Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
77 changes: 39 additions & 38 deletions docs/en/sql-reference/00-sql-reference/10-data-types/vector.md

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ title: ARRAY_AGG
title_includes: LIST
---

The ARRAY_AGG function (also known by its alias LIST) transforms all the values, including NULL, of a specific column in a query result into an array.
The ARRAY_AGG function (also known by its alias LIST) transforms all the values, excluding NULL, of a specific column in a query result into an array.

## Syntax

Expand Down Expand Up @@ -68,4 +68,4 @@ GROUP BY movie_title;
| movie_title | ratings |
|-------------|------------|
| Inception | [5, 5, 4] |
```
```
Original file line number Diff line number Diff line change
Expand Up @@ -46,13 +46,28 @@ This function performs vector computations within Databend and does not rely on

## Examples

### Basic Usage

```sql
-- Calculate cosine distance between two vectors
SELECT COSINE_DISTANCE([1.0, 2.0, 3.0]::vector(3), [4.0, 5.0, 6.0]::vector(3)) AS distance;
```

Result:
```
╭─────────────╮
│ distance │
├─────────────┤
│ 0.025368214 │
╰─────────────╯
```

Create a table with vector data:

```sql
CREATE OR REPLACE TABLE vectors (
id INT,
vec VECTOR(3),
VECTOR INDEX idx_vec(vec) distance='cosine'
vec VECTOR(3)
);

INSERT INTO vectors VALUES
Expand All @@ -64,20 +79,22 @@ INSERT INTO vectors VALUES
Find the vector most similar to [1, 2, 3]:

```sql
SELECT
SELECT
id,
vec,
COSINE_DISTANCE(vec, [1.0000, 2.0000, 3.0000]::VECTOR(3)) AS distance
FROM
vectors
ORDER BY
distance ASC
LIMIT 1;
distance ASC;
```

```
+-------------------------+----------+
| vec | distance |
+-------------------------+----------+
| [1.0000,2.2000,3.0000] | 0.0 |
+-------------------------+----------+
╭────────────────────────────────────╮
│ id │ vec │ distance │
├────┼───────────┼───────────────────┤
│ 1 │ [1,2,3] │ 0.000000059604645 │
│ 2 │ [1,2.2,3] │ 0.00096315145 │
│ 3 │ [4,5,6] │ 0.025368214 │
╰────────────────────────────────────╯
```
Original file line number Diff line number Diff line change
Expand Up @@ -49,13 +49,28 @@ Where v1ᵢ and v2ᵢ are the elements of the input vectors.

## Examples

### Basic Usage

```sql
-- Calculate L2 distance between two vectors
SELECT L2_DISTANCE([1.0, 2.0, 3.0]::vector(3), [4.0, 5.0, 6.0]::vector(3)) AS distance;
```

Result:
```
╭──────────╮
│ distance │
├──────────┤
│ 5.196152 │
╰──────────╯
```

Create a table with vector data:

```sql
CREATE OR REPLACE TABLE vectors (
id INT,
vec VECTOR(3),
VECTOR INDEX idx_vec(vec) distance='l2'
vec VECTOR(3)
);

INSERT INTO vectors VALUES
Expand All @@ -78,11 +93,12 @@ ORDER BY
```

```
+----+-------------------------+----------+
| id | vec | distance |
+----+-------------------------+----------+
| 1 | [1.0000,2.0000,3.0000] | 0.0 |
| 2 | [1.0000,2.2000,3.0000] | 0.2 |
| 3 | [4.0000,5.0000,6.0000] | 5.196152 |
+----+-------------------------+----------+
```
╭─────────────────────────────╮
│ id │ vec │ distance │
├────┼───────────┼────────────┤
│ 1 │ [1,2,3] │ 0 │
│ 2 │ [1,2.2,3] │ 0.20000005 │
│ 3 │ [4,5,6] │ 5.196152 │
╰─────────────────────────────╯
```

Original file line number Diff line number Diff line change
Expand Up @@ -37,38 +37,52 @@ Formula: `L1_DISTANCE(a, b) = |a1 - b1| + |a2 - b2| + ... + |an - bn|`

```sql
-- Calculate L1 distance between two vectors
SELECT L1_DISTANCE([1.0, 2.0, 3.0], [4.0, 5.0, 6.0]) AS distance;
SELECT L1_DISTANCE([1.0, 2.0, 3.0]::vector(3), [4.0, 5.0, 6.0]::vector(3)) AS distance;
```

Result:
```
──────────
──────────
│ distance │
├──────────┤
9.0
──────────
9
──────────
```

### Using with VECTOR Type
Create a table with vector data:

```sql
-- Create table with VECTOR columns
CREATE TABLE products (
CREATE OR REPLACE TABLE vectors (
id INT,
features VECTOR(3),
VECTOR INDEX idx_features(features) distance='l1'
vec VECTOR(3)
);

INSERT INTO products VALUES
(1, [1.0, 2.0, 3.0]::VECTOR(3)),
(2, [2.0, 3.0, 4.0]::VECTOR(3));
INSERT INTO vectors VALUES
(1, [1.0000, 2.0000, 3.0000]),
(2, [1.0000, 2.2000, 3.0000]),
(3, [4.0000, 5.0000, 6.0000]);
```

Find the vector closest to [1, 2, 3] using L1 distance:

-- Find products similar to a query vector using L1 distance
SELECT
```sql
SELECT
id,
features,
L1_DISTANCE(features, [1.5, 2.5, 3.5]::VECTOR(3)) AS distance
FROM products
ORDER BY distance ASC
LIMIT 5;
vec,
L1_DISTANCE(vec, [1.0000, 2.0000, 3.0000]::VECTOR(3)) AS distance
FROM
vectors
ORDER BY
distance ASC;
```

```
╭─────────────────────────────╮
│ id │ vec │ distance │
├────┼───────────┼────────────┤
│ 1 │ [1,2,3] │ 0 │
│ 2 │ [1,2.2,3] │ 0.20000005 │
│ 3 │ [4,5,6] │ 9 │
╰─────────────────────────────╯
```

Original file line number Diff line number Diff line change
Expand Up @@ -9,30 +9,30 @@ This section provides reference information for vector functions in Databend. Th

| Function | Description | Example |
|----------|-------------|--------|
| [COSINE_DISTANCE](./00-vector-cosine-distance.md) | Calculates angular distance between vectors (range: 0-1) | `COSINE_DISTANCE([1,2,3], [4,5,6])` |
| [L1_DISTANCE](./02-vector-l1-distance.md) | Calculates Manhattan (L1) distance between vectors | `L1_DISTANCE([1,2,3], [4,5,6])` |
| [L2_DISTANCE](./01-vector-l2-distance.md) | Calculates Euclidean (straight-line) distance | `L2_DISTANCE([1,2,3], [4,5,6])` |
| [COSINE_DISTANCE](./00-vector-cosine-distance.md) | Calculates Cosine distance between vectors (range: 0-1) | `COSINE_DISTANCE([1,2,3]::VECTOR(3), [4,5,6]::VECTOR(3))` |
| [L1_DISTANCE](./02-vector-l1-distance.md) | Calculates Manhattan (L1) distance between vectors | `L1_DISTANCE([1,2,3]::VECTOR(3), [4,5,6]::VECTOR(3))` |
| [L2_DISTANCE](./01-vector-l2-distance.md) | Calculates Euclidean (straight-line) distance | `L2_DISTANCE([1,2,3]::VECTOR(3), [4,5,6]::VECTOR(3))` |
| [INNER_PRODUCT](./03-inner-product.md) | Calculates the inner product (dot product) of two vectors | `INNER_PRODUCT([1,2,3]::VECTOR(3), [4,5,6]::VECTOR(3))` |

## Vector Analysis Functions

| Function | Description | Example |
|----------|-------------|--------|
| [INNER_PRODUCT](./03-inner-product.md) | Calculates the inner product (dot product) of two vectors | `INNER_PRODUCT([1,2,3], [4,5,6])` |
| [VECTOR_NORM](./05-vector-norm.md) | Calculates the L2 norm (magnitude) of a vector | `VECTOR_NORM([1,2,3])` |
| [VECTOR_DIMS](./04-vector-dims.md) | Returns the dimensionality of a vector | `VECTOR_DIMS([1,2,3])` |
| [VECTOR_NORM](./05-vector-norm.md) | Calculates the L2 norm (magnitude) of a vector | `VECTOR_NORM([1,2,3]::VECTOR(3))` |
| [VECTOR_DIMS](./04-vector-dims.md) | Returns the dimensionality of a vector | `VECTOR_DIMS([1,2,3]::VECTOR(3))` |

## Distance Functions Comparison

| Function | Description | Range | Best For | Use Cases |
|----------|-------------|-------|----------|-----------|
| [COSINE_DISTANCE](./00-vector-cosine-distance.md) | Angular distance between vectors | [0, 1] | When direction matters more than magnitude | • Document similarity<br/>• Semantic search<br/>• Recommendation systems<br/>• Text analysis |
| [COSINE_DISTANCE](./00-vector-cosine-distance.md) | Cosine distance between vectors | [0, 1] | When direction matters more than magnitude | • Document similarity<br/>• Semantic search<br/>• Recommendation systems<br/>• Text analysis |
| [L1_DISTANCE](./02-vector-l1-distance.md) | Manhattan (L1) distance between vectors | [0, ∞) | Robust to outliers | • Feature comparison<br/>• Outlier detection<br/>• Grid-based pathfinding<br/>• Clustering algorithms |
| [L2_DISTANCE](./01-vector-l2-distance.md) | Euclidean (straight-line) distance | [0, ∞) | When magnitude matters | • Image similarity<br/>• Geographical data<br/>• Anomaly detection<br/>• Feature-based clustering |
| [L2_DISTANCE](./01-vector-l2-distance.md) | Euclidean (straight-line) distance | [0, ∞) | When magnitude and absolute differences are important | • Image similarity<br/>• Geographical data<br/>• Anomaly detection<br/>• Feature-based clustering |
| [INNER_PRODUCT](./03-inner-product.md) | Dot product of two vectors | (-∞, ∞) | When both magnitude and direction are important | • Neural networks<br/>• Machine learning<br/>• Physics calculations<br/>• Vector projections |

## Vector Analysis Functions Comparison

| Function | Description | Range | Best For | Use Cases |
|----------|-------------|-------|----------|-----------|
| [INNER_PRODUCT](./03-inner-product.md) | Dot product of two vectors | (-∞, ∞) | Measuring vector similarity and projections | • Neural networks<br/>• Machine learning<br/>• Physics calculations<br/>• Vector projections |
| [VECTOR_NORM](./05-vector-norm.md) | L2 norm (magnitude) of a vector | [0, ∞) | Vector normalization and magnitude | • Vector normalization<br/>• Feature scaling<br/>• Magnitude calculations<br/>• Physics applications |
| [VECTOR_DIMS](./04-vector-dims.md) | Number of vector dimensions | [1, 4096] | Vector validation and processing | • Data validation<br/>• Dynamic processing<br/>• Debugging<br/>• Compatibility checks |
Loading
Loading