You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/en/sql-reference/20-sql-functions/10-search-functions/index.md
+65-24Lines changed: 65 additions & 24 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,50 +2,91 @@
2
2
title: Full-Text Search Functions
3
3
---
4
4
5
-
This section provides reference information for the full-text search functions in Databend. These functions enable powerful text search capabilities similar to those found in dedicated search engines.
5
+
Databend's full-text search functions deliver search-engine-style filtering for semi-structured `VARIANT` data and plain text columns that are indexed with an inverted index. They are ideal for AI-generated metadataβsuch as perception results from autonomous-driving video framesβstored alongside your assets.
6
6
7
7
:::info
8
-
Databend's full-text search functions are inspired by [Elasticsearch Full-Text Search Functions](https://www.elastic.co/guide/en/elasticsearch/reference/current/sql-functions-search.html).
8
+
Databend's search functions are inspired by [Elasticsearch Full-Text Search Functions](https://www.elastic.co/guide/en/elasticsearch/reference/current/sql-functions-search.html).
9
9
:::
10
10
11
+
Include an inverted index in the table definition for the columns you plan to search:
12
+
13
+
```sql
14
+
CREATE OR REPLACETABLEframes (
15
+
id INT,
16
+
meta VARIANT,
17
+
INVERTED INDEX idx_meta (meta)
18
+
);
19
+
```
20
+
11
21
## Search Functions
12
22
13
23
| Function | Description | Example |
14
-
|----------|-------------|--------|
15
-
|[MATCH](match)|Searches for documents containing specified keywords in selected columns |`MATCH('title, body', 'technology')`|
16
-
|[QUERY](query)|Searches for documents satisfying a specified query expression with advanced syntax |`QUERY('title:technology AND society')`|
17
-
|[SCORE](score)| Returns the relevance score of search results when used with MATCH or QUERY |`SELECT title, SCORE() FROM articles WHERE MATCH('title', 'technology')`|
24
+
|----------|-------------|---------|
25
+
|[MATCH](match)|Performs a relevance-ranked search across the listed columns.|`MATCH('summary, tags', 'traffic light red')`|
26
+
|[QUERY](query)|Evaluates a Lucene-style query expression, including nested `VARIANT` fields. |`QUERY('meta.signals.traffic_light:red')`|
27
+
|[SCORE](score)| Returns the relevance score for the current row when used with `MATCH` or `QUERY`.|`SELECT summary, SCORE() FROM frame_notes WHERE MATCH('summary, tags', 'traffic light red')`|
18
28
19
-
## Usage Examples
29
+
## Query Syntax Examples
20
30
21
-
### Basic Text Search
31
+
### Example: Single Keyword
22
32
23
33
```sql
24
-
-- Search for documents with 'technology' in title or body columns
25
-
SELECT*FROM articles
26
-
WHERE MATCH('title, body', 'technology');
34
+
SELECT id, meta['frame']['timestamp'] AS ts
35
+
FROM frames
36
+
WHERE QUERY('meta.detections.label:pedestrian')
37
+
LIMIT100;
27
38
```
28
39
29
-
### Advanced Query Expressions
40
+
### Example: Boolean AND
30
41
31
42
```sql
32
-
-- Search for documents with 'technology' in title and 'impact' in body
33
-
SELECT*FROM articles
34
-
WHERE QUERY('title:technology AND body:impact');
43
+
SELECT id, meta['frame']['timestamp'] AS ts
44
+
FROM frames
45
+
WHERE QUERY('meta.signals.traffic_light:red AND meta.vehicle.lane:center')
46
+
LIMIT100;
35
47
```
36
48
37
-
### Relevance Scoring
49
+
### Example: Boolean OR
38
50
39
51
```sql
40
-
-- Search with relevance scoring and sorting by relevance
41
-
SELECT title, body, SCORE()
42
-
FROM articles
43
-
WHERE MATCH('title^2, body', 'technology')
44
-
ORDER BY SCORE() DESC;
52
+
SELECT id, meta['frame']['timestamp'] AS ts
53
+
FROM frames
54
+
WHERE QUERY('meta.signals.traffic_light:red OR meta.detections.label:bike')
55
+
LIMIT100;
45
56
```
46
57
47
-
Before using these functions, you need to create an inverted index on the columns you want to search:
58
+
### Example: IN List
48
59
49
60
```sql
50
-
CREATE INVERTED INDEX idx ON articles(title, body);
51
-
```
61
+
SELECT id, meta['frame']['timestamp'] AS ts
62
+
FROM frames
63
+
WHERE QUERY('meta.tags:IN [stop urban]')
64
+
LIMIT100;
65
+
```
66
+
67
+
### Example: Inclusive Range
68
+
69
+
```sql
70
+
SELECT id, meta['frame']['timestamp'] AS ts
71
+
FROM frames
72
+
WHERE QUERY('meta.vehicle.speed_kmh:[0 TO 10]')
73
+
LIMIT100;
74
+
```
75
+
76
+
### Example: Exclusive Range
77
+
78
+
```sql
79
+
SELECT id, meta['frame']['timestamp'] AS ts
80
+
FROM frames
81
+
WHERE QUERY('meta.vehicle.speed_kmh:{0 TO 10}')
82
+
LIMIT100;
83
+
```
84
+
85
+
### Example: Boosted Fields
86
+
87
+
```sql
88
+
SELECT id, meta['frame']['timestamp'] AS ts, SCORE()
89
+
FROM frames
90
+
WHERE QUERY('meta.signals.traffic_light:red^1.0 AND meta.tags:urban^2.0')
Copy file name to clipboardExpand all lines: docs/en/sql-reference/20-sql-functions/10-search-functions/match.md
+53-74Lines changed: 53 additions & 74 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,7 +5,7 @@ import FunctionDescription from '@site/src/components/FunctionDescription';
5
5
6
6
<FunctionDescriptiondescription="Introduced or updated: v1.2.619"/>
7
7
8
-
Searches for documents containing specified keywords. Please note that the MATCH function can only be used in a WHERE clause.
8
+
`MATCH` searches for rows that contain the supplied keywords within the listed columns. The function can only appear in a `WHERE` clause.
9
9
10
10
:::info
11
11
Databend's MATCH function is inspired by Elasticsearch's [MATCH](https://www.elastic.co/guide/en/elasticsearch/reference/current/sql-functions-search.html#sql-functions-search-match).
@@ -14,83 +14,62 @@ Databend's MATCH function is inspired by Elasticsearch's [MATCH](https://www.ela
|`<columns>`| A comma-separated list of column names in the table to search for the specified keywords, with optional weighting using the syntax (^), which allows assigning different weights to each column, influencing the importance of each column in the search. |
23
-
|`<keywords>`| The keywords to match against the specified columns in the table. This parameter can also be used for suffix matching, where the search term followed by an asterisk (*) can match any number of characters or words. |
24
-
|`<options>`| A set of configuration options, separated by semicolons `;`, that customize the search behavior. See the table below for details. |
20
+
-`<columns>`: A comma-separated list of columns to search. Append `^<boost>` to weight a column higher than the others.
21
+
-`<keywords>`: The terms to search for. Append `*` for suffix matching, for example `rust*`.
22
+
-`<options>`: An optional semicolon-separated list of `key=value` pairs fine-tuning the search.
| fuzziness | Allows matching terms within a specified Levenshtein distance. `fuzziness` can be set to 1 or 2. | SELECT id, score(), content FROM t WHERE match(content, 'box', 'fuzziness=1'); | When matching the query term "box", `fuzziness=1` allows matching terms like "fox", since "box" and "fox" have a Levenshtein distance of 1. |
29
-
| operator | Specifies how multiple query terms are combined. Can be set to OR (default) or AND. OR returns results containing any of the query terms, while AND returns results containing all query terms. | SELECT id, score(), content FROM t WHERE match(content, 'action works', 'fuzziness=1;operator=AND'); | With `operator=AND`, the query requires both "action" and "works" to be present in the results. Due to `fuzziness=1`, it matches terms like "Actions" and "words", so "Actions speak louder than words" is returned. |
30
-
| lenient | Controls whether errors are reported when the query text is invalid. Defaults to `false`. If set to `true`, no error is reported, and an empty result set is returned if the query text is invalid. | SELECT id, score(), content FROM t WHERE match(content, '()', 'lenient=true'); | If the query text `()` is invalid, setting `lenient=true` prevents an error from being thrown and returns an empty result set instead. |
24
+
## Options
25
+
26
+
| Option | Values | Description | Example |
27
+
|--------|--------|-------------|---------|
28
+
|`fuzziness`|`1` or `2`| Matches keywords within the specified Levenshtein distance. |`MATCH('summary, tags', 'pedestrain', 'fuzziness=1')` matches rows that contain the correctly spelled `pedestrian`. |
29
+
|`operator`|`OR` (default) or `AND`| Controls how multiple keywords are combined when no boolean operator is specified. |`MATCH('summary, tags', 'traffic light red', 'operator=AND')` requires both words. |
30
+
|`lenient`|`true` or `false`| When `true`, suppresses parsing errors and returns an empty result set. |`MATCH('summary, tags', '()', 'lenient=true')` returns no rows instead of an error. |
31
31
32
32
## Examples
33
33
34
+
In many AI pipelines you may capture structured metadata in a `VARIANT` column while also materializing human-readable summaries for search. The following example stores dashcam frame summaries and tags that were extracted from the JSON payload.
35
+
36
+
### Example: Build Searchable Summaries
37
+
38
+
```sql
39
+
CREATE OR REPLACETABLEframe_notes (
40
+
id INT,
41
+
camera STRING,
42
+
summary STRING,
43
+
tags STRING,
44
+
INVERTED INDEX idx_notes (summary, tags)
45
+
);
46
+
47
+
INSERT INTO frame_notes VALUES
48
+
(1, 'dashcam_front',
49
+
'Green light at Market & 5th with pedestrian entering the crosswalk',
50
+
'downtown commute green-light pedestrian'),
51
+
(2, 'dashcam_front',
52
+
'Vehicle stopped at Mission & 6th red traffic light with cyclist ahead',
53
+
'stop urban red-light cyclist'),
54
+
(3, 'dashcam_front',
55
+
'School zone caution sign in SOMA with pedestrian waiting near crosswalk',
56
+
'school-zone caution pedestrian');
57
+
```
58
+
59
+
### Example: Boolean AND
60
+
61
+
```sql
62
+
SELECT id, summary
63
+
FROM frame_notes
64
+
WHERE MATCH('summary, tags', 'traffic light red', 'operator=AND');
65
+
-- Returns id 2
66
+
```
67
+
68
+
### Example: Fuzzy Matching
69
+
34
70
```sql
35
-
CREATETABLEtest(title STRING, body STRING);
36
-
37
-
CREATE INVERTED INDEX idx ON test(title, body);
38
-
39
-
INSERT INTO test VALUES
40
-
('The Importance of Reading', 'Reading is a crucial skill that opens up a world of knowledge and imagination.'),
41
-
('The Benefits of Exercise', 'Exercise is essential for maintaining a healthy lifestyle.'),
42
-
('The Power of Perseverance', 'Perseverance is the key to overcoming obstacles and achieving success.'),
43
-
('The Art of Communication', 'Effective communication is crucial in everyday life.'),
44
-
('The Impact of Technology on Society', 'Technology has revolutionized our society in countless ways.');
45
-
46
-
-- Retrieve documents where the 'title' column matches 'art power'
47
-
SELECT*FROM test WHERE MATCH('title', 'art power');
0 commit comments