Skip to content

Commit c4a58cf

Browse files
Merge pull request #829 from KarthikSubbarao/1.2
Merge from main into 1.2 branch
2 parents f31b235 + 35643b2 commit c4a58cf

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

56 files changed

+2868
-475
lines changed

.github/benchmark_configs/fts-benchmarks-arm.json

Lines changed: 1500 additions & 0 deletions
Large diffs are not rendered by default.

docs/commands/ft._list.md

Lines changed: 5 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -12,10 +12,8 @@ Example
1212

1313
```
1414
ft._list
15-
1) b
16-
2) a
17-
3) x
18-
4) index
19-
5) aa
20-
6) bb
21-
```
15+
1) index
16+
2) products
17+
3) users
18+
4) transactions
19+
```

docs/commands/ft.aggregate.md

Lines changed: 1 addition & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -73,18 +73,11 @@ If the MAX clause is present, then the output is trimmed after the first N recor
7373
The `GROUPBY` stage organizes the input records into buckets based on the values of the specified fields.
7474
For each unique combination of values a separate bucket is created to hold all records that have that combination of values.
7575

76-
Each bucket of records is processed into a single output record, discarding the bucket contents. That output record has two sections. The first section has one value for each of the specified groupby fields. This section provides the values that formed (named) this unique bucket.
77-
78-
The second section is the output of the reducers for that bucket. Reducers provide an efficient mechanism for reducing (summarizing) the contents of a bucket. Each reducer function processes each record of the bucket and generates a single output value which is inserted into the second section of groupby output record for this bucket.
79-
80-
The `GROUPBY` stage organizes the input records into buckets based on the values of the specified fields.
81-
For each unique combination of values a separate bucket is created to hold all records that have that combination of values.
82-
8376
Each bucket of records is processed into a single output record, discarding the bucket contents. That output record has two sections. The first section has one value for each of the specified `GROUPBY` fields. This section provides the values that formed (named) this unique bucket.
8477

8578
The second section is the output of the reducers for that bucket. Reducers provide an efficient mechanism for reducing (summarizing) the contents of a bucket. Each reducer function processes each record of the bucket and generates a single output value which is inserted into the second section of `GROUPBY` output record for this bucket.
8679

87-
The output of the `GROUPBY` stage is one record for each unique bucket.
80+
The output of the `GROUPBY`stage is one record for each unique bucket.
8881

8982
### Reducers
9083

docs/commands/ft.create.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -37,13 +37,13 @@ FT.CREATE <index-name>
3737

3838
- `LANGUAGE <language>` (optional): For text fields, the language used to control lexical parsing and stemming. Currently only the value `ENGLISH` is supported.
3939

40-
- `MINSTEMSIZE <min_stem_size>` (optional): For text fields with stemming enabled. This controls the minimum length of a word before it is subjected to stemming. The default value is 4.
40+
- `MINSTEMSIZE <min_stem_size>` (optional): For text fields with stemming enabled. This controls the minimum length of a word required for it to be subjected to stemming. The default value is 4.
4141

4242
- `WITHOFFSETS | NOOFFSETS` (optional): Enables/Disables the retention of per-word offsets within a text field. Offsets are required to perform exact phrase matching and slop-based proximity matching. Thus if offsets are disabled, those query operations will be rejected with an error. The default is `WITHOFFSETS`.
4343

44-
- `NOSTOPWORDS | STOPWORDS <count> <word1> <word2>...` (optional): Stop words are words which are not put into the indexes. The default value of `STOPWORDS`is language dependent. For`LANGUAGE ENGLISH` the default is: <?>.
44+
- `NOSTOPWORDS | STOPWORDS <count> <word1> <word2>...` (optional): Stop words are words which are not put into the indexes. The default value of `STOPWORDS`is language dependent. For`LANGUAGE ENGLISH` the default is: [a, an, and, are, as, at, be, but, by, for, if, in, into, is, it, no, not, of, on, or, such, that, their, then, there, these, they, this, to, was, will, with].
4545

46-
- `PUNCTUATION <punctuation>` (optional): A string of characters that are used to define words in the text field. The default value is `,.<>{}[]"':;!@#$%^&\*()-+=~/\|`.
46+
- `PUNCTUATION <punctuation>` (optional): A string of characters that define the separation points between words, in addition to whitespace characters (spaces, tabs, newlines, carriage returns, and control characters) which always break words. The default value is `,.<>{}[]"':;!@#$%^&\*()-+=~/\|?`.
4747

4848
- `SKIPINITIALSCAN` (optional): If specified, this option skips the normal backfill operation for an index. If this option is specified, pre-existing keys which match the `PREFIX` clause will not be loaded into the index during a backfill operation. This clause has no effect on processing of key mutations _after_ an index is created, i.e., keys which are mutated after an index is created and satisfy the data type and `PREFIX` clause will be inserted into that index.
4949

docs/commands/ft.info.md

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -40,17 +40,16 @@ An array of key value pairs.
4040
- Type-specific extension (see below)
4141
- `num_docs` (integer) Total keys in the index
4242
- `num_records` (integer) Total number of fields indexed.
43-
- `num_total_terms` (integer) Total number of terms in all text fields in this index.
44-
- `num_unique_terms` (integer) Total number of unique terms in all text fields in this index.
45-
- `total_postings` (integer) Total number of postings entries in all text fields in this index.
43+
- `total_term_occurrences` (integer) Total number of terms in all text fields in this index.
44+
- `num_terms` (integer) Total number of unique terms in all text fields in this index.
4645
- `hash_indexing_failures` (integer) Count of unsuccessful indexing attempts
4746
- `backfill_in_progress` (string). "1" if a backfill is currently running. "0" if not.
4847
- `backfill_complete_percent` (string) Estimated progress of background indexing. Percentage is expressed as a fractional value from 0 to 1.0.
4948
- `mutation_queue_size` (string) Number of keys contained in the mutation queue.
5049
- `recent_mutations_queue_delay` (string) 0 if the mutation queue is empty. Otherwise it is the mutation queue occupancy of the of the last key to be ingested in seconds.
51-
- `state` (string) Backfill state. `ready` indicates not backfill is in progress. `backfill_in_progress` backfill operation proceeding normally. `backfill_paused_by_oom` backfill is paused because the Valkey instance is out of memory.
50+
- `state` (string) Current backfill state. `ready` indicates not backfill is in progress. `backfill_in_progress` backfill operation proceeding normally. `backfill_paused_by_oom` backfill is paused because the Valkey instance is out of memory.
5251
- `punctuation` (string) list of punctuation characters.
53-
- `stopwords` (array of strings) list of stopwords.
52+
- `stopwords` (array of strings) list of `stopwords`.
5453
- `with_offsets` (string) "1" if offsets are included. "0" if offsets are not included
5554
- `min_stem_size` (integer) Minimum stemming size for this field.
5655

docs/commands/ft.search.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -22,17 +22,17 @@ FT.SEARCH <index> <query>
2222
- `ALLSHARDS` (Optional): If specified, the command is terminated with a timeout error if a valid response from all shards is not received within the timeout interval. This is the default.
2323
- `CONSISTENT` (Optional): If specified, the command is terminated with an error if the cluster is in an inconsistent state. This is the default.
2424
- `DIALECT <dialect>` (optional): Specifies your dialect. The only supported dialect is 2.
25-
- `INORDER` (optional): Indicates that proximity matching of terms must be inorder.
2625
- `INCONSISTENT` (Optional): If specified, the command will generate a best-effort reply if the cluster remains inconsistent within the timeout interval.
2726
- `LIMIT <offset> <count>` (optional): Lets you choose a portion of the result. The first `<offset>` keys are skipped and only a maximum of `<count>` keys are included. The default is LIMIT 0 10, which returns at most 10 keys.
2827
- `NOCONTENT` (optional): When present, only the resulting key names are returned, no key values are included.
2928
- `PARAMS <count> <name> <value> [<name> <value> ...]` (optional): `count` is of the number of arguments, i.e., twice the number of value/name pairs. [Search - query language](../topics/search-query.md) for details.
3029
- `RETURN <count> <field> [AS <name>] <field> [AS <name>] ...` (options): `count` is the number of fields to return. Specifies the fields you want to retrieve from your documents, along with any renaming for the returned values. By default, all fields are returned unless the `NOCONTENT` option is set, in which case no fields are returned. If num is set to 0, it behaves the same as `NOCONTENT`.
31-
- `SLOP <slop>` (Optional): Specifies a slop value for proximity matching of terms.
30+
- `INORDER` (optional): Indicates that proximity matching of text terms in the query must be in order.
31+
- `SLOP <slop>` (Optional): Specifies a slop value for proximity matching of text terms in the query.
32+
- `VERBATIM` (Optional): If specified, stemming is not applied to text terms in the query.
3233
- `SOMESHARDS` (Optional): If specified, the command will generate a best-effort reply if all shards have not responded within the timeout interval.
3334
- `SORTBY <field> [ASC | DESC]` (Optional): If present, results are sorted according the value of the specified field and the optional sort-direction instruction. By default, vector results are sorted in distance order and non-vector results are not sorted in any particular order. Sorting is applied before the `LIMIT` clause is applied.
3435
- `TIMEOUT <timeout>` (optional): Lets you set a timeout value for the search command. This must be an integer in milliseconds.
35-
- `VERBATIM` (Optional): If specified stemming is not applied to term searches.
3636
- `WITHSORTKEYS` (Optional): If `SORTBY` is specified then enabling this option augments the output with the value of the field used for sorting.
3737

3838
Response

docs/topics/search-configurables.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
2-
title: "Search - Configuration"
3-
description: Search Module Configurable Settings
2+
title: "Valkey Search - Configuration"
3+
description: Valkey Search Module Configurable Settings
44
---
55

66
# Configurables

docs/topics/search-data-formats.md

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,4 @@
11
---
2-
title: "Search - Data Formats"
3-
description: Search Module Formats for Input Field Data
4-
---
52
title: "Valkey Search - Data Formats"
63
description: Valkey Search Module Formats for Input Field Data
74
---
@@ -260,15 +257,15 @@ Removed stop words do not occupy a position in the token sequence. For example,
260257

261258
### Stemming
262259

263-
Stemming reduces words to their root form so that morphological variants match each other. For example, "running", "runs", and "runner" all stem to "run". The stemming algorithm is language-specific; currently only English (Snowball stemmer) is supported.
260+
Stemming reduces words to their root form so that morphological variants match each other. For example, "running", "runs", and "run" all have the same stem: "run". The stemming algorithm is language-specific; currently only English (Snowball stemmer) is supported.
264261

265262
Stemming is controlled by these options:
266263

267264
- `LANGUAGE ENGLISH` (schema-level): Specifies the stemming language. Currently only `ENGLISH` is supported.
268265
- `NOSTEM` (per-field): Disables stemming for a specific text field.
269266
- `MINSTEMSIZE <size>` (schema-level): Words shorter than this length are not stemmed. The default is 4.
270267

271-
When stemming is enabled, the original word (not the stemmed form) is stored in the index. The stem mapping is recorded separately so that a search for a stemmed root expands to match all words that share that root.
268+
When stemming is enabled, the original word (not the stemmed form) is stored in the index. The stem mapping is recorded separately so that a search on a term expands to match all terms that share the same stem, as well as the stem itself.
272269

273270
## Text Ingestion Examples
274271

@@ -309,4 +306,4 @@ Given English stemming with the default `MINSTEMSIZE` of 4 and input `"The Runni
309306
| Indexed tokens | `running` (position 0), `searches` (position 1), `cat` (position 2) |
310307
| Stem mappings | `run` -> {`running`}, `search` -> {`searches`} |
311308

312-
The word `the` is removed as a stop word. The word `cat` (3 characters) is below the minimum stem size and is not stemmed. A search for `run` will match `running` through the stem mapping.
309+
The word `the` is removed as a stop word. The word `cat` (3 characters) is below the minimum stem size and is not stemmed. A search for `run` or `runs` will match `running` through the stem mapping.

docs/topics/search-expressions.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
2-
title: "Search Aggregation Expressions"
3-
description: Aggregation Expression Language
2+
title: "Valkey Search - Aggregation Expressions"
3+
description: Valkey Search Module Aggregation Expression Language
44
---
55

66
The `FILTER`, `APPLY`, `SORTBY` and `GROUPBY` stages of `FT.AGGREGATE` use expressions to compute values.

docs/topics/search-observables.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
2-
title: "Search - Info Fields"
3-
description: Search Module INFO Fields
2+
title: "Valkey Search - Info Fields"
3+
description: Valkey Search Module INFO Fields
44
---
55

66
# INFO Fields

0 commit comments

Comments
 (0)