diff --git a/content/commands/ft.aggregate.md b/content/commands/ft.aggregate.md index 47aa2b4d28..ab8b8e83e0 100644 --- a/content/commands/ft.aggregate.md +++ b/content/commands/ft.aggregate.md @@ -489,6 +489,28 @@ Next, count GitHub events by user (actor), to produce the most active users. +
+Use the case function for conditional logic +{{< highlight bash >}} +//Simple mapping +FT.AGGREGATE products "*" +APPLY case(@price > 100, "premium", "standard") AS category + +//Nested conditions where an error should be returned +FT.AGGREGATE orders "*" +APPLY case(@status == "pending", + case(@priority == "high", 1, 2), + case(@status == "completed", 3, 4)) AS status_code + +//Mapped approach +FT.AGGREGATE orders "*" +APPLY case(@status == "pending", 1, 0) AS is_pending +APPLY case(@is_pending == 1 && @priority == "high", 1,2) AS status_high +APPLY case(@is_pending == 0 && @priority == "high", 3,4) AS status_completed +{{< / highlight >}} + +
+ ## See also [`FT.CONFIG SET`]({{< relref "commands/ft.config-set/" >}}) | [`FT.SEARCH`]({{< relref "commands/ft.search/" >}}) diff --git a/content/develop/ai/search-and-query/advanced-concepts/aggregations.md b/content/develop/ai/search-and-query/advanced-concepts/aggregations.md index 3813995e7a..d0a2e3f1b5 100644 --- a/content/develop/ai/search-and-query/advanced-concepts/aggregations.md +++ b/content/develop/ai/search-and-query/advanced-concepts/aggregations.md @@ -379,6 +379,7 @@ Note that these operators apply only to numeric values and numeric sub-expressio | Function | Description | Example | | -------- | ------------------------------------------------------------ | ------------------ | | exists(s)| Checks whether a field exists in a document. | `exists(@field)` | +| case(condition, if_true, if_false) | If condition is non-zero, return if_true, otherwise return if_false. | `case(exists(@foo), @foo, "no foo")` | ### List of numeric APPLY functions diff --git a/content/develop/ai/search-and-query/vectors.md b/content/develop/ai/search-and-query/vectors.md index 27bd2a978b..b72153b61d 100644 --- a/content/develop/ai/search-and-query/vectors.md +++ b/content/develop/ai/search-and-query/vectors.md @@ -152,18 +152,39 @@ Choose the `SVS-VAMANA` index type when all of the following requirements apply: | Attribute | Description | Default value | |:---------------------------|:-----------------------------------------|:-------------:| -| `COMPRESSION` | Compression algorithm (`LVQ8`, `LVQ4`, `LVQ4x4`, `LVQ4x8`, `LeanVec4x8`, or `LeanVec8x8`). Vectors will be compressed during indexing. See these Intel pages for best practices on using these algorithms: [`COMPRESSION` settings](https://intel.github.io/ScalableVectorSearch/howtos.html#compression-setting) and [`LeanVec`](https://intel.github.io/ScalableVectorSearch/python/experimental/leanvec.html). | None | +| `COMPRESSION` | Compression algorithm; one of `LVQ8`, `LVQ4`, `LVQ4x4`, `LVQ4x8`, `LeanVec4x8`, or `LeanVec8x8`. Vectors will be compressed during indexing. See below for descriptions of each algorithm. Also, see these Intel pages for best practices on using these algorithms: [`COMPRESSION` settings](https://intel.github.io/ScalableVectorSearch/howtos.html#compression-setting) and [`LeanVec`](https://intel.github.io/ScalableVectorSearch/python/experimental/leanvec.html). | `LVQ4x4` | | `CONSTRUCTION_WINDOW_SIZE` | The search window size to use during graph construction. A higher search window size will yield a higher quality graph since more overall vertexes are considered, but will increase construction time. | 200 | -| `GRAPH_MAX_DEGREE` | The maximum node degree in the graph. A higher max degree may yield a higher quality graph in terms of recall for performance, but the memory footprint of the graph is directly proportional to the maximum degree. | 32 | -| `SEARCH_WINDOW_SIZE` | The size of the search window. Increasing the search window size and capacity generally yields more accurate but slower search results. | 10 | -| `EPSILON` | The range search approximation factor. | 0.01 | -| `TRAINING_THRESHOLD` | The number of vectors after which training is triggered. Applicable only when used with `COMPRESSION`. If a value is provided, it be less than `100 * DEFAULT_BLOCK_SIZE`, where `DEFAULT_BLOCK_SIZE` is 1024. | `10 * DEFAULT_BLOCK_SIZE` | -| `LEANVEC_DIM` | The dimension used when using `LeanVec4x8` or `LeanVec8x8` compression for dimensionality reduction. If a value is provided, it should be less than `DIM`. | `DIM / 2` | +| `GRAPH_MAX_DEGREE` | Sets the maximum number of edges per node; equivalent to `HNSW’s M*2`. A higher max degree may yield a higher quality graph in terms of recall for performance, but the memory footprint of the graph is directly proportional to the maximum degree. | 32 | +| `SEARCH_WINDOW_SIZE` | The size of the search window; the same as `HSNW's EF_RUNTIME`. Increasing the search window size and capacity generally yields more accurate but slower search results. | 10 | +| `EPSILON` | The range search approximation factor; the same as `HSNW's EPSILON`. | 0.01 | +| `TRAINING_THRESHOLD` | Number of vectors needed to learn compression parameters. Applicable only when used with `COMPRESSION`. Increase if recall is low. Note: setting this too high may slow down search.If a value is provided, it must be less than `100 * DEFAULT_BLOCK_SIZE`, where `DEFAULT_BLOCK_SIZE` is 1024. | `10 * DEFAULT_BLOCK_SIZE` | +| `LEANVEC_DIM` | The dimension used when using `LeanVec4x8` or `LeanVec8x8` compression for dimensionality reduction. If a value is provided, it should be less than `DIM`. Lowering it can speed up search and reduce memory use. | `DIM / 2` | {{< warning >}} On non-Intel platforms, `SVS-VAMANA` with `COMPRESSION` will fall back to Intel’s basic scalar quantization implementation. {{< /warning >}} +**SVS_VAMANA vector compression algorithms** + +LVQ is a scalar quantization method that applies scaling constants for each vector. LeanVec builds on this by combining query-aware dimensionality reduction with LVQ-based scalar quantization for efficient vector compression. + +`LVQ4x4` (the default): Fast search with 4x vector compression relative to float32-encoded vectors (8 bits per dimension) and high accuracy. + +`LeanVec4x8`: Recommended for high-dimensional datasets. It offers the fastest search and ingestion. It's not the default because in rare cases it may reduce recall if the data does not compress well. + +`LeanVec` dimensional: For faster search and lower memory use, reduce the dimension further (default is input `dim / 2`; try `dim / 4` or even higher reduction). + +`LVQ8`: Faster ingestion than the default, but with slower search. + +| Compression algorithm | Best for | +|-----------------------|----------| +| `LVQ4x4` (default) | Fast search in most cases with low memory use. | +| `LeanVec4x8` | Fastest search and ingestion. | +| `LVQ4` | Maximum memory savings. | +| `LVQ8` | Faster ingestion than the default. | +| `LeanVec8x8` | Improved recall in cases where `LeanVec4x8` is not sufficient. | +| `LVQ4x8` | Improved recall in cases where the default is not sufficient. | + **Example** ```