From 226c4c30dcf38b166450e96557d0551d736355aa Mon Sep 17 00:00:00 2001 From: Garrett Gu Date: Mon, 16 Dec 2024 10:39:41 -0600 Subject: [PATCH] Add documentation for Vectorize range queries --- src/content/changelogs/vectorize.yaml | 5 +++ .../reference/metadata-filtering.mdx | 37 +++++++++++++++---- 2 files changed, 34 insertions(+), 8 deletions(-) diff --git a/src/content/changelogs/vectorize.yaml b/src/content/changelogs/vectorize.yaml index 7aeb1ba5472cca..5cbd442f761421 100644 --- a/src/content/changelogs/vectorize.yaml +++ b/src/content/changelogs/vectorize.yaml @@ -5,6 +5,11 @@ productLink: "/vectorize/" productArea: Developer platform productAreaLink: /workers/platform/changelog/platform/ entries: + - publish_date: "2024-12-19" + title: Added support for range queries in metadata filters + description: |- + Vectorize now supports `$lt`, `$lte`, `$gt`, and `$gte` clauses in [metadata filters](/vectorize/reference/metadata-filtering/). + - publish_date: "2024-11-13" title: Added support for `$in` and `$nin` metadata filters description: |- diff --git a/src/content/docs/vectorize/reference/metadata-filtering.mdx b/src/content/docs/vectorize/reference/metadata-filtering.mdx index 957f0535681336..e7f17a823951db 100644 --- a/src/content/docs/vectorize/reference/metadata-filtering.mdx +++ b/src/content/docs/vectorize/reference/metadata-filtering.mdx @@ -31,19 +31,26 @@ Vectors upserted before a metadata index was created won't have their metadata c ## Supported operations -Optional `filter` property on `query()` method specifies metadata filter: - -| Operator | Description | -| -------- | ----------- | -| `$eq` | Equals | -| `$ne` | Not equals | -| `$in` | In | -| `$nin` | Not in | +An optional `filter` property on `query()` method specifies metadata filters: + +| Operator | Description | +| -------- | ------------------------ | +| `$eq` | Equals | +| `$ne` | Not equals | +| `$in` | In | +| `$nin` | Not in | +| `$lt` | Less than | +| `$lte` | Less than or equal to | +| `$gt` | Greater than | +| `$gte` | Greater than or equal to | - `filter` must be non-empty object whose compact JSON representation must be less than 2048 bytes. - `filter` object keys cannot be empty, contain `" | .` (dot is reserved for nesting), start with `$`, or be longer than 512 characters. - For `$eq` and `$ne`, `filter` object non-nested values can be `string`, `number`, `boolean`, or `null` values. - For `$in` and `$nin`, `filter` object values can be arrays of `string`, `number`, `boolean`, or `null` values. +- Upper-bound range queries (i.e. `$lt` and `$lte`) can be combined with lower-bound range queries (i.e. `$gt` and `$gte`) within the same filter. Other combinations are not allowed. +- For range queries (i.e. `$lt`, `$lte`, `$gt`, `$gte`), `filter` object non-nested values can be `string` or `number` values. Strings are ordered lexicographically. +- Range queries involving a large number of vectors (~10M and above) may experience reduced accuracy. ### Namespace versus metadata filtering @@ -78,6 +85,20 @@ Both [namespaces](/vectorize/best-practices/insert-vectors/#namespaces) and meta { "someKey": { "$nin": ["hbo", "netflix"] } } ``` +#### Range query involving numbers + +```json +{ "timestamp": { "$gte": 1734242400, "$lt": 1734328800 } } +``` + +#### Range query involving strings +Range queries can be used to implement prefix searching on string metadata fields. +For example, the following filter matches all values starting with "net": + +```json +{ "someKey": { "$gte": "net", "$lt": "neu" } } +``` + #### Implicit logical `AND` with multiple keys ```json