Skip to content

Commit b02df6d

Browse files
committed
[Vectorize] Mention index name reuse + high-precision scoring trigger
1 parent 539d913 commit b02df6d

File tree

3 files changed

+17
-6
lines changed

3 files changed

+17
-6
lines changed

src/content/docs/vectorize/best-practices/create-indexes.mdx

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,8 @@ Creating an index requires three inputs:
1515
- The (fixed) [dimension size](#dimensions) of each vector, for example 384 or 1536.
1616
- The (fixed) [distance metric](#distance-metrics) to use for calculating vector similarity.
1717

18+
An index cannot be created using the same name as an index previously deleted on your account.
19+
1820
The configuration of an index cannot be changed after creation.
1921

2022
## Create an index
@@ -103,4 +105,6 @@ Distance metrics are functions that determine how close vectors are from each ot
103105

104106
Determining the similarity between vectors can be subjective based on how the machine-learning model that represents features in the resulting vector embeddings. For example, a score of `0.8511` when using a `cosine` metric means that two vectors are close in distance, but whether data they represent is _similar_ is a function of how well the model is able to represent the original content.
105107

108+
When querying vectors, you can specify Vectorize to either use high-precision scoring, increasing the precision of the query matches scores as well as the accuracy of the query results, or use approximate scoring for faster response times. Using approximate scoring, returned scores will be an approximation of the real distance/similarity between your query and the returned vector. See [Control over scoring precision and query accuracy](/vectorize/best-practices/query-vectors/#control-over-scoring-precision-and-query-accuracy)
109+
106110
Distance metrics cannot be changed after index creation, and that each metric has a different scoring function.

src/content/docs/vectorize/best-practices/insert-vectors.mdx

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ Vectorize indexes allow you to insert vectors at any point: Vectorize will optim
1313

1414
If the same vector id is _inserted_ twice in a Vectorize index, the index would reflect the vector that was added first.
1515

16-
If the same vector id is _upserted_ twice in a Vectorize index, the index would reflect the vector that was added second.
16+
If the same vector id is _upserted_ twice in a Vectorize index, the index would reflect the vector that was added last.
1717

1818
Use the upsert operation if you want to overwrite the vector value for a vector id that already exists in an index.
1919

@@ -38,8 +38,8 @@ Metadata keys cannot be empty, contain the dot character (`.`), contain the doub
3838
Metadata can be used to:
3939

4040
- Include the object storage key, database UUID or other identifier to look up the content the vector embedding represents.
41-
- The raw content (up to the [metadata limits](/vectorize/platform/limits/)), which can allow you to skip additional lookups for smaller content.
42-
- Dates, timestamps, or other metadata that describes when the vector embedding was generated or how it was generated.
41+
- Store JSON data (up to the [metadata limits](/vectorize/platform/limits/)), which can allow you to skip additional lookups for smaller content.
42+
- Keep track of dates, timestamps, or other metadata that describes when the vector embedding was generated or how it was generated.
4343

4444
For example, a vector embedding representing an image could include the path to the [R2 object](/r2/) it was generated from, the format, and a category lookup:
4545

@@ -55,7 +55,7 @@ To associate vectors with a namespace, you can optionally provide a `namespace:
5555

5656
A namespace can be up to 64 characters (bytes) in length and you can have up to 1,000 namespaces per index. Refer to the [Limits](/vectorize/platform/limits/) documentation for more details.
5757

58-
When a namespace is specified in a query operation, only vectors within that namespace are used for the search. Namespace filtering is applied before vector search, not after.
58+
When a namespace is specified in a query operation, only vectors within that namespace are used for the search. Namespace filtering is applied before vector search, increasing the precision of the matched results.
5959

6060
To insert vectors with a namespace:
6161

src/content/docs/vectorize/best-practices/query-vectors.mdx

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ A query vector is either an array of JavaScript numbers, 32-bit floating point o
1717

1818
```ts
1919
// query vector dimensions must match the Vectorize index dimension being queried
20-
let queryVector = [54.8, 5.5, 3.1, ...];
20+
let queryVector = [54.8, 5.5, 3.1, ...];
2121
let matches = await env.YOUR_INDEX.query(queryVector);
2222
```
2323

@@ -42,7 +42,7 @@ You can optionally change the number of results returned and/or whether results
4242

4343
```ts
4444
// query vector dimensions must match the Vectorize index dimension being queried
45-
let queryVector = [54.8, 5.5, 3.1, ...];
45+
let queryVector = [54.8, 5.5, 3.1, ...];
4646
// topK defaults to 5; returnValues defaults to false; returnMetadata defaults to "none"
4747
let matches = await env.YOUR_INDEX.query(queryVector, {
4848
topK: 1,
@@ -71,6 +71,13 @@ This would return a set of matches resembling the following, based on the distan
7171

7272
Refer to [Vectorize API](/vectorize/reference/client-api/) for additional examples.
7373

74+
## Control over scoring precision and query accuracy
75+
76+
When querying vectors, you can specify to either use high-precision scoring, increasing the precision of the query matches scores as well as the accuracy of the query results, or use approximate scoring for faster response times.
77+
Using approximate scoring, returned scores will be an approximation of the real distance/similarity between your query and the returned vector.
78+
79+
High-precision scoring is enabled by setting `returnValues: true` on your query; this tells Vectorize to fetch and use the original vector values for your matches, which enables the computation of exact scores of matches, increasing the accuracy of the results.
80+
7481
## Workers AI
7582

7683
If you are generating embeddings from a [Workers AI](/workers-ai/models/#text-embeddings) text embedding model, the response type from `env.AI.run()` is an object that includes both the `shape` of the response vector - e.g. `[1,768]` - and the vector `data` as an array of vectors:

0 commit comments

Comments
 (0)