[Vectorize] Mention index name reuse + high-precision scoring trigger

netgusto · netgusto · commit b02df6dc0d85 · 2024-09-27T12:22:39.000+02:00
diff --git a/src/content/docs/vectorize/best-practices/create-indexes.mdx b/src/content/docs/vectorize/best-practices/create-indexes.mdx
@@ -15,6 +15,8 @@ Creating an index requires three inputs:
 - The (fixed) [dimension size](#dimensions) of each vector, for example 384 or 1536.
 - The (fixed) [distance metric](#distance-metrics) to use for calculating vector similarity.
 
+An index cannot be created using the same name as an index previously deleted on your account.
+
 The configuration of an index cannot be changed after creation.
 
 ## Create an index
@@ -103,4 +105,6 @@ Distance metrics are functions that determine how close vectors are from each ot
 
 Determining the similarity between vectors can be subjective based on how the machine-learning model that represents features in the resulting vector embeddings. For example, a score of `0.8511` when using a `cosine` metric means that two vectors are close in distance, but whether data they represent is _similar_ is a function of how well the model is able to represent the original content.
 
+When querying vectors, you can specify Vectorize to either use high-precision scoring, increasing the precision of the query matches scores as well as the accuracy of the query results, or use approximate scoring for faster response times. Using approximate scoring, returned scores will be an approximation of the real distance/similarity between your query and the returned vector. See [Control over scoring precision and query accuracy](/vectorize/best-practices/query-vectors/#control-over-scoring-precision-and-query-accuracy)
+
 Distance metrics cannot be changed after index creation, and that each metric has a different scoring function.
diff --git a/src/content/docs/vectorize/best-practices/insert-vectors.mdx b/src/content/docs/vectorize/best-practices/insert-vectors.mdx
@@ -13,7 +13,7 @@ Vectorize indexes allow you to insert vectors at any point: Vectorize will optim
 
 If the same vector id is _inserted_ twice in a Vectorize index, the index would reflect the vector that was added first.
 
-If the same vector id is _upserted_ twice in a Vectorize index, the index would reflect the vector that was added second.
+If the same vector id is _upserted_ twice in a Vectorize index, the index would reflect the vector that was added last.
 
 Use the upsert operation if you want to overwrite the vector value for a vector id that already exists in an index.
 
@@ -38,8 +38,8 @@ Metadata keys cannot be empty, contain the dot character (`.`), contain the doub
 Metadata can be used to:
 
 - Include the object storage key, database UUID or other identifier to look up the content the vector embedding represents.
-- The raw content (up to the [metadata limits](/vectorize/platform/limits/)), which can allow you to skip additional lookups for smaller content.
-- Dates, timestamps, or other metadata that describes when the vector embedding was generated or how it was generated.
+- Store JSON data (up to the [metadata limits](/vectorize/platform/limits/)), which can allow you to skip additional lookups for smaller content.
+- Keep track of dates, timestamps, or other metadata that describes when the vector embedding was generated or how it was generated.
 
 For example, a vector embedding representing an image could include the path to the [R2 object](/r2/) it was generated from, the format, and a category lookup:
 
@@ -55,7 +55,7 @@ To associate vectors with a namespace, you can optionally provide a `namespace:
 
 A namespace can be up to 64 characters (bytes) in length and you can have up to 1,000 namespaces per index. Refer to the [Limits](/vectorize/platform/limits/) documentation for more details.
 
-When a namespace is specified in a query operation, only vectors within that namespace are used for the search. Namespace filtering is applied before vector search, not after.
+When a namespace is specified in a query operation, only vectors within that namespace are used for the search. Namespace filtering is applied before vector search, increasing the precision of the matched results.
 
 To insert vectors with a namespace:
 
diff --git a/src/content/docs/vectorize/best-practices/query-vectors.mdx b/src/content/docs/vectorize/best-practices/query-vectors.mdx
@@ -17,7 +17,7 @@ A query vector is either an array of JavaScript numbers, 32-bit floating point o
 
 ```ts
 // query vector dimensions must match the Vectorize index dimension being queried
-let queryVector = [54.8, 5.5, 3.1, ...]; 
+let queryVector = [54.8, 5.5, 3.1, ...];
 let matches = await env.YOUR_INDEX.query(queryVector);
 ```
 
@@ -42,7 +42,7 @@ You can optionally change the number of results returned and/or whether results
 
 ```ts
 // query vector dimensions must match the Vectorize index dimension being queried
-let queryVector = [54.8, 5.5, 3.1, ...]; 
+let queryVector = [54.8, 5.5, 3.1, ...];
 // topK defaults to 5; returnValues defaults to false; returnMetadata defaults to "none"
 let matches = await env.YOUR_INDEX.query(queryVector, {
 	topK: 1,
@@ -71,6 +71,13 @@ This would return a set of matches resembling the following, based on the distan
 
 Refer to [Vectorize API](/vectorize/reference/client-api/) for additional examples.
 
+## Control over scoring precision and query accuracy
+
+When querying vectors, you can specify to either use high-precision scoring, increasing the precision of the query matches scores as well as the accuracy of the query results, or use approximate scoring for faster response times.
+Using approximate scoring, returned scores will be an approximation of the real distance/similarity between your query and the returned vector.
+
+High-precision scoring is enabled by setting `returnValues: true` on your query; this tells Vectorize to fetch and use the original vector values for your matches, which enables the computation of exact scores of matches, increasing the accuracy of the results.
+
 ## Workers AI
 
 If you are generating embeddings from a [Workers AI](/workers-ai/models/#text-embeddings) text embedding model, the response type from `env.AI.run()` is an object that includes both the `shape` of the response vector - e.g. `[1,768]` - and the vector `data` as an array of vectors: