Skip to content

Commit 15b7829

Browse files
Merge pull request #245099 from HeidiSteen/heidist-gh
Revising for readability
2 parents 7fe3df0 + cebe72e commit 15b7829

File tree

2 files changed

+20
-14
lines changed

2 files changed

+20
-14
lines changed

articles/search/vector-search-how-to-create-index.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -23,21 +23,21 @@ In Azure Cognitive Search, vector data is represented in fields in a [search ind
2323

2424
Most existing services support vector search. For a small subset of services created prior to January 2019, an index containing vector fields will fail on creation. In this situation, a new service must be created.
2525

26-
+ Pre-existing embeddings in your source documents. Cognitive Search doesn't generate embeddings. We recommend Azure OpenAI but you can use any model for vectorization. For more information, see [Create and use embeddings for search queries and documents](vector-search-how-to-generate-embeddings.md).
26+
+ Pre-existing vector embeddings in your source documents. Cognitive Search doesn't generate vectors. We recommend Azure OpenAI but you can use any model for vectorization. For more information, see [Create and use embeddings for search queries and documents](vector-search-how-to-generate-embeddings.md).
2727

28-
Be sure to use the same embedding model for both indexing and queries. At query time, you must include a step that converts the user's query into a vector.
28+
Be sure to use the same embedding model for both indexing and queries. At query time, you must include a step that converts the user's query string into a vector.
2929

3030
## Prepare documents for indexing
3131

32-
Prior to indexing, assemble a document payload that includes vector data. The document structure must conform to the index schema. Make sure your documents include the following elements:
32+
Prior to indexing, assemble a document payload that includes vector data. The document structure must conform to the index schema. Make sure your documents:
3333

34-
1. Provide a unique value or a metadata property that uniquely identifies each source document. All search indexes require a document key as a unique identifier, which means all documents must have one field that can be mapped to type `Edm.String` and `key=true` in the search index.
34+
1. Provide a field or a metadata property that uniquely identifies each document. All search indexes require a document key. Your documents must have one field or property that can be mapped to type `Edm.String` and `key=true` in the search index.
3535

3636
1. Provide vector data (an array of single-precision floating point numbers) in source fields.
3737

38-
Vector fields contain vector data generated by embedding models. We recommend the embedding models in [Azure OpenAI](https://aka.ms/oai/access), such as **text-embedding-ada-002** for text documents or the [Image Retrieval REST API](/rest/api/computervision/2023-02-01-preview/image-retrieval/vectorize-image) for images.
38+
Vector fields contain vector data generated by embedding models, one embedding per field. We recommend the embedding models in [Azure OpenAI](https://aka.ms/oai/access), such as **text-embedding-ada-002** for text documents or the [Image Retrieval REST API](/rest/api/computervision/2023-02-01-preview/image-retrieval/vectorize-image) for images.
3939

40-
1. Provide any other fields with alphanumeric content for any nonvector queries you want to support, as well as for hybrid query scenarios that include full text search or semantic ranking in the same request.
40+
1. Provide other fields with alphanumeric content for the search response and for hybrid query scenarios that include full text search or semantic ranking in the same request.
4141

4242
Your search index should include fields and content for all of the query scenarios you want to support. Suppose you want to search or filter over product names, versions, metadata, or addresses. In this case, similarity search isn't especially helpful. Keyword search, geo-search, or filters would be a better choice. A search index that includes a comprehensive field collection of vector and non-vector data provides maximum flexibility for query construction and response composition.
4343

articles/search/vector-search-how-to-query.md

Lines changed: 14 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -81,28 +81,34 @@ The expected response is 202 for a successful call to the deployed model. The bo
8181

8282
## Configure a query response
8383

84-
When you're setting up the vector query, think about the response structure. You can control the shape of the response by choosing which fields are in the results and how many results are included. The search engine ranks the results. Ranking algorithms aren't generally configurable.
84+
When you're setting up the vector query, think about the response structure. The response is a flattened rowset. Parameters on the query determine which fields are in each row and how many rows are in the response. The search engine ranks the matching documents and returns the most relevant results.
8585

8686
### Fields in a response
8787

88-
Search results are composed of either all "retrievable" fields (a REST API default) or the fields explicitly listed in a "select" parameter on the query. In the examples that follow, each one includes a "select" statement that specifies text (non-vector) fields to include the response.
88+
Search results are composed of "retrievable" fields from your search index. A result is either:
8989

90-
Vectors aren't designed for readability, so avoid returning them in the response. Instead, choose non-vector fields that are representative of the search document. For example, if the query targets a "descriptionVector" field, return an equivalent text field if you have one ("description") in the response.
90+
+ All "retrievable" fields (a REST API default).
91+
+ Fields explicitly listed in a "select" parameter on the query.
92+
93+
The examples in this article include a "select" statement that specifies text (non-vector) fields to include the response.
94+
95+
> [!NOTE]
96+
> Vectors aren't designed for readability, so avoid returning them in the response. Instead, choose non-vector fields that are representative of the search document. For example, if the query targets a "descriptionVector" field, return an equivalent text field if you have one ("description") in the response.
9197
9298
### Number of results
9399

94-
A query might match to any number of documents, up to all of them in the search index if the search criteria are weak. However, the size of the results sent back in the response is determined by the query parameters "k" and "top". Maximum results in a response are either:
100+
A query might match to any number of documents, as many as all of them if the search criteria are weak (for example "search=*" for a null query). Because it's seldom practical to return unbounded results, you should specify a maximum for the response:
95101

96102
+ `"k": n` results for vector-only queries
97-
+ `"top": n` results for hybrid queries
103+
+ `"top": n` results for hybrid queries that include a "search" parameter
98104

99-
Both "k" and "top" are optional. Unspecified, the default number of results in a response is 50. You can set "top" and "skip" to [page through more results](search-pagination-page-layout.md#paging-results).
105+
Both "k" and "top" are optional. Unspecified, the default number of results in a response is 50. You can set "top" and "skip" to [page through more results](search-pagination-page-layout.md#paging-results) or to change the default.
100106

101107
### Ranking
102108

103109
Ranking of results is computed by either:
104110

105-
+ The similarity metric specified in the index `vectorConfiguration` for a vector-only query.
111+
+ The similarity metric specified in the index `vectorConfiguration` for a vector-only query. Valid values are `cosine` , `euclidean`, and `dotProduct`.
106112
+ Reciprocal Rank Fusion (RRF) if there are multiple sets of search results.
107113

108114
Azure OpenAI embedding models use cosine similarity, so if you're using Azure OpenAI embedding models, `cosine` is the recommended metric. Other supported ranking metrics include `euclidean` and `dotProduct`.
@@ -113,7 +119,7 @@ Multiple sets are created if the query targets multiple vector fields, or if the
113119

114120
In this vector query, which is shortened for brevity, the "value" contains the vectorized text of the query input. The "fields" property specifies which vector fields are searched. The "k" property specifies the number of nearest neighbors to return as top hits.
115121

116-
Recall that the vector query was generated from this string: `"what Azure services support full text search"`. The search targets the "contentVector" field.
122+
The sample vector query for this article is: `"what Azure services support full text search"`. The query targets the "contentVector" field.
117123

118124
```http
119125
POST https://{{search-service-name}}.search.windows.net/indexes/{{index-name}}/docs/search?api-version={{api-version}}

0 commit comments

Comments
 (0)