You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/includes/quickstarts/search-get-started-vector-python.md
+27-29Lines changed: 27 additions & 29 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,9 +7,7 @@ ms.topic: include
7
7
ms.date: 06/19/2025
8
8
---
9
9
10
-
The [Azure AI Search client library](/python/api/overview/azure/search-documents-readme) allows you to create, load, and query vectors. It provides an abstraction over the REST API for access to index operations such as data ingestion, search operations and index management operations.
11
-
12
-
In this quickstart, you'll use a Jupyter notebook which contains the configuration, data, and code required to perform these operations.
10
+
In this quickstart, you use a Jupyter notebook to create, load, and query vectors. The code examples perform these operations by using the [Azure AI Search client library](/python/api/overview/azure/search-documents-readme). The library provides an abstraction over the REST API for access to index operations such as data ingestion, search operations, and index management operations.
13
11
14
12
In Azure AI Search, a [vector store](../../vector-store.md) has an index schema that defines vector and nonvector fields, a vector search configuration for algorithms that create the embedding space, and settings on vector field definitions that are evaluated at query time. The [Create Index](/rest/api/searchservice/indexes/create-or-update) REST API creates the vector store.
15
13
@@ -32,9 +30,9 @@ In Azure AI Search, a [vector store](../../vector-store.md) has an index schema
32
30
33
31
## Retrieve resource information
34
32
35
-
Requests to the search endpoint must be authenticated and authorized. While it is possible to use API keys or roles for this task, we recommend [using a keyless connection via Microsoft Entra ID](../../search-get-started-rbac.md).
33
+
Requests to the search endpoint must be authenticated and authorized. While it's possible to use API keys or roles for this task, we recommend [using a keyless connection via Microsoft Entra ID](../../search-get-started-rbac.md).
36
34
37
-
This quickstart uses `DefaultAzureCredential` which simplifies authentication in both development and production scenarios. However, for production scenarios, you may have more advanced requirements that require a different approach. See [Authenticate Python apps to Azure services by using the Azure SDK for Python](/azure/developer/python/sdk/authentication/overview) to understand all of your options.
35
+
This quickstart uses `DefaultAzureCredential`, which simplifies authentication in both development and production scenarios. However, for production scenarios, you might have more advanced requirements that require a different approach. See [Authenticate Python apps to Azure services by using the Azure SDK for Python](/azure/developer/python/sdk/authentication/overview) to understand all of your options.
38
36
39
37
40
38
## Clone the code and setup environment
@@ -57,7 +55,7 @@ This quickstart uses `DefaultAzureCredential` which simplifies authentication in
57
55
58
56
1. Rename the `sample.env` file to `.env` and modify the values in the `.env` file.
59
57
60
-
Use the Search service Url as the `AZURE_SEARCH_ENDPOINT`. You can find this in the Azure portal. Go to your Azure AI Search service resource, on the Overview page, look for the Url field. An example endpoint might look like `https://mydemo.search.windows.net`.
58
+
Use the Search service Url as the `AZURE_SEARCH_ENDPOINT`. You can find the url in the Azure portal. Go to your Azure AI Search service resource, on the Overview page, look for the Url field. An example endpoint might look like `https://mydemo.search.windows.net`.
61
59
62
60
Finally, choose a new `AZURE_SEARCH_INDEX_NAME` name, or use the one provided in the file.
63
61
@@ -72,11 +70,11 @@ This quickstart uses `DefaultAzureCredential` which simplifies authentication in
72
70
```
73
71
74
72
> [!Note]
75
-
> This assumes you're using Git Bash in your Terminal, and you're running on Windows. If you're using a different shell and/or a different operating system, you'll need to adjust these instructions for your specific environment.
73
+
> This assumes you're using Git Bash in your Terminal, and you're running on Windows. If you're using a different shell and/or a different operating system, adjust these instructions for your specific environment.
76
74
77
75
If prompted, allow Visual Studio Code to use the new environment.
78
76
79
-
The `where python` command will validate that you are working from the virtual environment by listing `python.exe` in the `Quickstart-Vector-Search\.venv\` folder, as well as other locations from your machine's directory.
77
+
The `where python` command validates that you're working from the virtual environment by listing `python.exe` in the `Quickstart-Vector-Search\.venv\` folder, and other locations from your machine's directory.
80
78
81
79
1. Install the required libraries by running the following command.
82
80
@@ -90,7 +88,7 @@ This quickstart uses `DefaultAzureCredential` which simplifies authentication in
90
88
> If this is the first time you have used a Jupyter Notebook (.ipynb) in Visual Studio Code, you will be prompted to install the Jupyter Notebook kernal and possibly other tools. Choose to install the suggested tools to continue with this quickstart.
91
89
92
90
93
-
1.Run the cell in the section below the title "Install packages and set variables". This invokes the following code:
91
+
1.Find the cell below section titled "Install packages and set variables" and select the **Execute Cell (`Ctrl` + `Alt` + `Enter`)** button (which looks like a typical run button) to the left of the cell. Executing the cell loads the environment variables, creates the DefaultAzureCredential, and prints values to the output to confirm that the notebook's dependencies and `.env` are set up correctly.
94
92
95
93
```python
96
94
# Load environment variables from .env file
@@ -109,7 +107,7 @@ This quickstart uses `DefaultAzureCredential` which simplifies authentication in
109
107
print(f"Using Azure Search index: {index_name}")
110
108
!pip list
111
109
```
112
-
The following output is displayed below this cell to confirm that the values are set up correctly.
110
+
Executing this cell produces the following output.
113
111
114
112
```output
115
113
Using Azure Search endpoint: https://<search-service-name>.search.windows.net
@@ -131,14 +129,14 @@ This quickstart uses `DefaultAzureCredential` which simplifies authentication in
131
129
...
132
130
```
133
131
134
-
There are many more packages which you can view in a scrollable element (see the message below the cell results).
132
+
There are many more packages that you can view in a scrollable element (see the message below the cell results).
135
133
136
134
137
135
## Create the vector index
138
136
139
137
The code in the `vector-search-quickstart.ipynb` uses several methods from the `azure.search.documents` library to create the vector index and searchable fields.
140
138
141
-
1.Run the cell in the section below the title "Create an index". This invokes the following code:
139
+
1.Find the cell below section titled "Create an index" and execute the cell to create the index.
142
140
143
141
```python
144
142
from azure.search.documents.indexes import SearchIndexClient
@@ -228,7 +226,7 @@ The code in the `vector-search-quickstart.ipynb` uses several methods from the `
228
226
229
227
Key takeaways when creating vector index with the `azure.search.documents`:
230
228
231
-
- You define an index by creating a list of fields, each one created with a helper method defining the field type, along with various settings for each field.
229
+
- You define an index by creating a list of fields. Each field is created using a helper method that defines the field type and its settings.
232
230
233
231
- This particular index supports multiple search capabilities, such as:
@@ -246,7 +244,7 @@ Creating and loading the index are separate steps. You created the index schema
246
244
247
245
In Azure AI Search, the index contains all searchable data and queries run on the search service.
248
246
249
-
1.In Visual Studio Code, run the cell in the section below "Create documents payload". This cell contains the following code (truncated for brevity):
247
+
1.Find the cell below section titled "Create documents payload" and execute the cell. This cell contains the following code (truncated for brevity):
250
248
251
249
```python
252
250
# Create a documents payload
@@ -290,9 +288,9 @@ In Azure AI Search, the index contains all searchable data and queries run on th
290
288
This cell loads a variable named `documents` with a JSON object describing each document, along with the vectorized version of the article's description. This vector is what powers the search.
291
289
292
290
> [!IMPORTANT]
293
-
> The code in this example isn't runnable. Several characters or lines are truncated / removed for brevity. Use the code in your `vector-search-quickstart.ipynb` file to run the request.
291
+
> The code in this example isn't runnable. Several characters or lines are removed for brevity. Instead, run the code in the Jupyter notebook.
294
292
295
-
1.Run the cell in the section below "Upload the documents". This cell contains the following code (truncated for brevity):
293
+
1.Find the cell below section titled "Upload the documents" and execute the cell. This cell contains the following code (truncated for brevity):
296
294
297
295
```python
298
296
# Upload documents to the index
@@ -326,7 +324,7 @@ In Azure AI Search, the index contains all searchable data and queries run on th
326
324
327
325
Key takeaways about the `upload_documents()` method and this example:
328
326
329
-
-The `SearchClient` is the main object provided by the Azure SDK for Python (azure-search-documents package) that allows your code to interact with a specific search index hosted in your Azure AI Search service. It is an abstraction over the REST API. It provides access to index operations such as:
327
+
-Your code interacts with a specific search index hosted in your Azure AI Search service through the `SearchClient`, which is the main object provided by the `azure-search-documents` package. The `SearchClient` provides access to index operations such as:
330
328
331
329
-**Data ingestion** - `upload_documents()`, `merge_documents()`, `delete_documents()`, etc.
@@ -354,13 +352,13 @@ The example vector queries are based on two strings:
354
352
355
353
The vector query string is semantically similar to the search string, but it includes terms that don't exist in the search index. If you do a keyword search for `quintessential lodging near running trails, eateries, retail`, results are zero. We use this example to show how you can get relevant results even if there are no matching terms.
356
354
357
-
1.Run the cell in the section below "Create the vector query string". this loads the `vector` variable with the vectorized query data required to run all of the searches in the next sections.
355
+
1.Find the cell below section titled "Create the vector query string" and execute the cell. This loads the `vector` variable with the vectorized query data required to run all of the searches in the next sections.
358
356
359
357
### Single vector search
360
358
361
359
The first example demonstrates a basic scenario where you want to find document descriptions that closely match the search string.
362
360
363
-
1.Run the cell below the section called "Single vector search". This block contains the request to query the search index.
361
+
1.Find the cell below section titled "Single vector search" and execute the cell. This block contains the request to query the search index.
364
362
365
363
```python
366
364
# IMPORTANT: Before you run this code, make sure the documents were successfully
@@ -400,11 +398,11 @@ The first example demonstrates a basic scenario where you want to find document
400
398
401
399
The vector query string is`quintessential lodging near running trails, eateries, retail`, which is vectorized into 1,536 embeddings for this query.
402
400
403
-
The response for the vector equivalent of `quintessential lodging near running trails, eateries, retail` includes seven results but the code specifies `top=5` so only the first five results will be returned. Furthermore, only the fields specific by the `select` are returned.
401
+
The response for the vector equivalent of `quintessential lodging near running trails, eateries, retail` includes seven results but the code specifies `top=5` so only the first five results are returned. Furthermore, only the fields specific by the `select` are returned.
404
402
405
-
`search_client.search()` returns a dict-like object. Each result provides a search score which can be accessed using `score = result.get("@search.score", "N/A")`. While not displayed in this example, in a similarity search, the response always includes `k` results ordered by the value similarity score.
403
+
`search_client.search()` returns a dict-like object. Each result provides a search score, which can be accessed using `score = result.get("@search.score", "N/A")`. While not displayed in this example, in a similarity search, the response always includes `k` results ordered by the value similarity score.
406
404
407
-
When run, each result will be displayed:
405
+
When run, each result is displayed:
408
406
409
407
```output
410
408
Total results: 5
@@ -419,7 +417,7 @@ The first example demonstrates a basic scenario where you want to find document
419
417
420
418
You can add filters, but the filters are applied to the nonvector content in your index. In this example, the filter applies to the `Tags` field to filter out any hotels that don't provide free Wi-Fi.
421
419
422
-
1. Find the `### Run a vector query with a filter` code block in the file. This block contains the request to query the search index.
420
+
1. Find the cell below section titled "Run a vector query with a filter"and execute the cell. This cell contains the request to query the search index.
423
421
424
422
```python
425
423
if vector:
@@ -450,7 +448,7 @@ You can add filters, but the filters are applied to the nonvector content in you
450
448
print("No vector loaded, skipping search.")
451
449
```
452
450
453
-
When run, each result will be displayed:
451
+
When run, each result is displayed:
454
452
455
453
```output
456
454
Total filtered results: 2
@@ -501,7 +499,7 @@ You can add filters, but the filters are applied to the nonvector content in you
501
499
print("No vector loaded, skipping search.")
502
500
```
503
501
504
-
The query was the same as the previous [single vector search example](#single-vector-search), but it includes a post-processing exclusion filter and returns only the two hotels hotels within 300 KM.
502
+
The query was the same as the previous [single vector search example](#single-vector-search), but it includes a post-processing exclusion filter and returns only the two hotels within 300 kilometers.
505
503
506
504
```output
507
505
Total semantic hybrid results: 2
@@ -525,7 +523,7 @@ Hybrid search consists of keyword queries and vector queries in a single search
525
523
-**Search string**: `historic hotel walk to restaurants and shopping`
526
524
-**Vector query string** (vectorized into a mathematical representation): `quintessential lodging near running trails, eateries, retail`
527
525
528
-
1.Run the cell in the section titled "Hybrid Search". This block contains the request to query the search index.
526
+
1.Find the cell below section titled "Hybrid Search" and execute the cell. This block contains the request to query the search index.
529
527
530
528
```python
531
529
if vector:
@@ -603,7 +601,7 @@ Hybrid search consists of keyword queries and vector queries in a single search
Because RRF merges results, it helps to review the inputs. The following results are from only the full-text query. The top two results are Sublime Palace Hotel and History Lion Resort. The Sublime Palace Hotel has a stronger BM25 relevance score.
604
+
Because Reciprocal Rank Fusion (RRF) merges results, it helps to review the inputs. The following results are from only the full-text query. The top two results are Sublime Palace Hotel and History Lion Resort. The Sublime Palace Hotel has a stronger BM25 relevance score.
607
605
608
606
```json
609
607
{
@@ -678,7 +676,7 @@ Hybrid search consists of keyword queries and vector queries in a single search
678
676
679
677
Here's the last query in the collection. This hybrid query with semantic ranking is filtered to show only the hotels within a 500-kilometer radius of Washington D.C. You can set `vectorFilterMode` to null, which is equivalent to the default (`preFilter` for newer indexes and `postFilter` for older ones).
680
678
681
-
1.Run the cell below the section titled `Semantic hybrid search`. This code block contains the request to query the search index.
679
+
1.Find the cell below section titled "Semantic hybrid search" and execute the cell. This code block contains the request to query the search index.
682
680
683
681
```python
684
682
if semantic_hybrid_query_vector:
@@ -785,7 +783,7 @@ When you're working in your own subscription, it's a good idea at the end of a p
785
783
786
784
You can find and manage resources in the Azure portal by using the **All resources** or **Resource groups** link in the leftmost pane.
787
785
788
-
If you want to keep the search service, but delete the index and documents, you can use the `SearchIndexClient` object's `delete_index()` method. The cell in the section "Clean up" at the bottom of the notebook deletes the `hotels-vector-quickstart` index:
786
+
If you want to keep the search service, but delete the index and documents, you can use the `SearchIndexClient` object's `delete_index()` method. Find the cell below section titled "Clean up" and execute the cell if you want to delete the `hotels-vector-quickstart` index:
0 commit comments