Skip to content

Commit 2c14740

Browse files
Merge pull request #143 from mattgotteiner/main
Update samples
2 parents 62f451f + d25f167 commit 2c14740

File tree

13 files changed

+2583
-6
lines changed

13 files changed

+2583
-6
lines changed

Quickstart-Agentic-Retrieval/quickstart-agentic-retrieval.ipynb

Lines changed: 1090 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
azure-identity
2+
openai
3+
aiohttp
4+
ipykernel
5+
dotenv
6+
requests
7+
azure-search-documents==11.6.0b12
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
AZURE_OPENAI_ENDPOINT=https://your-openai-service.openai.azure.com
2+
AZURE_OPENAI_GPT_DEPLOYMENT=gpt-4o-mini
3+
AZURE_SEARCH_ENDPOINT=https://your-search-service.search.windows.net
4+
AZURE_SEARCH_INDEX_NAME=agentic-retrieval-sample

Quickstart-Document-Permissions-Pull-API/document-permissions-pull-api.ipynb

Lines changed: 347 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
azure-identity
2+
aiohttp
3+
ipykernel
4+
dotenv
5+
requests
6+
msgraph-sdk
7+
azure-storage-file-datalake
8+
azure-search-documents==11.6.0b12
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
AZURE_SEARCH_ENDPOINT=https://your-search-service.search.windows.net
2+
AZURE_SEARCH_INDEX=document-permissions-indexer-idx
3+
AZURE_SEARCH_INDEXER=document-permissions-indexer-idxr
4+
AZURE_SEARCH_DATASOURCE=document-permissions-indexer-ds
5+
AZURE_STORAGE_ACCOUNT_NAME=
6+
AZURE_STORAGE_CONTAINER_NAME=state-parks
7+
AZURE_STORAGE_CONNECTION_STRING=
8+
AZURE_STORAGE_RESOURCE_ID=
Lines changed: 243 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,243 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"id": "810ce279",
6+
"metadata": {},
7+
"source": [
8+
"# Document-level access example using the push document APIs\n",
9+
"\n",
10+
"In Azure AI Search, you can upload any JSON document payload to a search index for indexing. This notebook shows you how index documents that contain [user access permissions at the document level](azure/search/search-document-level-access-overview), and then query the index to return only those results that the user is authorized to view.\n",
11+
"\n",
12+
"The security principal behind the query access token determines the \"user\". The permission metadata in the document determines whether the user has authorization to the content. Internally, the search engine filters out any documents that aren't associated with the security principal.\n",
13+
"\n",
14+
"This feature is currently in preview.\n",
15+
"\n",
16+
"For an alternative approaching using indexers and pull API, see [Quickstart-Document-Permissions-Pull-API](../Quickstart-Document-Permissions-Pull-API/document-permissions-pull-api.ipynb).\n"
17+
]
18+
},
19+
{
20+
"cell_type": "markdown",
21+
"id": "b6585426",
22+
"metadata": {},
23+
"source": [
24+
"## Prerequisites\n",
25+
"\n",
26+
"+ Azure AI Search, with [role-based access control](https://learn.microsoft.com/azure/search/search-security-enable-roles).\n",
27+
"\n",
28+
"## Permissions\n",
29+
"\n",
30+
"This walkthrough uses Microsoft Entra ID authentication and authorization.\n",
31+
"\n",
32+
"On Azure AI Search, you must have role assignments to create objects and run queries:\n",
33+
"\n",
34+
"+ **Search Service Contributor**\n",
35+
"+ **Search Index Data Contributor**\n",
36+
"+ **Search Index Data Reader**\n",
37+
"\n",
38+
"For more information, see [Connect to Azure AI Search using roles](https://learn.microsoft.com/azure/search/search-security-rbac) and [Quickstart: Connect without keys for local testing](https://learn.microsoft.com/azure/search/search-get-started-rbac).\n",
39+
"\n",
40+
"## Set the environment variables\n",
41+
"\n",
42+
"1. Rename `sample.env` to `.env`.\n",
43+
"1. In the `.env` file, provide a full endpoint to your search service (https://your-search-service.search.windows.net).\n",
44+
"1. Replace the default index name if you want a different name.\n",
45+
"\n",
46+
"## Load Connections\n",
47+
"\n",
48+
"We recommend creating a virtual environment to run this sample code. In Visual Studio Code, open the control palette (ctrl-shift-p) to create an environment. This notebook was tested on Python 3.10.\n",
49+
"\n",
50+
"Once your environment is created, load the environment variables."
51+
]
52+
},
53+
{
54+
"cell_type": "code",
55+
"execution_count": null,
56+
"id": "2975a7f5",
57+
"metadata": {},
58+
"outputs": [],
59+
"source": [
60+
"from dotenv import load_dotenv\n",
61+
"from azure.identity import DefaultAzureCredential, get_bearer_token_provider\n",
62+
"import os\n",
63+
"\n",
64+
"load_dotenv(override=True) # take environment variables from .env.\n",
65+
"\n",
66+
"# The following variables from your .env file are used in this notebook\n",
67+
"endpoint = os.environ[\"AZURE_SEARCH_ENDPOINT\"]\n",
68+
"credential = DefaultAzureCredential()\n",
69+
"index_name = os.getenv(\"AZURE_SEARCH_INDEX\")\n",
70+
"token_provider = get_bearer_token_provider(credential, \"https://search.azure.com/.default\")\n"
71+
]
72+
},
73+
{
74+
"cell_type": "markdown",
75+
"id": "9327cf01",
76+
"metadata": {},
77+
"source": [
78+
"## Create Sample Index\n",
79+
"\n",
80+
"The search index must includes fields for your content and for permission metadata. Assign the new permission filter option to a string field and make sure the field is filterable. The search engine builds the filter internally at query time."
81+
]
82+
},
83+
{
84+
"cell_type": "code",
85+
"execution_count": null,
86+
"id": "9863061f",
87+
"metadata": {},
88+
"outputs": [],
89+
"source": [
90+
"from azure.search.documents.indexes.models import SearchField, SearchIndex, PermissionFilter, SearchIndexPermissionFilterOption\n",
91+
"from azure.search.documents.indexes import SearchIndexClient\n",
92+
"\n",
93+
"index_client = SearchIndexClient(endpoint=endpoint, credential=credential)\n",
94+
"index = SearchIndex(\n",
95+
" name=index_name,\n",
96+
" fields=[\n",
97+
" SearchField(name=\"id\", type=\"Edm.String\", key=True, filterable=True, sortable=True),\n",
98+
" SearchField(name=\"oid\", type=\"Collection(Edm.String)\", retrievable=True, filterable=True, permission_filter=PermissionFilter.USER_IDS),\n",
99+
" SearchField(name=\"group\", type=\"Collection(Edm.String)\", retrievable=True, filterable=True, permission_filter=PermissionFilter.GROUP_IDS),\n",
100+
" SearchField(name=\"name\", type=\"Edm.String\", searchable=True)\n",
101+
" ],\n",
102+
" permission_filter_option=SearchIndexPermissionFilterOption.ENABLED\n",
103+
")\n",
104+
"\n",
105+
"index_client.create_index(index=index)\n",
106+
"print(f\"Index '{index_name}' created with permission filter option enabled.\")"
107+
]
108+
},
109+
{
110+
"cell_type": "markdown",
111+
"id": "f5cf4169",
112+
"metadata": {},
113+
"source": [
114+
"## Connect to Graph to find your object ID (OID) and groups\n",
115+
"\n",
116+
"This step calls the Graph APIs to get a few group IDs for your Microsoft Entra identity. Your group IDs will be added to the access control list of the objects created in the next step."
117+
]
118+
},
119+
{
120+
"cell_type": "code",
121+
"execution_count": null,
122+
"id": "63904f09",
123+
"metadata": {},
124+
"outputs": [],
125+
"source": [
126+
"from msgraph import GraphServiceClient\n",
127+
"client = GraphServiceClient(credentials=credential, scopes=[\"https://graph.microsoft.com/.default\"])\n",
128+
"\n",
129+
"groups = await client.me.member_of.get()\n",
130+
"me = await client.me.get()\n",
131+
"oid = me.id"
132+
]
133+
},
134+
{
135+
"cell_type": "markdown",
136+
"id": "a9ce6d0f",
137+
"metadata": {},
138+
"source": [
139+
"## Upload Sample Data\n",
140+
"\n",
141+
"This step creates the container, folders, and uploads documents into Azure Storage. It assigns your group IDs to to the access control list for each file."
142+
]
143+
},
144+
{
145+
"cell_type": "code",
146+
"execution_count": null,
147+
"id": "8fb830a1",
148+
"metadata": {},
149+
"outputs": [],
150+
"source": [
151+
"from azure.search.documents import SearchClient\n",
152+
"search_client = SearchClient(endpoint=endpoint, index_name=index_name, credential=credential)\n",
153+
"\n",
154+
"documents = [\n",
155+
" { \"id\": \"1\", \"oid\": [oid], \"group\": [groups.value[0].id], \"name\": \"Document 1\" },\n",
156+
" { \"id\": \"2\", \"oid\": [\"all\"], \"group\": [groups.value[0].id], \"name\": \"Document 2\" },\n",
157+
" { \"id\": \"3\", \"oid\": [oid], \"group\": [\"all\"], \"name\": \"Document 3\" },\n",
158+
" { \"id\": \"4\", \"oid\": [\"none\"], \"group\": [\"none\"], \"name\": \"Document 4\" },\n",
159+
" { \"id\": \"5\", \"oid\": [\"none\"], \"group\": [groups.value[0].id], \"name\": \"Document 5\" },\n",
160+
"]\n",
161+
"search_client.upload_documents(documents=documents)\n",
162+
"print(\"Documents uploaded to the index.\")\n"
163+
]
164+
},
165+
{
166+
"cell_type": "markdown",
167+
"id": "e5c93f76",
168+
"metadata": {},
169+
"source": [
170+
"## Search sample data with x-ms-query-source-authorization\n",
171+
"\n",
172+
"This query uses an empty search string (`*`) to provide an unqualified search. It returns the file name and permission metadata associated with each file. Notice that each file is associated with a different group ID."
173+
]
174+
},
175+
{
176+
"cell_type": "code",
177+
"execution_count": null,
178+
"id": "cd872e8c",
179+
"metadata": {},
180+
"outputs": [],
181+
"source": [
182+
"results = search_client.search(search_text=\"*\", x_ms_query_source_authorization=token_provider(), select=\"name,oid,group\", order_by=\"id asc\")\n",
183+
"\n",
184+
"for result in results:\n",
185+
" print(f\"Name: {result['name']}, OID: {result['oid']}, Group: {result['group']}\")"
186+
]
187+
},
188+
{
189+
"cell_type": "markdown",
190+
"id": "d31b67d8",
191+
"metadata": {},
192+
"source": [
193+
"## Search sample data without x-ms-query-source-authorization \n",
194+
"\n",
195+
"This step demonstrates the user experience when authorization fails. No results are returned in the response."
196+
]
197+
},
198+
{
199+
"cell_type": "code",
200+
"execution_count": null,
201+
"id": "a1f2f2a0",
202+
"metadata": {},
203+
"outputs": [],
204+
"source": [
205+
"results = search_client.search(search_text=\"*\", x_ms_query_source_authorization=None, select=\"name,oid,group\", order_by=\"id asc\")\n",
206+
"\n",
207+
"for result in results:\n",
208+
" print(f\"Name: {result['name']}, OID: {result['oid']}, Group: {result['group']}\")"
209+
]
210+
},
211+
{
212+
"cell_type": "markdown",
213+
"id": "5ad253ec",
214+
"metadata": {},
215+
"source": [
216+
"## Next steps\n",
217+
"\n",
218+
"To learn more, see [Document-level access control in Azure AI Search](https://learn.microsoft.com/azure/search/search-document-level-access-overview)."
219+
]
220+
}
221+
],
222+
"metadata": {
223+
"kernelspec": {
224+
"display_name": ".venv",
225+
"language": "python",
226+
"name": "python3"
227+
},
228+
"language_info": {
229+
"codemirror_mode": {
230+
"name": "ipython",
231+
"version": 3
232+
},
233+
"file_extension": ".py",
234+
"mimetype": "text/x-python",
235+
"name": "python",
236+
"nbconvert_exporter": "python",
237+
"pygments_lexer": "ipython3",
238+
"version": "3.12.10"
239+
}
240+
},
241+
"nbformat": 4,
242+
"nbformat_minor": 5
243+
}
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
azure-identity
2+
aiohttp
3+
ipykernel
4+
dotenv
5+
requests
6+
msgraph-sdk
7+
azure-search-documents==11.6.0b12
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
AZURE_SEARCH_ENDPOINT=https://your-search-service.search.windows.net
2+
AZURE_SEARCH_INDEX=document-permissions-push-idx

README.md

Lines changed: 17 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,27 @@
11
# Python samples for Azure AI Search
22

3-
This repository contains Python code samples used in Azure AI Search documentation. Unless noted otherwise, all samples run on the shared (free) pricing tier of an [Azure AI Search service](https://learn.microsoft.com/azure/search/search-create-service-portal).
3+
This repository contains Python code samples used in Azure AI Search documentation. Unless noted otherwise, all samples run on the shared (free) pricing tier of an [Azure AI Search service](https://learn.microsoft.com/azure/search/search-create-service-portal). If your configuration uses a search service managed identity for indexer connections, or if the samples uses semantic ranker, your search service must be basic tier or higher.
4+
5+
## Day-one quickstarts and tutorials
6+
7+
| Sample | Description |
8+
|--------|-------------|
9+
| [Quickstart](Quickstart/README.md) | "Day One" introduction to the fundamental tasks of working with a classic search index: create, load, and query. This sample is a Jupyter notebook (.ipynb) file. The index is modeled on a subset of the Hotels dataset, widely used in Azure AI Search samples, but reduced here for readability and comprehension. |
10+
| [Quickstart-Semantic-Search](Quickstart-Semantic-Search/semantic-search-quickstart.ipynb) | Extends the quickstart through modifications that invoke semantic ranking. This notebook adds a semantic configuration to the index and semantic query options that formulate the query and response. You must have basic tier or higher for this quickstart.|
11+
| [Quickstart-RAG](Quickstart-RAG/quickstart-rag.ipynb) | "Day One" introduction to LLM integration with a chat model such as GPT-3.5-turbo or equivalent. We recommend the basic tier or higher for this quickstart.|
12+
| [Quickstart-Document-Permissions-Pull-API](Quickstart-Document-Permissions-Pull-API/document-permissions-pull-api.ipynb) | Using an indexer "pull API" approach, flow access control lists from a data source to search results and apply permission filters that restrict access to authorized content. Indexer support is limited to Azure Data Lake Storage (ADLS) Gen2 permission metadata. You must have basic tier or higher for this quickstart.|
13+
| [Quickstart-Document-Permissions-Push-API](Quickstart-Document-Permissions-Push-API/document-permissions-push-api.ipynb) | Using the push APIs for indexing a JSON payload, flow embedded permission metadata to indexed documents, and to search results that are filtered based on user access to authorized content. We recommend the basic tier or higher for this quickstart.|
14+
| [Quickstart-Agentic-Retrieval](Quickstart-Agentic-Retrieval/quickstart-agentic-retrieval.ipynb) | Set up a search agent in Azure AI Search to integrate LLM reasoning into query planning. We recommend the basic tier or higher for this quickstart. |
15+
|[Tutorial-RAG](Tutorial-RAG/tutorial-rag.ipynb) | A deeper dive into LLM integration with a chat model such as GPT-3.5-turbo or equivalent. We recommend the basic tier or higher for this quickstart. |
16+
17+
## Deeper dive tutorials
418

519
| Sample | Description |
620
|--------|-------------|
21+
| [agentic-retrieval-pipeline-example](agentic-retrieval-pipeline-example/agent-example.ipynb) | This sample demonstrates integration with Azure AI Agent service, adding an AI agent and tool for an end-to-end conversational search experience. |
722
| [azure-function-search](azure-function-search/readme.md) | This sample is an Azure Function that sends query requests to an Azure AI Search service. You can substitute this code to replace the contents of the `api` folder in the C# sample [azure-search-static-web-app](https://github.com/Azure-Samples/azure-search-static-web-app). |
823
| [bulk-insert](bulk-insert/readme.md) | This sample shows you how to create and load an index using the push APIs and sample data. You can substitute this code to replace the contents of the `bulk-insert` folder in the C# sample [azure-search-static-web-app](https://github.com/Azure-Samples/azure-search-static-web-app) |
9-
| cmk-encryption | This example shows you how to encrypt content using customer-managed keys.|
10-
| quickstart | "Day One" introduction to the fundamental tasks of working with a search index: create, load, and query. This sample is a notebook .ipynb file. The index is modeled on a subset of the Hotels dataset, widely used in Azure AI Search samples, but reduced here for readability and comprehension. |
11-
| quickstart-semantic-search | Extends the quickstart through modifications that invoke semantic search. This notebook adds a semantic configuration to the index and semantic query options that formulate the query and response. |
12-
| quickstart-rag | "Day One" introduction to LLM integration with a chat model such as GPT-3.5-turbo or equivalent. |
13-
| tutorial-rag | A deeper dive into LLM integration with a chat model such as GPT-3.5-turbo or equivalent. |
24+
| [cmk-encryption](cmk-example/cmk-example.ipynb) | This example shows you how to encrypt content using customer-managed keys.|
1425

1526
## Archived samples
1627

0 commit comments

Comments
 (0)