Skip to content

Commit d8b7a15

Browse files
committed
updating code
1 parent bd807da commit d8b7a15

File tree

4 files changed

+130
-126
lines changed

4 files changed

+130
-126
lines changed

articles/ai-services/agents/how-to/tools/file-search.md

Lines changed: 1 addition & 109 deletions
Original file line numberDiff line numberDiff line change
@@ -129,116 +129,8 @@ As a fallback, there's a 60-second maximum wait in the run object when the threa
129129
::: zone-end
130130

131131
::: zone pivot="deep-dive"
132-
## Creating vector stores and adding files
133132

134-
You can create a vector store and add files to it in a single API call:
133+
[!INCLUDE [deep-dive](../../includes/file-search/deep-dive.md)]
135134

136-
```python
137-
vector_store = client.beta.vector_stores.create(
138-
name="Product Documentation",
139-
file_ids=['file_1', 'file_2', 'file_3', 'file_4', 'file_5']
140-
)
141-
```
142-
143-
Adding files to vector stores is an async operation. To ensure the operation is complete, we recommend that you use the 'create and poll' helpers in our official SDKs. If you're not using the SDKs, you can retrieve the `vector_store` object and monitor its `file_counts` property to see the result of the file ingestion operation.
144-
145-
Files can also be added to a vector store after it's created by creating vector store files.
146-
147-
```python
148-
file = client.beta.vector_stores.files.create_and_poll(
149-
vector_store_id="vs_abc123",
150-
file_id="file-abc123"
151-
)
152-
```
153-
154-
Alternatively, you can add several files to a vector store by creating batches of up to 500 files.
155-
156-
```python
157-
batch = client.beta.vector_stores.file_batches.create_and_poll(
158-
vector_store_id="vs_abc123",
159-
file_ids=['file_1', 'file_2', 'file_3', 'file_4', 'file_5']
160-
)
161-
```
162-
163-
Similarly, these files can be removed from a vector store by either:
164-
165-
* Deleting the vector store file object or,
166-
* By deleting the underlying file object (which removes the file it from all vector_store and code_interpreter configurations across all agents and threads in your organization)
167-
168-
The maximum file size is 512 MB. Each file should contain no more than 5,000,000 tokens per file (computed automatically when you attach a file).
169-
170-
171-
## Attaching vector stores
172-
173-
You can attach vector stores to your agent or thread using the tool_resources parameter.
174-
175-
```python
176-
assistant = client.beta.assistants.create(
177-
instructions="You are a helpful product support assistant and you answer questions based on the files provided to you.",
178-
model="gpt-4-turbo",
179-
tools=[{"type": "file_search"}],
180-
tool_resources={
181-
"file_search": {
182-
"vector_store_ids": ["vs_1"]
183-
}
184-
}
185-
)
186-
187-
thread = client.beta.threads.create(
188-
messages=[ { "role": "user", "content": "How do I cancel my subscription?"} ],
189-
tool_resources={
190-
"file_search": {
191-
"vector_store_ids": ["vs_2"]
192-
}
193-
}
194-
)
195-
```
196-
197-
You can also attach a vector store to Threads or Assistants after they're created by updating them with the right `tool_resources`.
198-
199-
200-
## Ensuring vector store readiness before creating runs
201-
202-
We highly recommend that you ensure all files in a vector_store are fully processed before you create a run. This ensures that all the data in your vector store is searchable. You can check for vector store readiness by using the polling helpers in the SDKs, or by manually polling the `vector_store` object to ensure the status is completed.
203-
204-
As a fallback, there's a 60-second maximum wait in the run object when the thread's vector store contains files that are still being processed. This is to ensure that any files your users upload in a thread a fully searchable before the run proceeds. This fallback wait does not apply to the agent's vector store.
205-
206-
## Managing costs with expiration policies
207-
208-
The `file_search` tool uses the `vector_stores` object as its resource and you will be billed based on the size of the vector_store objects created. The size of the vector store object is the sum of all the parsed chunks from your files and their corresponding embeddings.
209-
210-
In order to help you manage the costs associated with these vector_store objects, we have added support for expiration policies in the `vector_store` object. You can set these policies when creating or updating the `vector_store` object.
211-
212-
```python
213-
vector_store = client.beta.vector_stores.create_and_poll(
214-
name="Product Documentation",
215-
file_ids=['file_1', 'file_2', 'file_3', 'file_4', 'file_5'],
216-
expires_after={
217-
"anchor": "last_active_at",
218-
"days": 7
219-
}
220-
)
221-
```
222-
223-
### Thread vector stores have default expiration policies
224-
225-
Vector stores created using thread helpers (like `tool_resources.file_search.vector_stores` in Threads or `message.attachments` in Messages) have a default expiration policy of seven days after they were last active (defined as the last time the vector store was part of a run).
226-
227-
When a vector store expires, the runs on that thread fail. To fix this, you can recreate a new vector_store with the same files and reattach it to the thread.
228-
229-
```python
230-
all_files = list(client.beta.vector_stores.files.list("vs_expired"))
231-
232-
vector_store = client.beta.vector_stores.create(name="rag-store")
233-
client.beta.threads.update(
234-
"thread_abc123",
235-
tool_resources={"file_search": {"vector_store_ids": [vector_store.id]}},
236-
)
237-
238-
for file_batch in chunked(all_files, 100):
239-
client.beta.vector_stores.file_batches.create_and_poll(
240-
vector_store_id=vector_store.id, file_ids=[file.id for file in file_batch]
241-
)
242-
```
243135
::: zone-end
244136

articles/ai-services/agents/includes/file-search/azure-blob-storage-code-examples.md

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -25,16 +25,15 @@ project_client = AIProjectClient.from_connection_string(
2525
)
2626
```
2727
### Step 2: Upload local files to your project Azure Blob Storage container
28-
We upload the local file to your project Azure Blob Storage container. This is the same storage account you connected to your agent during the agent setup.
29-
The project asset ID is the URI of the uploaded file and we print this value. If you create more agents in the same project that want to use the uploaded file, you can reuse this asset ID. That way you don't need to upload the file again.
28+
Upload your local file to the project’s Azure Blob Storage container. This is the same storage account you connected to your agent during setup. If you create more agents in the same project that need to use the uploaded file(s), you can reuse this asset uri, avoiding the need to upload the file multiple times.
3029
```python
3130
# We'll upload the local file to your project Azure Blob Storage container and will use it for vector store creation.
32-
_, asset_uri = project_client.upload_file("C:\\Users\\fosteramanda\\Downloads\\hub bicep\\azure-ai-agents\\data\\product_info_1.md")
31+
_, asset_uri = project_client.upload_file("sample_file_for_upload.md")
3332
print(f"Uploaded file, asset URI: {asset_uri}")
3433

3534
# create a vector store with no file and wait for it to be processed
3635
ds = VectorStoreDataSource(asset_identifier=asset_uri, asset_type=VectorStoreDataSourceAssetType.URI_ASSET)
37-
vector_store = project_client.agents.create_vector_store_and_poll(data_sources=[ds], name="sample_vector_store-3")
36+
vector_store = project_client.agents.create_vector_store_and_poll(data_sources=[ds], name="sample_vector_store")
3837
print(f"Created vector store, vector store ID: {vector_store.id}")
3938
```
4039
### Step 3: Create an agent with access to the file search tool
@@ -75,7 +74,7 @@ print(f"Messages: {messages}")
7574
```
7675

7776
### Step 4: Create second vector store using the previously uploaded file
78-
Now we create a second vector store using the previously uploaded file. Using the asset_uri of file already in Azure Blob Storage is useful if you have multiple agents that need access to the same files. That way you don't need to upload the same file multiple times.
77+
Now, create a second vector store using the previously uploaded file. Using the ```asset_uri``` of a file already in Azure Blob Storage is useful if you have multiple agents that need access to the same files, as it eliminates the need to upload the same file multiple times.
7978
```python
8079

8180
# create a vector store with a previously uploaded file and wait for it to be processed
Lines changed: 112 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,112 @@
1+
2+
## Creating vector stores and adding files
3+
4+
You can create a vector store and add files to it in a single API call:
5+
6+
```python
7+
vector_store = project_client.agents.create_vector_store_file_batch_and_poll(
8+
name="my_vector_store",
9+
file_ids=['file_path_1', 'file_path_2', 'file_path_3', 'file_path_4', 'file_path_5']
10+
)
11+
```
12+
13+
Adding files to vector stores is an async operation. To ensure the operation is complete, we recommend that you use the 'create and poll' helpers in our official SDKs. If you're not using the SDKs, you can retrieve the `vector_store` object and monitor its `file_counts` property to see the result of the file ingestion operation.
14+
15+
Files can also be added to a vector store after it's created by creating vector store files.
16+
17+
```python
18+
19+
# create a vector store with no file and wait for it to be processed
20+
vector_store = project_client.agents.create_vector_store_and_poll(data_sources=[], name="sample_vector_store")
21+
print(f"Created vector store, vector store ID: {vector_store.id}")
22+
23+
# add the file to the vector store or you can supply file ids in the vector store creation
24+
vector_store_file_batch = project_client.agents.create_vector_store_file_batch_and_poll(
25+
vector_store_id=vector_store.id, file_ids=[file.id]
26+
)
27+
print(f"Created vector store file batch, vector store file batch ID: {vector_store_file_batch.id}")
28+
29+
```
30+
31+
Alternatively, you can add several files to a vector store by creating batches of up to 500 files.
32+
33+
```python
34+
batch = project_client.agents.create_vector_store_file_batch_and_poll(
35+
vector_store_id=vector_store.id,
36+
file_ids=['file_1', 'file_2', 'file_3', 'file_4', 'file_5']
37+
)
38+
```
39+
40+
### Basic agent setup: Deleting files from vector stores
41+
Files can be removed from a vector store by either:
42+
43+
* Deleting the vector store file object or,
44+
* By deleting the underlying file object (which removes the file it from all vector_store and code_interpreter configurations across all agents and threads in your organization)
45+
46+
The maximum file size is 512 MB. Each file should contain no more than 5,000,000 tokens per file (computed automatically when you attach a file).
47+
48+
49+
## Remove vector store
50+
51+
You can can remove a vector store from the file search tool.
52+
53+
```python
54+
file_search_tool.remove_vector_store(vector_store.id)
55+
print(f"Removed vector store from file search, vector store ID: {vector_store.id}")
56+
57+
project_client.agents.update_agent(
58+
assistant_id=agent.id, tools=file_search_tool.definitions, tool_resources=file_search_tool.resources
59+
)
60+
print(f"Updated agent, agent ID: {agent.id}")
61+
62+
```
63+
64+
## Deleting vector stores
65+
```python
66+
project_client.agents.delete_vector_store(vector_store.id)
67+
print("Deleted vector store")
68+
```
69+
70+
## Ensuring vector store readiness before creating runs
71+
72+
We highly recommend that you ensure all files in a vector_store are fully processed before you create a run. This ensures that all the data in your vector store is searchable. You can check for vector store readiness by using the polling helpers in the SDKs, or by manually polling the `vector_store` object to ensure the status is completed.
73+
74+
As a fallback, there's a 60-second maximum wait in the run object when the thread's vector store contains files that are still being processed. This is to ensure that any files your users upload in a thread a fully searchable before the run proceeds. This fallback wait does not apply to the agent's vector store.
75+
76+
## Managing costs with expiration policies
77+
78+
For basic agent setup. the `file_search` tool uses the `vector_stores` object as its resource and you will be billed based on the size of the vector_store objects created. The size of the vector store object is the sum of all the parsed chunks from your files and their corresponding embeddings.
79+
80+
In order to help you manage the costs associated with these vector_store objects, we have added support for expiration policies in the `vector_store` object. You can set these policies when creating or updating the `vector_store` object.
81+
82+
```python
83+
vector_store = project_client.agents.create_vector_store_and_poll(
84+
name="Product Documentation",
85+
file_ids=['file_1', 'file_2', 'file_3', 'file_4', 'file_5'],
86+
expires_after={
87+
"anchor": "last_active_at",
88+
"days": 7
89+
}
90+
)
91+
```
92+
93+
### Thread vector stores have default expiration policies
94+
95+
Vector stores created using thread helpers (like `tool_resources.file_search.vector_stores` in Threads or `message.attachments` in Messages) have a default expiration policy of seven days after they were last active (defined as the last time the vector store was part of a run).
96+
97+
When a vector store expires, the runs on that thread fail. To fix this, you can recreate a new vector_store with the same files and reattach it to the thread.
98+
99+
```python
100+
all_files = list(client.beta.vector_stores.files.list("vs_expired"))
101+
102+
vector_store = client.beta.vector_stores.create(name="rag-store")
103+
client.beta.threads.update(
104+
"thread_abc123",
105+
tool_resources={"file_search": {"vector_store_ids": [vector_store.id]}},
106+
)
107+
108+
for file_batch in chunked(all_files, 100):
109+
client.beta.vector_stores.file_batches.create_and_poll(
110+
vector_store_id=vector_store.id, file_ids=[file.id for file in file_batch]
111+
)
112+
```

articles/ai-services/agents/includes/file-search/upload-files-code-examples.md

Lines changed: 13 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ Create a client object, that contains the connection string for connecting to yo
1414
```python
1515
import os
1616
from azure.ai.projects import AIProjectClient
17-
from azure.ai.projects.models import FileSearchTool
17+
from azure.ai.projects.models import FileSearchTool, MessageAttachment, FilePurpose
1818
from azure.identity import DefaultAzureCredential
1919

2020

@@ -42,29 +42,23 @@ using NUnit.Framework;
4242
// At the moment, it should be in the format "<HostName>;<AzureSubscriptionId>;<ResourceGroup>;<ProjectName>"
4343
// Customer needs to login to Azure subscription via Azure CLI and set the environment variables
4444
var connectionString = TestEnvironment.AzureAICONNECTIONSTRING;
45-
AgentsClient client = new AgentsClient(connectionString, new DefaultAzureCredentia());
45+
AgentsClient client = new AgentsClient(connectionString, new DefaultAzureCredential());
4646
```
4747

4848
---
4949

5050
## Step 2: Upload files and add them to a Vector Store
5151

5252
To access your files, the file search tool uses the vector store object. Upload your files and create a vector store to contain them. Once the vector store is created, you should poll its status until all files are out of the `in_progress` state to ensure that all content has finished processing. The SDK provides helpers for uploading and polling.
53-
54-
Vector stores are created using message attachments that have a default expiration policy of seven days after they were last active (defined as the last time the vector store was part of a run). This default exists to help you manage your vector storage costs. You can override these expiration policies at any time.
55-
5653
# [Python](#tab/python)
5754

5855
```python
5956
# We will upload the local file and will use it for vector store creation.
6057

6158
#upload a file
62-
file = project_client.agents.upload_file_and_poll(file_path='./data/product_info_1.md', purpose="assistants")
59+
file = project_client.agents.upload_file_and_poll(file_path='./data/product_catelog.md', purpose=FilePurpose.AGENTS)
6360
print(f"Uploaded file, file ID: {file.id}")
6461

65-
_, asset_uri = project_client.upload_file("./data/product_info_1.md")
66-
print(f"Uploaded file, asset URI: {asset_uri}")
67-
6862
# create a vector store with the file you uploaded
6963
vector_store = project_client.agents.create_vector_store_and_poll(file_ids=[file.id], name="my_vectorstore")
7064
print(f"Created vector store, vector store ID: {vector_store.id}")
@@ -91,7 +85,7 @@ VectorStore vectorStore = await client.CreateVectorStoreAsync(
9185
```
9286
---
9387

94-
## Step 3: Enable file search
88+
## Step 3: Create an agent with access to file search
9589

9690
To make the files accessible to your agent, create a `FileSearchTool` object with the `vector_store` ID, and attach `tools` and `tool_resources` to the agent.
9791

@@ -132,15 +126,22 @@ Agent agent = agentResponse.Value;
132126
---
133127

134128
## Step 4: Create a thread
135-
129+
You can also attach files as Message attachments on your thread. Doing so will create another ```vector_store``` associated with the thread, or, if there is already a vector store attached to this thread, attach the new files to the existing thread vector store. When you create a Run on this thread, the file search tool will query both the ```vector_store``` from your assistant and the ```vector_store``` on the thread.
136130
# [Python](#tab/python)
137131

138132
```python
139133
thread = project_client.agents.create_thread()
140134
print(f"Created thread, thread ID: {thread.id}")
141135

136+
# Upload the user provided file as a messsage attachment
137+
message_file = project_client.agents.upload_file_and_poll(file_path='product_info_1.md', purpose=FilePurpose.AGENTS)
138+
print(f"Uploaded file, file ID: {message_file.id}")
139+
140+
# Create a message with the file search attachment
141+
# Notice that vector store is created temporarily when using attachments with a default expiration policy of seven days.
142+
attachment = MessageAttachment(file_id=file.id, tools=FileSearchTool().definitions)
142143
message = project_client.agents.create_message(
143-
thread_id=thread.id, role="user", content="What feature does Smart Eyewear offer?"
144+
thread_id=thread.id, role="user", content="What feature does Smart Eyewear offer?", attachments=[attachment]
144145
)
145146
print(f"Created message, message ID: {message.id}")
146147
```

0 commit comments

Comments
 (0)