Skip to content

Commit 8c84ffd

Browse files
committed
Add vector search exercise
1 parent 2a5ea25 commit 8c84ffd

File tree

2 files changed

+212
-1
lines changed

2 files changed

+212
-1
lines changed

README.md

Lines changed: 174 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ In this workshop, you’ll build a **Grocery Shopping AI agent** step by step. E
1515
2. How could this data be useful to a shopping agent?
1616
3. Are there any unusual fields in the documents?
1717

18-
## Exercise 1: Initialize the Agent
18+
## Exercise 1: Initialize the Agent with Google ADK
1919

2020
In this step, you’ll create your first AI Agent with ADK. At this stage, the agent won’t have any tools — which means it won’t be able to do much yet. This will demonstrate why tools are essential.
2121

@@ -84,3 +84,176 @@ In this step, you’ll create your first AI Agent with ADK. At this stage, the a
8484
**<span aria-hidden="true">👉</span> Discussion point:**
8585
What risks do you see if an agent makes up products or fetches information from outside sources instead of the inventory?
8686
87+
## Exercise 2: Find Similar Products with MongoDB Atlas Vector Search
88+
89+
In this exercise, you’ll add a tool that lets the agent find products relevant to a user’s question. To do this, you’ll use **vector search**, which compares product embeddings to a user query and returns semantically similar results.
90+
91+
### Step 1: Generate Embeddings
92+
93+
1. Open the MongoDB extension by clicking the green MongoDB leaf in the sidebar.
94+
2. Expand the **Groceries Database** connection, then the **grocery_store** database and finally, the **inventory** collection.
95+
3. Open any document. Notice the **gemini_embedding** field: it already contains the product’s vector embedding.
96+
97+
We pre-generated these embeddings to save time and resources. Otherwise, every workshop attendee would need to re-run the same embedding process—producing identical vectors at unnecessary cost. For this exercise, you can work directly with the stored vectors.
98+
99+
### Step 2: Create a Vector Search Index
100+
101+
A vector search index is a special data structure optimized for similarity searches. It allows MongoDB to efficiently compare vectors and return the closest matches.
102+
103+
> **<span aria-hidden="true">📗</span> Extra credit:** If you’d like to dive deeper into how vector search indexes work in MongoDB, check out [this video](https://www.youtube.com/watch?v=AvCuiRs2cxw).
104+
105+
**<span aria-hidden="true">💡</span> Key things to know:**
106+
- You only need to create the index once. As new documents with vectors are added, MongoDB keeps the index updated automatically.
107+
- You can create the index from any MongoDB driver, the MongoDB Shell, or directly in Atlas. For this workshop, you’ll define it programmatically using the Python driver, which makes the process reproducible.
108+
109+
#### Task: Update the Vector Creation Script
110+
111+
Open **`mongodb_groceries_agent/create-vector-search-index.py`** and fill in the placeholders:
112+
113+
1. `<DATABASE_NAME>`: the name of the database that you explored through the MongoDB extension
114+
2. `<COLLECTION_NAME>`: the collection with grocery products
115+
3. `<VECTOR_FIELD_IN_THE_DOCUMENT>`: the field storing the embedding (the numeric array)
116+
4. `<LENGTH_OF_THE_VECTOR>`: the size of the embedding array
117+
5. `<VECTOR_SEARCH_DEFINITION>`: use the predefined variable in the script and pass it into the method
118+
119+
In the bottom panel, open a new terminal tab and run the script to create the index:
120+
121+
```bash
122+
python mongodb_groceries_agent/create-vector-search-index.py
123+
```
124+
125+
Creating the index takes only a few seconds for the 5000 documents you have in the dataset. After a 10-second timeout, the script will display the collection’s search indexes. At that point, your vector search index is ready and can be used by the agent to find similar products.
126+
127+
### Step 3: Implement the Vector Search Tool
128+
129+
With the index in place, let’s wire up the agent so it can actually use it.
130+
131+
In this step, you’ll:
132+
- Configure placeholders for your database and collection.
133+
- Define a helper to generate embeddings with Gemini.
134+
- Implement the `find_similar_products` tool that performs a vector search against MongoDB.
135+
- Register the tool with the agent so it becomes part of the shopping workflow.
136+
137+
Open **`mongodb_groceries_agent/agent.py`** and add the following code:
138+
139+
```python
140+
# Initialize the GenAI client to vectorize the user queries
141+
genai_client = genai.Client()
142+
# Initialize the MongoDB client to communicate with the database
143+
CONNECTION_STRING = os.environ.get("CONNECTION_STRING")
144+
database_client = pymongo.MongoClient(CONNECTION_STRING)
145+
146+
DATABASE_NAME = "grocery_store"
147+
INVENTORY_COLLECTION_NAME = "inventory"
148+
149+
# 3. Helper function: Generate embeddings for a user query
150+
def generate_embeddings(query):
151+
"""Generate embeddings for the user query using the Gemini embedding model."""
152+
result = genai_client.models.embed_content(
153+
model="gemini-embedding-001",
154+
contents=query,
155+
# 1. Replace with the desired size of the vector. This should match the vector size in the document.
156+
config=types.EmbedContentConfig(output_dimensionality=<OUTPUT_VECTOR_SIZE>)
157+
)
158+
return result.embeddings[0].values
159+
160+
# 4. Tool: Perform a vector search against MongoDB Atlas
161+
def find_similar_products(query: str) -> str:
162+
"""Search for products with names semantically similar to the query.
163+
164+
Args:
165+
query: The user’s request (e.g., product name or description).
166+
Returns:
167+
A list of product documents with details (excluding embeddings).
168+
"""
169+
vector_embeddings = generate_embeddings(query)
170+
171+
pipeline = [
172+
{
173+
"$vectorSearch": {
174+
"index": "vector_index", # <-- Leave as it is. This is the name of the index you created in Step 2
175+
"path": "<VECTOR_FIELD_IN_THE_DOCUMENT>", # <-- 2. Replace with the document field that holds the vector embedding
176+
"queryVector": vector_embeddings,
177+
"numCandidates": 100,
178+
"limit": 10
179+
},
180+
},
181+
{
182+
"$project": {
183+
"_id": 0,
184+
# 3. Replace with the document field the holds the embedding. This will reduce the network traffic and the tokens the agent needs to include in LLM prompt.
185+
"<VECTOR_FIELD_IN_THE_DOCUMENT>": 0
186+
}
187+
}
188+
]
189+
190+
try:
191+
documents = database_client[DATABASE_NAME][INVENTORY_COLLECTION_NAME].aggregate(pipeline).to_list()
192+
return documents
193+
except pymongo.errors.OperationFailure:
194+
return "Failed to find similar products."
195+
196+
instruction = """
197+
You are the **Online Groceries Agent**, a friendly and helpful virtual assistant for our e-commerce grocery store.
198+
Start every conversation with a warm greeting, introduce yourself as the "Online Groceries Agent," and ask how you can assist the user today.
199+
Your role is to guide customers through their shopping experience.
200+
201+
What you can do:
202+
- Help users discover and explore products in the store.
203+
- Suggest alternatives when the exact item is not available.
204+
- Add products to the user’s shopping cart.
205+
- Answer product-related questions in a clear and concise way.
206+
- Return the total in the user’s shopping cart.
207+
208+
Available tools:
209+
1. **find_similar_products**: Search for products with names semantically similar to the user’s request.
210+
2. **add_to_cart**: Add a product to the user’s cart in MongoDB. Pass only the product name (as it appears in the inventory collection) and the user’s username.
211+
3. **calculate_cart_total**: Sum the total of all products in a user's cart and return it. Pass the user’s username.
212+
213+
Core guidelines:
214+
- **Always search first**: If a user asks for a product, call `find_similar_products` before attempting to add it to the cart.
215+
- **Handle missing products**: If the requested product is not in the inventory, suggest similar items returned by the search.
216+
- **Parallel tool use**: You may call multiple tools in parallel when appropriate (e.g., searching for several items at once).
217+
- **Clarify only when necessary**: Ask for more details if the request is unclear and you cannot perform a search.
218+
- Keep your tone positive, approachable, and customer-focused throughout the interaction.
219+
220+
Additional important instructions:
221+
- **Do not assume availability**: Never add a product directly to the cart without confirming it exists in the inventory.
222+
- **Respect exact names**: When using `add_to_cart`, pass the product name exactly as stored in the inventory collection.
223+
- **Multi-item requests**: If the user asks for several items in one message, search for all items together and suggest results before adding to the cart.
224+
- **Quantity requests**: If the user specifies a quantity, repeat it back to confirm and ensure it is respected when adding to the cart.
225+
- **Cart confirmation**: After adding items, confirm with the user that they have been successfully added.
226+
- **Fallback behavior**: If no results are found, apologize politely, and encourage the user to try a different product or category.
227+
- **Stay focused**: Only handle product discovery, shopping, and cart management tasks. Politely decline requests unrelated to groceries.
228+
- **Answering product questions**: If the question is about a product (e.g., "Is this organic?" or "How much does it cost?"), use the search results to answer. If the information is not available, respond transparently that you don’t have that detail.
229+
230+
Remember: you are a professional yet friendly shopping assistant whose goal is to make the user’s grocery shopping smooth, efficient, and enjoyable.
231+
"""
232+
233+
# 5. Define the agent and register the tools
234+
root_agent = Agent(
235+
model="gemini-2.5-flash",
236+
name="grocery_shopping_agent",
237+
instruction=instruction
238+
tools=[
239+
find_similar_products
240+
]
241+
)
242+
```
243+
244+
Finally, restart the agent with the following command:
245+
246+
```
247+
adk web
248+
```
249+
250+
Hold CMD (Mac) or CTRL (Windows/Linux) and click on the link: http://127.0.0.1:8000. Once again, this opens the development UI where you can chat with your agent.
251+
252+
Try asking your agent:
253+
254+
```
255+
Find me sourdough bread in the inventory.
256+
```
257+
258+
**<span aria-hidden="true">👉</span> Discussion point:**
259+
Does the agent respond in a different way? Is it running any tools? What happens when you click on the tool execution boxes?
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
import os
2+
import pymongo
3+
import time
4+
import os
5+
import pprint
6+
7+
CONNECTION_STRING = os.environ.get("CONNECTION_STRING")
8+
database_client = pymongo.MongoClient(CONNECTION_STRING)
9+
10+
DATABASE_NAME = "<DATABASE_NAME>" # <-- 1. Insert the correct database name here
11+
COLLECTION_NAME = "<COLLECTION_NAME>" # <-- 2. Insert the correct collection name here
12+
13+
collection = database_client[DATABASE_NAME][COLLECTION_NAME]
14+
15+
index_definition = {
16+
"name": "vector_index",
17+
"type": "vectorSearch",
18+
"definition": {
19+
"fields": [
20+
{
21+
"path": "<VECTOR_FIELD_IN_THE_DOCUMENT>", # <-- 3. Insert the correct field name here
22+
"numDimensions": <LENGTH_OF_THE_VECTOR>, # <-- 4. Insert the correct vector length here
23+
"similarity": "cosine",
24+
"type": "vector"
25+
}
26+
]
27+
}
28+
}
29+
30+
print("Creating the vector search index...")
31+
collection.create_search_index(<VECTOR_SEARCH_DEFINITION>) # <-- 5. Use the variable defined above
32+
33+
print("Waiting 10 seconds for the index to finish building...")
34+
time.sleep(10)
35+
36+
indexes = list(collection.list_search_indexes())
37+
print("Search indexes: ")
38+
pprint.pp(indexes)

0 commit comments

Comments
 (0)