Skip to content

Commit 7d8a788

Browse files
committed
Create Blog “part-8-agentic-ai-and-qdrant-building-semantic-memory-with-mcp-protocol”
1 parent d106316 commit 7d8a788

File tree

2 files changed

+270
-0
lines changed

2 files changed

+270
-0
lines changed
Lines changed: 270 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,270 @@
1+
---
2+
title: "Part 8: Agentic AI and Qdrant: Building Semantic Memory with MCP Protocol"
3+
date: 2025-07-21T10:50:25.839Z
4+
author: Dinesh R Singh
5+
authorimage: /img/dinesh-192-192.jpg
6+
disable: false
7+
---
8+
As Agentic AI systems evolve from reactive language models to structured thinkers, a new challenge emerges — how do we give these agents memory? Not just logs or files, but real, searchable memory that understands context. Enter Qdrant and the Model Context Protocol (MCP) — a modular pairing that brings semantic search and knowledge storage to agent workflows.
9+
10+
Inspired by a Medium post by Dinesh R, this article explores how MCP standardizes interactions between intelligent agents and vector databases like Qdrant. By enabling seamless storage and retrieval of embeddings, agents can now “remember” useful information and leverage it in future reasoning.
11+
12+
Let’s walk through the full architecture and code implementation of this cutting-edge pattern.
13+
14+
15+
16+
## Why This Matters: Agentic AI + MCP
17+
18+
In Agentic AI, a language model doesn’t just generate — it thinks, acts, and reflects using external tools. That’s where MCP comes in.
19+
20+
Think of MCP as a “USB interface” for AI — it lets agents plug into tools like Qdrant, APIs, or structured databases using a consistent protocol.
21+
22+
Qdrant itself is a high-performance vector database — capable of powering semantic search, knowledge retrieval, and long-term memory for AI agents. However, direct integration with agents can be messy and non-standardized.
23+
24+
This is solved by wrapping Qdrant inside an MCP server, giving agents a semantic API they can call like a function.
25+
26+
27+
28+
### Architecture Overview
29+
30+
```
31+
LLM Agent
32+
33+
\|
34+
35+
\|-- \[MCP Client]
36+
37+
\|
38+
39+
\[MCP Protocol]
40+
41+
\|
42+
43+
\|-- \[Qdrant MCP Server]
44+
45+
\|   |-- Tool: qdrant-store
46+
47+
\|   |-- Tool: qdrant-find
48+
49+
\|
50+
51+
\[Qdrant Vector DB]
52+
```
53+
54+
55+
### Use Case: Support Ticket Memory for AI Assistants
56+
57+
Imagine an AI assistant answering support queries.
58+
59+
* It doesn't have all answers built-in.
60+
* But it has semantic memory from prior support logs stored in Qdrant.
61+
* It uses qdrant-find to semantically retrieve similar issues and then formulates a contextual response.
62+
63+
64+
65+
## Step-by-Step Implementation
66+
67+
### Step 1: Launch Qdrant MCP Server
68+
69+
```
70+
export COLLECTION_NAME="support-tickets"
71+
72+
export QDRANT_LOCAL_PATH="./qdrant_local_db"
73+
74+
export EMBEDDING_MODEL="sentence-transformers/all-MiniLM-L6-v2"
75+
```
76+
 
77+
```
78+
uvx mcp-server-qdrant --transport sse
79+
```
80+
81+
## Key Parameters:
82+
83+
* COLLECTION_NAME: Name of the Qdrant collection
84+
* QDRANT_LOCAL_PATH: Local vector DB storage path
85+
* EMBEDDING_MODEL: Embedding model for vectorization
86+
87+
88+
### Step 2: Connect the MCP Client
89+
90+
```
91+
from mcp import ClientSession, StdioServerParameters
92+
93+
from mcp.client.stdio import stdio_client
94+
95+
async def main():
96+
97+
server_params = StdioServerParameters(
98+
99+
     command="uvx",
100+
101+
     args=\["mcp-server-qdrant"],
102+
103+
     env={
104+
105+
         "QDRANT_LOCAL_PATH": "./qdrant_local_db",
106+
107+
         "COLLECTION_NAME": "support-tickets",
108+
109+
         "EMBEDDING_MODEL": "sentence-transformers/all-MiniLM-L6-v2"
110+
111+
     }
112+
113+
)
114+
```
115+
```
116+
async with stdio_client(server_params) as (read, write):
117+
118+
     async with ClientSession(read, write) as session:
119+
120+
         await session.initialize()
121+
122+
         tools = await session.list_tools()
123+
124+
         print(tools)
125+
```
126+
```
127+
Expected Output: Lists tools like qdrant-store, qdrant-find
128+
```
129+
130+
### Step 3: Ingest a New Memory
131+
```
132+
133+
ticket_info = "Order #1234 was delayed due to heavy rainfall in transit zone."
134+
135+
 
136+
137+
result = await session.call_tool("qdrant-store", arguments={
138+
139+
"information": ticket_info,
140+
141+
"metadata": {"order_id": 1234}
142+
143+
})
144+
```
145+
This stores an embedded version of the text in Qdrant.
146+
147+
### Step 4: Perform a Semantic Search
148+
```
149+
150+
query = "Why was order 1234 delayed?"
151+
152+
search_response = await session.call_tool("qdrant-find", arguments={
153+
154+
"query": "order 1234 delay"
155+
156+
})
157+
```
158+
```
159+
Example Output:
160+
161+
json
162+
163+
CopyEdit
164+
165+
[
166+
167+
  {
168+
169+
"content": "Order #1234 was delayed due to heavy rainfall in transit zone.",
170+
171+
"metadata": {"order_id": 1234}
172+
173+
  }
174+
175+
]
176+
```
177+
178+
### Step 5: Use with LLM
179+
```
180+
import openai
181+
182+
context = "\n".join(\[r["content"] for r in search_response])
183+
184+
prompt = f"""
185+
186+
You are a helpful assistant. Use this context to answer:
187+
188+
\"\"\"
189+
190+
{context}
191+
192+
\"\"\"
193+
194+
Question: Why was order #1234 delayed?
195+
196+
"""
197+
198+
response = openai.ChatCompletion.create(
199+
200+
model="gpt-3.5-turbo",
201+
202+
messages=\[{"role": "user", "content": prompt}]
203+
204+
)
205+
206+
print(response\["choices"]\[0]\["message"]\["content"])
207+
```
208+
```
209+
210+
Final Answer:\
211+
"Order #1234 was delayed due to heavy rainfall in the transit zone."
212+
```
213+
214+
## Parameter Reference
215+
216+
<table border="1" cellpadding="8" cellspacing="0" style="border-collapse: collapse; width: 100%;">
217+
<thead style="background-color:#f2f2f2">
218+
<tr>
219+
<th>Tool</th>
220+
<th>Parameter</th>
221+
<th>Description</th>
222+
</tr>
223+
</thead>
224+
<tbody>
225+
<tr>
226+
<td><code>qdrant-store</code></td>
227+
<td><code>information</code></td>
228+
<td>Raw string to embed</td>
229+
</tr>
230+
<tr>
231+
<td></td>
232+
<td><code>metadata</code></td>
233+
<td>Optional metadata for filtering</td>
234+
</tr>
235+
<tr>
236+
<td><code>qdrant-find</code></td>
237+
<td><code>query</code></td>
238+
<td>Natural language query</td>
239+
</tr>
240+
<tr>
241+
<td><code>env var</code></td>
242+
<td><code>EMBEDDING_MODEL</code></td>
243+
<td>Model used to create embeddings</td>
244+
</tr>
245+
<tr>
246+
<td><code>env var</code></td>
247+
<td><code>COLLECTION_NAME</code></td>
248+
<td>Qdrant vector collection name</td>
249+
</tr>
250+
</tbody>
251+
</table>
252+
253+
254+
## Pro Tip: Chain MCP Servers
255+
256+
You can deploy multiple MCP servers for different tools and plug them into agent workflows:
257+
258+
* qdrant-find for memory
259+
* google-search for web data
260+
* postgres-query for structured facts
261+
262+
Then orchestrate it all using Agentic AI Teams to perform high-level, multi-tool reasoning.
263+
264+
265+
266+
## Final Thought
267+
268+
By pairing Qdrant with MCP, Agentic AI gains powerful, semantic memory — a critical enabler of contextual understanding and long-term knowledge retention. This pattern abstracts the complexity of vector DBs behind a unified protocol, empowering agents to think, recall, and act without manual data plumbing.
269+
270+
As the AI stack modularizes further, approaches like this will form the backbone of scalable, pluggable, and intelligent multi-agent ecosystems.

static/img/picture-1.jpg

913 KB
Loading

0 commit comments

Comments
 (0)