|
| 1 | +--- |
| 2 | +title: "Part 8: Agentic AI and Qdrant: Building semantic memory with MCP protocol" |
| 3 | +date: 2025-07-21T10:50:25.839Z |
| 4 | +author: Dinesh R Singh |
| 5 | +authorimage: /img/dinesh-192-192.jpg |
| 6 | +disable: false |
| 7 | +tags: |
| 8 | + - MCP |
| 9 | + - Agentic AI |
| 10 | + - Generative AI |
| 11 | +--- |
| 12 | +<style> |
| 13 | +li { |
| 14 | + font-size: 27px; |
| 15 | + line-height: 33px; |
| 16 | + max-width: none; |
| 17 | +} |
| 18 | +</style> |
| 19 | + |
| 20 | +As **Agentic AI** systems evolve from reactive language models into structured thinkers, a new challenge emerges: **how do we give these agents memory?** Not just basic logs or static files, but real, **searchable memory** that understands and adapts to context over time. |
| 21 | + |
| 22 | +This is where tools like **Qdrant** and the **Model Context Protocol (MCP)** come in—a modular pairing that brings semantic search and long-term knowledge storage into agent workflows. Together, they enable agents to not only recall relevant information but to reason across past experiences, making **Agentic AI** systems more intelligent, adaptive, and human-like in their decision-making.\ |
| 23 | +\ |
| 24 | +[Inspired by my Medium post](https://dineshr1493.medium.com/all-you-need-to-know-about-the-evolution-of-generative-ai-to-agentic-ai-part-8-agentic-ai-mcp-281567e26838), this article explores how **MCP**, the **Model Context Protocol**—a kind of connective tissue between LLMs and external tools or data sources—**standardizes interactions** between intelligent agents and vector databases like **Qdrant**. By enabling seamless storage and retrieval of embeddings, agents can now “remember” useful information and leverage it in future reasoning. |
| 25 | + |
| 26 | +Let’s walk through the full architecture and code implementation of this cutting-edge combination. |
| 27 | + |
| 28 | +## LLMs + MCP + Database = Thoughtful Agentic AI |
| 29 | + |
| 30 | +In Agentic AI, a language model doesn’t just generate — it thinks, acts, and reflects using external tools. That’s where MCP comes in. |
| 31 | + |
| 32 | +Think of MCP as a “USB interface” for AI — it lets agents plug into tools like Qdrant, APIs, or structured databases using a consistent protocol. |
| 33 | + |
| 34 | +Qdrant itself is a high-performance vector database — capable of powering semantic search, knowledge retrieval, and acting as long-term memory for AI agents. However, direct integration with agents can be messy and non-standardized. |
| 35 | + |
| 36 | +This is solved by wrapping Qdrant inside an MCP server, giving agents a semantic API they can call like a function. |
| 37 | + |
| 38 | +### Architecture overview |
| 39 | + |
| 40 | +```cwl |
| 41 | +[LLM Agent] |
| 42 | + | |
| 43 | + |-- [MCP Client] |
| 44 | +[MCP Protocol] |
| 45 | + | |
| 46 | + |-- [Qdrant MCP Server] |
| 47 | + | |-- Tool: qdrant-store |
| 48 | + | |-- Tool: qdrant-find |
| 49 | + | |
| 50 | +[Qdrant Vector DB] |
| 51 | +``` |
| 52 | + |
| 53 | +### Use case: Support ticket memory for AI assistants |
| 54 | + |
| 55 | +Imagine an AI assistant answering support queries. |
| 56 | + |
| 57 | +* It doesn't have all answers built-in. |
| 58 | +* But it has semantic memory from prior support logs stored in Qdrant. |
| 59 | +* It uses qdrant-find to semantically retrieve similar issues . |
| 60 | +* It then formulates a contextual response. |
| 61 | + |
| 62 | + |
| 63 | + |
| 64 | +## Step-by-step implementation |
| 65 | + |
| 66 | +### Step 1: Launch Qdrant MCP Server |
| 67 | + |
| 68 | +```cwl |
| 69 | +export COLLECTION_NAME="support-tickets" |
| 70 | +export QDRANT_LOCAL_PATH="./qdrant_local_db" |
| 71 | +export EMBEDDING_MODEL="sentence-transformers/all-MiniLM-L6-v2" |
| 72 | +``` |
| 73 | + |
| 74 | +```cwl |
| 75 | +uvx mcp-server-qdrant --transport sse |
| 76 | +``` |
| 77 | + |
| 78 | +## Key parameters: |
| 79 | + |
| 80 | +* COLLECTION_NAME: Name of the Qdrant collection |
| 81 | +* QDRANT_LOCAL_PATH: Local vector DB storage path |
| 82 | +* EMBEDDING_MODEL: Embedding model for vectorization |
| 83 | + |
| 84 | +### Step 2: Connect the MCP Client |
| 85 | + |
| 86 | +```python |
| 87 | +from mcp import ClientSession, StdioServerParameters |
| 88 | +from mcp.client.stdio import stdio_client |
| 89 | +async def main(): |
| 90 | +server_params = StdioServerParameters( |
| 91 | + command="uvx", |
| 92 | + args=\["mcp-server-qdrant"], |
| 93 | + env={ |
| 94 | + "QDRANT_LOCAL_PATH": "./qdrant_local_db", |
| 95 | + "COLLECTION_NAME": "support-tickets", |
| 96 | + "EMBEDDING_MODEL": "sentence-transformers/all-MiniLM-L6-v2" |
| 97 | + } |
| 98 | +) |
| 99 | +``` |
| 100 | + |
| 101 | +```python |
| 102 | +async with stdio_client(server_params) as (read, write): |
| 103 | + async with ClientSession(read, write) as session: |
| 104 | + await session.initialize() |
| 105 | + tools = await session.list_tools() |
| 106 | + print(tools) |
| 107 | +``` |
| 108 | + |
| 109 | +```cwl |
| 110 | +Expected Output: Lists tools like qdrant-store, qdrant-find |
| 111 | +``` |
| 112 | + |
| 113 | +### Step 3: Ingest a new memory |
| 114 | + |
| 115 | +```python |
| 116 | +ticket_info = "Order #1234 was delayed due to heavy rainfall in transit zone." |
| 117 | +result = await session.call_tool("qdrant-store", arguments={ |
| 118 | +"information": ticket_info, |
| 119 | +"metadata": {"order_id": 1234} |
| 120 | +}) |
| 121 | +``` |
| 122 | + |
| 123 | +This stores an embedded version of the text in Qdrant. |
| 124 | + |
| 125 | +### Step 4: Perform a semantic search |
| 126 | + |
| 127 | +```python |
| 128 | +query = "Why was order 1234 delayed?" |
| 129 | +search_response = await session.call_tool("qdrant-find", arguments={ |
| 130 | +"query": "order 1234 delay" |
| 131 | +}) |
| 132 | +``` |
| 133 | + |
| 134 | +### Example output: |
| 135 | + |
| 136 | +```cwl |
| 137 | +[ |
| 138 | + { |
| 139 | +"content": "Order #1234 was delayed due to heavy rainfall in transit zone.", |
| 140 | +"metadata": {"order_id": 1234} |
| 141 | + } |
| 142 | +] |
| 143 | +``` |
| 144 | + |
| 145 | +### Step 5: Use with LLM |
| 146 | + |
| 147 | +```python |
| 148 | +import openai |
| 149 | +context = "\n".join(\[r["content"] for r in search_response]) |
| 150 | +prompt = f""" |
| 151 | +You are a helpful assistant. Use this context to answer: |
| 152 | +""" |
| 153 | +{context} |
| 154 | +""" |
| 155 | +Question: Why was order #1234 delayed? |
| 156 | +""" |
| 157 | +response = openai.ChatCompletion.create( |
| 158 | +model="gpt-3.5-turbo", |
| 159 | +messages=[{"role": "user", "content": prompt}] |
| 160 | +) |
| 161 | +print(response["choices"][0]["message"]["content"]) |
| 162 | +``` |
| 163 | + |
| 164 | +### Final answer: |
| 165 | + |
| 166 | +```cwl |
| 167 | +"Order #1234 was delayed due to heavy rainfall in the transit zone." |
| 168 | +``` |
| 169 | + |
| 170 | +## Parameter reference |
| 171 | + |
| 172 | +<table border="1" cellpadding="8" cellspacing="0" style="border-collapse: collapse; width: 100%;"> |
| 173 | + <thead style="background-color:#f2f2f2"> |
| 174 | + <tr> |
| 175 | + <th>Tool</th> |
| 176 | + <th>Parameter</th> |
| 177 | + <th>Description</th> |
| 178 | + </tr> |
| 179 | + </thead> |
| 180 | + <tbody> |
| 181 | + <tr> |
| 182 | + <td><code>qdrant-store</code></td> |
| 183 | + <td><code>information</code></td> |
| 184 | + <td>Raw string to embed</td> |
| 185 | + </tr> |
| 186 | + <tr> |
| 187 | + <td></td> |
| 188 | + <td><code>metadata</code></td> |
| 189 | + <td>Optional metadata for filtering</td> |
| 190 | + </tr> |
| 191 | + <tr> |
| 192 | + <td><code>qdrant-find</code></td> |
| 193 | + <td><code>query</code></td> |
| 194 | + <td>Natural language query</td> |
| 195 | + </tr> |
| 196 | + <tr> |
| 197 | + <td><code>env var</code></td> |
| 198 | + <td><code>EMBEDDING_MODEL</code></td> |
| 199 | + <td>Model used to create embeddings</td> |
| 200 | + </tr> |
| 201 | + <tr> |
| 202 | + <td><code>env var</code></td> |
| 203 | + <td><code>COLLECTION_NAME</code></td> |
| 204 | + <td>Qdrant vector collection name</td> |
| 205 | + </tr> |
| 206 | + </tbody> |
| 207 | +</table> |
| 208 | + |
| 209 | +## Pro tip: Chain MCP servers |
| 210 | + |
| 211 | +You can deploy multiple MCP servers for different tools and plug them into agent workflows: |
| 212 | + |
| 213 | +* qdrant-find for memory |
| 214 | +* google-search for web data |
| 215 | +* postgres-query for structured facts |
| 216 | + |
| 217 | +Then, orchestrate it all using Agentic AI Teams to perform high-level, multi-tool reasoning. |
| 218 | + |
| 219 | +## Final thoughts |
| 220 | + |
| 221 | +By pairing Qdrant with MCP, Agentic AI gains powerful, semantic memory — a critical enabler of contextual understanding and long-term knowledge retention. This pattern abstracts the complexity of vector DBs behind a unified protocol, empowering agents to think, recall, and act without manual data plumbing. |
| 222 | + |
| 223 | +As the AI stack modularizes further, approaches like this will form the backbone of scalable, pluggable, and intelligent multi-agent ecosystems. |
0 commit comments