@@ -1163,10 +1163,13 @@ modifications to the `main.py`:
11631163Step 6. Use Gel's advanced features to create a RAG
11641164====================================================
11651165
1166+ .. note::
1167+ mention httpx-sse
1168+
11661169At this point we have a decent search bot that can refine a search query over
11671170multiple turns of a conversation.
11681171
1169- It's time to add a final touch: we can make the bot remember previous similar
1172+ It's time to add the final touch: we can make the bot remember previous similar
11701173interactions with the user using retrieval-augmented generation (RAG).
11711174
11721175To achieve this we need to implement similarity search across message history:
@@ -1186,20 +1189,25 @@ the `dbschema/default.esdl`:
11861189.. code-block:: sdl
11871190 using extension ai;
11881191
1189- module default {
1190- # type definitions
1191- }
1192+ ... and do the migration:
1193+
1194+
1195+ .. code-block:: bash
1196+ $ gel migration create
1197+ $ gel migrate
11921198
11931199Next, we need to configure the API key in Gel for whatever embedding provider
1194- we're going to be using. As per documentation, let's open up `gel cli` and run
1195- the following command:
1200+ we're going to be using. As per documentation, let's open up the CLI by typing
1201+ `gel` and run the following command (assuming we're using OpenAI) :
11961202
11971203.. code-block:: edgeql
1198- configure current database
1204+ searchbot:main> configure current database
11991205 insert ext::ai::OpenAIProviderConfig {
12001206 secret := 'sk-....',
12011207 };
12021208
1209+ OK: CONFIGURE DATABASE
1210+
12031211In order to get Gel to automatically keep track of creating and updating message
12041212embeddings, all we need to do is create a deferred index like this:
12051213
@@ -1216,25 +1224,172 @@ embeddings, all we need to do is create a deferred index like this:
12161224 on (.body);
12171225 }
12181226
1227+ ... and run a migration one more time.
1228+
12191229And we're done! Gel is going to cook in the background for a while and generate
12201230embedding vectors for our queries. To make sure nothing broke we can follow
12211231Gel's AI documentation and take a look at instance logs:
12221232
12231233.. code-block:: bash
1224- $ gel instance logs -I searchbot
1234+ $ gel instance logs -I searchbot | grep api.openai.com
1235+
1236+ INFO 50121 searchbot 2025-01-30T14:39:53.364 httpx: HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
12251237
12261238It's time to create the second half of the similarity search - the search query.
12271239The query needs to fetch `k` chats in which there're messages that are most
12281240similar to our current message. This can be a little difficult to visualize in
12291241your head, so here's the query itself:
12301242
12311243.. code-block:: edgeql
1232- # queries
1244+ with
1245+ user := (select User filter .name = <str>$username),
1246+ chats := (select Chat filter .<chats[is User] = user)
1247+
1248+ select chats {
1249+ distance := min(
1250+ ext::ai::search(
1251+ .messages,
1252+ <array<float32>>$embedding,
1253+ ).distance,
1254+ ),
1255+ messages: {
1256+ role, body, sources
1257+ }
1258+ }
1259+
1260+ order by .distance
1261+ limit <int64>$limit;
1262+
1263+ Let's place in in `app/queries/search_chats.edgeql`, run the codegen and modify
1264+ our `post_messages` endpoint to keep track of those similar chats.
1265+
1266+ .. code-block:: python
1267+ from edgedb.ai import create_async_ai, AsyncEdgeDBAI
1268+ from .queries.search_chats_async_edgeql import (
1269+ search_chats as search_chats_query,
1270+ )
1271+
1272+ @app.post("/messages", status_code=HTTPStatus.CREATED)
1273+ async def post_messages(
1274+ search_terms: SearchTerms,
1275+ username: str = Query(),
1276+ chat_id: str = Query(),
1277+ ) -> SearchResult:
1278+ # 1. Fetch chat history
1279+ chat_history = await get_messages_query(
1280+ gel_client, username=username, chat_id=chat_id
1281+ )
1282+
1283+ # 2. Add incoming message to Gel
1284+ _ = await add_message_query(
1285+ gel_client,
1286+ username=username,
1287+ message_role="user",
1288+ message_body=search_terms.query,
1289+ sources=[],
1290+ chat_id=chat_id,
1291+ )
1292+
1293+ # 3. Generate a query and perform googling
1294+ search_query = await generate_search_query(search_terms.query, chat_history)
1295+ web_sources = await search_web(search_query)
1296+
1297+ # 4. Fetch similar chats
1298+ db_ai: AsyncEdgeDBAI = await create_async_ai(gel_client, model="gpt-4o-mini")
1299+ embedding = await db_ai.generate_embeddings(
1300+ search_query, model="text-embedding-3-small"
1301+ )
1302+ similar_chats = await search_chats_query(
1303+ gel_client, username=username, embedding=embedding, limit=1
1304+ )
1305+
1306+ # 5. Generate answer
1307+ search_result = await generate_answer(
1308+ search_terms.query, chat_history, web_sources, similar_chats
1309+ )
1310+
1311+ # 6. Add LLM response to Gel
1312+ _ = await add_message_query(
1313+ gel_client,
1314+ username=username,
1315+ message_role="assistant",
1316+ message_body=search_result.response,
1317+ sources=search_result.sources,
1318+ chat_id=chat_id,
1319+ )
1320+
1321+ # 7. Send result back to the client
1322+ return search_result
1323+
1324+ Finally, the answer generator needs to get updated one more time, since we need
1325+ to inject the additional messages into the prompt.
1326+
1327+ .. code-block:: python
1328+ async def generate_answer(
1329+ query: str,
1330+ chat_history: list[GetMessagesResult],
1331+ web_sources: list[WebSource],
1332+ similar_chats: list[list[GetMessagesResult]],
1333+ ) -> SearchResult:
1334+ system_prompt = (
1335+ "You are a helpful assistant that answers user's questions"
1336+ + " by finding relevant information in web search results."
1337+ + " You can reference previous conversation with the user that"
1338+ + " are provided to you, if they are relevant, by explicitly referring"
1339+ + " to them."
1340+ )
1341+
1342+ prompt = f"User search query: {query}\n\nWeb search results:\n"
1343+
1344+ for i, source in enumerate(web_sources):
1345+ prompt += f"Result {i} (URL: {source.url}):\n"
1346+ prompt += f"{source.text}\n\n"
1347+
1348+ prompt += "Similar chats with the same user:\n"
1349+
1350+ for i, chat in enumerate(similar_chats):
1351+ prompt += f"Chat {i}: \n"
1352+ for message in chat.messages:
1353+ prompt += f"{message.role}: {message.body} (sources: {message.sources})\n"
1354+
1355+ completion = llm_client.chat.completions.create(
1356+ model="gpt-4o-mini",
1357+ messages=[
1358+ {
1359+ "role": "system",
1360+ "content": system_prompt,
1361+ },
1362+ {
1363+ "role": "user",
1364+ "content": prompt,
1365+ },
1366+ ],
1367+ )
1368+
1369+ llm_response = completion.choices[0].message.content
1370+ search_result = SearchResult(
1371+ response=llm_response, sources=[source.url for source in web_sources]
1372+ )
1373+
1374+ return search_result
12331375
1234- As before, let's run the query generator by calling `gel-py` in the terminal.
1235- Then we need to modify our `search` function to make sure we use the new
1236- capabilities.
12371376
1377+ And one last time, let's check to make sure everything works:
1378+
1379+ .. code-block:: bash
1380+ $ curl -X 'POST' \
1381+ 'http://127.0.0.1:8000/messages?username=charlie&chat_id=544ef3f2-ded8-11ef-ba16-f7f254b95e36' \
1382+ -H 'accept: application/json' \
1383+ -H 'Content-Type: application/json' \
1384+ -d '{
1385+ "query": "how do i write a simple query in it?"
1386+ }'
12381387
1388+ {
1389+ "response": "To write a simple query in EdgeQL..."
1390+ "sources": [
1391+ "https://docs.edgedb.com/cli/edgedb_query"
1392+ ]
1393+ }
12391394
12401395
0 commit comments