You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on May 20, 2025. It is now read-only.
Copy file name to clipboardExpand all lines: docs/guides/python/llama-rag.mdx
+10-2Lines changed: 10 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -276,33 +276,41 @@ uv run model_utilities.py
276
276
Let's create our resources in a common file so that it can be imported to the subscriber and chat modules. We'll create a websocket which will interface with the user for prompts and create a topic to handle the backend query engine. The websocket will trigger the topic on a prompt message, which will trigger the subscriber to handle the prompt. Once the subscriber is finished it will send a response to the socket. It is done this way with the topic so that the websocket doesn't time out after 30 seconds, as most queries will take longer than that to process.
277
277
278
278
```python title:common/resources.py
279
-
from nitric.resources import websocket, topic
279
+
from nitric.resources import websocket, topic, kv
280
280
281
281
socket = websocket("socket")
282
282
chat_topic = topic("chat")
283
+
connections = kv("connections")
283
284
```
284
285
285
286
## Use the resources for querying the model
286
287
287
288
With our LLM downloaded and given the context documentation for querying, we can use our websocket to handle prompts. The main piece of logic here is publishing to the chat topic
288
289
289
290
```python title:services/chat.py
290
-
from common.resources import socket, chat_topic
291
+
from common.resources import socket, chat_topic, connections
0 commit comments