Skip to content

Commit d35e15f

Browse files
committed
docs: add missing RAG documentation to fix build
The sidebars.ts references core-concepts/rag but the file was missing, causing the Docusaurus build to fail.
1 parent 0762205 commit d35e15f

File tree

1 file changed

+169
-0
lines changed

1 file changed

+169
-0
lines changed

docs/core-concepts/rag.md

Lines changed: 169 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,169 @@
1+
# RAG (Retrieval-Augmented Generation)
2+
3+
SpoonOS ships a minimal, switchable RAG stack under `spoon_ai.rag`:
4+
5+
- Index local files/dirs/URLs
6+
- Retrieve top-k chunks
7+
- Answer questions with `[n]` citations
8+
9+
## Installation
10+
11+
The RAG system supports multiple vector-store backends.
12+
13+
### Basic (FAISS / Offline)
14+
15+
```bash
16+
pip install faiss-cpu # Optional: real FAISS backend
17+
# or no extra install for offline/in-memory testing
18+
```
19+
20+
### Advanced Backends
21+
22+
```bash
23+
pip install chromadb # Chroma
24+
pip install pinecone-client # Pinecone
25+
pip install qdrant-client # Qdrant
26+
```
27+
28+
## Basic Usage
29+
30+
### 1) Initialize components
31+
32+
```python
33+
import os
34+
from spoon_ai.rag import (
35+
get_default_config,
36+
get_vector_store,
37+
get_embedding_client,
38+
RagIndex,
39+
RagRetriever,
40+
RagQA,
41+
)
42+
from spoon_ai.chat import ChatBot
43+
44+
# Example: enable embeddings
45+
# os.environ["OPENAI_API_KEY"] = "sk-..."
46+
# or
47+
# os.environ["OPENROUTER_API_KEY"] = "sk-or-..."
48+
49+
cfg = get_default_config()
50+
store = get_vector_store(cfg.backend)
51+
embed = get_embedding_client(
52+
cfg.embeddings_provider,
53+
openai_model=cfg.openai_embeddings_model,
54+
)
55+
```
56+
57+
### 2) Ingest
58+
59+
```python
60+
index = RagIndex(config=cfg, store=store, embeddings=embed)
61+
count = index.ingest(["./my_documents", "https://example.com/article"])
62+
print(f"Ingested {count} chunks.")
63+
```
64+
65+
### 3) Retrieve
66+
67+
```python
68+
retriever = RagRetriever(config=cfg, store=store, embeddings=embed)
69+
chunks = retriever.retrieve("How do I use SpoonAI?", top_k=3)
70+
for c in chunks:
71+
print(f"[{c.score:.2f}] {c.text[:100]}... (Source: {c.metadata.get('source')})")
72+
```
73+
74+
### 4) QA with citations
75+
76+
```python
77+
llm = ChatBot() # uses the configured core LLM provider
78+
qa = RagQA(config=cfg, llm=llm)
79+
result = await qa.answer("How do I use SpoonAI?", chunks)
80+
81+
print("Answer:", result.answer)
82+
for cite in result.citations:
83+
print(f"- {cite.marker} {cite.source}")
84+
```
85+
86+
## Configuration
87+
88+
### Core env vars
89+
90+
| Variable | Description | Default |
91+
|----------|-------------|---------|
92+
| `RAG_BACKEND` | Vector store backend (`faiss`, `chroma`, `pinecone`, `qdrant`) | `faiss` |
93+
| `RAG_COLLECTION` | Collection name | `default` |
94+
| `RAG_DIR` | Persistence directory (used by some backends) | `.rag_store` |
95+
| `TOP_K` | Default number of chunks to retrieve | `5` |
96+
| `CHUNK_SIZE` | Chunk size | `800` |
97+
| `CHUNK_OVERLAP` | Chunk overlap | `120` |
98+
99+
### Embeddings selection
100+
101+
| Variable | Description | Default |
102+
|----------|-------------|---------|
103+
| `RAG_EMBEDDINGS_PROVIDER` | `auto`, `openai`, `openrouter`, `gemini`, `openai_compatible`, `ollama`, `hash` (`auto` uses OpenAI > OpenRouter > Gemini > openai_compatible) | `auto` |
104+
| `RAG_EMBEDDINGS_MODEL` | Embedding model id (provider-specific) | `text-embedding-3-small` |
105+
| `RAG_EMBEDDINGS_API_KEY` | API key for `openai_compatible` embeddings | None |
106+
| `RAG_EMBEDDINGS_BASE_URL` | Base URL for `openai_compatible` embeddings (OpenAI-compatible `/embeddings`) | None |
107+
108+
### Provider keys (when used)
109+
110+
- `OPENAI_API_KEY` (OpenAI embeddings)
111+
- `OPENROUTER_API_KEY` (OpenRouter embeddings)
112+
- `GEMINI_API_KEY` (Gemini embeddings)
113+
- `OLLAMA_BASE_URL` (Ollama embeddings, default: `http://localhost:11434`)
114+
115+
## Backends & Smoke Tests
116+
117+
### Vector stores (`RAG_BACKEND`)
118+
119+
- `faiss` (default): local/offline friendly. Falls back to an in-memory cosine store if FAISS is not installed.
120+
- `pinecone`: cloud vector DB (requires `PINECONE_API_KEY`, optional `RAG_PINECONE_INDEX`).
121+
- `qdrant`: local/cloud (requires `qdrant-client`; uses `QDRANT_URL` / `QDRANT_PATH`).
122+
- `chroma`: local (requires `chromadb`; persists under `${RAG_DIR:-.rag_store}/chroma`).
123+
124+
### Smoke tests
125+
126+
```bash
127+
# Offline (no LLM calls)
128+
RAG_BACKEND=faiss RAG_FAKE_QA=1 python examples/smoke/rag_faiss_smoke.py
129+
130+
# Pinecone
131+
export PINECONE_API_KEY=...
132+
RAG_BACKEND=pinecone RAG_FAKE_QA=1 python examples/smoke/rag_pinecone_smoke.py
133+
134+
# Qdrant
135+
pip install qdrant-client
136+
export QDRANT_URL=http://localhost:6333
137+
RAG_BACKEND=qdrant RAG_FAKE_QA=1 python examples/smoke/rag_qdrant_smoke.py
138+
139+
# Chroma
140+
pip install chromadb
141+
RAG_BACKEND=chroma RAG_FAKE_QA=1 python examples/smoke/rag_chroma_smoke.py
142+
```
143+
144+
## Runnable Examples
145+
146+
```bash
147+
python examples/rag_react_agent_demo.py
148+
python examples/rag_graph_agent_demo.py
149+
```
150+
151+
152+
153+
154+
155+
156+
157+
158+
159+
160+
161+
162+
163+
164+
165+
166+
167+
168+
169+

0 commit comments

Comments
 (0)