Skip to content

Commit 4036328

Browse files
committed
docs: update README with complete project documentation - Add detailed setup instructions - Add usage examples for all components - Add Chain of Thought documentation - Update API endpoints section
1 parent 806ccac commit 4036328

File tree

1 file changed

+92
-82
lines changed

1 file changed

+92
-82
lines changed

agentic_rag/README.md

Lines changed: 92 additions & 82 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
# Agentic RAG System
22

3+
## Introduction
4+
35
An intelligent RAG (Retrieval Augmented Generation) system that uses an LLM agent to make decisions about information retrieval and response generation. The system processes PDF documents and can intelligently decide which knowledge base to query based on the user's question.
46

57
The system has the following features:
@@ -12,7 +14,23 @@ The system has the following features:
1214
- Support for both OpenAI-based agents or local, transformer-based agents (`Mistral-7B` by default)
1315
- Optional Chain of Thought (CoT) reasoning for more detailed and structured responses
1416

15-
## Setup
17+
## 0. Prerequisites and setup
18+
19+
### Prerequisites
20+
21+
- Python 3.8 or higher
22+
- OpenAI API key (optional, for OpenAI-based agent)
23+
- HuggingFace token (optional, for local Mistral model)
24+
25+
### Hardware Requirements
26+
27+
- For the OpenAI Agent: Standard CPU machine
28+
- For the Local Agent:
29+
- Minimum 16GB RAM (recommended >24GBs)
30+
- GPU with 8GB VRAM recommended for better performance
31+
- Will run on CPU if GPU is not available, but will be significantly slower.
32+
33+
### Setup
1634

1735
1. Clone the repository and install dependencies:
1836

@@ -59,7 +77,38 @@ python main.py
5977
6078
The API will be available at `http://localhost:8000`. You can then use the API endpoints as described in the API Endpoints section below.
6179
62-
### 2. Using Individual Python Components via Command Line
80+
### 2. Using the Gradio Interface
81+
82+
The system provides a user-friendly web interface using Gradio, which allows you to:
83+
- Upload and process PDF documents
84+
- Process web content from URLs
85+
- Chat with your documents using either local or OpenAI models
86+
- Toggle Chain of Thought reasoning
87+
88+
To launch the interface:
89+
90+
```bash
91+
python gradio_app.py
92+
```
93+
94+
This will start the Gradio server and automatically open the interface in your default browser at `http://localhost:7860`. The interface has two main tabs:
95+
96+
1. **Document Processing**:
97+
- Upload PDFs using the file uploader
98+
- Process web content by entering URLs
99+
- View processing status and results
100+
101+
2. **Chat Interface**:
102+
- Select between Local (Mistral) and OpenAI models
103+
- Toggle Chain of Thought reasoning for more detailed responses
104+
- Chat with your documents using natural language
105+
- Clear chat history as needed
106+
107+
Note: The interface will automatically detect available models based on your configuration:
108+
- Local Mistral model requires HuggingFace token in `config.yaml`
109+
- OpenAI model requires API key in `.env` file
110+
111+
### 3. Using Individual Python Components via Command Line
63112
64113
#### Process PDFs
65114
@@ -76,9 +125,11 @@ python pdf_processor.py --input path/to/pdf/directory --output chunks.json
76125
python pdf_processor.py --input https://example.com/document.pdf --output chunks.json
77126
# sample pdf: https://arxiv.org/pdf/2203.06605
78127
```
128+
79129
#### Process Websites with Trafilatura
80130
81131
Process a single website and save the content to a JSON file:
132+
82133
```bash
83134
python web_processor.py --input https://example.com --output docs/web_content.json
84135
```
@@ -117,7 +168,7 @@ python rag_agent.py --query "Can you explain the DaGAN Approach proposed in the
117168
python local_rag_agent.py --query "Can you explain the DaGAN Approach proposed in the Depth-Aware Generative Adversarial Network for Talking Head Video Generation article?"
118169
```
119170
120-
### 3. Complete Pipeline Example
171+
### 4. Complete Pipeline Example
121172
122173
First, we process a document and query it using the local model. Then, we add the document to the vector store and query from the knowledge base to get the RAG system in action.
123174
@@ -135,63 +186,7 @@ python local_rag_agent.py --query "Can you explain the DaGAN Approach proposed i
135186
python rag_agent.py --query "Can you explain the DaGAN Approach proposed in the Depth-Aware Generative Adversarial Network for Talking Head Video Generation article?"
136187
```
137188
138-
## Annex: API Endpoints
139-
140-
### Upload PDF
141-
142-
```http
143-
POST /upload/pdf
144-
Content-Type: multipart/form-data
145-
146-
file: <pdf-file>
147-
```
148-
149-
This endpoint uploads and processes a PDF file, storing its contents in the vector database.
150-
151-
### Query
152-
153-
```http
154-
POST /query
155-
Content-Type: application/json
156-
157-
{
158-
"query": "your question here"
159-
}
160-
```
161-
162-
This endpoint processes a query through the agentic RAG pipeline and returns a response with context.
163-
164-
## Annex: Architecture
165-
166-
The system consists of several key components:
167-
168-
1. **PDF Processor**: we use Docling to extract and chunk text from PDF documents
169-
2. **Vector Store**: Manages document embeddings and similarity search using ChromaDB
170-
3. **RAG Agent**: Makes intelligent decisions about query routing and response generation
171-
- OpenAI Agent: Uses `gpt-4-turbo-preview` for high-quality responses, but requires an OpenAI API key
172-
- Local Agent: Uses `Mistral-7B` as an open-source alternative
173-
4. **FastAPI Server**: Provides REST API endpoints for document upload and querying
174-
175-
The RAG Agent flow is the following:
176-
177-
1. Analyzes query type
178-
2. Try to find relevant PDF context, regardless of query type
179-
3. If PDF context is found, use it to generate a response.
180-
4. If no PDF context is found OR if it's a general knowledge query, use the pre-trained LLM directly
181-
5. Fall back to a "no information" response only in edge cases.
182-
183-
## Hardware Requirements
184-
185-
- For the OpenAI Agent: Standard CPU machine
186-
- For the Local Agent:
187-
- Minimum 16GB RAM (recommended >24GBs)
188-
- GPU with 8GB VRAM recommended for better performance
189-
- Will run on CPU if GPU is not available, but will be significantly slower.
190-
191-
TODO: integrate with Trafilatura to crawl web content apart from PDF
192-
193-
194-
## Chain of Thought (CoT) Support
189+
## 2. Chain of Thought (CoT) Support
195190
196191
The system implements Chain of Thought prompting, allowing the LLMs to break down complex queries into steps and show their reasoning process. This feature can be activated in several ways:
197192
@@ -237,36 +232,51 @@ This is particularly useful for:
237232
- Questions requiring detailed explanations
238233
- Queries that need careful consideration of multiple pieces of context
239234
240-
## Using the Gradio Interface
241235
242-
The system provides a user-friendly web interface using Gradio, which allows you to:
243-
- Upload and process PDF documents
244-
- Process web content from URLs
245-
- Chat with your documents using either local or OpenAI models
246-
- Toggle Chain of Thought reasoning
236+
## Annex: API Endpoints
247237
248-
To launch the interface:
238+
### Upload PDF
249239
250-
```bash
251-
python gradio_app.py
240+
```http
241+
POST /upload/pdf
242+
Content-Type: multipart/form-data
243+
244+
file: <pdf-file>
252245
```
253246
254-
This will start the Gradio server and automatically open the interface in your default browser at `http://localhost:7860`. The interface has two main tabs:
247+
This endpoint uploads and processes a PDF file, storing its contents in the vector database.
255248
256-
1. **Document Processing**:
257-
- Upload PDFs using the file uploader
258-
- Process web content by entering URLs
259-
- View processing status and results
249+
### Query
260250
261-
2. **Chat Interface**:
262-
- Select between Local (Mistral) and OpenAI models
263-
- Toggle Chain of Thought reasoning for more detailed responses
264-
- Chat with your documents using natural language
265-
- Clear chat history as needed
251+
```http
252+
POST /query
253+
Content-Type: application/json
266254
267-
Note: The interface will automatically detect available models based on your configuration:
268-
- Local Mistral model requires HuggingFace token in `config.yaml`
269-
- OpenAI model requires API key in `.env` file
255+
{
256+
"query": "your question here"
257+
}
258+
```
259+
260+
This endpoint processes a query through the agentic RAG pipeline and returns a response with context.
261+
262+
## Annex: Architecture
263+
264+
The system consists of several key components:
265+
266+
1. **PDF Processor**: we use Docling to extract and chunk text from PDF documents
267+
2. **Vector Store**: Manages document embeddings and similarity search using ChromaDB
268+
3. **RAG Agent**: Makes intelligent decisions about query routing and response generation
269+
- OpenAI Agent: Uses `gpt-4-turbo-preview` for high-quality responses, but requires an OpenAI API key
270+
- Local Agent: Uses `Mistral-7B` as an open-source alternative
271+
4. **FastAPI Server**: Provides REST API endpoints for document upload and querying
272+
273+
The RAG Agent flow is the following:
274+
275+
1. Analyzes query type
276+
2. Try to find relevant PDF context, regardless of query type
277+
3. If PDF context is found, use it to generate a response.
278+
4. If no PDF context is found OR if it's a general knowledge query, use the pre-trained LLM directly
279+
5. Fall back to a "no information" response only in edge cases.
270280

271281
## Contributing
272282

0 commit comments

Comments
 (0)