Skip to content

Commit 7b922b0

Browse files
authored
Merge pull request #22 from oracle-devrel/update
Big update
2 parents a8eab42 + 9d81563 commit 7b922b0

File tree

109 files changed

+7620
-4
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

109 files changed

+7620
-4
lines changed

.gitignore

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,4 +35,8 @@ Temporary Items
3535
oci-language-translation/config.yaml
3636
oci-subtitle-translation/config.yaml
3737
oci-csv-json-translation/config.yaml
38-
oci-language-multiple-translation/config.yaml
38+
oci-language-multiple-translation/config.yaml
39+
40+
agentic_rag/config.yaml
41+
agentic_rag/chroma_db
42+
agentic_rag/embeddings

agentic_rag/.gitignore

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
# Python
2+
__pycache__/
3+
*.py[cod]
4+
*$py.class
5+
6+
# Virtual Environment
7+
venv/
8+
env/
9+
.env
10+
11+
# IDE
12+
.vscode/
13+
.idea/
14+
15+
# Gradio
16+
.gradio/
17+
18+
# Generated files
19+
embeddings/
20+
chroma_db/
21+
docs/*.json
22+
23+
# Distribution / packaging
24+
dist/
25+
build/
26+
*.egg-info/
27+
28+
# Logs
29+
*.log

agentic_rag/README.md

Lines changed: 333 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,333 @@
1+
# Agentic RAG System
2+
3+
## Introduction
4+
5+
An intelligent RAG (Retrieval Augmented Generation) system that uses an LLM agent to make decisions about information retrieval and response generation. The system processes PDF documents and can intelligently decide which knowledge base to query based on the user's question.
6+
7+
The system has the following features:
8+
9+
- Intelligent query routing
10+
- PDF processing using Docling for accurate text extraction and chunking
11+
- Persistent vector storage with ChromaDB (PDF and Websites)
12+
- Smart context retrieval and response generation
13+
- FastAPI-based REST API for document upload and querying
14+
- Support for both OpenAI-based agents or local, transformer-based agents (`Mistral-7B` by default)
15+
- Optional Chain of Thought (CoT) reasoning for more detailed and structured responses
16+
17+
## 0. Prerequisites and setup
18+
19+
### Prerequisites
20+
21+
- Python 3.8 or higher
22+
- OpenAI API key (optional, for OpenAI-based agent)
23+
- HuggingFace token (optional, for local Mistral model)
24+
25+
### Hardware Requirements
26+
27+
- For the OpenAI Agent: Standard CPU machine
28+
- For the Local Agent:
29+
- Minimum 16GB RAM (recommended >24GBs)
30+
- GPU with 8GB VRAM recommended for better performance
31+
- Will run on CPU if GPU is not available, but will be significantly slower.
32+
33+
### Setup
34+
35+
1. Clone the repository and install dependencies:
36+
37+
```bash
38+
git clone https://github.com/oracle-devrel/devrel-labs.git
39+
cd agentic-rag
40+
pip install -r requirements.txt
41+
```
42+
43+
2. Authenticate with HuggingFace:
44+
45+
The system uses `Mistral-7B` by default, which requires authentication with HuggingFace:
46+
47+
a. Create a HuggingFace account [here](https://huggingface.co/join), if you don't have one yet.
48+
49+
b. Accept the Mistral-7B model terms & conditions [here](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)
50+
51+
c. Create an access token [here](https://huggingface.co/settings/tokens)
52+
53+
d. Create a `config.yaml` file (you can copy from `config_example.yaml`), and add your HuggingFace token:
54+
```yaml
55+
HUGGING_FACE_HUB_TOKEN: your_token_here
56+
```
57+
58+
3. (Optional) If you want to use the OpenAI-based agent instead of the default local model, create a `.env` file with your OpenAI API key:
59+
60+
```bash
61+
OPENAI_API_KEY=your-api-key-here
62+
```
63+
64+
If no API key is provided, the system will automatically download and use `Mistral-7B-Instruct-v0.2` for text generation when using the local model. No additional configuration is needed.
65+
66+
## 1. Getting Started
67+
68+
You can launch this solution in three ways:
69+
70+
### 1. Using the Complete REST API
71+
72+
Start the API server:
73+
74+
```bash
75+
python main.py
76+
```
77+
78+
The API will be available at `http://localhost:8000`. You can then use the API endpoints as described in the API Endpoints section below.
79+
80+
### 2. Using the Gradio Interface
81+
82+
The system provides a user-friendly web interface using Gradio, which allows you to:
83+
- Upload and process PDF documents
84+
- Process web content from URLs
85+
- Chat with your documents using either local or OpenAI models
86+
- Toggle Chain of Thought reasoning
87+
88+
To launch the interface:
89+
90+
```bash
91+
python gradio_app.py
92+
```
93+
94+
This will start the Gradio server and automatically open the interface in your default browser at `http://localhost:7860`. The interface has two main tabs:
95+
96+
1. **Document Processing**:
97+
- Upload PDFs using the file uploader
98+
- Process web content by entering URLs
99+
- View processing status and results
100+
101+
2. **Chat Interface**:
102+
- Select between Local (Mistral) and OpenAI models
103+
- Toggle Chain of Thought reasoning for more detailed responses
104+
- Chat with your documents using natural language
105+
- Clear chat history as needed
106+
107+
Note: The interface will automatically detect available models based on your configuration:
108+
- Local Mistral model requires HuggingFace token in `config.yaml`
109+
- OpenAI model requires API key in `.env` file
110+
111+
### 3. Using Individual Python Components via Command Line
112+
113+
#### Process PDFs
114+
115+
To process a PDF file and save the chunks to a JSON file, run:
116+
117+
```bash
118+
# Process a single PDF
119+
python pdf_processor.py --input path/to/document.pdf --output chunks.json
120+
121+
# Process multiple PDFs in a directory
122+
python pdf_processor.py --input path/to/pdf/directory --output chunks.json
123+
124+
# Process a single PDF from a URL
125+
python pdf_processor.py --input https://example.com/document.pdf --output chunks.json
126+
# sample pdf: https://arxiv.org/pdf/2203.06605
127+
```
128+
129+
#### Process Websites with Trafilatura
130+
131+
Process a single website and save the content to a JSON file:
132+
133+
```bash
134+
python web_processor.py --input https://example.com --output docs/web_content.json
135+
```
136+
137+
Or, process multiple URLs from a file and save them into a single JSON file:
138+
139+
```bash
140+
python web_processor.py --input urls.txt --output docs/web_content.json
141+
```
142+
143+
#### Manage Vector Store
144+
145+
To add documents to the vector store and query them, run:
146+
147+
```bash
148+
# Add documents from a chunks file, by default to the pdf_collection
149+
python store.py --add chunks.json
150+
# for websites, use the --add-web flag
151+
python store.py --add-web docs/web_content.json
152+
153+
# Query the vector store directly, both pdf and web collections
154+
# llm will make the best decision on which collection to query based upon your input
155+
python store.py --query "your search query"
156+
python local_rag_agent.py --query "your search query"
157+
```
158+
159+
#### Use RAG Agent
160+
161+
To query documents using either OpenAI or a local model, run:
162+
163+
```bash
164+
# Using OpenAI (requires API key in .env)
165+
python rag_agent.py --query "Can you explain the DaGAN Approach proposed in the Depth-Aware Generative Adversarial Network for Talking Head Video Generation article?"
166+
167+
# Using local Mistral model
168+
python local_rag_agent.py --query "Can you explain the DaGAN Approach proposed in the Depth-Aware Generative Adversarial Network for Talking Head Video Generation article?"
169+
```
170+
171+
### 4. Complete Pipeline Example
172+
173+
First, we process a document and query it using the local model. Then, we add the document to the vector store and query from the knowledge base to get the RAG system in action.
174+
175+
```bash
176+
# 1. Process the PDF
177+
python pdf_processor.py --input example.pdf --output chunks.json
178+
179+
#python pdf_processor.py --input https://arxiv.org/pdf/2203.06605 --output chunks.json
180+
181+
# 2. Add to vector store
182+
python store.py --add chunks.json
183+
184+
# 3. Query using local model
185+
python local_rag_agent.py --query "Can you explain the DaGAN Approach proposed in the Depth-Aware Generative Adversarial Network for Talking Head Video Generation article?"
186+
187+
# Or using OpenAI (requires API key):
188+
python rag_agent.py --query "Can you explain the DaGAN Approach proposed in the Depth-Aware Generative Adversarial Network for Talking Head Video Generation article?"
189+
```
190+
191+
## 2. Chain of Thought (CoT) Support
192+
193+
The system implements an advanced multi-agent Chain of Thought system, allowing complex queries to be broken down and processed through multiple specialized agents. This feature enhances the reasoning capabilities of both local and cloud-based models.
194+
195+
### Multi-Agent System
196+
197+
The CoT system consists of four specialized agents:
198+
199+
1. **Planner Agent**: Breaks down complex queries into clear, manageable steps
200+
2. **Research Agent**: Gathers and analyzes relevant information from knowledge bases
201+
3. **Reasoning Agent**: Applies logical analysis to information and draws conclusions
202+
4. **Synthesis Agent**: Combines multiple pieces of information into a coherent response
203+
204+
### Using CoT
205+
206+
You can activate the multi-agent CoT system in several ways:
207+
208+
1. **Command Line**:
209+
```bash
210+
# Using local Mistral model (default)
211+
python local_rag_agent.py --query "your query" --use-cot
212+
213+
# Using OpenAI model
214+
python rag_agent.py --query "your query" --use-cot
215+
```
216+
217+
2. **Testing the System**:
218+
```bash
219+
# Test with local model (default)
220+
python tests/test_new_cot.py
221+
222+
# Test with OpenAI model
223+
python tests/test_new_cot.py --model openai
224+
```
225+
226+
3. **API Endpoint**:
227+
```http
228+
POST /query
229+
Content-Type: application/json
230+
231+
{
232+
"query": "your query",
233+
"use_cot": true
234+
}
235+
```
236+
237+
### Example Output
238+
239+
When CoT is enabled, the system will show:
240+
- The initial plan for answering the query
241+
- Research findings for each step
242+
- Reasoning process and conclusions
243+
- Final synthesized answer
244+
- Sources used from the knowledge base
245+
246+
Example:
247+
```
248+
Step 1: Planning
249+
- Break down the technical components
250+
- Identify key features
251+
- Analyze implementation details
252+
253+
Step 2: Research
254+
[Research findings for each step...]
255+
256+
Step 3: Reasoning
257+
[Logical analysis and conclusions...]
258+
259+
Final Answer:
260+
[Comprehensive response synthesized from all steps...]
261+
262+
Sources used:
263+
- document.pdf (pages: 1, 2, 3)
264+
- implementation.py
265+
```
266+
267+
### Benefits
268+
269+
The multi-agent CoT approach offers several advantages:
270+
- More structured and thorough analysis of complex queries
271+
- Better integration with knowledge bases
272+
- Transparent reasoning process
273+
- Improved answer quality through specialized agents
274+
- Works with both local and cloud-based models
275+
276+
## Annex: API Endpoints
277+
278+
### Upload PDF
279+
280+
```http
281+
POST /upload/pdf
282+
Content-Type: multipart/form-data
283+
284+
file: <pdf-file>
285+
```
286+
287+
This endpoint uploads and processes a PDF file, storing its contents in the vector database.
288+
289+
### Query
290+
291+
```http
292+
POST /query
293+
Content-Type: application/json
294+
295+
{
296+
"query": "your question here"
297+
}
298+
```
299+
300+
This endpoint processes a query through the agentic RAG pipeline and returns a response with context.
301+
302+
## Annex: Architecture
303+
304+
The system consists of several key components:
305+
306+
1. **PDF Processor**: we use Docling to extract and chunk text from PDF documents
307+
2. **Vector Store**: Manages document embeddings and similarity search using ChromaDB
308+
3. **RAG Agent**: Makes intelligent decisions about query routing and response generation
309+
- OpenAI Agent: Uses `gpt-4-turbo-preview` for high-quality responses, but requires an OpenAI API key
310+
- Local Agent: Uses `Mistral-7B` as an open-source alternative
311+
4. **FastAPI Server**: Provides REST API endpoints for document upload and querying
312+
313+
The RAG Agent flow is the following:
314+
315+
1. Analyzes query type
316+
2. Try to find relevant PDF context, regardless of query type
317+
3. If PDF context is found, use it to generate a response.
318+
4. If no PDF context is found OR if it's a general knowledge query, use the pre-trained LLM directly
319+
5. Fall back to a "no information" response only in edge cases.
320+
321+
## Contributing
322+
323+
This project is open source. Please submit your contributions by forking this repository and submitting a pull request! Oracle appreciates any contributions that are made by the open source community.
324+
325+
## License
326+
327+
Copyright (c) 2024 Oracle and/or its affiliates.
328+
329+
Licensed under the Universal Permissive License (UPL), Version 1.0.
330+
331+
See [LICENSE](../LICENSE) for more details.
332+
333+
ORACLE AND ITS AFFILIATES DO NOT PROVIDE ANY WARRANTY WHATSOEVER, EXPRESS OR IMPLIED, FOR ANY SOFTWARE, MATERIAL OR CONTENT OF ANY KIND CONTAINED OR PRODUCED WITHIN THIS REPOSITORY, AND IN PARTICULAR SPECIFICALLY DISCLAIM ANY AND ALL IMPLIED WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. FURTHERMORE, ORACLE AND ITS AFFILIATES DO NOT REPRESENT THAT ANY CUSTOMARY SECURITY REVIEW HAS BEEN PERFORMED WITH RESPECT TO ANY SOFTWARE, MATERIAL OR CONTENT CONTAINED OR PRODUCED WITHIN THIS REPOSITORY. IN ADDITION, AND WITHOUT LIMITING THE FOREGOING, THIRD PARTIES MAY HAVE POSTED SOFTWARE, MATERIAL OR CONTENT TO THIS REPOSITORY WITHOUT ANY REVIEW. USE AT YOUR OWN RISK.

0 commit comments

Comments
 (0)