Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
87 commits
Select commit Hold shift + click to select a range
08cbe32
feat: updated files in testing csv json translation
jasperan Jan 8, 2025
74d075b
feat: added v1 files
jasperan Jan 14, 2025
ba48efd
feat: added config loading option for srt generation
jasperan Jan 14, 2025
5592894
feat: added config loading option for srt generation
jasperan Jan 14, 2025
76700cd
feat(reqs): updated to latest oci version
jasperan Jan 16, 2025
fdd8819
feat: updated instructions about multi identity domains
jasperan Jan 16, 2025
b4785b0
feat: print job id for traceability in object storage
jasperan Jan 16, 2025
09b3316
feat: added images for readme
jasperan Jan 16, 2025
a05f4d7
feat: updated instructions to latest version, with examples, code pre…
jasperan Jan 16, 2025
793ca1f
feat: re-enabled column-specific translations
jasperan Jan 16, 2025
7332f96
feat: improved data quality of input files, added output file example
jasperan Jan 16, 2025
4c5c06e
feat: updated to 60% width
jasperan Jan 17, 2025
7cb4fcc
feat: added images, added config example and updated readme
jasperan Jan 20, 2025
8b929e7
feat: added images, added config example and updated readme
jasperan Jan 20, 2025
474e209
feat: added output image
jasperan Jan 22, 2025
65ca514
fix(bucket_translation): update to use correct OCI Language API calls…
jasperan Jan 23, 2025
23ec454
chore(deps): update OCI SDK version requirement - Set minimum OCI SDK…
jasperan Jan 23, 2025
2095b56
chore(config): update example configuration - Ad language translation…
jasperan Jan 23, 2025
8ccfcef
feat(samples): add sample texts for testing - Add tchnology-focused E…
jasperan Jan 23, 2025
3f0e9a8
feat(bucket): enhance script output and monitoring - Add detailed buc…
jasperan Jan 23, 2025
bd00d0c
feat: added final output image
jasperan Jan 23, 2025
b5964fa
feat: added translated and original files
jasperan Jan 23, 2025
739d3f0
feat: removed file directive
jasperan Jan 23, 2025
647685c
feat: added images
jasperan Jan 23, 2025
f47f7bd
feat: changed html directive
jasperan Jan 23, 2025
2678d68
build: initial project dependencies - Add docling, langchain, OpenAI,…
jasperan Jan 25, 2025
683f2ca
feat(processor): implement PDF processing with Docling - Add PDFProce…
jasperan Jan 25, 2025
5f2dbd9
feat(store): implement vector store with ChromaDB - Add VectorStore c…
jasperan Jan 25, 2025
0f7bf0b
feat(agent): implement RAG agents with OpenAI and local models - Add …
jasperan Jan 25, 2025
57b781a
feat(api): implement FastAPI server - Add REST API endpoints for docu…
jasperan Jan 25, 2025
576818d
docs: add comprehensive documentation and usage examples
jasperan Jan 25, 2025
1176db6
feat(bucket): enhance script output and monitoring - Add detailed pro…
jasperan Jan 26, 2025
7c6e749
feat: added editorial updates
jasperan Jan 26, 2025
df90db0
feat: Add URL support for PDF processing - Add process_pdf_url method…
jasperan Jan 27, 2025
b3b4a9d
refactor: Flatten agentic_rag project structure - Move all core files…
jasperan Jan 28, 2025
e8371fe
cleanup: Remove old directory structure - Delete old subdirectories a…
jasperan Jan 28, 2025
31e7aac
docs: Update README with new execution commands
jasperan Jan 28, 2025
58dba31
feat: Make OpenAI optional and use local model by default - Update ma…
jasperan Jan 28, 2025
2fdb510
docs: Add HuggingFace authentication instructions - Add step-by-step …
jasperan Jan 28, 2025
653d6f2
feat: Use YAML config for HuggingFace token - Add confg.yaml support …
jasperan Jan 28, 2025
86dd608
feat: reflect new hidden files and dirs in agentic_rag
jasperan Jan 28, 2025
3f1fd46
fix: Add missing is_url function to pdf_processor - Add URL validatio…
jasperan Jan 28, 2025
1573aaf
fix: Improve PDF processing robustness - Handle both dictionary and D…
jasperan Jan 28, 2025
2308831
fix: Improve PDF processing robustness - Add better metadata extracti…
jasperan Jan 28, 2025
ac5fdea
fix: Improve error handling and warnings
jasperan Jan 28, 2025
ab75086
feat: Improve RAG agent flow - Use LLM directly for general knowledge…
jasperan Jan 30, 2025
45748aa
feat: Enhance RAG system with web processing - Add web content proces…
jasperan Feb 3, 2025
baa03dc
feat: updated gitignore
jasperan Feb 4, 2025
faa6528
feat: Enhance speech-to-text transcription - Add detailed logging sys…
jasperan Feb 4, 2025
55597ec
feat: added translated images
jasperan Feb 4, 2025
2db44c4
feat: added translated and original docs to docs/
jasperan Feb 4, 2025
0b84292
feat: updated file names in docs/
jasperan Feb 4, 2025
094c8c4
feat: removed stale files
jasperan Feb 4, 2025
f082c04
feat(rag): implement Chain of Thought prompting in RAG agents - Add u…
jasperan Feb 17, 2025
e26061c
feat(api): expose Chain of Thought option in API - Add use_cot parame…
jasperan Feb 17, 2025
c4c43eb
test(rag): add Chain of Thought comparison tests - Create test suite …
jasperan Feb 17, 2025
0275ff3
docs(rag): add Chain of Thought documentation - Document CoT feature …
jasperan Feb 17, 2025
d20670c
build(deps): add Gradio and update dependencies - Add Gradio for web …
jasperan Feb 17, 2025
e56ac5d
feat(ui): add Gradio web interface - Create interactive web UI with d…
jasperan Feb 17, 2025
b5f5e3c
docs: add Gradio interface documentation - Add instructions for launc…
jasperan Feb 17, 2025
4e0a2b5
fix(ui): correct web content processing - Fix method name mismatch in…
jasperan Feb 17, 2025
33d7743
fix(web): handle metadata extraction failures - Add proper object to …
jasperan Feb 17, 2025
18a1013
feat(web): add special handling for diferent domains - Add support fo…
jasperan Feb 17, 2025
806ccac
chore: add gitignore file - Exclude Python cache files - Ignore Gradi…
jasperan Feb 17, 2025
4036328
docs: update README with complete project documentation - Add detaile…
jasperan Feb 18, 2025
bd7b8a7
fix: improve PDF processor error handling and reduce token size - Add…
jasperan Feb 18, 2025
554e0d6
fix: add document_id handling in PDF processor - Generate unique docu…
jasperan Feb 18, 2025
75b195e
fix: update PDF processing in web interfaces - Update gradio_app to h…
jasperan Feb 18, 2025
a64f264
fix: update Gradio chat function to properly format messages - Fix ch…
jasperan Feb 18, 2025
dd1ac41
feat: enhance RAG agents and PDF processing
jasperan Feb 19, 2025
f934e53
fix: correct HybridChunker initialization and chunk size setting
jasperan Feb 19, 2025
314c7d6
feat: add Spanish language support
jasperan Feb 19, 2025
5d2d6f3
feat: add repository processing and language fix - Add RepoProcessor …
jasperan Feb 19, 2025
a6052b0
fix: improve repo processor error handling and output - Fix error wit…
jasperan Feb 19, 2025
7e1628b
fix: improve content type handling in repo processor
jasperan Feb 19, 2025
a106d83
feat(agentic_rag): Enhance reository processing and collection-specif…
jasperan Feb 19, 2025
78729be
feat: Implement multi-agent Chain of Thought system
jasperan Feb 21, 2025
ed54773
feat: move file
jasperan Feb 21, 2025
27f514d
feat: Enhance CoT display in Gradio interface
jasperan Feb 21, 2025
f1c4a33
fix: Handle missing researcher agent in CoT
jasperan Feb 21, 2025
05836c2
fix: Enhance error handling in RAG agents
jasperan Feb 21, 2025
3b7ca5c
fix: Improve Chain of Thought handling in Gradio interface
jasperan Feb 21, 2025
0a91bac
fix: Properly define agent fields in Pydantic models
jasperan Feb 21, 2025
6de3ca8
feat: Add comprehensive logging system
jasperan Feb 21, 2025
baf2ed2
feat: Improve Gradio interface
jasperan Feb 21, 2025
f49444e
feat: Optimize agent prompts and improve CoT visibility
jasperan Feb 21, 2025
9d81563
feat: Enhance RAG in a Box with multi-agent CoT and deployment options
jasperan Feb 24, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -35,4 +35,8 @@ Temporary Items
oci-language-translation/config.yaml
oci-subtitle-translation/config.yaml
oci-csv-json-translation/config.yaml
oci-language-multiple-translation/config.yaml
oci-language-multiple-translation/config.yaml

agentic_rag/config.yaml
agentic_rag/chroma_db
agentic_rag/embeddings
29 changes: 29 additions & 0 deletions agentic_rag/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Python
__pycache__/
*.py[cod]
*$py.class

# Virtual Environment
venv/
env/
.env

# IDE
.vscode/
.idea/

# Gradio
.gradio/

# Generated files
embeddings/
chroma_db/
docs/*.json

# Distribution / packaging
dist/
build/
*.egg-info/

# Logs
*.log
333 changes: 333 additions & 0 deletions agentic_rag/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,333 @@
# Agentic RAG System

## Introduction

An intelligent RAG (Retrieval Augmented Generation) system that uses an LLM agent to make decisions about information retrieval and response generation. The system processes PDF documents and can intelligently decide which knowledge base to query based on the user's question.

The system has the following features:

- Intelligent query routing
- PDF processing using Docling for accurate text extraction and chunking
- Persistent vector storage with ChromaDB (PDF and Websites)
- Smart context retrieval and response generation
- FastAPI-based REST API for document upload and querying
- Support for both OpenAI-based agents or local, transformer-based agents (`Mistral-7B` by default)
- Optional Chain of Thought (CoT) reasoning for more detailed and structured responses

## 0. Prerequisites and setup

### Prerequisites

- Python 3.8 or higher
- OpenAI API key (optional, for OpenAI-based agent)
- HuggingFace token (optional, for local Mistral model)

### Hardware Requirements

- For the OpenAI Agent: Standard CPU machine
- For the Local Agent:
- Minimum 16GB RAM (recommended >24GBs)
- GPU with 8GB VRAM recommended for better performance
- Will run on CPU if GPU is not available, but will be significantly slower.

### Setup

1. Clone the repository and install dependencies:

```bash
git clone https://github.com/oracle-devrel/devrel-labs.git
cd agentic-rag
pip install -r requirements.txt
```

2. Authenticate with HuggingFace:

The system uses `Mistral-7B` by default, which requires authentication with HuggingFace:

a. Create a HuggingFace account [here](https://huggingface.co/join), if you don't have one yet.

b. Accept the Mistral-7B model terms & conditions [here](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)

c. Create an access token [here](https://huggingface.co/settings/tokens)

d. Create a `config.yaml` file (you can copy from `config_example.yaml`), and add your HuggingFace token:
```yaml
HUGGING_FACE_HUB_TOKEN: your_token_here
```

3. (Optional) If you want to use the OpenAI-based agent instead of the default local model, create a `.env` file with your OpenAI API key:

```bash
OPENAI_API_KEY=your-api-key-here
```

If no API key is provided, the system will automatically download and use `Mistral-7B-Instruct-v0.2` for text generation when using the local model. No additional configuration is needed.

## 1. Getting Started

You can launch this solution in three ways:

### 1. Using the Complete REST API

Start the API server:

```bash
python main.py
```

The API will be available at `http://localhost:8000`. You can then use the API endpoints as described in the API Endpoints section below.

### 2. Using the Gradio Interface

The system provides a user-friendly web interface using Gradio, which allows you to:
- Upload and process PDF documents
- Process web content from URLs
- Chat with your documents using either local or OpenAI models
- Toggle Chain of Thought reasoning

To launch the interface:

```bash
python gradio_app.py
```

This will start the Gradio server and automatically open the interface in your default browser at `http://localhost:7860`. The interface has two main tabs:

1. **Document Processing**:
- Upload PDFs using the file uploader
- Process web content by entering URLs
- View processing status and results

2. **Chat Interface**:
- Select between Local (Mistral) and OpenAI models
- Toggle Chain of Thought reasoning for more detailed responses
- Chat with your documents using natural language
- Clear chat history as needed

Note: The interface will automatically detect available models based on your configuration:
- Local Mistral model requires HuggingFace token in `config.yaml`
- OpenAI model requires API key in `.env` file

### 3. Using Individual Python Components via Command Line

#### Process PDFs

To process a PDF file and save the chunks to a JSON file, run:

```bash
# Process a single PDF
python pdf_processor.py --input path/to/document.pdf --output chunks.json

# Process multiple PDFs in a directory
python pdf_processor.py --input path/to/pdf/directory --output chunks.json

# Process a single PDF from a URL
python pdf_processor.py --input https://example.com/document.pdf --output chunks.json
# sample pdf: https://arxiv.org/pdf/2203.06605
```

#### Process Websites with Trafilatura

Process a single website and save the content to a JSON file:

```bash
python web_processor.py --input https://example.com --output docs/web_content.json
```

Or, process multiple URLs from a file and save them into a single JSON file:

```bash
python web_processor.py --input urls.txt --output docs/web_content.json
```

#### Manage Vector Store

To add documents to the vector store and query them, run:

```bash
# Add documents from a chunks file, by default to the pdf_collection
python store.py --add chunks.json
# for websites, use the --add-web flag
python store.py --add-web docs/web_content.json

# Query the vector store directly, both pdf and web collections
# llm will make the best decision on which collection to query based upon your input
python store.py --query "your search query"
python local_rag_agent.py --query "your search query"
```

#### Use RAG Agent

To query documents using either OpenAI or a local model, run:

```bash
# Using OpenAI (requires API key in .env)
python rag_agent.py --query "Can you explain the DaGAN Approach proposed in the Depth-Aware Generative Adversarial Network for Talking Head Video Generation article?"

# Using local Mistral model
python local_rag_agent.py --query "Can you explain the DaGAN Approach proposed in the Depth-Aware Generative Adversarial Network for Talking Head Video Generation article?"
```

### 4. Complete Pipeline Example

First, we process a document and query it using the local model. Then, we add the document to the vector store and query from the knowledge base to get the RAG system in action.

```bash
# 1. Process the PDF
python pdf_processor.py --input example.pdf --output chunks.json

#python pdf_processor.py --input https://arxiv.org/pdf/2203.06605 --output chunks.json

# 2. Add to vector store
python store.py --add chunks.json

# 3. Query using local model
python local_rag_agent.py --query "Can you explain the DaGAN Approach proposed in the Depth-Aware Generative Adversarial Network for Talking Head Video Generation article?"

# Or using OpenAI (requires API key):
python rag_agent.py --query "Can you explain the DaGAN Approach proposed in the Depth-Aware Generative Adversarial Network for Talking Head Video Generation article?"
```

## 2. Chain of Thought (CoT) Support

The system implements an advanced multi-agent Chain of Thought system, allowing complex queries to be broken down and processed through multiple specialized agents. This feature enhances the reasoning capabilities of both local and cloud-based models.

### Multi-Agent System

The CoT system consists of four specialized agents:

1. **Planner Agent**: Breaks down complex queries into clear, manageable steps
2. **Research Agent**: Gathers and analyzes relevant information from knowledge bases
3. **Reasoning Agent**: Applies logical analysis to information and draws conclusions
4. **Synthesis Agent**: Combines multiple pieces of information into a coherent response

### Using CoT

You can activate the multi-agent CoT system in several ways:

1. **Command Line**:
```bash
# Using local Mistral model (default)
python local_rag_agent.py --query "your query" --use-cot

# Using OpenAI model
python rag_agent.py --query "your query" --use-cot
```

2. **Testing the System**:
```bash
# Test with local model (default)
python tests/test_new_cot.py

# Test with OpenAI model
python tests/test_new_cot.py --model openai
```

3. **API Endpoint**:
```http
POST /query
Content-Type: application/json

{
"query": "your query",
"use_cot": true
}
```

### Example Output

When CoT is enabled, the system will show:
- The initial plan for answering the query
- Research findings for each step
- Reasoning process and conclusions
- Final synthesized answer
- Sources used from the knowledge base

Example:
```
Step 1: Planning
- Break down the technical components
- Identify key features
- Analyze implementation details

Step 2: Research
[Research findings for each step...]

Step 3: Reasoning
[Logical analysis and conclusions...]

Final Answer:
[Comprehensive response synthesized from all steps...]

Sources used:
- document.pdf (pages: 1, 2, 3)
- implementation.py
```

### Benefits

The multi-agent CoT approach offers several advantages:
- More structured and thorough analysis of complex queries
- Better integration with knowledge bases
- Transparent reasoning process
- Improved answer quality through specialized agents
- Works with both local and cloud-based models

## Annex: API Endpoints

### Upload PDF

```http
POST /upload/pdf
Content-Type: multipart/form-data

file: <pdf-file>
```

This endpoint uploads and processes a PDF file, storing its contents in the vector database.

### Query

```http
POST /query
Content-Type: application/json

{
"query": "your question here"
}
```

This endpoint processes a query through the agentic RAG pipeline and returns a response with context.

## Annex: Architecture

The system consists of several key components:

1. **PDF Processor**: we use Docling to extract and chunk text from PDF documents
2. **Vector Store**: Manages document embeddings and similarity search using ChromaDB
3. **RAG Agent**: Makes intelligent decisions about query routing and response generation
- OpenAI Agent: Uses `gpt-4-turbo-preview` for high-quality responses, but requires an OpenAI API key
- Local Agent: Uses `Mistral-7B` as an open-source alternative
4. **FastAPI Server**: Provides REST API endpoints for document upload and querying

The RAG Agent flow is the following:

1. Analyzes query type
2. Try to find relevant PDF context, regardless of query type
3. If PDF context is found, use it to generate a response.
4. If no PDF context is found OR if it's a general knowledge query, use the pre-trained LLM directly
5. Fall back to a "no information" response only in edge cases.

## Contributing

This project is open source. Please submit your contributions by forking this repository and submitting a pull request! Oracle appreciates any contributions that are made by the open source community.

## License

Copyright (c) 2024 Oracle and/or its affiliates.

Licensed under the Universal Permissive License (UPL), Version 1.0.

See [LICENSE](../LICENSE) for more details.

ORACLE AND ITS AFFILIATES DO NOT PROVIDE ANY WARRANTY WHATSOEVER, EXPRESS OR IMPLIED, FOR ANY SOFTWARE, MATERIAL OR CONTENT OF ANY KIND CONTAINED OR PRODUCED WITHIN THIS REPOSITORY, AND IN PARTICULAR SPECIFICALLY DISCLAIM ANY AND ALL IMPLIED WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. FURTHERMORE, ORACLE AND ITS AFFILIATES DO NOT REPRESENT THAT ANY CUSTOMARY SECURITY REVIEW HAS BEEN PERFORMED WITH RESPECT TO ANY SOFTWARE, MATERIAL OR CONTENT CONTAINED OR PRODUCED WITHIN THIS REPOSITORY. IN ADDITION, AND WITHOUT LIMITING THE FOREGOING, THIRD PARTIES MAY HAVE POSTED SOFTWARE, MATERIAL OR CONTENT TO THIS REPOSITORY WITHOUT ANY REVIEW. USE AT YOUR OWN RISK.
Loading