You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: Enhance RAG system with web processing - Add web content processing support with Trafilatura - Update requirements.txt with new dependencies - Add documentation and example outputs - Improve store.py for handling web content - Add example processed content in docs/gan.json
Copy file name to clipboardExpand all lines: agentic_rag/README.md
+36-24Lines changed: 36 additions & 24 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,11 +5,11 @@ An intelligent RAG (Retrieval Augmented Generation) system that uses an LLM agen
5
5
The system has the following features:
6
6
7
7
- Intelligent query routing
8
-
- PDF processing using Docling for accurate text extraction
9
-
- Persistent vector storage with ChromaDB
8
+
- PDF processing using Docling for accurate text extraction and chunking
9
+
- Persistent vector storage with ChromaDB (PDF and Websites)
10
10
- Smart context retrieval and response generation
11
-
- FastAPI-based REST API
12
-
- Support for both OpenAI-based agents or local, transformer-based agents (Mistral-7B by default)
11
+
- FastAPI-based REST API for document upload and querying
12
+
- Support for both OpenAI-based agents or local, transformer-based agents (`Mistral-7B` by default)
13
13
14
14
## Setup
15
15
@@ -23,15 +23,15 @@ The system has the following features:
23
23
24
24
2. Authenticate with HuggingFace:
25
25
26
-
The system uses Mistral-7B by default, which requires authentication with HuggingFace:
26
+
The system uses `Mistral-7B` by default, which requires authentication with HuggingFace:
27
27
28
-
a. Create a HuggingFace account [here](https://huggingface.co/join)
28
+
a. Create a HuggingFace account [here](https://huggingface.co/join), if you don't have one yet.
29
29
30
30
b. Accept the Mistral-7B model terms & conditions [here](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)
31
31
32
32
c. Create an access token [here](https://huggingface.co/settings/tokens)
33
33
34
-
d. Create a `config.yaml` file (you can copy from `config_example.yaml`):
34
+
d. Create a `config.yaml` file (you can copy from `config_example.yaml`), and add your HuggingFace token:
35
35
```yaml
36
36
HUGGING_FACE_HUB_TOKEN: your_token_here
37
37
```
@@ -42,11 +42,11 @@ The system has the following features:
42
42
OPENAI_API_KEY=your-api-key-here
43
43
```
44
44
45
-
4. If no API key is provided, the system will automatically download and use `Mistral-7B-Instruct-v0.2`for text generation when using the local model. No additional configuration is needed.
45
+
If no API key is provided, the system will automatically download and use `Mistral-7B-Instruct-v0.2` for text generation when using the local model. No additional configuration is needed.
46
46
47
47
## 1. Getting Started
48
48
49
-
You can use this solution in three ways:
49
+
You can launch this solution in three ways:
50
50
51
51
### 1. Using the Complete REST API
52
52
@@ -58,11 +58,11 @@ python main.py
58
58
59
59
The API will be available at `http://localhost:8000`. You can then use the API endpoints as described in the API Endpoints section below.
60
60
61
-
### 2. Using Individual Components via Command Line
61
+
### 2. Using Individual Python Components via Command Line
62
62
63
63
#### Process PDFs
64
64
65
-
Process a PDF file and save the chunks to a JSON file:
65
+
To process a PDF file and save the chunks to a JSON file, run:
Query documents using either OpenAI or a local model:
109
+
To query documents using either OpenAI or a local model, run:
95
110
96
111
```bash
97
112
# Using OpenAI (requires API key in .env)
@@ -103,7 +118,7 @@ python local_rag_agent.py --query "Can you explain the DaGAN Approach proposed i
103
118
104
119
### 3. Complete Pipeline Example
105
120
106
-
Here's how to process a document and query it using the local model:
121
+
First, we process a document and query it using the local model. Then, we add the document to the vector store and query from the knowledge base to get the RAG system in action.
python local_rag_agent.py --query "Can you explain the DaGAN Approach proposed in the Depth-Aware Generative Adversarial Network for Talking Head Video Generation article?"
117
-
```
118
132
119
-
Or using OpenAI (requires API key):
120
-
```bash
121
-
# Same steps 1 and 2 as above, then:
133
+
# Or using OpenAI (requires API key):
122
134
python rag_agent.py --query "Can you explain the DaGAN Approach proposed in the Depth-Aware Generative Adversarial Network for Talking Head Video Generation article?"
0 commit comments