Skip to content

Commit ab75086

Browse files
committed
feat: Improve RAG agent flow - Use LLM directly for general knowledge queries - Update store path to embeddings/ - Add detailed logging - Update documentation
1 parent ac5fdea commit ab75086

File tree

4 files changed

+112
-30
lines changed

4 files changed

+112
-30
lines changed

agentic_rag/README.md

Lines changed: 26 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -73,47 +73,53 @@ python pdf_processor.py --input path/to/pdf/directory --output chunks.json
7373
7474
# Process a single PDF from a URL
7575
python pdf_processor.py --input https://example.com/document.pdf --output chunks.json
76+
# sample pdf: https://arxiv.org/pdf/2203.06605
7677
```
7778

7879
#### Manage Vector Store
7980

8081
Add documents to the vector store and query them:
82+
8183
```bash
8284
# Add documents from a chunks file
83-
python store.py --add chunks.json --store-path my_chroma_db
85+
python store.py --add chunks.json
8486
85-
# Query the vector store
86-
python store.py --query "your search query" --store-path my_chroma_db
87+
# Query the vector store directly, or with local_rag_agent.py
88+
python store.py --query "your search query"
89+
python local_rag_agent.py --query "your search query"
8790
```
8891

8992
#### Use RAG Agent
90-
Query documents using either the OpenAI or local model:
93+
94+
Query documents using either OpenAI or a local model:
95+
9196
```bash
9297
# Using OpenAI (requires API key in .env)
93-
python rag_agent.py --query "What are the main topics?" --store-path my_chroma_db
98+
python rag_agent.py --query "Can you explain the DaGAN Approach proposed in the Depth-Aware Generative Adversarial Network for Talking Head Video Generation article?"
9499
95100
# Using local Mistral model
96-
python local_rag_agent.py --query "What are the main topics?" --store-path my_chroma_db
101+
python local_rag_agent.py --query "Can you explain the DaGAN Approach proposed in the Depth-Aware Generative Adversarial Network for Talking Head Video Generation article?"
97102
```
98103

99104
### 3. Complete Pipeline Example
100105

101106
Here's how to process a document and query it using the local model:
107+
102108
```bash
103109
# 1. Process the PDF
104110
python pdf_processor.py --input example.pdf --output chunks.json
105111
106112
# 2. Add to vector store
107-
python store.py --add chunks.json --store-path my_chroma_db
113+
python store.py --add chunks.json
108114
109115
# 3. Query using local model
110-
python local_rag_agent.py --query "What is the main conclusion?" --store-path my_chroma_db
116+
python local_rag_agent.py --query "Can you explain the DaGAN Approach proposed in the Depth-Aware Generative Adversarial Network for Talking Head Video Generation article?"
111117
```
112118
113119
Or using OpenAI (requires API key):
114120
```bash
115121
# Same steps 1 and 2 as above, then:
116-
python rag_agent.py --query "What is the main conclusion?" --store-path my_chroma_db
122+
python rag_agent.py --query "Can you explain the DaGAN Approach proposed in the Depth-Aware Generative Adversarial Network for Talking Head Video Generation article?"
117123
```
118124
119125
## API Endpoints
@@ -153,11 +159,19 @@ The system consists of several key components:
153159
- Local Agent: Uses `Mistral-7B` as an open-source alternative
154160
4. **FastAPI Server**: Provides REST API endpoints for document upload and querying
155161
162+
The RAG Agent flow is the following:
163+
164+
1. Analyzes query type
165+
2. Try to find relevant PDF context, regardless of query type
166+
3. If PDF context is found, use it to generate a response.
167+
4. If no PDF context is found OR if it's a general knowledge query, use the pre-trainedLLM directly
168+
5. Fall back to a "no information" response only in edge cases.
169+
156170
## Hardware Requirements
157171

158-
- For OpenAI Agent: Standard CPU machine
159-
- For Local Agent:
160-
- Minimum 16GB RAM, recommended more than 24GBs
172+
- For the OpenAI Agent: Standard CPU machine
173+
- For the Local Agent:
174+
- Minimum 16GB RAM (recommended >24GBs)
161175
- GPU with 8GB VRAM recommended for better performance
162176
- Will run on CPU if GPU is not available, but will be significantly slower.
163177

agentic_rag/local_rag_agent.py

Lines changed: 83 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,15 @@
66
import argparse
77
import yaml
88
import os
9+
import logging
10+
11+
# Configure logging
12+
logging.basicConfig(
13+
level=logging.INFO,
14+
format='%(asctime)s - %(levelname)s - %(message)s',
15+
datefmt='%H:%M:%S'
16+
)
17+
logger = logging.getLogger(__name__)
918

1019
class QueryAnalysis(BaseModel):
1120
"""Pydantic model for query analysis output"""
@@ -107,42 +116,89 @@ def _analyze_query(self, query: str) -> QueryAnalysis:
107116
requires_context=True
108117
)
109118

119+
def _generate_direct_response(self, query: str) -> Dict[str, Any]:
120+
"""Generate a response directly from the LLM without context"""
121+
logger.info("Generating direct response from LLM without context...")
122+
123+
prompt = f"""You are a helpful AI assistant. Please answer the following query to the best of your ability.
124+
If you're not confident about the answer, please say so.
125+
126+
Query: {query}
127+
128+
Answer:"""
129+
130+
logger.info("Generating response using local model...")
131+
response = self._generate_text(prompt, max_length=1024)
132+
logger.info("Response generation complete")
133+
134+
return {
135+
"answer": response,
136+
"context": []
137+
}
138+
110139
def process_query(self, query: str) -> Dict[str, Any]:
111140
"""Process a user query using the agentic RAG pipeline"""
141+
logger.info(f"Starting to process query: {query}")
142+
112143
# Analyze the query
144+
logger.info("Analyzing query type and context requirements...")
113145
analysis = self._analyze_query(query)
146+
logger.info(f"Query analysis results:")
147+
logger.info(f"- Type: {analysis.query_type}")
148+
logger.info(f"- Requires context: {analysis.requires_context}")
149+
logger.info(f"- Reasoning: {analysis.reasoning}")
114150

115151
# If query type is unsupported, return early
116152
if analysis.query_type == "unsupported":
153+
logger.warning("Query type is unsupported")
117154
return {
118155
"answer": "I apologize, but I don't have the information to answer this query.",
119156
"reasoning": analysis.reasoning,
120157
"context": []
121158
}
122159

123-
# Retrieve relevant context based on query type
124-
if analysis.query_type == "pdf_documents":
125-
context = self.vector_store.query_pdf_collection(query)
126-
else:
127-
context = self.vector_store.query_general_collection(query)
160+
# First try to get context from PDF documents
161+
logger.info("Querying PDF collection...")
162+
context = self.vector_store.query_pdf_collection(query)
163+
logger.info(f"Retrieved {len(context)} context chunks")
128164

129-
# Generate response using context
130-
if context and analysis.requires_context:
165+
if context:
166+
# If we found relevant PDF context, use it
167+
for i, ctx in enumerate(context):
168+
source = ctx["metadata"].get("source", "Unknown")
169+
pages = ctx["metadata"].get("page_numbers", [])
170+
logger.info(f"Context chunk {i+1}:")
171+
logger.info(f"- Source: {source}")
172+
logger.info(f"- Pages: {pages}")
173+
logger.info(f"- Content preview: {ctx['content'][:100]}...")
174+
175+
logger.info("Generating response with PDF context...")
131176
response = self._generate_response(query, context)
132-
else:
133-
response = {
134-
"answer": "I couldn't find relevant information to answer your query.",
135-
"reasoning": analysis.reasoning,
136-
"context": []
137-
}
177+
logger.info("Response generated successfully")
178+
return response
179+
180+
# If no PDF context found or if it's a general knowledge query,
181+
# use the LLM directly
182+
if analysis.query_type == "general_knowledge" or not context:
183+
logger.info("No relevant PDF context found or general knowledge query detected")
184+
logger.info("Falling back to direct LLM response...")
185+
return self._generate_direct_response(query)
138186

139-
return response
187+
# This case should rarely happen, but just in case
188+
logger.warning("No relevant context found and query type is not general knowledge")
189+
return {
190+
"answer": "I couldn't find relevant information to answer your query.",
191+
"reasoning": analysis.reasoning,
192+
"context": []
193+
}
140194

141195
def _generate_response(self, query: str, context: List[Dict[str, Any]]) -> Dict[str, Any]:
142196
"""Generate a response using the retrieved context"""
197+
logger.info("Preparing context for response generation...")
143198
context_str = "\n\n".join([f"Context {i+1}:\n{item['content']}"
144199
for i, item in enumerate(context)])
145200

201+
logger.info("Building prompt with context...")
146202
prompt = f"""Answer the following query using the provided context.
147203
If the context doesn't contain enough information to answer accurately,
148204
say so explicitly.
@@ -154,7 +210,9 @@ def _generate_response(self, query: str, context: List[Dict[str, Any]]) -> Dict[
154210
155211
Answer:"""
156212

213+
logger.info("Generating response using local model...")
157214
response = self._generate_text(prompt, max_length=1024)
215+
logger.info("Response generation complete")
158216

159217
return {
160218
"answer": response,
@@ -164,16 +222,25 @@ def _generate_response(self, query: str, context: List[Dict[str, Any]]) -> Dict[
164222
def main():
165223
parser = argparse.ArgumentParser(description="Query documents using local Mistral model")
166224
parser.add_argument("--query", required=True, help="Query to process")
167-
parser.add_argument("--store-path", default="chroma_db", help="Path to the vector store")
225+
parser.add_argument("--store-path", default="embeddings", help="Path to the vector store")
168226
parser.add_argument("--model", default="mistralai/Mistral-7B-Instruct-v0.2", help="Model to use")
227+
parser.add_argument("--quiet", action="store_true", help="Disable verbose logging")
169228

170229
args = parser.parse_args()
171230

231+
# Set logging level based on quiet flag
232+
if args.quiet:
233+
logger.setLevel(logging.WARNING)
234+
else:
235+
logger.setLevel(logging.INFO)
236+
172237
print("\nInitializing RAG agent...")
173238
print("=" * 50)
174239

175240
try:
241+
logger.info(f"Initializing vector store from: {args.store_path}")
176242
store = VectorStore(persist_directory=args.store_path)
243+
logger.info("Initializing local RAG agent...")
177244
agent = LocalRAGAgent(store, model_name=args.model)
178245

179246
print(f"\nProcessing query: {args.query}")
@@ -193,6 +260,7 @@ def main():
193260
print(f"- {source} (pages: {pages})")
194261

195262
except Exception as e:
263+
logger.error(f"Error during execution: {str(e)}", exc_info=True)
196264
print(f"\n✗ Error: {str(e)}")
197265
exit(1)
198266

agentic_rag/main.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@
3434
pdf_processor = PDFProcessor()
3535
vector_store = VectorStore()
3636

37-
# Initialize RAG agent - use OpenAI if API key is available, otherwise use local model
37+
# Initialize RAG agent - use OpenAI if API key is available, otherwise use local model. by default = local model
3838
openai_api_key = os.getenv("OPENAI_API_KEY")
3939
if openai_api_key:
4040
print("\nUsing OpenAI GPT-4 for RAG...")

agentic_rag/store.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
from chromadb.config import Settings
66

77
class VectorStore:
8-
def __init__(self, persist_directory: str = "chroma_db"):
8+
def __init__(self, persist_directory: str = "embeddings"):
99
"""Initialize vector store with ChromaDB"""
1010
self.client = chromadb.PersistentClient(
1111
path=persist_directory,
@@ -113,7 +113,7 @@ def main():
113113
parser = argparse.ArgumentParser(description="Manage vector store")
114114
parser.add_argument("--add", help="JSON file containing chunks to add")
115115
parser.add_argument("--query", help="Query to search for")
116-
parser.add_argument("--store-path", default="chroma_db", help="Path to vector store")
116+
parser.add_argument("--store-path", default="embeddings", help="Path to vector store")
117117

118118
args = parser.parse_args()
119119
store = VectorStore(persist_directory=args.store_path)

0 commit comments

Comments
 (0)