-
Notifications
You must be signed in to change notification settings - Fork 9
Feature/rag agent script #21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Schimuneck
wants to merge
5
commits into
opendatahub-io:main
Choose a base branch
from
Schimuneck:feature/rag-agent-script
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 3 commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
443590d
Add RAG agent script with multi-document support
7b0de73
feat: Simplified RAG agent script with comprehensive configuration an…
4dc6a89
docs: enhance README to emphasize development lifecycle and agent dep…
5904e5c
Enhance RAG agent with advanced processing and better performance
c46a047
feat: update RAG agent setup with improvements and ignore input files
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
# Python virtual environment | ||
venv/ | ||
__pycache__/ | ||
*.pyc | ||
|
||
# IDE files | ||
.vscode/ | ||
.idea/ | ||
|
||
# Environment variables | ||
.env | ||
|
||
# Logs | ||
*.log |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,249 @@ | ||
# Simple RAG Agent Demo | ||
|
||
A didactic example for **facilitating the creation of RAG agents in llama-stack**. This demo provides a streamlined approach to quickly deploy agents with RAG capabilities using PDF and TXT documents as inputs, making it ideal for development lifecycle workflows. | ||
|
||
## Purpose | ||
|
||
This simple RAG script is designed to **facilitate the development lifecycle** by providing a quick and easy way to: | ||
- **Deploy agents rapidly** with RAG capabilities | ||
- **Process documents** (PDF and TXT) for knowledge base creation | ||
- **Create vector databases** automatically from your documents | ||
- **Set up AI agents** that can answer questions based on your specific documents | ||
- **Streamline the development process** for RAG-enabled applications | ||
|
||
## What is RAG? | ||
|
||
Retrieval Augmented Generation (RAG) is a technique that combines: | ||
1. **Document Retrieval**: Finding relevant information from a knowledge base | ||
2. **Text Generation**: Using an AI model to generate answers based on the retrieved information | ||
|
||
This approach helps AI models provide more accurate and up-to-date answers by grounding their responses in specific documents. | ||
|
||
## Development Lifecycle Benefits | ||
|
||
This script is particularly useful for: | ||
|
||
### 🚀 **Rapid Prototyping** | ||
- Quickly test RAG concepts with your documents | ||
- Iterate on agent configurations without complex setup | ||
- Validate document processing pipelines | ||
|
||
### 🔄 **Development Workflow** | ||
- Easy integration into CI/CD pipelines | ||
- Consistent agent creation across environments | ||
- Simplified testing of RAG functionality | ||
|
||
### 📚 **Document Processing** | ||
- Automated handling of PDF and TXT files | ||
- Built-in text extraction and chunking | ||
- Vector database setup without manual configuration | ||
|
||
### 🤖 **Agent Deployment** | ||
- One-command agent creation | ||
- Configurable agent parameters | ||
- Ready-to-use chat sessions | ||
|
||
## How This Demo Works | ||
|
||
The script demonstrates these simple steps: | ||
|
||
1. **📁 Load Documents**: Read text and PDF files from the `input_files` folder | ||
2. **🔄 Convert to Text**: Extract text content from different file formats | ||
3. **🗄️ Store in Vector DB**: Save documents in a searchable vector database | ||
4. **🤖 Create Agent**: Set up an AI agent that can query the documents | ||
5. **💬 Ask Questions**: Query the agent to get answers based on your documents | ||
|
||
## Prerequisites | ||
|
||
- Python 3.8+ | ||
- A running llama-stack instance (see setup below) | ||
- Some text or PDF files to process | ||
|
||
## Setup | ||
|
||
### 1. Install Dependencies | ||
|
||
```bash | ||
# Create and activate virtual environment | ||
python3 -m venv venv | ||
source venv/bin/activate # On Windows: venv\Scripts\activate | ||
|
||
# Install required packages | ||
pip install -r requirements.txt | ||
``` | ||
|
||
### 2. Start llama-stack | ||
|
||
Make sure you have llama-stack running and accessible. You can use port-forwarding to access it locally: | ||
|
||
```bash | ||
# If running on OpenShift | ||
oc port-forward svc/lsd-llama-milvus 8081:8081 | ||
|
||
# Or if running locally | ||
# Follow llama-stack installation instructions | ||
``` | ||
|
||
### 3. Add Your Documents | ||
|
||
Place your text (`.txt`) and PDF (`.pdf`) files in the `input_files` folder: | ||
|
||
``` | ||
input_files/ | ||
├── document1.txt | ||
├── document2.pdf | ||
└── ... | ||
``` | ||
|
||
## Usage | ||
|
||
### Run the RAG Setup | ||
|
||
```bash | ||
python setup_rag_agent.py | ||
``` | ||
|
||
The script will: | ||
- Load all documents from `input_files/` | ||
- Create a vector database | ||
- Set up a RAG agent | ||
- Provide you with the IDs and a curl command to query the agent | ||
|
||
### Query Your RAG Agent | ||
|
||
After running the script, you'll get a curl command like this: | ||
|
||
```bash | ||
curl -X POST http://localhost:8081/v1/agents/{agent_id}/session/{session_id}/turn \ | ||
-H "Content-Type: application/json" \ | ||
-d '{ | ||
"messages": [ | ||
{ | ||
"role": "user", | ||
"content": "What is this document about?" | ||
} | ||
], | ||
"stream": true | ||
}' | ||
``` | ||
|
||
### Example Questions | ||
|
||
Try asking questions like: | ||
- "What is the main topic of the documents?" | ||
- "What are the key points mentioned?" | ||
- "Can you summarize the content?" | ||
- "What specific details are mentioned about [topic]?" | ||
|
||
## Configuration | ||
|
||
You can modify these settings at the top of `setup_rag_agent.py`: | ||
|
||
### Basic Settings | ||
```python | ||
LLAMA_STACK_URL = "http://localhost:8081" # Your llama-stack URL | ||
INFERENCE_MODEL = "vllm" # Model for generating answers | ||
EMBEDDING_MODEL = "granite-embedding-125m" # Model for embeddings | ||
AGENT_NAME = "Simple RAG Agent" # Custom name for your agent | ||
``` | ||
|
||
### Document Processing | ||
```python | ||
INPUT_FOLDER = "input_files" # Folder containing your documents | ||
SUPPORTED_EXTENSIONS = [".txt", ".pdf"] # File types to process | ||
CHUNK_SIZE_IN_TOKENS = 256 # Size of text chunks for vector database | ||
``` | ||
|
||
### Vector Database | ||
```python | ||
VECTOR_DB_PROVIDER = "milvus" # Vector database provider | ||
VECTOR_DB_PREFIX = "simple-rag-db" # Prefix for vector database ID | ||
``` | ||
|
||
### RAG Agent Settings | ||
```python | ||
TOP_K = 3 # Number of most relevant chunks to retrieve | ||
SIMILARITY_THRESHOLD = 0.0 # Minimum similarity score for retrieval | ||
MAX_INFER_ITERS = 10 # Maximum inference iterations | ||
ENABLE_SESSION_PERSISTENCE = False # Whether to persist sessions | ||
``` | ||
|
||
### PDF Processing | ||
```python | ||
PDF_DO_OCR = False # Whether to perform OCR on PDFs | ||
PDF_DO_TABLE_STRUCTURE = True # Whether to extract table structures | ||
PDF_DO_CELL_MATCHING = True # Whether to perform cell matching in tables | ||
``` | ||
|
||
### Session & Logging | ||
```python | ||
SESSION_NAME = "simple-rag-session" # Name for the chat session | ||
LOG_LEVEL = "INFO" # Logging level (DEBUG, INFO, WARNING, ERROR) | ||
``` | ||
|
||
### Agent Instructions | ||
```python | ||
AGENT_INSTRUCTIONS = """You are a helpful assistant...""" # Custom instructions for the agent | ||
``` | ||
|
||
## Supported File Types | ||
|
||
- **Text files** (`.txt`): Plain text documents | ||
- **PDF files** (`.pdf`): PDF documents with text extraction and table structure | ||
|
||
## Troubleshooting | ||
|
||
### Connection Issues | ||
- Make sure llama-stack is running and accessible | ||
- Check the `LLAMA_STACK_URL` configuration | ||
- Verify port-forwarding is working | ||
|
||
### Document Processing Issues | ||
- Ensure files are in supported formats (`.txt`, `.pdf`) | ||
- Check file permissions and encoding | ||
- For PDFs, make sure they contain extractable text | ||
|
||
### Model Issues | ||
- Verify the specified models are available in your llama-stack | ||
- Check model names match exactly | ||
|
||
## Understanding the Code | ||
|
||
The script is structured in simple, clear functions: | ||
|
||
- `load_text_file()`: Reads plain text files | ||
- `load_pdf_file()`: Extracts text from PDFs using docling | ||
- `load_documents_from_folder()`: Processes all files in the input folder | ||
- `setup_vector_database()`: Creates and populates the vector database | ||
- `create_rag_agent()`: Sets up the AI agent with RAG capabilities | ||
- `create_session()`: Creates a chat session for the agent | ||
|
||
Each function has a single responsibility and clear error handling, making it easy to understand and modify. | ||
|
||
## Next Steps | ||
|
||
Once you understand this basic RAG setup, you can explore: | ||
|
||
### 🔧 **Development Enhancements** | ||
- **Custom agent configurations** for specific use cases | ||
- **Advanced document processing** pipelines | ||
- **Integration with CI/CD** for automated agent deployment | ||
- **Environment-specific configurations** (dev, staging, prod) | ||
|
||
### 🚀 **Production Deployment** | ||
- **Web interface** for agent management | ||
- **API endpoints** for programmatic agent creation | ||
- **Monitoring and logging** for agent performance | ||
- **Scalable vector database** configurations | ||
|
||
### 📊 **Advanced Features** | ||
- **Custom retrieval strategies** for better document matching | ||
- **Multi-modal document support** (images, audio, etc.) | ||
- **Real-time document updates** and agent retraining | ||
- **Performance optimization** for large document sets | ||
|
||
### 🔗 **Integration Possibilities** | ||
- **Chatbot interfaces** for end users | ||
- **Knowledge management systems** | ||
- **Documentation assistants** | ||
- **Customer support automation** |
Binary file not shown.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
The Simpsons – Overview | ||
The Simpsons is a long-running animated television sitcom created by Matt Groening. First airing in 1989, the show is set in the fictional town of Springfield and satirizes American culture, society, and television. The series centers around the Simpson family — Homer, Marge, Bart, Lisa, and Maggie — and their interactions with a diverse range of supporting characters. Known for its humor, cultural references, and social commentary, The Simpsons has become a cornerstone of modern pop culture and is one of the most influential TV shows in history. | ||
|
||
Homer Simpson | ||
Homer Jay Simpson is the bumbling but lovable father of the Simpson family. He works at the Springfield Nuclear Power Plant as a safety inspector, despite being lazy and incompetent. Homer is known for his love of donuts, beer (especially Duff Beer), and television. He's overweight, bald, and often acts impulsively, but he deeply loves his family in his own misguided way. | ||
|
||
Marge Simpson | ||
Marjorie "Marge" Simpson is the caring and patient matriarch of the family. She has a tall blue beehive hairstyle and is known for her moral integrity and common sense. Marge is a homemaker who often acts as the voice of reason in the chaotic Simpson household. She is deeply devoted to her husband and children, even when they drive her crazy. | ||
|
||
Bart Simpson | ||
Bartholomew JoJo "Bart" Simpson is the 10-year-old son of Homer and Marge. A rebellious troublemaker, Bart is known for his mischievous pranks, slingshot, and catchphrases like “Eat my shorts!” He struggles academically but is street-smart and clever. Bart often clashes with authority figures and is a constant source of stress for his teachers and Principal Skinner. | ||
|
||
Lisa Simpson | ||
Lisa Marie Simpson is the 8-year-old daughter of the Simpsons. Highly intelligent and talented, Lisa excels in school and plays the saxophone. She is a vegetarian, a Buddhist, and an environmentalist with strong social values. Despite being the youngest sibling after Maggie, she is often the most mature member of the family and frequently challenges societal norms. | ||
|
||
Maggie Simpson | ||
Margaret "Maggie" Simpson is the baby of the family. She rarely speaks but is known for her pacifier-sucking and surprising displays of intelligence. Maggie is observant and occasionally performs impressive feats for her age, often unnoticed by the adults around her. | ||
|
||
Grandpa Simpson | ||
Abraham "Abe" Simpson is Homer’s elderly father. A resident of the Springfield Retirement Castle, Grandpa often tells long-winded, rambling stories about his past. He’s forgetful and sometimes grumpy, but he genuinely cares for his family despite being a little out of touch. | ||
|
||
Milhouse Van Houten | ||
Milhouse is Bart’s best friend. He’s awkward, nerdy, and often the victim of school bullies. Milhouse is loyal but easily manipulated, and he has an unrequited crush on Lisa. | ||
|
||
Mr. Burns | ||
Charles Montgomery Burns is the wealthy, elderly owner of the Springfield Nuclear Power Plant. He’s greedy, ruthless, and power-hungry, often putting profit over safety or morality. Mr. Burns is one of the main antagonists of the series, frequently showing disregard for others, especially his employees. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
llama-stack-client>=0.1.0 | ||
pathlib>=1.0.1 | ||
fire>=0.5.0 | ||
requests>=2.31.0 | ||
docling>=0.1.0 | ||
pypdfium2>=4.0.0 | ||
Pillow>=9.0.0 | ||
numpy>=1.21.0 | ||
pandas>=1.3.0 |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Remove unnecessary pathlib dependency.
The
pathlib
package has been part of Python's standard library since Python 3.4 and should not be listed as an external dependency. This could cause installation issues or confusion.-pathlib>=1.0.1
📝 Committable suggestion
🤖 Prompt for AI Agents