Skip to content

Feature/rag agent script #21

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions demos/local/simple_rag/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Python virtual environment
venv/
__pycache__/
*.pyc

# IDE files
.vscode/
.idea/

# Environment variables
.env

# Logs
*.log
249 changes: 249 additions & 0 deletions demos/local/simple_rag/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,249 @@
# Simple RAG Agent Demo

A didactic example for **facilitating the creation of RAG agents in llama-stack**. This demo provides a streamlined approach to quickly deploy agents with RAG capabilities using PDF and TXT documents as inputs, making it ideal for development lifecycle workflows.

## Purpose

This simple RAG script is designed to **facilitate the development lifecycle** by providing a quick and easy way to:
- **Deploy agents rapidly** with RAG capabilities
- **Process documents** (PDF and TXT) for knowledge base creation
- **Create vector databases** automatically from your documents
- **Set up AI agents** that can answer questions based on your specific documents
- **Streamline the development process** for RAG-enabled applications

## What is RAG?

Retrieval Augmented Generation (RAG) is a technique that combines:
1. **Document Retrieval**: Finding relevant information from a knowledge base
2. **Text Generation**: Using an AI model to generate answers based on the retrieved information

This approach helps AI models provide more accurate and up-to-date answers by grounding their responses in specific documents.

## Development Lifecycle Benefits

This script is particularly useful for:

### 🚀 **Rapid Prototyping**
- Quickly test RAG concepts with your documents
- Iterate on agent configurations without complex setup
- Validate document processing pipelines

### 🔄 **Development Workflow**
- Easy integration into CI/CD pipelines
- Consistent agent creation across environments
- Simplified testing of RAG functionality

### 📚 **Document Processing**
- Automated handling of PDF and TXT files
- Built-in text extraction and chunking
- Vector database setup without manual configuration

### 🤖 **Agent Deployment**
- One-command agent creation
- Configurable agent parameters
- Ready-to-use chat sessions

## How This Demo Works

The script demonstrates these simple steps:

1. **📁 Load Documents**: Read text and PDF files from the `input_files` folder
2. **🔄 Convert to Text**: Extract text content from different file formats
3. **🗄️ Store in Vector DB**: Save documents in a searchable vector database
4. **🤖 Create Agent**: Set up an AI agent that can query the documents
5. **💬 Ask Questions**: Query the agent to get answers based on your documents

## Prerequisites

- Python 3.8+
- A running llama-stack instance (see setup below)
- Some text or PDF files to process

## Setup

### 1. Install Dependencies

```bash
# Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate

# Install required packages
pip install -r requirements.txt
```

### 2. Start llama-stack

Make sure you have llama-stack running and accessible. You can use port-forwarding to access it locally:

```bash
# If running on OpenShift
oc port-forward svc/lsd-llama-milvus 8081:8081

# Or if running locally
# Follow llama-stack installation instructions
```

### 3. Add Your Documents

Place your text (`.txt`) and PDF (`.pdf`) files in the `input_files` folder:

```
input_files/
├── document1.txt
├── document2.pdf
└── ...
```

## Usage

### Run the RAG Setup

```bash
python setup_rag_agent.py
```

The script will:
- Load all documents from `input_files/`
- Create a vector database
- Set up a RAG agent
- Provide you with the IDs and a curl command to query the agent

### Query Your RAG Agent

After running the script, you'll get a curl command like this:

```bash
curl -X POST http://localhost:8081/v1/agents/{agent_id}/session/{session_id}/turn \
-H "Content-Type: application/json" \
-d '{
"messages": [
{
"role": "user",
"content": "What is this document about?"
}
],
"stream": true
}'
```

### Example Questions

Try asking questions like:
- "What is the main topic of the documents?"
- "What are the key points mentioned?"
- "Can you summarize the content?"
- "What specific details are mentioned about [topic]?"

## Configuration

You can modify these settings at the top of `setup_rag_agent.py`:

### Basic Settings
```python
LLAMA_STACK_URL = "http://localhost:8081" # Your llama-stack URL
INFERENCE_MODEL = "vllm" # Model for generating answers
EMBEDDING_MODEL = "granite-embedding-125m" # Model for embeddings
AGENT_NAME = "Simple RAG Agent" # Custom name for your agent
```

### Document Processing
```python
INPUT_FOLDER = "input_files" # Folder containing your documents
SUPPORTED_EXTENSIONS = [".txt", ".pdf"] # File types to process
CHUNK_SIZE_IN_TOKENS = 256 # Size of text chunks for vector database
```

### Vector Database
```python
VECTOR_DB_PROVIDER = "milvus" # Vector database provider
VECTOR_DB_PREFIX = "simple-rag-db" # Prefix for vector database ID
```

### RAG Agent Settings
```python
TOP_K = 3 # Number of most relevant chunks to retrieve
SIMILARITY_THRESHOLD = 0.0 # Minimum similarity score for retrieval
MAX_INFER_ITERS = 10 # Maximum inference iterations
ENABLE_SESSION_PERSISTENCE = False # Whether to persist sessions
```

### PDF Processing
```python
PDF_DO_OCR = False # Whether to perform OCR on PDFs
PDF_DO_TABLE_STRUCTURE = True # Whether to extract table structures
PDF_DO_CELL_MATCHING = True # Whether to perform cell matching in tables
```

### Session & Logging
```python
SESSION_NAME = "simple-rag-session" # Name for the chat session
LOG_LEVEL = "INFO" # Logging level (DEBUG, INFO, WARNING, ERROR)
```

### Agent Instructions
```python
AGENT_INSTRUCTIONS = """You are a helpful assistant...""" # Custom instructions for the agent
```

## Supported File Types

- **Text files** (`.txt`): Plain text documents
- **PDF files** (`.pdf`): PDF documents with text extraction and table structure

## Troubleshooting

### Connection Issues
- Make sure llama-stack is running and accessible
- Check the `LLAMA_STACK_URL` configuration
- Verify port-forwarding is working

### Document Processing Issues
- Ensure files are in supported formats (`.txt`, `.pdf`)
- Check file permissions and encoding
- For PDFs, make sure they contain extractable text

### Model Issues
- Verify the specified models are available in your llama-stack
- Check model names match exactly

## Understanding the Code

The script is structured in simple, clear functions:

- `load_text_file()`: Reads plain text files
- `load_pdf_file()`: Extracts text from PDFs using docling
- `load_documents_from_folder()`: Processes all files in the input folder
- `setup_vector_database()`: Creates and populates the vector database
- `create_rag_agent()`: Sets up the AI agent with RAG capabilities
- `create_session()`: Creates a chat session for the agent

Each function has a single responsibility and clear error handling, making it easy to understand and modify.

## Next Steps

Once you understand this basic RAG setup, you can explore:

### 🔧 **Development Enhancements**
- **Custom agent configurations** for specific use cases
- **Advanced document processing** pipelines
- **Integration with CI/CD** for automated agent deployment
- **Environment-specific configurations** (dev, staging, prod)

### 🚀 **Production Deployment**
- **Web interface** for agent management
- **API endpoints** for programmatic agent creation
- **Monitoring and logging** for agent performance
- **Scalable vector database** configurations

### 📊 **Advanced Features**
- **Custom retrieval strategies** for better document matching
- **Multi-modal document support** (images, audio, etc.)
- **Real-time document updates** and agent retraining
- **Performance optimization** for large document sets

### 🔗 **Integration Possibilities**
- **Chatbot interfaces** for end users
- **Knowledge management systems**
- **Documentation assistants**
- **Customer support automation**
Binary file added demos/local/simple_rag/input_files/RAGSurvey.pdf
Binary file not shown.
26 changes: 26 additions & 0 deletions demos/local/simple_rag/input_files/example.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
The Simpsons – Overview
The Simpsons is a long-running animated television sitcom created by Matt Groening. First airing in 1989, the show is set in the fictional town of Springfield and satirizes American culture, society, and television. The series centers around the Simpson family — Homer, Marge, Bart, Lisa, and Maggie — and their interactions with a diverse range of supporting characters. Known for its humor, cultural references, and social commentary, The Simpsons has become a cornerstone of modern pop culture and is one of the most influential TV shows in history.

Homer Simpson
Homer Jay Simpson is the bumbling but lovable father of the Simpson family. He works at the Springfield Nuclear Power Plant as a safety inspector, despite being lazy and incompetent. Homer is known for his love of donuts, beer (especially Duff Beer), and television. He's overweight, bald, and often acts impulsively, but he deeply loves his family in his own misguided way.

Marge Simpson
Marjorie "Marge" Simpson is the caring and patient matriarch of the family. She has a tall blue beehive hairstyle and is known for her moral integrity and common sense. Marge is a homemaker who often acts as the voice of reason in the chaotic Simpson household. She is deeply devoted to her husband and children, even when they drive her crazy.

Bart Simpson
Bartholomew JoJo "Bart" Simpson is the 10-year-old son of Homer and Marge. A rebellious troublemaker, Bart is known for his mischievous pranks, slingshot, and catchphrases like “Eat my shorts!” He struggles academically but is street-smart and clever. Bart often clashes with authority figures and is a constant source of stress for his teachers and Principal Skinner.

Lisa Simpson
Lisa Marie Simpson is the 8-year-old daughter of the Simpsons. Highly intelligent and talented, Lisa excels in school and plays the saxophone. She is a vegetarian, a Buddhist, and an environmentalist with strong social values. Despite being the youngest sibling after Maggie, she is often the most mature member of the family and frequently challenges societal norms.

Maggie Simpson
Margaret "Maggie" Simpson is the baby of the family. She rarely speaks but is known for her pacifier-sucking and surprising displays of intelligence. Maggie is observant and occasionally performs impressive feats for her age, often unnoticed by the adults around her.

Grandpa Simpson
Abraham "Abe" Simpson is Homer’s elderly father. A resident of the Springfield Retirement Castle, Grandpa often tells long-winded, rambling stories about his past. He’s forgetful and sometimes grumpy, but he genuinely cares for his family despite being a little out of touch.

Milhouse Van Houten
Milhouse is Bart’s best friend. He’s awkward, nerdy, and often the victim of school bullies. Milhouse is loyal but easily manipulated, and he has an unrequited crush on Lisa.

Mr. Burns
Charles Montgomery Burns is the wealthy, elderly owner of the Springfield Nuclear Power Plant. He’s greedy, ruthless, and power-hungry, often putting profit over safety or morality. Mr. Burns is one of the main antagonists of the series, frequently showing disregard for others, especially his employees.
9 changes: 9 additions & 0 deletions demos/local/simple_rag/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
llama-stack-client>=0.1.0
pathlib>=1.0.1
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Remove unnecessary pathlib dependency.

The pathlib package has been part of Python's standard library since Python 3.4 and should not be listed as an external dependency. This could cause installation issues or confusion.

-pathlib>=1.0.1
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pathlib>=1.0.1
🤖 Prompt for AI Agents
In demos/local/simple_rag/requirements.txt at line 2, remove the line specifying
the pathlib dependency because pathlib is included in Python's standard library
since version 3.4 and does not need to be installed separately.

fire>=0.5.0
requests>=2.31.0
docling>=0.1.0
pypdfium2>=4.0.0
Pillow>=9.0.0
numpy>=1.21.0
pandas>=1.3.0
Loading