opendatahub-io · Schimuneck · Jun 12, 2025 · Jun 30, 2025 · Jun 30, 2025 · Jul 2, 2025
diff --git a/demos/local/simple_rag/.gitignore b/demos/local/simple_rag/.gitignore
@@ -0,0 +1,14 @@
+# Python virtual environment
+venv/
+__pycache__/
+*.pyc
+
+# IDE files
+.vscode/
+.idea/
+
+# Environment variables
+.env
+
+# Logs
+*.log 
diff --git a/demos/local/simple_rag/README.md b/demos/local/simple_rag/README.md
@@ -0,0 +1,249 @@
+# Simple RAG Agent Demo
+
+A didactic example for **facilitating the creation of RAG agents in llama-stack**. This demo provides a streamlined approach to quickly deploy agents with RAG capabilities using PDF and TXT documents as inputs, making it ideal for development lifecycle workflows.
+
+## Purpose
+
+This simple RAG script is designed to **facilitate the development lifecycle** by providing a quick and easy way to:
+- **Deploy agents rapidly** with RAG capabilities
+- **Process documents** (PDF and TXT) for knowledge base creation
+- **Create vector databases** automatically from your documents
+- **Set up AI agents** that can answer questions based on your specific documents
+- **Streamline the development process** for RAG-enabled applications
+
+## What is RAG?
+
+Retrieval Augmented Generation (RAG) is a technique that combines:
+1. **Document Retrieval**: Finding relevant information from a knowledge base
+2. **Text Generation**: Using an AI model to generate answers based on the retrieved information
+
+This approach helps AI models provide more accurate and up-to-date answers by grounding their responses in specific documents.
+
+## Development Lifecycle Benefits
+
+This script is particularly useful for:
+
+### 🚀 **Rapid Prototyping**
+- Quickly test RAG concepts with your documents
+- Iterate on agent configurations without complex setup
+- Validate document processing pipelines
+
+### 🔄 **Development Workflow**
+- Easy integration into CI/CD pipelines
+- Consistent agent creation across environments
+- Simplified testing of RAG functionality
+
+### 📚 **Document Processing**
+- Automated handling of PDF and TXT files
+- Built-in text extraction and chunking
+- Vector database setup without manual configuration
+
+### 🤖 **Agent Deployment**
+- One-command agent creation
+- Configurable agent parameters
+- Ready-to-use chat sessions
+
+## How This Demo Works
+
+The script demonstrates these simple steps:
+
+1. **📁 Load Documents**: Read text and PDF files from the `input_files` folder
+2. **🔄 Convert to Text**: Extract text content from different file formats
+3. **🗄️ Store in Vector DB**: Save documents in a searchable vector database
+4. **🤖 Create Agent**: Set up an AI agent that can query the documents
+5. **💬 Ask Questions**: Query the agent to get answers based on your documents
+
+## Prerequisites
+
+- Python 3.8+
+- A running llama-stack instance (see setup below)
+- Some text or PDF files to process
+
+## Setup
+
+### 1. Install Dependencies
+
+```bash
+# Create and activate virtual environment
+python3 -m venv venv
+source venv/bin/activate  # On Windows: venv\Scripts\activate
+
+# Install required packages
+pip install -r requirements.txt
+```
+
+### 2. Start llama-stack
+
+Make sure you have llama-stack running and accessible. You can use port-forwarding to access it locally:
+
+```bash
+# If running on OpenShift
+oc port-forward svc/lsd-llama-milvus 8081:8081
+
+# Or if running locally
+# Follow llama-stack installation instructions
+```
+
+### 3. Add Your Documents
+
+Place your text (`.txt`) and PDF (`.pdf`) files in the `input_files` folder:
+
+```
+input_files/
+├── document1.txt
+├── document2.pdf
+└── ...
+```
+
+## Usage
+
+### Run the RAG Setup
+
+```bash
+python setup_rag_agent.py
+```
+
+The script will:
+- Load all documents from `input_files/`
+- Create a vector database
+- Set up a RAG agent
+- Provide you with the IDs and a curl command to query the agent
+
+### Query Your RAG Agent
+
+After running the script, you'll get a curl command like this:
+
+```bash
+curl -X POST http://localhost:8081/v1/agents/{agent_id}/session/{session_id}/turn \
+  -H "Content-Type: application/json" \
+  -d '{
+    "messages": [
+      {
+        "role": "user",
+        "content": "What is this document about?"
+      }
+    ],
+    "stream": true
+  }'
+```
+
+### Example Questions
+
+Try asking questions like:
+- "What is the main topic of the documents?"
+- "What are the key points mentioned?"
+- "Can you summarize the content?"
+- "What specific details are mentioned about [topic]?"
+
+## Configuration
+
+You can modify these settings at the top of `setup_rag_agent.py`:
+
+### Basic Settings
+```python
+LLAMA_STACK_URL = "http://localhost:8081"  # Your llama-stack URL
+INFERENCE_MODEL = "vllm"                   # Model for generating answers
+EMBEDDING_MODEL = "granite-embedding-125m"  # Model for embeddings
+AGENT_NAME = "Simple RAG Agent"            # Custom name for your agent
+```
+
+### Document Processing
+```python
+INPUT_FOLDER = "input_files"               # Folder containing your documents
+SUPPORTED_EXTENSIONS = [".txt", ".pdf"]    # File types to process
+CHUNK_SIZE_IN_TOKENS = 256                 # Size of text chunks for vector database
+```
+
+### Vector Database
+```python
+VECTOR_DB_PROVIDER = "milvus"              # Vector database provider
+VECTOR_DB_PREFIX = "simple-rag-db"         # Prefix for vector database ID
+```
+
+### RAG Agent Settings
+```python
+TOP_K = 3                                  # Number of most relevant chunks to retrieve
+SIMILARITY_THRESHOLD = 0.0                 # Minimum similarity score for retrieval
+MAX_INFER_ITERS = 10                       # Maximum inference iterations
+ENABLE_SESSION_PERSISTENCE = False         # Whether to persist sessions
+```
+
+### PDF Processing
+```python
+PDF_DO_OCR = False                         # Whether to perform OCR on PDFs
+PDF_DO_TABLE_STRUCTURE = True              # Whether to extract table structures
+PDF_DO_CELL_MATCHING = True                # Whether to perform cell matching in tables
+```
+
+### Session & Logging
+```python
+SESSION_NAME = "simple-rag-session"        # Name for the chat session
+LOG_LEVEL = "INFO"                         # Logging level (DEBUG, INFO, WARNING, ERROR)
+```
+
+### Agent Instructions
+```python
+AGENT_INSTRUCTIONS = """You are a helpful assistant..."""  # Custom instructions for the agent
+```
+
+## Supported File Types
+
+- **Text files** (`.txt`): Plain text documents
+- **PDF files** (`.pdf`): PDF documents with text extraction and table structure
+
+## Troubleshooting
+
+### Connection Issues
+- Make sure llama-stack is running and accessible
+- Check the `LLAMA_STACK_URL` configuration
+- Verify port-forwarding is working
+
+### Document Processing Issues
+- Ensure files are in supported formats (`.txt`, `.pdf`)
+- Check file permissions and encoding
+- For PDFs, make sure they contain extractable text
+
+### Model Issues
+- Verify the specified models are available in your llama-stack
+- Check model names match exactly
+
+## Understanding the Code
+
+The script is structured in simple, clear functions:
+
+- `load_text_file()`: Reads plain text files
+- `load_pdf_file()`: Extracts text from PDFs using docling
+- `load_documents_from_folder()`: Processes all files in the input folder
+- `setup_vector_database()`: Creates and populates the vector database
+- `create_rag_agent()`: Sets up the AI agent with RAG capabilities
+- `create_session()`: Creates a chat session for the agent
+
+Each function has a single responsibility and clear error handling, making it easy to understand and modify.
+
+## Next Steps
+
+Once you understand this basic RAG setup, you can explore:
+
+### 🔧 **Development Enhancements**
+- **Custom agent configurations** for specific use cases
+- **Advanced document processing** pipelines
+- **Integration with CI/CD** for automated agent deployment
+- **Environment-specific configurations** (dev, staging, prod)
+
+### 🚀 **Production Deployment**
+- **Web interface** for agent management
+- **API endpoints** for programmatic agent creation
+- **Monitoring and logging** for agent performance
+- **Scalable vector database** configurations
+
+### 📊 **Advanced Features**
+- **Custom retrieval strategies** for better document matching
+- **Multi-modal document support** (images, audio, etc.)
+- **Real-time document updates** and agent retraining
+- **Performance optimization** for large document sets
+
+### 🔗 **Integration Possibilities**
+- **Chatbot interfaces** for end users
+- **Knowledge management systems**
+- **Documentation assistants**
+- **Customer support automation**
diff --git a/demos/local/simple_rag/input_files/RAGSurvey.pdf b/demos/local/simple_rag/input_files/RAGSurvey.pdf
diff --git a/demos/local/simple_rag/input_files/example.txt b/demos/local/simple_rag/input_files/example.txt
@@ -0,0 +1,26 @@
+The Simpsons – Overview
+The Simpsons is a long-running animated television sitcom created by Matt Groening. First airing in 1989, the show is set in the fictional town of Springfield and satirizes American culture, society, and television. The series centers around the Simpson family — Homer, Marge, Bart, Lisa, and Maggie — and their interactions with a diverse range of supporting characters. Known for its humor, cultural references, and social commentary, The Simpsons has become a cornerstone of modern pop culture and is one of the most influential TV shows in history.
+
+Homer Simpson
+Homer Jay Simpson is the bumbling but lovable father of the Simpson family. He works at the Springfield Nuclear Power Plant as a safety inspector, despite being lazy and incompetent. Homer is known for his love of donuts, beer (especially Duff Beer), and television. He's overweight, bald, and often acts impulsively, but he deeply loves his family in his own misguided way.
+
+Marge Simpson
+Marjorie "Marge" Simpson is the caring and patient matriarch of the family. She has a tall blue beehive hairstyle and is known for her moral integrity and common sense. Marge is a homemaker who often acts as the voice of reason in the chaotic Simpson household. She is deeply devoted to her husband and children, even when they drive her crazy.
+
+Bart Simpson
+Bartholomew JoJo "Bart" Simpson is the 10-year-old son of Homer and Marge. A rebellious troublemaker, Bart is known for his mischievous pranks, slingshot, and catchphrases like “Eat my shorts!” He struggles academically but is street-smart and clever. Bart often clashes with authority figures and is a constant source of stress for his teachers and Principal Skinner.
+
+Lisa Simpson
+Lisa Marie Simpson is the 8-year-old daughter of the Simpsons. Highly intelligent and talented, Lisa excels in school and plays the saxophone. She is a vegetarian, a Buddhist, and an environmentalist with strong social values. Despite being the youngest sibling after Maggie, she is often the most mature member of the family and frequently challenges societal norms.
+
+Maggie Simpson
+Margaret "Maggie" Simpson is the baby of the family. She rarely speaks but is known for her pacifier-sucking and surprising displays of intelligence. Maggie is observant and occasionally performs impressive feats for her age, often unnoticed by the adults around her.
+
+Grandpa Simpson
+Abraham "Abe" Simpson is Homer’s elderly father. A resident of the Springfield Retirement Castle, Grandpa often tells long-winded, rambling stories about his past. He’s forgetful and sometimes grumpy, but he genuinely cares for his family despite being a little out of touch.
+
+Milhouse Van Houten
+Milhouse is Bart’s best friend. He’s awkward, nerdy, and often the victim of school bullies. Milhouse is loyal but easily manipulated, and he has an unrequited crush on Lisa.
+
+Mr. Burns
+Charles Montgomery Burns is the wealthy, elderly owner of the Springfield Nuclear Power Plant. He’s greedy, ruthless, and power-hungry, often putting profit over safety or morality. Mr. Burns is one of the main antagonists of the series, frequently showing disregard for others, especially his employees.
diff --git a/demos/local/simple_rag/requirements.txt b/demos/local/simple_rag/requirements.txt
@@ -0,0 +1,9 @@
+llama-stack-client>=0.1.0
+pathlib>=1.0.1
-pathlib>=1.0.1
-pathlib>=1.0.1
+fire>=0.5.0
+requests>=2.31.0
+docling>=0.1.0
+pypdfium2>=4.0.0
+Pillow>=9.0.0
+numpy>=1.21.0
+pandas>=1.3.0