Name	Name	Last commit message	Last commit date
parent directory ..
.gradient	.gradient
agents	agents
pdfs	pdfs
tools	tools
.env.example	.env.example
.gitignore	.gitignore
README.md	README.md
main.py	main.py
prompts.py	prompts.py
requirements.txt	requirements.txt

RAG - Document Question Answering

A multi-agent Retrieval-Augmented Generation (RAG) system for question answering over PDF documents. Built with LangGraph and LangChain, deployable on the DigitalOcean Gradient AI Platform.

Use Case

Build an AI assistant that answers questions using your own documents as the knowledge source. This template demonstrates document retrieval, query optimization, and answer generation with citations.

When to use this template:

You need to answer questions from a document collection
You want to build a knowledge base chatbot
You need query rewriting for better retrieval

Key Concepts

Retrieval-Augmented Generation (RAG) enhances LLM responses by fetching relevant documents before answering. Instead of relying solely on training data, the agent searches a vector index of your documents, retrieves the most relevant passages, and uses them as context for generating accurate, grounded responses with citations.

Query rewriting improves retrieval quality when initial results are poor. If the retrieved documents don't answer the question well, the agent reformulates the query with different terms or more specific phrasing, then retries the search. This multi-step approach handles ambiguous or complex questions more effectively than single-pass retrieval.

Architecture

                              ┌───────────┐
                              │   Start   │
                              └─────┬─────┘
                                    │
                                    ▼
                     ┌──────────────────────────────┐
                     │  generate_query_or_respond   │
                     │                              │
                     │  LLM decides: retrieve docs  │
                     │  or answer directly?         │
                     └──────────────┬───────────────┘
                            ┌───────┴───────┐
                            │               │
                    (needs docs)      (can answer)
                            │               │
                            ▼               │
                     ┌──────────────┐       │
                     │   retrieve   │       │
                     │              │       │
                     │ Vector search│◄──────────────────────┐
                     │ for relevant │       │               │
                     │ documents    │       │               │
                     └──────┬───────┘       │               │
                            │               │               │
                            ▼               │               │
                     ┌──────────────┐       │               │
                     │    grade     │       │               │
                     │              │       │               │
                     │ LLM evaluates│       │               │
                     │ if docs are  │       │               │
                     │ relevant to  │       │               │
                     │ the question │       │               │
                     └──────┬───────┘       │               │
                            │               │               │
              ┌─────────────┴─────────┐     │               │
              │                       │     │               │
        (relevant)              (not relevant)              │
              │                       │                     │
              │                       ▼                     │
              │          ┌─────────────────┐                │
              │          │ rewrite_question│                │
              │          │                 │                │
              │          │ LLM reformulates│                │
              │          │ query for better│────────────────┘
              │          │ retrieval       │  (retry with
              │          └─────────────────┘   new query)
              │                       │
              └───────────┬───────────┘
                          │
                          ▼
                ┌─────────────────┐
                │ generate_answer │
                │                 │
                │ LLM writes      │
                │ response with   │
                │ citations       │
                └────────┬────────┘
                         │
                         ▼
                  ┌───────────┐
                  │    End    │
                  └───────────┘

The workflow uses LLMs at each decision point:

generate_query_or_respond: LLM decides if retrieval is needed or if it can answer directly
retrieve: Vector search fetches potentially relevant documents
grade: LLM evaluates whether retrieved documents actually answer the question
rewrite_question: If documents aren't relevant, LLM reformulates the query and loops back to retrieve
generate_answer: LLM synthesizes the final response with citations from relevant documents

Prerequisites

Python 3.10+
DigitalOcean account
OpenAI API key (for embeddings)
PDF documents to index

Getting API Keys

DigitalOcean API Token:
- Go to API Settings
- Generate a new token with read/write access
DigitalOcean Inference Key:
- Go to GenAI Settings
- Create or copy your inference key
OpenAI API Key (for embeddings):
- Go to OpenAI API Keys
- Create a new API key

Note: OpenAI is required because DigitalOcean GenAI Serverless Inference does not yet support embedding models. The LLM inference still uses DigitalOcean.

Setup

1. Create Virtual Environment

cd RAG
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

2. Install Dependencies

pip install -r requirements.txt

3. Configure Environment

cp .env.example .env

Edit .env with your credentials:

DIGITALOCEAN_INFERENCE_KEY=your_inference_key
OPENAI_API_KEY=your_openai_key

4. Add Your Documents

Copy PDF files to the pdfs/ folder:

cp /path/to/your/documents/*.pdf ./pdfs/

The template includes sample Hubble Space Telescope fact sheets by default.

Running Locally

Start the Agent

export DIGITALOCEAN_API_TOKEN=your_token
gradient agent run

Test with curl

curl --location 'http://localhost:8080/run' \
    --header 'Content-Type: application/json' \
    --data '{
        "prompt": {
            "messages": [
                {
                    "role": "user",
                    "content": "What is the difference between the STIS and the COS?"
                }
            ]
        }
    }'

Deployment

1. Configure Agent Name

Edit .gradient/agent.yml:

agent_name: my-rag-agent

2. Deploy

gradient agent deploy

3. Invoke Deployed Agent

curl --location 'https://agents.do-ai.run/<DEPLOYED_AGENT_ID>/main/run' \
    --header 'Content-Type: application/json' \
    --header 'Authorization: Bearer <DIGITALOCEAN_API_TOKEN>' \
    --data '{
        "prompt": {
            "messages": [
                {
                    "role": "user",
                    "content": "What is the difference between the STIS and the COS?"
                }
            ]
        }
    }'

Sample Input/Output

Input

{
    "prompt": {
        "messages": [
            {
                "role": "user",
                "content": "What instruments does Hubble use to observe ultraviolet light?"
            }
        ]
    }
}

Output

{
    "response": "Hubble uses two primary instruments for ultraviolet observations:\n\n1. **Cosmic Origins Spectrograph (COS)**: Installed in 2009, COS is optimized for observing faint ultraviolet emissions from distant sources like quasars and intergalactic gas.\n\n2. **Space Telescope Imaging Spectrograph (STIS)**: A versatile instrument that can observe in ultraviolet, visible, and near-infrared wavelengths, often used for studying planetary atmospheres and stellar winds.\n\nBoth instruments take advantage of Hubble's position above Earth's atmosphere, which blocks most ultraviolet light from reaching ground-based telescopes.",
    "sources": [
        "hubble_cos_factsheet.pdf",
        "hubble_stis_factsheet.pdf"
    ]
}

Project Structure

RAG/
├── .gradient/
│   └── agent.yml          # Deployment configuration
├── agents/
│   ├── __init__.py
│   ├── grader.py          # Document relevance evaluation
│   ├── rewriter.py        # Query reformulation
│   └── answer_writer.py   # Response generation
├── tools/
│   ├── __init__.py
│   └── doc_retriever.py   # PDF loading and vector search
├── pdfs/                   # Your PDF documents
│   └── *.pdf
├── main.py                 # LangGraph workflow
├── prompts.py              # All agent prompts (edit this to customize!)
├── requirements.txt
├── .env.example
└── README.md

Customization

Customizing the Prompts

The easiest way to adapt this template is by editing prompts.py. This file contains all the prompts used by the RAG pipeline agents.

Key prompts you can customize:

Variable	Purpose	Example Change
`GRADE_PROMPT`	Evaluates document relevance	Make stricter for high-precision retrieval
`REWRITE_PROMPT`	Reformulates failed queries	Add domain-specific synonyms
`GENERATE_PROMPT`	Creates the final answer	Change style (concise vs detailed)

Example: Detailed Technical Answers

# In prompts.py, change GENERATE_PROMPT to:
GENERATE_PROMPT = (
    "You are a technical documentation assistant. "
    "Use the following retrieved context to answer the question thoroughly. "
    "Provide step-by-step explanations when applicable. "
    "If the context doesn't contain enough information, say so and explain what's missing.\n"
    "Question: {question} \n"
    "Context: {context}"
)

Example: Strict Document Grading

# For higher precision retrieval:
GRADE_PROMPT = (
    "You are a strict grader assessing relevance of a retrieved document to a user question. \n "
    "Here is the retrieved document: \n\n {context} \n\n"
    "Here is the user question: {question} \n"
    "The document must directly answer the question or contain key information needed. "
    "Tangentially related content should be graded as 'no'. \n"
    "Give a binary score 'yes' or 'no' score to indicate whether the document is relevant."
)

Example: Citation-focused Answers

GENERATE_PROMPT = (
    "You are a research assistant that provides well-cited answers. "
    "Use the retrieved context to answer the question. "
    "Reference specific parts of the documents in your answer. "
    "If you don't have enough information, acknowledge the limitations.\n"
    "Question: {question} \n"
    "Context: {context}"
)

Change the Documents

Sample documents are included in /pdfs. You can swap these documents out for your documents and have the agent answer questions on your data.

Using a Different Vector Store

Replace the in-memory retriever in tools/doc_retriever.py:

from langchain_community.vectorstores import Pinecone

def create_retriever():
    # Use Pinecone for persistent storage
    vectorstore = Pinecone.from_existing_index(
        index_name="my-index",
        embedding=OpenAIEmbeddings()
    )
    return vectorstore.as_retriever(search_kwargs={"k": 5})

Changing the Embedding Model

Modify tools/doc_retriever.py to use a different embedding:

from langchain_community.embeddings import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

Adjusting Retrieval Parameters

Change the number of documents retrieved:

retriever = vectorstore.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 10}  # Retrieve 10 documents instead of default
)

Adding Document Sources

Support additional file types by modifying the loader:

from langchain_community.document_loaders import DirectoryLoader, TextLoader

# Load both PDFs and text files
loader = DirectoryLoader(
    "./documents",
    glob="**/*.*",
    loader_cls=TextLoader
)

Troubleshooting

Issue	Solution
"No documents found"	Check that PDFs exist in the `pdfs/` folder
Embedding errors	Verify your `OPENAI_API_KEY` is set correctly
Poor retrieval results	Try adding more context to your question
Memory issues	Reduce the number/size of PDFs or use an external vector DB

Limitations

In-memory storage: Document index is rebuilt on each restart. For production, use a persistent vector store.
PDF only: The default loader only supports PDF files.
Embedding costs: Each document is embedded using OpenAI, which incurs API costs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

RAG - Document Question Answering

Use Case

Key Concepts

Architecture

Prerequisites

Getting API Keys

Setup

1. Create Virtual Environment

2. Install Dependencies

3. Configure Environment

4. Add Your Documents

Running Locally

Start the Agent

Test with curl

Deployment

1. Configure Agent Name

2. Deploy

3. Invoke Deployed Agent

Sample Input/Output

Input

Output

Project Structure

Customization

Customizing the Prompts

Change the Documents

Using a Different Vector Store

Changing the Embedding Model

Adjusting Retrieval Parameters

Adding Document Sources

Troubleshooting

Limitations

Resources

FilesExpand file tree

RAG

Directory actions

More options

Directory actions

More options

Latest commit

History

RAG

Folders and files

parent directory

README.md

RAG - Document Question Answering

Use Case

Key Concepts

Architecture

Prerequisites

Getting API Keys

Setup

1. Create Virtual Environment

2. Install Dependencies

3. Configure Environment

4. Add Your Documents

Running Locally

Start the Agent

Test with curl

Deployment

1. Configure Agent Name

2. Deploy

3. Invoke Deployed Agent

Sample Input/Output

Input

Output

Project Structure

Customization

Customizing the Prompts

Change the Documents

Using a Different Vector Store

Changing the Embedding Model

Adjusting Retrieval Parameters

Adding Document Sources

Troubleshooting

Limitations

Resources