In this chapter you will learn about how to deploy a Haystack retriever pipeline as a REST API Endpoint with FastAPI. This is part I - deploying a retrieval pipeline from a prepopulated document store. Part II is an advanced case where you will learn to deploy an indexing and advanced retrieval pipelines that can dynamically upload a PDF and web urls and answer questions about the uploaded material.
- Setup environment:
uv sync
source .venv/bin/activate
./scripts/setup_local.sh- Configure environment:
Generate a secure API key:
# macOS/Linux
openssl rand -hex 32Enter your key in the .env file
# Copy the example environment file
cp .env.example .env
# Edit .env and add your API keys:
# OPENAI_API_KEY=your_actual_openai_key
# RAG_API_KEY=your_secret_api_key_for_authentication - Run indexing:
./scripts/run_indexing.sh- Start API:
./scripts/run_api.sh- Test the API:
uv run python tests/test_api.py- Build the Docker image:
docker build -t hybrid-rag-api .- Run the container:
docker run -d \
--name hybrid-rag \
-p 8000:8000 \
-e OPENAI_API_KEY=your_actual_openai_key \
-e RAG_API_KEY=your_secret_api_key \
hybrid-rag-api- Check logs:
docker logs -f hybrid-rag- Test the API:
# Once container shows "✅ Indexing complete!" and "🚀 Starting API server..."
curl http://localhost:8000/health- Stop the container:
docker stop hybrid-rag
docker rm hybrid-ragNote: The Docker container automatically runs the indexing pipeline on startup before launching the API. This takes a few minutes depending on the size of your data.
- GET /: Basic API information
- GET /health: Health check with component status
- GET /info: Configuration and model information
- POST /query: Submit queries to the RAG system (requires authentication)
- GET /docs: Interactive API documentation (Swagger UI)
The /query endpoint is protected with API key authentication to prevent unauthorized access.
- Generate a secure API key:
# Generate a random 32-byte hex string (recommended)
openssl rand -hex 32- Add to your
.envfile:
RAG_API_KEY=<your secret>- For Docker deployment, pass it as an environment variable:
docker run -d \
-p 8000:8000 \
-e OPENAI_API_KEY=sk-... \
-e RAG_API_KEY=your-secret-key \
hybrid-rag-apiInclude the API key in the X-API-Key header with every request to /query:
cURL Example:
curl -X POST http://localhost:8000/query \
-H "Content-Type: application/json" \
-H "X-API-Key: your-secret-key" \
-d '{"query": "What is retrieval-augmented generation?"}'Python Example:
import requests
response = requests.post(
"http://localhost:8000/query",
headers={
"Content-Type": "application/json",
"X-API-Key": "your-secret-key"
},
json={"query": "What is retrieval-augmented generation?"}
)
print(response.json())JavaScript/Fetch Example:
fetch('http://localhost:8000/query', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'X-API-Key': 'your-secret-key'
},
body: JSON.stringify({
query: 'What is retrieval-augmented generation?'
})
})
.then(response => response.json())
.then(data => console.log(data));- ✅ Only
/queryendpoint requires authentication - ✅ All other endpoints (
/,/health,/info,/docs) remain publicly accessible - ✅ Invalid or missing API keys return
401 Unauthorized ⚠️ Never commit your API key to version control⚠️ Use different keys for development, staging, and production⚠️ Store production keys in secure secret management systems (GitHub Secrets, AWS Secrets Manager, etc.)
Project structure can be found here
All configuration is handled via environment variables. See .env.example for all available options:
OPENAI_API_KEY: Your OpenAI API key (required)RAG_API_KEY: Secret key for API authentication (required)
QDRANT_PATH: Qdrant storage directory path (default:./qdrant_storage)QDRANT_INDEX: Index name for documents (default:documents)EMBEDDER_MODEL: Embedding model (default:text-embedding-3-small)LLM_MODEL: Language model (default:gpt-4o-mini)RANKER_MODEL: Reranker model (default:BAAI/bge-reranker-base)TOP_K: Number of documents to retrieve (default:3)API_HOST: API host address (default:0.0.0.0)API_PORT: API port number (default:8000)DEBUG: Enable debug mode (default:false)
- Local First: Develop and test locally using the scripts
- Test Thoroughly: Use the comprehensive test suite
- Docker Testing: Test with Docker Compose before deployment
- Production: Deploy using Docker in your preferred environment
-
401 Unauthorized on
/queryendpoint:- Verify
RAG_API_KEYis set in your.envfile - Ensure you're including the
X-API-Keyheader in your requests - Check that the key in the header matches the one in your
.envfile
- Verify
-
OpenAI API errors:
- Verify your
OPENAI_API_KEYis set correctly in.env - Check your OpenAI account has credits
- Verify your
-
No documents found:
- Run indexing first:
./scripts/run_indexing.sh - Check if Qdrant storage directory exists:
ls -la ./qdrant_storage
- Run indexing first:
-
API won't start - missing RAG_API_KEY:
- Add
RAG_API_KEY=your-secret-keyto your.envfile - Generate a secure key:
openssl rand -hex 32
- Add
-
Import errors in development:
- Make sure you're running from the project root
- Use
uv run python -m src.appinstead of direct imports
- API logs: Check terminal output when running
./scripts/run_api.sh - Docker logs:
docker-compose logs api - Enable debug mode: Set
DEBUG=truein.env