An AI contract analysis and Q&A system designed to run containerized by default (Ollama + TEI + LiteLLM + Streamlit), with optional online mode via OpenAI.
- Containerized by default: One command spins up the entire stack
- Dual mode: Local (offline) or OpenAI (online) without code changes
- Grounded QA: Retrieval-augmented Q&A over your contracts
- NER: Extract parties, dates, monetary values
- Persistent storage: Contracts and vectors persist across restarts
- Web UI + CLI: Streamlit app and Typer CLI
- Clone and configure
git clone <repository-url>
cd contract-assistant
cp env.example .env
# Edit .env as needed (defaults target local container stack)
- Start the stack (dev)
make dev-up
- First run behavior
- Ollama model is pulled automatically by the container entry script on first start (no need to visit the UI).
- TEI downloads embeddings artifacts on first start.
- Open the app
http://localhost:8501
Notes
- Services:
app
(Streamlit),litellm
,ollama
,tei
- Persistence:
- Vectors:
data/chroma_data
- Contracts (PDFs, metadata, entities):
data/contracts
- Ollama cache:
data/weights/ollama_data
- TEI cache:
data/weights/tei_data
- Vectors:
- Default local models: LLM
ollama/llama3.2:3b
, embeddingsintfloat/multilingual-e5-large-instruct
Stop the stack
make dev-down
- Upload a PDF to ingest. It will be chunked, embedded, and registered persistently.
- The sidebar lists previously ingested contracts; select and click “Load Contract.”
- Ask questions in chat; responses are grounded in retrieved excerpts.
Run CLI inside the app
container in dev mode:
docker compose -f docker/docker-compose.base.yml -f docker/docker-compose.dev.yml exec app \
poetry run typer cli/main.py ingest data/examples/sample_nda.pdf
docker compose -f docker/docker-compose.base.yml -f docker/docker-compose.dev.yml exec app \
poetry run typer cli/main.py ask "What is the governing law?" --id <contract_id>
If you prefer OpenAI instead of local models:
- Set in
.env
:
LLM_PROVIDER=openai
OPENAI_API_KEY=sk-...
- Run the app locally (no containers needed):
make setup
make run-app
- Compose files
docker/docker-compose.base.yml
: common services & volumesdocker/docker-compose.dev.yml
: dev overrides (hot reload, volumes)docker/docker-compose.prod.yml
: prod overrides (restart policies)
- Services
app
: Streamlit frontend and orchestrationlitellm
: OpenAI-compatible router. Healthchecked via/health/readiness
(per LiteLLM docs)ollama
: local LLM server. Entry script pullsLOCAL_LLM_MODEL
if missingtei
: embeddings server (intfloat/multilingual-e5-large-instruct)
- Make targets
- Dev:
dev-up
,dev-down
,dev-logs
,dev-shell
- Prod:
prod-up
,prod-down
,prod-logs
,prod-shell
- Local run (OpenAI):
run-app
- Code quality:
setup
,format
,lint
- Dev:
- See CONFIGURATION.md for environment variables and advanced options. Note that document chunking is dynamic based on your embedding model's token limits, which can be configured in your
.env
file.
- Vector store: ChromaDB persistent mode at
data/chroma_data
- Contract registry:
data/contracts/<contract_id>/
with- Original PDF
meta.json
(filename, pages, timestamps)entities.json
(NER output)
These paths are mounted into containers via bind volumes to persist across restarts.