An AI-powered curation assistant for the Alliance of Genome Resources, helping biocurators extract and validate biological data from research papers.
- Config-Driven Agents - Shipped agents live in package-owned YAML, not hardcoded Python files. Standard installs customize via
~/.agr_ai_curation/runtime/packages/*and~/.agr_ai_curation/runtime/config/*. - Agent Studio - Browse agents, inspect prompts, discuss behavior with Claude, and submit improvement suggestions.
- Agent Workshop - Clone any agent, customize its prompt, select a model, attach tools, and test it against live documents -- all without writing code.
- Visual Workflow Builder - Create reusable curation flows by chaining agents together in a drag-and-drop interface.
- Multi-Provider LLM Support - Pluggable provider system (
config/providers.yaml) supporting OpenAI, Gemini, and Groq out of the box. Models are declared inconfig/models.yaml. - PDF Processing - Upload research papers and extract structured data with AI assistance.
- Batch Processing - Process multiple documents through saved workflows.
- Real-time Audit Trail - Full transparency into AI decisions, database queries, and tool calls.
- Tool Policy System - Centralized YAML-based policies (
config/tool_policy_defaults.yaml) governing which tools are available to curators and agents.
- Docker and Docker Compose
- OpenAI API key (for embeddings and GPT models)
- Optional: Anthropic API key (for Claude in Agent Studio chat)
- Optional: Groq API key, Gemini API key (additional LLM providers)
For the published modular runtime, use the installer instead of editing the repository in place:
-
Get the installer
git clone https://github.com/alliance-genome/agr_ai_curation.git cd agr_ai_curation -
Run the standalone installer
scripts/install/install.sh
Stage 2 prompts
Package profile [1=core only, 2=core + alliance]and defaults tocore only. That seedsagr.core(Alliance Core) only, and a core-only install is expected to start healthy. To pin a published release, pass--image-tag vX.Y.Z.To add
agr.alliance(Alliance Defaults) later, re-run Stage 2:scripts/install/install.sh --from-stage 2 --package-profile core-plus-alliance
-
Review the installed runtime
- Secrets and image tags:
~/.agr_ai_curation/.env - Selected package profile:
~/.agr_ai_curation/.install_package_profile.env - Runtime config:
~/.agr_ai_curation/runtime/config/ - Shipped and custom packages:
~/.agr_ai_curation/runtime/packages/ - Mutable data:
~/.agr_ai_curation/data/
- Secrets and image tags:
-
Access the application
- Frontend: http://localhost:3002
- Backend API: http://localhost:8000
- API Documentation: http://localhost:8000/docs
See Modular Packages and Upgrades for package authoring, override behavior, standard upgrades, and repo-install migration.
For local product development in a repository checkout:
-
Configure environment
make setup # Edit ~/.agr_ai_curation/.env with your API keys and settingsAt minimum, set these values in
~/.agr_ai_curation/.env:OPENAI_API_KEY=your_openai_api_key_here -
Start the services
make dev-detached
# Standalone install
docker compose --env-file ~/.agr_ai_curation/.env -f docker-compose.production.yml ps
# Source development
docker compose ps
# Shared health check
curl http://localhost:8000/healthFor a generic standalone install, curation_db is optional and can remain
not_configured. It should be treated as a later integration step, not a base
install requirement.
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Frontend │────▶│ Backend │────▶│ Weaviate │
│ (React/MUI) │ │ (FastAPI) │ │ (Vector Store) │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│
▼
┌─────────────────┐
│ PostgreSQL │
│ (Metadata) │
└─────────────────┘
Agents are no longer hardcoded Python files. The default standalone install
seeds agr.core (Alliance Core), which provides the minimal supervisor/startup
contract. Installing agr.alliance (Alliance Defaults) restores the shipped
specialist catalog and tool bindings, while standalone deployments customize
behavior through additional packages under
~/.agr_ai_curation/runtime/packages/ and deployment YAML overrides under
~/.agr_ai_curation/runtime/config/.
~/.agr_ai_curation/
├── runtime/config/ # Deployment override YAML
│ ├── models.yaml
│ ├── providers.yaml
│ └── tool_policy_defaults.yaml
└── runtime/packages/
├── core/ # `agr.core` (Alliance Core)
├── alliance/ # `agr.alliance` (Alliance Defaults, optional)
└── org-custom/ # Your custom package(s)
| Service | Port | Description |
|---|---|---|
| frontend | 3002 | React web application |
| backend | 8000 | FastAPI server with config-driven AI agents |
| weaviate | 8080 | Vector database for document chunks |
| postgres | 5432 | Metadata storage (users, documents, flows, custom agents) |
| langfuse | 3000 | Observability and tracing (optional) |
| trace_review | 3001/8001 | Langfuse trace analysis UI (separate service) |
- Getting Started - First-time setup and basic usage
- Best Practices - Tips for effective queries
- Available Agents - All specialist agents
- Curation Flows - Visual workflow builder
- Batch Processing - Process multiple documents
- Agent Studio - Browse prompts, Agent Workshop, and chat with Claude
- Harness Health - Current automation and validation health notes
- Trace Review - Langfuse trace analysis tool for debugging agent behavior
- Independent Deployment - Standalone deployment guide
- Modular Packages and Upgrades - Runtime layout, package model, overrides, and upgrade paths
- LLM Provider Rollout Runbook - Adding new LLM providers
- LLM Provider Smoke Test Matrix - Provider validation checklist
# Run all healthy tests
docker compose exec backend pytest tests/unit/ -v
# Run specific test file
docker compose exec backend pytest tests/unit/test_config.py -v
# Run with coverage
docker compose exec backend pytest tests/unit/ --cov=src --cov-report=html# Lint with ruff
docker compose exec backend ruff check .
# Format code
docker compose exec backend ruff format .# Create a new migration
docker compose exec backend alembic revision --autogenerate -m "description"
# Apply migrations
docker compose exec backend alembic upgrade headFor a standalone/public install, add agents through a custom runtime package
under ~/.agr_ai_curation/runtime/packages/<your-package>/agents/ rather than
editing the repo-local config/agents/ directory directly. See
config/agents/README.md and
Modular Packages and Upgrades for the
package contract.
If you are developing the built-in catalog from a repository checkout,
config/agents/supervisor/ mirrors the shipped agr.core supervisor bundle and
the remaining config/agents/<name>/ folders mirror the shipped
agr.alliance specialist bundles.
All runtime configuration is done through environment variables. See .env.example for available options.
| Section | Variables | Description |
|---|---|---|
| API Keys | OPENAI_API_KEY, ANTHROPIC_API_KEY, GROQ_API_KEY, GEMINI_API_KEY |
LLM provider credentials |
| Database | DATABASE_URL |
PostgreSQL connection |
| Weaviate | WEAVIATE_HOST, WEAVIATE_PORT |
Vector store connection |
| LLM Settings | DEFAULT_AGENT_MODEL, DEFAULT_AGENT_REASONING |
Global model defaults |
| Auth | COGNITO_* |
AWS Cognito authentication |
| File | Description |
|---|---|
~/.agr_ai_curation/runtime/packages/*/agents/*/agent.yaml |
Package-owned agent definitions for standalone installs |
~/.agr_ai_curation/runtime/config/models.yaml |
Deployment model overrides |
~/.agr_ai_curation/runtime/config/providers.yaml |
Deployment provider overrides |
~/.agr_ai_curation/runtime/config/tool_policy_defaults.yaml |
Deployment tool policy overrides |
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Alliance of Genome Resources
- OpenAI for GPT models
- Anthropic for Claude
- Groq for high-throughput inference
- Weaviate for vector storage