LLMOps Multi-Agent CI/CD Pipeline

A production-grade multi-agent AI research pipeline built with LangGraph, FastAPI, Docker, and Azure DevOps. The system uses a Supervisor → Researcher → Writer agent architecture to autonomously research any topic and produce a polished, structured report.

Architecture Overview

User Request (POST /research)
        │
        ▼
┌──────────────────┐
│   FastAPI App    │  ← REST API layer
└────────┬─────────┘
         │
         ▼
┌──────────────────────────────────────────┐
│         LangGraph State Machine          │
│                                          │
│   ┌────────────┐                         │
│   │ Supervisor │  ← validates & routes   │
│   └─────┬──────┘                         │
│         │                                │
│         ▼                                │
│   ┌────────────┐     ┌──────────────┐   │
│   │ Researcher │────▶│ Tavily Search│   │
│   │   Agent    │     │  Web Reader  │   │
│   └─────┬──────┘     └──────────────┘   │
│         │                                │
│         ▼                                │
│   ┌────────────┐     ┌──────────────┐   │
│   │   Writer   │────▶│ Self Critique│   │
│   │   Agent    │     │    Tool      │   │
│   └─────┬──────┘     └──────────────┘   │
│         │                                │
└─────────┼────────────────────────────────┘
          │
          ▼
   Structured JSON Response
   (report + sources + agent trace)

Tech Stack

Layer	Technology
Agent Framework	LangGraph (StateGraph, ReAct pattern)
LLM	OpenAI GPT-4o / GPT-4o-mini via `init_chat_model`
Web Search	Tavily Search API
API Framework	FastAPI + Pydantic v2
Dependency Management	uv
Containerization	Docker
CI/CD	Azure DevOps Pipelines (3-stage)
Image Registry	Docker Hub
Code Quality	Ruff (linting)
Testing	Pytest

Multi-Agent Design

Supervisor

Orchestrates the pipeline. Validates state between agents and handles routing. If the Researcher produces no findings, the pipeline fails gracefully before the Writer is invoked.

Researcher Agent (ReAct)

Model: gpt-4o-mini
Tools: Tavily web search + webpage content reader
Behaviour: Executes 2-3 targeted searches, reads full articles when needed, outputs structured JSON findings
Max tool calls: 5 (cost control)

Writer Agent (ReAct)

Model: gpt-4o
Tools: Self-critique tool
Behaviour: Transforms raw research into a polished Markdown report, self-reviews the draft, revises before finalising

Shared State

All agents communicate through a typed AgentState (LangGraph TypedDict) — no direct agent-to-agent calls. State carries topic, research findings, final report, agent trace, and token count.

CI/CD Pipeline (Azure DevOps)

3-stage pipeline triggered on every push to master:

Stage 1: 🧪 Quality Gate
├── Set Python 3.11
├── Install dependencies via uv
├── Ruff lint check
└── Pytest (mocked pipeline tests)

Stage 2: 🐳 Build & Ship
├── Docker build
└── Push to Docker Hub (:latest + :build_id)

Stage 3: 🔍 Container Health Verification
├── Pull image from Docker Hub
├── Run container with env vars
├── Hit /health endpoint → assert HTTP 200
├── Print health response
└── Cleanup container

Secrets (OPENAI_API_KEY, TAVILY_API_KEY, DOCKER_HUB_USERNAME) are stored in Azure DevOps Variable Groups — never in code.

Project Structure

llmops-multi-agent-cicd-pipeline/
├── agents/
│   ├── researcher.py      # Researcher ReAct agent
│   ├── writer.py          # Writer ReAct agent
│   └── supervisor.py      # Routing & validation logic
├── graph/
│   ├── state.py           # Shared AgentState TypedDict
│   └── pipeline.py        # LangGraph StateGraph wiring
├── tools/
│   ├── search.py          # Tavily search tool
│   └── web_reader.py      # URL content extractor
├── models/
│   └── schemas.py         # Pydantic request/response schemas
├── tests/
│   └── test_api.py        # Pytest tests (mocked pipeline)
├── main.py                # App entry point
├── api/
│   └── main.py            # FastAPI application
├── Dockerfile
├── docker-compose.yml
├── azure-pipelines.yml
└── pyproject.toml

API Endpoints

`GET /health`

Returns service health status.

{
  "status": "healthy",
  "service": "multi-agent-pipeline",
  "version": "1.0.0"
}

`POST /research`

Runs the full multi-agent pipeline.

Request:

{
  "topic": "Impact of LLMs on software engineering jobs in 2025",
  "max_search_results": 3
}

Response:

{
  "topic": "Impact of LLMs on software engineering jobs in 2025",
  "research_summary": {
    "key_findings": ["Finding 1", "Finding 2"],
    "sources": ["https://..."],
    "search_queries_used": ["query 1", "query 2"]
  },
  "final_report": "# Impact of LLMs...\n\n## Executive Summary\n...",
  "agent_trace": [
    {"agent": "researcher", "action": "start", "detail": "Starting research on: ..."},
    {"agent": "researcher", "action": "complete", "detail": "Found 5 key findings"},
    {"agent": "writer", "action": "start", "detail": "Starting report writing"},
    {"agent": "writer", "action": "complete", "detail": "Report written — 520 words"}
  ],
  "tokens_used": 3420,
  "status": "success"
}

Local Setup

Prerequisites

Python 3.11+
uv installed
Docker Desktop
OpenAI API key
Tavily API key (free tier at tavily.com)

Run Locally

# Clone the repo
git clone https://github.com/your-username/llmops-multi-agent-cicd-pipeline.git
cd llmops-multi-agent-cicd-pipeline

# Create and activate virtual environment
uv venv .venv
source .venv/bin/activate  # Linux/Mac
# .venv\Scripts\Activate.ps1  # Windows PowerShell

# Install dependencies
uv sync

# Add environment variables
cp .env.example .env
# Edit .env and add your API keys

# Run the API
uvicorn main:app --reload --host 0.0.0.0 --port 8000

Run with Docker

docker build -t llmops-multi-agent-cicd-pipeline .

docker run -p 8000:8000 \
  -e OPENAI_API_KEY=your_key \
  -e TAVILY_API_KEY=your_key \
  llmops-multi-agent-cicd-pipeline

Run Tests

uv run pytest tests/ -v

Lint

uv run ruff check .

Environment Variables

Variable	Description
`OPENAI_API_KEY`	OpenAI API key
`TAVILY_API_KEY`	Tavily Search API key

Key Design Decisions

Why LangGraph over LangChain AgentExecutor? LangGraph models each agent as an explicit node in a state graph, giving full control over state, routing, and error handling. Each agent's inputs and outputs are typed and traceable.

Why separate Researcher and Writer agents? Separation of concerns — the Researcher never writes prose, the Writer never searches the web. This prevents hallucination (Writer can only use what Researcher found) and makes each agent's behaviour predictable and testable.

Why init_chat_model? Provider-agnostic LLM initialisation. Swapping from OpenAI to Anthropic or any other provider requires changing one string, not the entire codebase.

Why GPT-4o-mini for Researcher and GPT-4o for Writer? Cost optimisation — the Researcher makes many tool calls and processes raw text, where speed and cost matter more than prose quality. The Writer makes fewer calls but needs higher quality output.

Agent Trace — Observability

Every API response includes a full agent_trace showing every action taken by every agent. This is intentional — it provides transparency into how the answer was produced, which is essential for debugging and for demonstrating the system's reasoning in interviews and demos.

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLMOps Multi-Agent CI/CD Pipeline

Architecture Overview

Tech Stack

Multi-Agent Design

Supervisor

Researcher Agent (ReAct)

Writer Agent (ReAct)

Shared State

CI/CD Pipeline (Azure DevOps)

Project Structure

API Endpoints

`GET /health`

`POST /research`

Local Setup

Prerequisites

Run Locally

Run with Docker

Run Tests

Lint

Environment Variables

Key Design Decisions

Agent Trace — Observability

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
agents		agents
api		api
graph		graph
models		models
tests		tests
tools		tools
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
Dockerfile		Dockerfile
README.md		README.md
azure-pipelines.yml		azure-pipelines.yml
docker-compose.yml		docker-compose.yml
main.py		main.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

LLMOps Multi-Agent CI/CD Pipeline

Architecture Overview

Tech Stack

Multi-Agent Design

Supervisor

Researcher Agent (ReAct)

Writer Agent (ReAct)

Shared State

CI/CD Pipeline (Azure DevOps)

Project Structure

API Endpoints

GET /health

POST /research

Local Setup

Prerequisites

Run Locally

Run with Docker

Run Tests

Lint

Environment Variables

Key Design Decisions

Agent Trace — Observability

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`GET /health`

`POST /research`

Packages