Skip to content

muneeb-rashid-cyan/LLMOps-Multi-Agent-Azure-CICD-Pipeline-

Repository files navigation

LLMOps Multi-Agent CI/CD Pipeline

A production-grade multi-agent AI research pipeline built with LangGraph, FastAPI, Docker, and Azure DevOps. The system uses a Supervisor → Researcher → Writer agent architecture to autonomously research any topic and produce a polished, structured report.


Architecture Overview

User Request (POST /research)
        │
        ▼
┌──────────────────┐
│   FastAPI App    │  ← REST API layer
└────────┬─────────┘
         │
         ▼
┌──────────────────────────────────────────┐
│         LangGraph State Machine          │
│                                          │
│   ┌────────────┐                         │
│   │ Supervisor │  ← validates & routes   │
│   └─────┬──────┘                         │
│         │                                │
│         ▼                                │
│   ┌────────────┐     ┌──────────────┐   │
│   │ Researcher │────▶│ Tavily Search│   │
│   │   Agent    │     │  Web Reader  │   │
│   └─────┬──────┘     └──────────────┘   │
│         │                                │
│         ▼                                │
│   ┌────────────┐     ┌──────────────┐   │
│   │   Writer   │────▶│ Self Critique│   │
│   │   Agent    │     │    Tool      │   │
│   └─────┬──────┘     └──────────────┘   │
│         │                                │
└─────────┼────────────────────────────────┘
          │
          ▼
   Structured JSON Response
   (report + sources + agent trace)

Tech Stack

Layer Technology
Agent Framework LangGraph (StateGraph, ReAct pattern)
LLM OpenAI GPT-4o / GPT-4o-mini via init_chat_model
Web Search Tavily Search API
API Framework FastAPI + Pydantic v2
Dependency Management uv
Containerization Docker
CI/CD Azure DevOps Pipelines (3-stage)
Image Registry Docker Hub
Code Quality Ruff (linting)
Testing Pytest

Multi-Agent Design

Supervisor

Orchestrates the pipeline. Validates state between agents and handles routing. If the Researcher produces no findings, the pipeline fails gracefully before the Writer is invoked.

Researcher Agent (ReAct)

  • Model: gpt-4o-mini
  • Tools: Tavily web search + webpage content reader
  • Behaviour: Executes 2-3 targeted searches, reads full articles when needed, outputs structured JSON findings
  • Max tool calls: 5 (cost control)

Writer Agent (ReAct)

  • Model: gpt-4o
  • Tools: Self-critique tool
  • Behaviour: Transforms raw research into a polished Markdown report, self-reviews the draft, revises before finalising

Shared State

All agents communicate through a typed AgentState (LangGraph TypedDict) — no direct agent-to-agent calls. State carries topic, research findings, final report, agent trace, and token count.


CI/CD Pipeline (Azure DevOps)

3-stage pipeline triggered on every push to master:

Stage 1: 🧪 Quality Gate
├── Set Python 3.11
├── Install dependencies via uv
├── Ruff lint check
└── Pytest (mocked pipeline tests)

Stage 2: 🐳 Build & Ship
├── Docker build
└── Push to Docker Hub (:latest + :build_id)

Stage 3: 🔍 Container Health Verification
├── Pull image from Docker Hub
├── Run container with env vars
├── Hit /health endpoint → assert HTTP 200
├── Print health response
└── Cleanup container

Secrets (OPENAI_API_KEY, TAVILY_API_KEY, DOCKER_HUB_USERNAME) are stored in Azure DevOps Variable Groups — never in code.


Project Structure

llmops-multi-agent-cicd-pipeline/
├── agents/
│   ├── researcher.py      # Researcher ReAct agent
│   ├── writer.py          # Writer ReAct agent
│   └── supervisor.py      # Routing & validation logic
├── graph/
│   ├── state.py           # Shared AgentState TypedDict
│   └── pipeline.py        # LangGraph StateGraph wiring
├── tools/
│   ├── search.py          # Tavily search tool
│   └── web_reader.py      # URL content extractor
├── models/
│   └── schemas.py         # Pydantic request/response schemas
├── tests/
│   └── test_api.py        # Pytest tests (mocked pipeline)
├── main.py                # App entry point
├── api/
│   └── main.py            # FastAPI application
├── Dockerfile
├── docker-compose.yml
├── azure-pipelines.yml
└── pyproject.toml

API Endpoints

GET /health

Returns service health status.

{
  "status": "healthy",
  "service": "multi-agent-pipeline",
  "version": "1.0.0"
}

POST /research

Runs the full multi-agent pipeline.

Request:

{
  "topic": "Impact of LLMs on software engineering jobs in 2025",
  "max_search_results": 3
}

Response:

{
  "topic": "Impact of LLMs on software engineering jobs in 2025",
  "research_summary": {
    "key_findings": ["Finding 1", "Finding 2"],
    "sources": ["https://..."],
    "search_queries_used": ["query 1", "query 2"]
  },
  "final_report": "# Impact of LLMs...\n\n## Executive Summary\n...",
  "agent_trace": [
    {"agent": "researcher", "action": "start", "detail": "Starting research on: ..."},
    {"agent": "researcher", "action": "complete", "detail": "Found 5 key findings"},
    {"agent": "writer", "action": "start", "detail": "Starting report writing"},
    {"agent": "writer", "action": "complete", "detail": "Report written — 520 words"}
  ],
  "tokens_used": 3420,
  "status": "success"
}

Local Setup

Prerequisites

  • Python 3.11+
  • uv installed
  • Docker Desktop
  • OpenAI API key
  • Tavily API key (free tier at tavily.com)

Run Locally

# Clone the repo
git clone https://github.com/your-username/llmops-multi-agent-cicd-pipeline.git
cd llmops-multi-agent-cicd-pipeline

# Create and activate virtual environment
uv venv .venv
source .venv/bin/activate  # Linux/Mac
# .venv\Scripts\Activate.ps1  # Windows PowerShell

# Install dependencies
uv sync

# Add environment variables
cp .env.example .env
# Edit .env and add your API keys

# Run the API
uvicorn main:app --reload --host 0.0.0.0 --port 8000

Run with Docker

docker build -t llmops-multi-agent-cicd-pipeline .

docker run -p 8000:8000 \
  -e OPENAI_API_KEY=your_key \
  -e TAVILY_API_KEY=your_key \
  llmops-multi-agent-cicd-pipeline

Run Tests

uv run pytest tests/ -v

Lint

uv run ruff check .

Environment Variables

Variable Description
OPENAI_API_KEY OpenAI API key
TAVILY_API_KEY Tavily Search API key

Key Design Decisions

Why LangGraph over LangChain AgentExecutor? LangGraph models each agent as an explicit node in a state graph, giving full control over state, routing, and error handling. Each agent's inputs and outputs are typed and traceable.

Why separate Researcher and Writer agents? Separation of concerns — the Researcher never writes prose, the Writer never searches the web. This prevents hallucination (Writer can only use what Researcher found) and makes each agent's behaviour predictable and testable.

Why init_chat_model? Provider-agnostic LLM initialisation. Swapping from OpenAI to Anthropic or any other provider requires changing one string, not the entire codebase.

Why GPT-4o-mini for Researcher and GPT-4o for Writer? Cost optimisation — the Researcher makes many tool calls and processes raw text, where speed and cost matter more than prose quality. The Writer makes fewer calls but needs higher quality output.


Agent Trace — Observability

Every API response includes a full agent_trace showing every action taken by every agent. This is intentional — it provides transparency into how the answer was produced, which is essential for debugging and for demonstrating the system's reasoning in interviews and demos.


License

MIT

About

A production-grade **multi-agent AI research pipeline** built with LangGraph, FastAPI, Docker, and Azure DevOps. The system uses a **Supervisor → Researcher → Writer** agent architecture to autonomously research any topic and produce a polished, structured report.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors