An AI Agent leveraging fine-tuned LLMs (Google/Gemma-3-1B-it w/ PEFT) and LangGraph to proactively identify, assess, and suggest mitigations for supply chain disruptions, tailored to specific client needs (e.g. from Petroleum and Pharma Industry). Includes MLOps integration using Vertex AI and MLflow for robust deployment and maintenance.
Managing complex global supply chains for diverse clients from Petroleum Industry (volatile energy markets) and Pharma Industry (temperature-sensitive pharma) involves constant risk of disruptions (e.g., port closures, extreme weather, customs delays, geopolitical events). Reactively managing these led to significant costs, SLA penalties, and client dissatisfaction for major global logistics providers. A system was needed to proactively identify potential disruptions, assess impact specific to client supply chains, and suggest mitigation actions before escalation.
This project implements a proactive AI agent built with LangGraph that orchestrates multiple components:
- Event Ingestion & Filtering: Monitors real-time data streams (news APIs like GDELT, weather feeds, internal alerts) and filters for potentially relevant supply chain events.
- Context Gathering: Retrieves client-specific data including network maps, contractual obligations (SLAs), product sensitivities (e.g., temperature constraints), and historical performance.
- Client-Specific Risk Assessment: Utilizes a Large Language Model (LLM), fine-tuned using PEFT/LoRA on client-specific data, to assess the potential impact of the event on that client's unique supply chain configuration and sensitivities.
- Impact Forecasting: Employs a hybrid approach combining a traditional ML model (XGBoost trained on historical data) to predict quantitative impact (e.g., probability of delay) and LLM reasoning to forecast qualitative impact (cost, operational issues).
- RAG-Enhanced Mitigation Suggestions: Generates context-aware mitigation actions using an LLM augmented with Retrieval-Augmented Generation (RAG). The RAG component retrieves relevant Standard Operating Procedures (SOPs) and successful past mitigation examples from a vector database.
- Notification: Dispatches tailored alerts to relevant internal teams (Operations, Client Managers) with the assessment, forecast, and suggested actions.
The core differentiator of this agent is the client-specific fine-tuning of the risk assessment LLM. Unlike generic alerting systems, this agent understands the unique vulnerabilities, contractual nuances, and product requirements of individual clients (demonstrated with Pharma vs. Petroleum Industry archetypes). This allows for highly contextualized risk scoring and mitigation suggestions, moving beyond a one-size-fits-all approach.
- Proactive Disruption Identification from multiple sources.
- Client-Specific Impact Assessment using fine-tuned LLMs (PEFT/LoRA on Google/Gemma-3-1B-it).
- Hybrid Impact Forecasting (XGBoost for delay probability + LLM reasoning).
- RAG-Enhanced Mitigation leveraging SOPs and historical data via Vector Search.
- Stateful and Conditional Agent Workflow using LangGraph.
- End-to-End MLOps Pipeline on Google Cloud Vertex AI (Pipelines, Training, Prediction, Monitoring).
- Experiment Tracking & Model Registry integration with MLflow.
- Containerized FastAPI application for potential integration.
- Comprehensive testing suite (
pytest).
- Orchestration: Langchain, LangGraph
- LLMs: Google/Gemma-3-1B-it (Base Model), Hugging Face Transformers, PEFT (LoRA)
- ML: PyTorch (underlying Transformers), Scikit-learn, XGBoost
- Vector DB (RAG): FAISS (for local demo), Vertex AI Vector Search (for cloud)
- Embeddings: Sentence Transformers (Hugging Face)
- Cloud Platform: Google Cloud Platform (GCP)
- MLOps: Vertex AI (Pipelines, Custom Training, Endpoints, Prediction, Model Registry, Experiments [via MLflow integration or native], Model Monitoring)
- Data: BigQuery, Cloud Storage, Pub/Sub
- Compute/Serving: Google Kubernetes Engine (GKE) or Cloud Run
- Experiment Tracking: MLflow
- API: FastAPI, Uvicorn
- Containerization: Docker
- Testing: Pytest, pytest-asyncio, HTTPX
- Language: Python 3.9+
The agent operates as a state machine defined using LangGraph. The state (AgentState) tracks information about the event, affected clients, assessments, forecasts, and suggestions as it moves through the nodes. Edges determine the next step, often conditionally based on the current state (e.g., event relevance, presence of errors).
Node Descriptions:
entry_node: Initializes state, assigns event ID, parses or fetches initial event data.filter_event: Checks event relevance using keywords/rules. Identifies potentially affected clients. Edge: Proceeds togather_contextif relevant AND clients identified, else END.gather_context: Fetches client-specific details (routes, SLAs) for relevant clients. Edge: Proceeds toassess_riskif context gathered for at least one client, else END.assess_risk: For each client, calls the fine-tuned LLM for specific risk assessment. Edge: Proceeds toforecast_impact(even with partial errors), END on critical failure.forecast_impact: For each client, calls XGBoost model for delay probability and LLM for qualitative impact/reasoning. Edge: Proceeds tosuggest_mitigation(even with partial errors), END on critical failure.suggest_mitigation: For each client, performs RAG search on SOPs and calls LLM to generate mitigation steps. Edge: Proceeds todispatch_notification(even with partial errors), END on critical failure.dispatch_notification: Formats analysis and simulates sending alerts.terminate_workflow/final_node: Logs final status (Completed/Failed) and performs cleanup, leading to LangGraph'sEND.
Diagram
graph LR
A[entry_node] --> B(filter_event);
B -- Relevant --> C{gather_context};
B -- Irrelevant --> Z(END);
C -- Context Found --> D(assess_risk);
C -- Context Error / No Clients --> Z;
D -- Assessment Done --> E(forecast_impact);
D -- Critical Assessment Error --> Z;
E -- Forecast Done --> F(suggest_mitigation);
E -- Critical Forecast Error --> Z;
F -- Suggestions Done --> G(dispatch_notification);
F -- Critical Suggestion Error --> Z;
G --> H(terminate_workflow / final_node);
H --> Z;
style Z fill:#f9f,stroke:#333,stroke-width:2px;
proactive-supply-chain-disruption-agent/
├── .github/workflows/ # CI/CD workflows (e.g., lint, test, build)
├── .gitignore
├── LICENSE
├── README.md # This file
├── requirements.txt # Python dependencies
├── setup.py # Setup script
├── setup.cfg # Build configuration
├── pyproject.toml # Build system definition, tool configs
├── configs/ # YAML configuration files (agent, api, ml)
├── data/ # Sample/hypothetical data & schemas
│ ├── schemas/ # JSON schemas for data sources
│ ├── sample_*.csv/txt # Small sample data files
│ └── fine-tuning/ # Sample raw/processed data for LLM tuning
├── deployment/ # Deployment artifacts (Dockerfile, GKE/Cloud Run YAMLs)
├── mlops/ # MLOps pipelines and components (Vertex AI)
│ ├── vertex_pipelines/ # KFP pipeline definitions and components
│ └── monitoring/ # Monitoring config placeholders (e.g., dashboard JSON)
├── notebooks/ # Jupyter notebooks for exploration and prototyping
├── src/ # Source code
│ └── proactive_disruption_agent/ # Main Python package
│ ├── agent/ # LangGraph agent implementation (graph, nodes, state)
│ ├── api/ # FastAPI application (main, models, dependencies)
│ ├── data_ingestion/ # Scripts/modules for event/data ingestion
│ ├── fine_tuning/ # LLM fine-tuning scripts (preprocess, train)
│ ├── forecasting/ # XGBoost forecasting model (train, predict)
│ ├── llm_services/ # Clients for interacting with LLMs (Vertex AI, local)
│ ├── retrieval/ # RAG / Vector store interaction
│ └── utils/ # Utility functions (config, logging)
└── tests/ # Unit and integration tests (pytest)
├── agent/
├── api/
└── forecasting/
- Python 3.9+
- Google Cloud SDK initialized and authenticated (if using GCP resources).
- Access to a GCP project with Vertex AI, BigQuery, Pub/Sub, Cloud Storage APIs enabled (for full MLOps/deployment).
- Docker installed (for containerization).
- MLflow Tracking Server (local, hosted, or Vertex AI Experiment) configured (optional, but recommended).
- Access to a base LLM like Gemma3-1B (e.g., via Hugging Face Hub).
- GPU (preferably NVIDIA with CUDA, >=16GB VRAM recommended) for local fine-tuning/inference experimentation.
-
Clone the repository:
git clone https://github.com/Smit-Parekh/proactive-supply-chain-disruption-agent.git cd proactive-supply-chain-disruption-agent -
Create and activate a virtual environment:
python -m venv venv source venv/bin/activate # On Windows use `venv\Scripts\activate`
-
Install dependencies:
pip install --upgrade pip pip install -r requirements.txt # Install the src package in editable mode pip install -e .
(Note: Some libraries like
torch,faiss,bitsandbytesmight require specific installation commands depending on your OS and CUDA version. Refer to their official documentation if the standard pip install fails.)
- Copy & Customize: Examine the configuration files in
configs/. You might copyagent_config.yamltoagent_config_local.yamletc. for local overrides. - Fill Placeholders: Crucially, you MUST replace placeholders in
configs/*.yaml(e.g.,your-gcp-project-id,YOUR_*_ENDPOINT_ID,your-gcs-bucket-name,your_mlflow_or_vertex_experiment_uri) with your actual values. - Secrets Management: For sensitive values (API keys, service account keys), DO NOT hardcode them in YAML files. Use environment variables, Google Secret Manager, or other secure methods, especially for cloud deployments. The code should be adapted to read from these sources.
- Authentication:
- GCP: Use
gcloud auth application-default loginfor local development or configure service accounts for deployments (Workload Identity for GKE/Cloud Run recommended). SetGOOGLE_APPLICATION_CREDENTIALSenvironment variable if using service account keys directly (keep key files secure and out of git). - MLflow: Set
MLFLOW_TRACKING_URIenvironment variable if needed.
- GCP: Use
- The
data/folder contains small, hypothetical sample files (.csv,.txt) and schemas (data/schemas/). - The
data/fine-tuning/tuning_raw_data.csvprovides a small sample of the format needed for fine-tuning. - For local testing, the application reads from these files (configured via
configs/agent_config.yaml). - For cloud deployment, data would typically reside in GCS, BigQuery, etc., and MLOps pipelines would reference those sources.
The easiest way to test the agent logic locally, especially with a locally fine-tuned LLM adapter:
- Ensure you have run necessary preprocessing/training steps (e.g., Notebook 03 for forecast model, Notebook 02 for LLM adapter).
- Modify and run
notebooks/04_langgraph_agent_prototype.ipynb. This notebook shows how to:- Load local forecasting model/vector store.
- Load the locally fine-tuned LLM adapter.
- Patch the agent's LLM calls to use the local model.
- Execute the graph with sample input and inspect the state.
- Ensure API configuration is appropriate (port usually set via command line).
- Ensure the agent dependencies (models, vector store) can be loaded based on
configs/*.yaml. You might need to mock cloud LLM calls if not testing with a local adapter via patching. - Run from the project root:
uvicorn src.proactive_disruption_agent.api.main:app --reload --port 8000
- Access the API docs via Swagger UI at
http://localhost:8000/docs. - Send POST requests to
/analyze-eventand check status using GET/analysis/{analysis_id}.
- Build Container Images: Build Docker images for training (
xgboost_training_container,llm_tuning_container) containing thesrccode and dependencies. Push them to Google Container Registry (GCR) or Artifact Registry. Also build the image for the API service specified indeployment/. - Configure Pipeline Parameters: Update
mlops/vertex_pipelines/pipeline.pywith your GCP project details, GCS paths, container image URIs, desired machine types, model names, etc. - Compile the Pipeline:
python -m mlops.vertex_pipelines.compile_pipeline --output mlops/vertex_pipelines/compiled_ml_pipeline.yaml
- Submit the Pipeline Job via the Google Cloud Console (Vertex AI > Pipelines) or using
gcloud:gcloud ai pipelines jobs submit \ --region=<your-region> \ --pipeline-spec=mlops/vertex_pipelines/compiled_ml_pipeline.yaml \ --parameter-values=project_id="<your-project-id>",pipeline_root="gs://<your-bucket>/pipeline-root",incident_data_gcs_uri="gs://...", ... # Pass all required pipeline parameters
- Objective: Adapt the base LLM (Google/Gemma-3-1B-it) to better understand client-specific context and generate relevant risk assessments or mitigation steps.
- Data: Requires curated, high-quality instruction-following data. See
data/fine-tuning/tuning_raw_data.csvfor format (input = event + client context, output = desired assessment/mitigation). - Process:
- Preprocessing: Use
src/fine_tuning/data_preprocessor.pyto convert raw data into JSONL format suitable for training. - Training: Use
src/fine_tuning/train.py(or Notebook 02 for experiments). Leverages Hugging Facetransformers,peft(LoRA), andTrainer. Can be run locally (GPU required) or orchestrated via the Vertex AI MLOps pipeline (launch_llm_fine_tuning_jobcomponent) for scalable cloud training. - Output: PEFT adapter weights saved to GCS (or locally).
- Preprocessing: Use
- Configuration: See
configs/ml_config.yamlfor fine-tuning parameters andconfigs/agent_config.yamlfor referencing resulting models/endpoints. See Notebook 02 discussion in the project history/documentation for recommended hyperparameter starting points.
- Vertex AI Pipelines: Orchestrate the ML workflow (data validation, training, evaluation, deployment). See
mlops/vertex_pipelines/. - Vertex AI Custom Training: Execute containerized training scripts (
model.py,train.py) scalably. - Vertex AI Prediction Endpoints: Host the trained XGBoost model and the fine-tuned LLM (base + adapter, using suitable serving containers like TGI/vLLM) for low-latency inference.
- Vertex AI Model Registry: Store and version trained models (XGBoost and potentially LLM adapters/pointers).
- Vertex AI Experiments / MLflow: Track hyperparameters, metrics, and artifacts for both XGBoost and LLM fine-tuning runs using MLflow (configure
MLFLOW_TRACKING_URI). - Vertex AI Model Monitoring: Configure monitoring jobs to detect drift in predictions or skew in input features for the deployed XGBoost model (and potentially for LLM inputs/outputs if feasible).
- Automation: Pipelines can be triggered manually, on a schedule (e.g., for retraining), or via Pub/Sub events (e.g., on new data arrival).
- Built with FastAPI (
src/proactive_disruption_agent/api/). - Uses Pydantic models (
api/models.py) for request/response validation. - Provides endpoints:
POST /analyze-event: Asynchronously triggers the agent workflow for a given event. Returns ananalysis_id.GET /analysis/{analysis_id}: Retrieves the status (PENDING, RUNNING, COMPLETED, FAILED) and results of a specific analysis.
- Uses a background task runner for the potentially long-running agent execution.
- Leverages FastAPI's dependency injection (
api/dependencies.py) to manage the agent graph instance. - Uses a simple in-memory dictionary (
analysis_store) for tracking job status (replace with Redis/database for production).
- Unit and integration tests are located in the
tests/directory usingpytest. tests/agent/: Tests the LangGraph nodes and graph logic, heavily utilizing mocking (unittest.mock) for external dependencies (LLM, forecasting, RAG).tests/forecasting/: Tests data preprocessing logic and the prediction function, potentially loading locally saved model artifacts.tests/api/: Tests FastAPI endpoints usingTestClient, mocking the agent execution or dependencies.- Run tests from the project root using:
pytest tests/ ```.
- Reduction in critical shipment delays: 18% (on pilot clients)
- Response time improvement (proactive insight): Hours -> Minutes
- Accuracy: LLM risk explanations verified by SMEs.
- Qualitative feedback: Positive from pilot stakeholders (Ops VPs, Client Directors).
- More sophisticated forecasting models (Deep Learning Time Series, ensemble methods).
- Multi-agent systems (e.g., specialized agents for different disruption types or clients).
- Reinforcement learning to optimize mitigation suggestions based on feedback/outcomes.
- Integration with real-time shipment tracking data.
- Full UI/Dashboard for visualizing risks, impacts, and suggested actions.
- More robust error handling and state management (persistent job store).
- Implementing LLM evaluation within the MLOps pipeline.
This project is licensed under the MIT License - see the LICENSE file for details.
- Langchain/LangGraph Team & Community
- Hugging Face Community & Model Contributors (Google DeepMind)
- MLflow Team
- Google Cloud Vertex AI Team
- GDELT Project and OpenWeatherMap