Skip to content

andrewm4894/anomaly-agent

Repository files navigation

Anomaly Agent

Open in GitHub Codespaces pre-commit Tests codecov

PyPI - Version PyPI - Python Version PyPI - License PyPI - Status Open In Colab

๐Ÿค– A powerful Python library for detecting anomalies in time series data using Large Language Models (LLMs). Built with modern LangGraph architecture for robust, scalable anomaly detection across multiple variables and domains.

โœจ Key Features

  • ๐Ÿง  LLM-Powered Detection: Leverages advanced language models for intelligent anomaly identification
  • ๐Ÿ”„ Two-Stage Pipeline: Detection and optional verification phases to reduce false positives
  • ๐Ÿ“Š Multi-Variable Support: Analyze multiple time series variables simultaneously
  • ๐ŸŽฏ Domain Awareness: Contextual understanding of different data types and domains
  • โšก Modern Architecture: Built on LangGraph with Pydantic validation and robust error handling
  • ๐Ÿ› ๏ธ Customizable: Custom prompts, configurable verification, and flexible model selection
  • ๐Ÿ“ˆ Rich Output: Structured anomaly descriptions with timestamps and confidence indicators

๐Ÿš€ Installation

pip install anomaly-agent

๐Ÿ—๏ธ How It Works

The Anomaly Agent uses a sophisticated two-stage pipeline powered by LangGraph state machines:

graph TD
    A[๐Ÿ“Š Input Time Series Data] --> B[๐Ÿ” Detection Stage]
    B --> C{๐Ÿ“‹ Verification Enabled?}
    C -->|Yes| D[โœ… Verification Stage]
    C -->|No| E[๐Ÿ“ค Output Anomalies]
    D --> F{๐ŸŽฏ Anomalies Confirmed?}
    F -->|Yes| E
    F -->|No| G[โŒ Filtered Out]
    G --> E
    
    style A fill:#e1f5fe
    style B fill:#f3e5f5
    style D fill:#fff3e0
    style E fill:#e8f5e8
Loading

๐Ÿ”ง Architecture Components

  1. ๐Ÿ” Detection Node: Uses LLM to identify potential anomalies with statistical and contextual analysis
  2. โœ… Verification Node (Optional): Secondary LLM review to reduce false positives with stricter criteria
  3. ๐ŸŽฏ State Management: Pydantic-based validation and error handling throughout the pipeline
  4. ๐Ÿ“Š Multi-Variable Processing: Parallel analysis of multiple time series columns

โšก Quick Start

Basic Usage

import pandas as pd
from anomaly_agent import AnomalyAgent

# Your time series data
df = pd.DataFrame({
    'timestamp': pd.date_range('2024-01-01', periods=100, freq='D'),
    'temperature': [20 + i*0.1 + (10 if i==50 else 0) for i in range(100)],
    'pressure': [1013 + i*0.2 + (50 if i==75 else 0) for i in range(100)]
})

# Create agent and detect anomalies
agent = AnomalyAgent()
anomalies = agent.detect_anomalies(df)

# Convert to DataFrame for analysis
df_anomalies = agent.get_anomalies_df(anomalies)
print(df_anomalies)

Advanced Configuration

from anomaly_agent import AnomalyAgent

# Customize model and verification behavior
agent = AnomalyAgent(
    model_name="gpt-4o-mini",           # Choose your preferred model
    verify_anomalies=True,              # Enable verification stage
    timestamp_col="date"                # Custom timestamp column name
)

# Custom prompts for domain-specific detection
financial_detection_prompt = """
You are a financial analyst detecting market anomalies.
Focus on: unusual price movements, volume spikes, trend reversals.
Consider market hours and economic events in your analysis.
"""

agent = AnomalyAgent(detection_prompt=financial_detection_prompt)
anomalies = agent.detect_anomalies(financial_data)

๐Ÿ“š Examples and Notebooks

๐Ÿ“ Examples Directory

Explore comprehensive examples in the examples/ folder:

๐ŸŽฎ Interactive Examples

# Run basic example
python examples/examples.py --example basic --plot

# Try real-world sensor data scenario  
python examples/examples.py --example real-world --plot

# Custom model and plotting
python examples/examples.py --model gpt-4o-mini --example multiple --plot

๐Ÿ““ Jupyter Notebooks

Launch the interactive notebook:

  • Local: Open examples/examples.ipynb
  • Colab: Open In Colab

๐Ÿ“Š Output Formats

Long Format (Default)

df_anomalies = agent.get_anomalies_df(anomalies)
timestamp variable_name value anomaly_description
2024-01-15 temperature 35.2 Significant temperature spike...
2024-01-20 pressure 1089.3 Unusual pressure reading...

Wide Format

df_anomalies = agent.get_anomalies_df(anomalies, format="wide")
timestamp temperature temperature_description pressure pressure_description
2024-01-15 35.2 Significant spike... NaN NaN
2024-01-20 NaN NaN 1089.3 Unusual reading...

๐ŸŽ›๏ธ Model Configuration

Choose the right model for your needs and budget:

Model Cost (Input/Output per 1M tokens) Best For Performance
gpt-5-nano $0.05 / $0.40 Cost-effective anomaly detection โญโญโญ
gpt-5-mini $0.25 / $2.00 Enhanced reasoning for complex patterns โญโญโญโญ
gpt-5 $1.25 / $10.00 Sophisticated domain-specific analysis โญโญโญโญโญ
gpt-4o-mini $0.60 / $2.40 Legacy support with good performance โญโญโญโญ
# Cost-optimized (default)
agent = AnomalyAgent(model_name="gpt-5-nano")

# Enhanced reasoning
agent = AnomalyAgent(model_name="gpt-5-mini") 

# Premium analysis
agent = AnomalyAgent(model_name="gpt-5")

๐ŸŽฏ Use Cases

๐Ÿข Business & Operations

  • ๐Ÿ“ˆ Sales Analytics: Detect unusual sales patterns, seasonal anomalies
  • ๐Ÿญ Manufacturing: Monitor equipment performance, quality metrics
  • ๐Ÿ’ฐ Financial Services: Fraud detection, market anomaly identification
  • ๐ŸŒ Web Analytics: Traffic spikes, user behavior anomalies

๐Ÿ”ฌ Science & Engineering

  • ๐ŸŒก๏ธ IoT Sensors: Temperature, humidity, pressure monitoring
  • โšก Energy Systems: Power consumption, grid stability analysis
  • ๐Ÿฉบ Healthcare: Patient monitoring, medical device readings
  • ๐ŸŒ Environmental: Weather patterns, pollution levels

๐Ÿ“Š Data Quality

  • ๐Ÿ” Data Validation: Identify measurement errors, sensor failures
  • ๐Ÿ“‹ ETL Monitoring: Pipeline anomalies, data drift detection
  • ๐ŸŽฏ Quality Assurance: Automated anomaly flagging in data workflows

๐Ÿ› ๏ธ Development

This project uses uv for fast, reliable dependency management. All commands automatically handle virtual environment management.

๐Ÿ—๏ธ Setup

# Clone the repository
git clone https://github.com/andrewm4894/anomaly-agent.git
cd anomaly-agent

# Install dependencies (creates .venv automatically)
make sync-dev

๐Ÿงช Testing

# Run all tests with coverage
make test

# Run specific test categories
uv run pytest tests/test_agent.py -v                    # Core functionality
uv run pytest tests/test_prompts.py -v                  # Prompt system
uv run pytest tests/test_graph_architecture.py -v       # Advanced architecture

# Integration tests (requires OPENAI_API_KEY in .env)
uv run pytest tests/ -m integration -v

๐Ÿ“‹ Code Quality

# Install pre-commit hooks
make pre-commit-install

# Run all quality checks
make pre-commit

# Individual tools
uv run black anomaly_agent/    # Formatting
uv run isort anomaly_agent/    # Import sorting  
uv run flake8 anomaly_agent/   # Linting
uv run mypy anomaly_agent/     # Type checking

๐Ÿ“ฆ Dependencies

# Add new dependencies
make add PACKAGE=pandas              # Runtime dependency
make add-dev PACKAGE=pytest          # Development dependency

# Update all dependencies
make update

# Remove dependencies
make remove PACKAGE=old-package

โš™๏ธ Environment Setup

Create a .env file in your project root:

# Required for anomaly detection
OPENAI_API_KEY=your-openai-api-key-here

# Optional: Custom model defaults
DEFAULT_MODEL_NAME=gpt-5-nano

The agent automatically loads environment variables via python-dotenv.

๐Ÿ—๏ธ Architecture Deep Dive

For detailed technical information about the internal architecture, see ARCHITECTURE.md.

Key architectural features:

  • ๐Ÿ”ง LangGraph State Machines: Robust workflow management with proper error handling
  • โœ… Pydantic Validation: Type-safe data models throughout the pipeline
  • ๐ŸŽฏ GraphManager Caching: Optimized performance with reusable compiled graphs
  • ๐Ÿ“Š Class-based Nodes: Modular, maintainable node architecture
  • ๐Ÿ”„ Async Support: Streaming and parallel processing capabilities

๐Ÿค Contributing

We welcome contributions! Please see our contributing guidelines for details.

  1. ๐Ÿด Fork the repository
  2. ๐ŸŒฟ Create a feature branch (git checkout -b feature/amazing-feature)
  3. โœ… Test your changes (make test)
  4. ๐Ÿ“ Commit your changes (git commit -m 'Add amazing feature')
  5. ๐Ÿš€ Push to the branch (git push origin feature/amazing-feature)
  6. ๐ŸŽฏ Open a Pull Request

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ™ Acknowledgments

  • Built with LangChain and LangGraph
  • Powered by OpenAI's language models
  • Inspired by the need for intelligent, contextual anomaly detection

About

aGeNtIc ๐Ÿš€ time series anomaly detection on your df with AnomalyAgent().detect_anomalies(df)

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •