API Provider Benchmarking Tool

A comprehensive command-line tool for benchmarking API providers with advanced monitoring capabilities. Built with Go and Vegeta, it tests multiple providers simultaneously, tracks server-side memory usage, and provides detailed performance analytics.

Features

Multi-Provider Support: Benchmark Bifrost, LiteLLM, Portkey, and Helicone simultaneously or individually
Advanced Metrics: Latency percentiles, throughput, success rates, and server memory monitoring
Real-time Memory Tracking: Monitor server-side RAM usage during benchmarks
Dynamic Payload Generation: Support for small and large payloads with dynamic content injection
Environment-based Configuration: Automatic port detection via .env file
Comprehensive Results: JSON output with aggregated metrics and historical data
Error Analysis: Track and categorize failed requests and drop reasons
Cooldown Management: Configurable rest periods between provider tests

Prerequisites

Go (version 1.19 or higher)
.env file with provider port configurations (see Environment Setup)
Target API providers must be running and accessible

Setting Up Bifrost for Benchmarking

Bifrost serves as a high-performance HTTP API gateway that provides seamless connections to various AI providers like OpenAI, Anthropic, and AWS Bedrock. This section covers the complete setup process for optimal benchmarking performance.

1. Install Bifrost

Option A: Using NPX (Recommended for Quick Start)

npx -y @maximhq/bifrost

This command downloads and runs the latest version of Bifrost instantly without requiring a separate installation.

Option B: Using Docker

# Pull the Bifrost image
docker pull maximhq/bifrost

# Run Bifrost with port mapping
docker run -p 8080:8080 maximhq/bifrost

# For persistent configuration across restarts
docker run -p 8080:8080 -v $(pwd)/data:/app/data maximhq/bifrost

After installation, Bifrost will be accessible at http://localhost:8080 with a web-based management interface.

📖 For detailed installation instructions: Bifrost Setup Guide

2. Run the Mock Server (Optional - For Testing Without External APIs)

If you want to test without making actual API calls to external providers:

# Navigate to the mocker directory
cd mocker

# Run the mock OpenAI server
go run main.go -port 8000

# Or with custom latency and jitter for realistic testing
go run main.go -port 8000 -latency 50 -jitter 20 -big-payload

3. Configure Providers in Bifrost

For Real API Testing:

Access the Bifrost Web UI at http://localhost:8080
Navigate to Providers section
Add your desired provider (OpenAI, Anthropic, etc.) with API keys
Configure routing rules as needed

For Mock Testing (Optional):

In the Bifrost Web UI, go to Providers.
Add an OpenAI provider with:
- Base URL: Use the host URL where your mock server is running:
  - For Docker Installation:
    - macOS / Windows (Docker Desktop):
      http://host.docker.internal:<mock server port>
    - Linux: Docker doesn’t provide host.docker.internal by default. Find your host’s Docker bridge IP using:
      ip addr show docker0
      Usually, the IP is 172.17.0.1. Then configure the provider URL as:
      http://172.17.0.1:<mock server port>
- API Key: Any dummy key (e.g., sk-mock-key)

Note: Alternatively, on Linux, you can run the container with host networking (--network host) and use http://127.0.0.1:<mock server port> inside the container.

4. Optimize Performance Settings

For accurate benchmarking results, configure these performance settings in the Bifrost Web UI:

Buffer and Pool Configuration:

Navigate to Providers → OpenAI → Performance Settings
Set Concurrency: >7500 (recommended for high-load testing for concurrent connections)
Set Buffer Size: >10000 (recommended for high-load testing)
Navigate to Settings
Set Initial Pool Size: >10000 (recommended for high-load testing)

5. Disable Unnecessary Plugins (Optional)

To minimize latency overhead and memory usage during benchmarking:

Access Settings in the Bifrost Web UI
Disable any non-essential plugins such as:
- Logging plugins (if detailed logs aren't required)
- Governance plugins (if not needed for testing)
- Semantic caching (if not needed for testing)

Note: Plugin overhead is typically minimal as nothing is computed in the hot path, but disabling them ensures the purest performance measurements.

6. Verify Bifrost Configuration

Before running benchmarks, verify your setup:

# Test Bifrost connectivity
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{"role": "user", "content": "Hello"}],
    "model": "gpt-4o-mini"
  }'

Expected response: JSON with completion data or mock response (depending on your provider configuration).

7. Update Environment Configuration

Update your .env file to match your Bifrost setup:

BIFROST_PORT=8080  # Default Bifrost port
# ... other provider ports

📖 For advanced configuration options: Bifrost Documentation

Environment Setup

Create a .env file in the project root with provider ports:

BIFROST_PORT=3001
LITELLM_PORT=4000
PORTKEY_PORT=8787
HELICONE_PORT=3002
OPENAI_API_KEY=sk-your-openai-api-key

Installation

# Clone or download the project
git clone <repository-url>
cd maxim-bifrost-benchmarks

# Install dependencies
go mod tidy

# Build the executable (optional)
go build benchmark.go

Usage

Basic Commands

Run all providers (default configuration):

go run benchmark.go
# Tests all configured providers with 500 RPS for 10 seconds each

Benchmark a specific provider:

go run benchmark.go -provider bifrost

High-intensity testing:

go run benchmark.go -rate 1000 -duration 30 -provider litellm

Large payload testing:

go run benchmark.go -big-payload -provider portkey -duration 60

Command-Line Flags

Flag	Type	Default	Description
`-rate`	int	500	Requests per second to send
`-duration`	int	10	Test duration in seconds
`-output`	string	results.json	Output file for benchmark results
`-cooldown`	int	60	Cooldown period between tests (seconds)
`-provider`	string	""	Specific provider to test (bifrost, litellm, portkey, helicone)
`-big-payload`	bool	false	Use large ~10KB payloads instead of small ones
`-model`	string	gpt-4o-mini	Model identifier to use in requests
`-suffix`	string	v1	URL route suffix (e.g., v1, v2)

Advanced Examples

Production load simulation:

go run benchmark.go -rate 2000 -duration 300 -cooldown 120 -big-payload
# 2000 RPS for 5 minutes each provider, 2-minute cooldowns, large payloads

Quick smoke test:

go run benchmark.go -rate 100 -duration 5 -cooldown 10
# Light testing: 100 RPS for 5 seconds each, 10-second cooldowns

Single provider deep dive:

go run benchmark.go -provider bifrost -rate 1500 -duration 120 -output bifrost_detailed.json
# Focus test: Bifrost only, 1500 RPS for 2 minutes

Output & Results

Results are saved in JSON format with detailed metrics for each provider:

Sample Output Structure

{
  "bifrost": {
    "requests": 5000,
    "rate": 500.12,
    "success_rate": 99.8,
    "mean_latency_ms": 45.2,
    "p50_latency_ms": 42.1,
    "p99_latency_ms": 156.7,
    "max_latency_ms": 203.4,
    "throughput_rps": 498.5,
    "timestamp": "2024-01-15T10:30:00Z",
    "status_code_counts": {
      "200": 4990,
      "500": 10
    },
    "server_peak_memory_mb": 256.7,
    "server_avg_memory_mb": 189.3,
    "drop_reasons": {
      "HTTP 500": 10
    }
  }
}

Key Metrics Explained

Success Rate: Percentage of HTTP 200 responses
Latency Metrics: P50, P99, mean, and max response times in milliseconds
Throughput: Actual requests processed per second
Memory Tracking: Peak and average server RAM usage during test
Drop Reasons: Categorized failure analysis (timeouts, HTTP errors, etc.)

Architecture

The tool uses:

Vegeta: High-performance HTTP load testing library
gopsutil: System monitoring for memory usage tracking
Dynamic Targeting: Real-time payload generation with request indexing
Concurrent Monitoring: Parallel memory sampling during load tests
Process Discovery: Automatic server process detection by port

Payload Types

Small Payload (~200 bytes)

{
  "messages": [
    {
      "role": "user",
      "content": "This is a benchmark request #123 at 2024-01-15T10:30:00Z. How are you?"
    }
  ],
  "model": "openai/gpt-4o-mini"
}

Large Payload (~10KB)

Extended payload with comprehensive AI proxy gateway analysis questions, suitable for testing larger request handling.

Provider-Specific Features

Portkey: Automatic x-portkey-config header injection with OpenAI API key
Dynamic Content: Request indexing and timestamps in all payloads
Memory Monitoring: Per-provider server process tracking
Error Categorization: Provider-specific failure analysis

Troubleshooting

Common Issues:

"No process found on port": Ensure your provider is running and the .env file has correct ports
"Provider not found": Check provider name spelling (bifrost, litellm, portkey, helicone)
Memory monitoring fails: Run with sufficient permissions to access process information
High latency/timeouts: Reduce rate or increase server resources

Debug Tips:

Start with low rates (-rate 50) to verify basic connectivity
Use single provider tests to isolate issues
Check server logs during benchmark runs
Monitor system resources on both client and server sides

Performance Considerations

Client Resources: High rates (>2000 RPS) may require tuning client machine resources
Network: Consider network bandwidth and latency in results interpretation
Server Capacity: Monitor target server CPU/memory to avoid resource exhaustion
Cooldown Periods: Allow servers to recover between intensive tests

This tool is designed for comprehensive API performance analysis in development, staging, and production environments.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
mitmproxy		mitmproxy
mocker		mocker
.gitignore		.gitignore
README.md		README.md
benchmark.go		benchmark.go
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

API Provider Benchmarking Tool

Features

Prerequisites

Setting Up Bifrost for Benchmarking

1. Install Bifrost

Option A: Using NPX (Recommended for Quick Start)

Option B: Using Docker

2. Run the Mock Server (Optional - For Testing Without External APIs)

3. Configure Providers in Bifrost

4. Optimize Performance Settings

5. Disable Unnecessary Plugins (Optional)

6. Verify Bifrost Configuration

7. Update Environment Configuration

Environment Setup

Installation

Usage

Basic Commands

Command-Line Flags

Advanced Examples

Output & Results

Sample Output Structure

Key Metrics Explained

Architecture

Payload Types

Small Payload (~200 bytes)

Large Payload (~10KB)

Provider-Specific Features

Troubleshooting

Performance Considerations

About

Uh oh!

Releases

Packages

Languages

maximhq/bifrost-benchmarking

Folders and files

Latest commit

History

Repository files navigation

API Provider Benchmarking Tool

Features

Prerequisites

Setting Up Bifrost for Benchmarking

1. Install Bifrost

Option A: Using NPX (Recommended for Quick Start)

Option B: Using Docker

2. Run the Mock Server (Optional - For Testing Without External APIs)

3. Configure Providers in Bifrost

4. Optimize Performance Settings

5. Disable Unnecessary Plugins (Optional)

6. Verify Bifrost Configuration

7. Update Environment Configuration

Environment Setup

Installation

Usage

Basic Commands

Command-Line Flags

Advanced Examples

Output & Results

Sample Output Structure

Key Metrics Explained

Architecture

Payload Types

Small Payload (~200 bytes)

Large Payload (~10KB)

Provider-Specific Features

Troubleshooting

Performance Considerations

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages