| layout | title | nav_order |
|---|---|---|
default |
Configuration |
4 |
This guide covers how to configure OneLLM for different providers and use cases.
OneLLM uses environment variables for API keys and configuration.
Set API keys for the providers you want to use:
# OpenAI
export OPENAI_API_KEY="sk-..."
# Anthropic
export ANTHROPIC_API_KEY="sk-ant-..."
# Google AI Studio
export GOOGLE_API_KEY="..."
# Mistral
export MISTRAL_API_KEY="..."
# Groq
export GROQ_API_KEY="..."
# X.AI
export XAI_API_KEY="..."
# And more...Configure OneLLM behavior:
# Set default timeout (seconds)
export ONELLM_TIMEOUT=60
# Set default max retries
export ONELLM_MAX_RETRIES=3
# Set logging level
export ONELLM_LOG_LEVEL=INFOCreate a .env file in your project:
# Provider API Keys
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_API_KEY=...
MISTRAL_API_KEY=...
# OneLLM Configuration
ONELLM_TIMEOUT=60
ONELLM_MAX_RETRIES=3
Load in Python:
from dotenv import load_dotenv
load_dotenv()
from onellm import OpenAI
client = OpenAI()Configure the client at initialization:
from onellm import OpenAI
client = OpenAI(
api_key="sk-...", # Override environment variable
timeout=120, # Custom timeout
max_retries=5 # Custom retry count
)Some providers need special configuration:
Create azure.json:
{
"endpoint": "https://your-name.openai.azure.com",
"api_key": "your-azure-key",
"api_version": "2024-02-01",
"deployments": {
"gpt-4": "your-gpt4-deployment",
"gpt-35-turbo": "your-gpt35-deployment"
}
}Use in code:
client = OpenAI(azure_config_path="azure.json")Create bedrock.json:
{
"region": "us-east-1",
"aws_access_key_id": "AKIA...",
"aws_secret_access_key": "..."
}Set service account:
export GOOGLE_APPLICATION_CREDENTIALS="path/to/service-account.json"Create custom model aliases:
from onellm import OpenAI
# Create client with model mappings
client = OpenAI()
# Use provider/model format
response = client.chat.completions.create(
model="anthropic/claude-3-5-sonnet-20241022",
messages=[{"role": "user", "content": "Hello"}]
)Set default parameters for all requests:
from functools import partial
# Create a configured create method
create_chat = partial(
client.chat.completions.create,
temperature=0.7,
max_tokens=500,
top_p=0.9
)
# Use with defaults
response = create_chat(
model="openai/gpt-4o-mini",
messages=[{"role": "user", "content": "Hello"}]
)Configure OneLLM behavior at runtime:
import onellm
# Set API keys programmatically
onellm.openai_api_key = "sk-..." # OpenAI API key
onellm.anthropic_api_key = "sk-..." # Anthropic API key
# Configure fallback behavior
onellm.config.fallback = {
"enabled": True,
"default_chains": {
"chat": ["openai/gpt-4", "anthropic/claude-3-opus", "groq/llama3-70b"],
"embedding": ["openai/text-embedding-3-small", "cohere/embed-english"]
},
"retry_delay": 1.0,
"max_retries": 3
}Configure retry behavior:
from onellm import OpenAI
from onellm.utils.retry import RetryConfig
client = OpenAI(
retry_config=RetryConfig(
max_retries=5,
initial_backoff=1.0,
max_backoff=60.0,
exponential_base=2.0
)
)Set different timeouts:
# Global timeout
client = OpenAI(timeout=120)
# Per-request timeout
response = client.chat.completions.create(
model="openai/gpt-4o-mini",
messages=[{"role": "user", "content": "Hello"}],
timeout=30 # Override for this request
)Configure logging:
import logging
# Set logging level
logging.basicConfig(level=logging.DEBUG)
# Or configure OneLLM logger specifically
logger = logging.getLogger("onellm")
logger.setLevel(logging.INFO)# Default Ollama endpoint
client = OpenAI() # Uses http://localhost:11434
# Custom Ollama endpoint
response = client.chat.completions.create(
model="ollama/llama3:8b@192.168.1.100:11434",
messages=[{"role": "user", "content": "Hello"}]
)# Set model directory
import os
os.environ["LLAMA_CPP_MODEL_DIR"] = "/path/to/models"
# Configure GPU layers
client = OpenAI(
llama_cpp_config={
"n_gpu_layers": 35, # GPU acceleration
"n_ctx": 4096, # Context window
"n_threads": 8 # CPU threads
}
)❌ Bad:
client = OpenAI(api_key="sk-1234567890")✅ Good:
client = OpenAI() # Uses environment variable# Development
export OPENAI_API_KEY="sk-dev-..."
# Production
export OPENAI_API_KEY="sk-prod-..."Keep track of key usage and rotate periodically.
Many providers allow restricting keys by:
- IP address
- Usage limits
- Specific models
For Claude.ai Code assistant, create CLAUDE.md:
# CLAUDE.md
This project uses OneLLM for LLM interactions.
## Configuration
- API keys are in .env file
- Default model: openai/gpt-4o-mini
- Timeout: 60 seconds
## Common Commands
- Run tests: pytest
- Format: black .
- Lint: ruff check .Configure OneLLM in pyproject.toml:
[tool.onellm]
default_provider = "openai"
default_model = "gpt-4o-mini"
timeout = 60
max_retries = 3from onellm.config import config
# Print current configuration
print(config)
# Check specific provider
print(config["providers"]["openai"])import os
providers = ["OPENAI", "ANTHROPIC", "GOOGLE", "MISTRAL"]
for provider in providers:
key_name = f"{provider}_API_KEY"
if os.environ.get(key_name):
print(f"✅ {key_name} is set")
else:
print(f"❌ {key_name} is not set")- [Provider Setup]({{ site.baseurl }}/providers/setup) - Detailed provider configuration
- [Best Practices]({{ site.baseurl }}/guides/best-practices) - Configuration best practices
- [Troubleshooting]({{ site.baseurl }}/guides/troubleshooting) - Common configuration issues