The preferred way to add custom providers is through your agent.yaml configuration file using the providers section. This allows you to define reusable provider configurations without modifying Cagent's source code.
providers:
my_custom_provider:
api_type: openai_chatcompletions # or openai_responses
base_url: https://api.example.com/v1
token_key: API_KEY_ENV_VAR_NAME
models:
my_model:
provider: my_custom_provider
model: gpt-4o
max_tokens: 32768
agents:
root:
model: my_model
instruction: You are a helpful assistant.
# You can also use the shorthand syntax
subagent:
model: my_custom_provider/gpt-4o-mini
instruction: You are a specialized assistant.| Field | Description | Default |
|---|---|---|
api_type |
API schema to use (openai_chatcompletions or openai_responses) |
openai_chatcompletions |
base_url |
Base URL for the provider's API endpoint | - |
token_key |
Environment variable name containing the API token | - |
openai_chatcompletions: Use the OpenAI Chat Completions API schema. This is the default and works with most OpenAI-compatible endpoints.openai_responses: Use the OpenAI Responses API schema. Use this for newer models that require the Responses API format.
When you reference a custom provider in your model configuration:
- The provider's
base_urlis applied to the model if not already set - The provider's
token_keyis applied to the model if not already set - The provider's
api_typeis stored inprovider_opts.api_type(model-level overrides take precedence) - The model can then be used with the appropriate API client
If you want to add a new built-in provider alias to Cagent itself, add a new Alias to Aliases in pkg/model/provider/provider.go
var Aliases = map[string]Alias{
"requesty": {
APIType: "openai",
BaseURL: "https://router.requesty.ai/v1",
TokenEnvVar: "REQUESTY_API_KEY",
},
"azure": {
APIType: "openai",
TokenEnvVar: "AZURE_API_KEY",
},
"YOUR_PROVIDER": {
APIType: "openai"
TokenEnvVar: "YOUR_PROVIDER_API_KEY"
BaseURL: "https://your-provider.ai/v1"
}
}If your provider requires custom config, like Azure's api_version or DMR's speculative decoding options
models:
azure_model:
provider: azure
model: gpt-4o
base_url: https://your-llm.openai.azure.com
provider_opts:
api_version: 2024-12-01-preview
# custom option example
your_model:
provider: your_provider
model: gpt-4o
provider_opts:
your_custom_option: your_custom_value
# DMR with speculative decoding
dmr_model:
provider: dmr
model: ai/qwen3:14B
provider_opts:
speculative_draft_model: ai/qwen3:1B
speculative_num_tokens: 5
speculative_acceptance_rate: 0.8edit pkg/model/provider/openai/client.go
switch cfg.Provider { //nolint:gocritic
case "azure":
if apiVersion, exists := cfg.ProviderOpts["api_version"]; exists {
slog.Debug("Setting API version", "api_version", apiVersion)
if apiVersionStr, ok := apiVersion.(string); ok {
openaiConfig.APIVersion = apiVersionStr
}
}
case "your_provider":
if yourCustomOption, exists := cfg.ProviderOpts["your_custom_option"]; exists {
slog.Debug("Setting your custom option", "your_custom_option", yourCustomOption)
if yourCustomOptionStr, ok := yourCustomOption.(string); ok {
openaiConfig.yourCustomOption = yourCustomOptionStr
}
}
}The DMR provider supports speculative decoding for faster inference. Configure it using provider_opts:
speculative_draft_model(string): Model to use for draft predictionsspeculative_num_tokens(int): Number of tokens to generate speculativelyspeculative_acceptance_rate(float): Acceptance rate threshold for speculative tokens
All three options are sent to Model Runner via its internal POST /engines/_configure API endpoint.
You can also pass any flag of the underlying model runtime (llama.cpp or vllm) using the runtime_flags option