-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Problem
When using agents heavily, models hit rate limits (429 errors). Currently, the only option is to:
- Wait for the rate limit to reset
- Manually edit
~/.config/opencode/oh-my-opencode.jsonto switch models
This breaks workflow and wastes time.
Proposed Solution
Allow configuring multiple models per agent with automatic fallback on rate limits or errors.
Example Configuration
{
"agents": {
"explore": {
"models": [
{ "model": "opencode/grok-code", "priority": 1 },
{ "model": "google/gemini-3-flash", "priority": 2 },
{ "model": "anthropic/claude-sonnet-4", "priority": 3 }
],
"fallback_strategy": "sequential",
"fallback_on": ["rate_limit", "timeout", "error"]
},
"librarian": {
"model": "google/gemini-3-pro" // backward compatible - single model still works
}
}
}Fallback Strategies
| Strategy | Behavior |
|---|---|
sequential |
Try models in priority order until one succeeds |
round_robin |
Distribute load across models to prevent hitting limits |
random |
Random selection from available models |
Fallback Triggers
| Trigger | Description |
|---|---|
rate_limit |
HTTP 429 or provider rate limit error |
timeout |
Request timeout exceeded |
error |
Any API error (5xx, connection failed, etc.) |
token_limit |
Context too large for model |
Benefits
- Zero downtime: Agents keep working when primary model hits limits
- Cost optimization: Use cheaper models as fallbacks
- Reliability: Multiple providers = redundancy
- Backward compatible: Single
"model": "string"still works
Implementation Hints
Based on existing codebase patterns:
-
Schema update (
assets/oh-my-opencode.schema.json):- Change
modelfromstringtooneOf: [string, object] - Add
modelsarray option with priority/strategy
- Change
-
Model resolution (
src/cli/config-manager.ts):- Already has fallback logic for install-time model selection
- Could extend to runtime model resolution
-
Error handling (
src/hooks/anthropic-auto-compact/executor.ts):- Already tracks
RetryStateandFallbackState - Could trigger model switch instead of just retry/compact
- Already tracks
-
State tracking:
- Track which models are currently rate-limited
- Implement cooldown timers per model
Use Case
I run multiple parallel background agents (explore, librarian, oracle) and frequently hit rate limits on primary models. Having automatic fallback would let agents continue working without manual intervention.
Alternatives Considered
- Manual config editing — Works but disrupts workflow
- Single high-limit model — Expensive, still has limits
- External proxy with fallback — Complex, adds latency
Native support in oh-my-opencode would be the cleanest solution.
afafara, UAEpro, dpshade, VictorValenciaAI, ControlNet and 4 more
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request