-
Notifications
You must be signed in to change notification settings - Fork 6.9k
Open
Labels
enhancementNew feature or requestNew feature or requesttriageIssue needs to be triaged/prioritizedIssue needs to be triaged/prioritized
Description
Feature Description
Introduce a first-class failover mechanism so that when a primary LLM request fails (timeouts, 429s, 5xx, network), LlamaIndex automatically routes the same request to the next configured provider—without surfacing errors to end users and without code duplication across apps.
Reason
- Today, apps built on LlamaIndex are tightly bound to a single LLM per call. Any transient provider issue becomes a user-visible failure or requires every app to re-implement retries and fallbacks in inconsistent ways.
- This gap is more acute during peak periods (rate limits), provider incidents, or regional network degradation.
- Teams need dependable behavior across standard and streaming responses; manual fallbacks cause UX glitches and slowdown.
- There is clear market precedent to aggregate across multiple models/providers under one surface (e.g., OpenRouter) but users still need a unified, provider-agnostic failover within LlamaIndex itself for end-to-end consistency. See: OpenRouter example docs.
Value of Feature
- Minimizes business impact during provider outages or throttling; requests continue seamlessly with the next available model.
- Clear pathway for enterprises to meet reliability/compliance targets without bespoke plumbing per product team.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requesttriageIssue needs to be triaged/prioritizedIssue needs to be triaged/prioritized