Skip to content

[Feature Request]: Built-in LLM Failover for Reliability #19631

@YassinNouh21

Description

@YassinNouh21

Feature Description

Introduce a first-class failover mechanism so that when a primary LLM request fails (timeouts, 429s, 5xx, network), LlamaIndex automatically routes the same request to the next configured provider—without surfacing errors to end users and without code duplication across apps.

Reason

  • Today, apps built on LlamaIndex are tightly bound to a single LLM per call. Any transient provider issue becomes a user-visible failure or requires every app to re-implement retries and fallbacks in inconsistent ways.
  • This gap is more acute during peak periods (rate limits), provider incidents, or regional network degradation.
  • Teams need dependable behavior across standard and streaming responses; manual fallbacks cause UX glitches and slowdown.
  • There is clear market precedent to aggregate across multiple models/providers under one surface (e.g., OpenRouter) but users still need a unified, provider-agnostic failover within LlamaIndex itself for end-to-end consistency. See: OpenRouter example docs.

Value of Feature

  • Minimizes business impact during provider outages or throttling; requests continue seamlessly with the next available model.
  • Clear pathway for enterprises to meet reliability/compliance targets without bespoke plumbing per product team.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesttriageIssue needs to be triaged/prioritized

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions