Skip to content

[v0.1] Multi-factor routing algorithm #37

@rootfs

Description

@rootfs

Acceptance

  • Routing formula combining quality (model_scores), load (ModelLoad counter), and latency (ModelCompletionLatency histogram), and token usage and pricing
  • Configurable for broad SLO based targets
  • Documented in architecture guide.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions