Skip to content

[3.13] AI Gateway: New and updated load balancing algorithms (least-connections / semantic) #3401

@tomek-labuk

Description

@tomek-labuk

Jobs to be done (optional)

With 3.13 we add new semantic-priority-group algorithm in ai-proxy-advanced to group/select candidate targets by similarity score and Apply load balancing within the selected group for resilience and even distribution.

Design document: https://docs.google.com/document/d/115t_GAXe3uOG8fOWNl6FrkCSAr0WMnr4R8ywh6sw5Qw/edit?tab=t.0#heading=h.lm7xbylqv511

Definition of done

  • Updated Load balancing section
  • Configuration example added in the 'Load balancing' group in the AI Proxy Advanced
  • Updated Load balancing reference page

Information

Contact: @oowl

Size

S

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions