Mitigate potential overloads in Round Robin load balancing in the event of node failure

In the default load balancing policy, round robin can lead to overloading of nodes in the event of a node failure. Under the usual round robin order, if a node such as A fails, the next node in the sequence (in this case, B) will take on all of A's requests, potentially causing it to become overloaded.

A potential solution to this issue is to shuffle chosen nodes in each load balancing plan's group, which would distribute the failed node's load more evenly among the remaining nodes. However, it should be noted that random shuffling is currently only implemented for replica choosing in the `scylla::transport::load_balancing::DefaultPolicy`. Shuffling all the nodes in the later stages of constructing a load balancing plan was considered, but deemed too costly, resulting in the use of round robin (https://github.com/scylladb/scylla-rust-driver/pull/612#discussion_r1055755872).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Mitigate potential overloads in Round Robin load balancing in the event of node failure #676

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Mitigate potential overloads in Round Robin load balancing in the event of node failure #676

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions