-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Start the allocation architecture guide section #121940
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -229,19 +229,43 @@ works in parallel with the storage engine.) | |||||
|
|
||||||
| # Allocation | ||||||
|
|
||||||
| (AllocationService runs on the master node) | ||||||
|
|
||||||
| (Discuss different deciders that limit allocation. Sketch / list the different deciders that we have.) | ||||||
|
|
||||||
| ### APIs for Balancing Operations | ||||||
|
|
||||||
| (Significant internal APIs for balancing a cluster) | ||||||
|
|
||||||
| ### Heuristics for Allocation | ||||||
|
|
||||||
| ### Cluster Reroute Command | ||||||
|
|
||||||
| (How does this command behave with the desired auto balancer.) | ||||||
| ### Core Components | ||||||
|
|
||||||
| The `DesiredBalanceShardsAllocator` is what runs shard allocation decisions. It leverages the `DesiredBalanceComputer` to produce | ||||||
| `DesiredBalance` instances for the cluster based on the latest cluster changes (add/remove nodes, create/remove indices, load, etc). Then | ||||||
| the `DesiredBalanceReconciler` is invoked to choose the next steps to take to move the cluster from the current shard allocation to the | ||||||
| latest computed `DesiredBalance` shard allocation. The `Reconciler` will apply changes to a copy of the `RoutingNodes`, which is then | ||||||
| published in a cluster state update that will reach the data nodes to start the individual shard creation/recovery/move work. | ||||||
|
|
||||||
| The `Reconciler` is throttled by cluster settings, like the max number of concurrent shard moves and recoveries per cluster and node: this | ||||||
| is why the `Reconciler` will make, and publish via cluster state updates, incremental changes to the cluster shard allocation. The | ||||||
| `DesiredBalanceShardsAllocator` is the endpoint for reroute requests, which may trigger immediate requests to the `Reconciler`, but | ||||||
| asynchronous requests to the `DesiredBalanceComputer` via the `ContinuousComputation` component. Cluster state changes that affect shard | ||||||
| balancing (for example index deletion) all call some reroute method interface that reaches the `DesiredBalanceShardsAllocator` to run | ||||||
| reconciliation and queue a request for the `DesiredBalancerComputer`, leading to desired balance computation and reconciliation actions. | ||||||
| Asynchronous completion of a new `DesiredBalance` will also invoke a reconciliation action, as will cluster state updates completing shard | ||||||
| moves/recoveries (unthrottling the next shard move/recovery). | ||||||
|
|
||||||
| The `ContinuousComputation` maintains a queue of desired balance computation requests, each of which holds the latest cluster information at | ||||||
|
||||||
| the time of the request, and a thread that runs the `DesiredBalanceComputer`. The ContinuousComputation thread grabs the latest request, | ||||||
|
||||||
| the time of the request, and a thread that runs the `DesiredBalanceComputer`. The ContinuousComputation thread grabs the latest request, | |
| the time of the request, and a thread that runs the `DesiredBalanceComputer`. The `ContinuousComputation` thread grabs the latest request, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
violate node resource limits or hard limits
Really, violating any rule as defined by an AllocationDecider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rephrased 👍
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a decider
Maybe name these deciders here so folks can go and look them up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
DesiredBalanceReconcilernot justReconciler(here and below)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done