@@ -195,6 +195,18 @@ works in parallel with the storage engine.)
195195
196196# Allocation
197197
198+ ### Indexes and Shards
199+
200+ Each index consists of a fixed number of primary shards. The number of primary shards cannot be changed for the lifetime of the index. Each
201+ primary shard can have zero-to-many replicas used for data redundancy. The number of replicas per shard can be changed dynamically.
202+
203+ The allocation assignment status of each shard copy is tracked by its [ ShardRoutingState] [ ] . The ` RoutingTable ` and ` RoutingNodes ` objects
204+ are responsible for tracking the data nodes to which each shard in the cluster is allocated: see the [ routing package javadoc] [ ] for more
205+ details about these structures.
206+
207+ [ routing package javadoc ] : https://github.com/elastic/elasticsearch/blob/v9.0.0-beta1/server/src/main/java/org/elasticsearch/cluster/routing/package-info.java
208+ [ ShardRoutingState ] : https://github.com/elastic/elasticsearch/blob/4c9c82418ed98613edcd91e4d8f818eeec73ce92/server/src/main/java/org/elasticsearch/cluster/routing/ShardRoutingState.java#L12-L46
209+
198210### Core Components
199211
200212The ` DesiredBalanceShardsAllocator ` is what runs shard allocation decisions. It leverages the ` DesiredBalanceComputer ` to produce
@@ -235,6 +247,22 @@ of shards, and an incentive to distribute shards within the same index across di
235247` NodeAllocationStatsAndWeightsCalculator ` classes for more details on the weight calculations that support the ` DesiredBalanceComputer `
236248decisions.
237249
250+ ### Inter-Node Communicaton
251+
252+ The elected master node creates a shard allocation plan with the ` DesiredBalanceShardsAllocator ` and then selects incremental shard
253+ movements towards the target allocation plan with the ` DesiredBalanceReconciler ` . The results of the ` DesiredBalanceReconciler ` is an
254+ updated ` RoutingTable ` . The ` RoutingTable ` is part of the cluster state, so the master node updates the cluster state with the new
255+ (incremental) desired shard allocation information. The updated cluster state is then published to the data nodes. Each data node will
256+ observe any change in shard allocation related to itself and take action to achieve the new shard allocation by: initiating creation of a
257+ new empty shard; starting recovery (copying) of an existing shard from another data node; or removing a shard. When the data node finishes
258+ a shard change, a request is sent to the master node to update the shard as having finished recovery/removal in the cluster state. The
259+ cluster state is used by allocation as a fancy work queue: the master node conveys new work to the data nodes, which pick up the work and
260+ report back when done.
261+
262+ - See ` DesiredBalanceShardsAllocator#submitReconcileTask ` for the master node's cluster state update post-reconciliation.
263+ - See ` IndicesClusterStateService#doApplyClusterState ` for the data node hook to observe shard changes in the cluster state.
264+ - See ` ShardStateAction#sendShardAction ` for the data node request to the master node on completion of a shard state change.
265+
238266# Autoscaling
239267
240268The Autoscaling API in ES (Elasticsearch) uses cluster and node level statistics to provide a recommendation
0 commit comments