Skip to content

feat: implement strict multi-zone pod distribution for StatefulSets (PSCLOUD-64)#701

Open
abhikumar2204 wants to merge 16 commits intomainfrom
pr-pscloud-64
Open

feat: implement strict multi-zone pod distribution for StatefulSets (PSCLOUD-64)#701
abhikumar2204 wants to merge 16 commits intomainfrom
pr-pscloud-64

Conversation

@abhikumar2204
Copy link
Contributor

@abhikumar2204 abhikumar2204 commented Feb 13, 2026

feat: add comprehensive multi-zone StatefulSet distribution for zone failure protection

Adds comprehensive multi-zone pod distribution to prevent StatefulSet quorum
loss during zone failures in AKS, EKS, and GKE clusters. Extends coverage to
all 7 critical StatefulSet workloads with automatic zone detection and
configurable nodepool labeling.

Features:

  • Balanced topology spread constraints (maxSkew: 1) for optimal zone distribution
  • Host-level anti-affinity to prevent multiple pods on same node
  • Comprehensive coverage: 7 StatefulSets across all critical services
  • Configurable nodepool label support (modern and legacy formats)
  • Automatic multi-zone vs single-zone cluster detection
  • Dedicated nodepool restriction for stateful workloads
  • Configurable per-service enablement for all workloads
  • Graceful single-zone fallback with relaxed constraints

StatefulSets covered:

  • RabbitMQ (message queue)
  • PostgreSQL/Crunchy (database)
  • Consul (service discovery)
  • Redis (cache/session store)
  • OpenDistro/OpenSearch (search/logging)
  • Workload Orchestrator (job scheduling)
  • Data Agent Server (data services)

Changes:

  • Add 7 multi-zone distribution transformers with balanced constraints
  • Implement direct transformer application with automatic zone detection
  • Add configurable nodepool restriction (workload.sas.com/class or agentpool)
  • Update VDM task pipeline with 7 service-specific tasks
  • Add 12 comprehensive configuration variables with sensible defaults
  • Update documentation with usage examples, chaos testing, and limitations

Transformers added:

  • rabbitmq-zone-distribution.yaml (balanced multi-zone)
  • postgres-zone-distribution.yaml (balanced multi-zone)
  • consul-zone-distribution.yaml (balanced multi-zone)
  • redis-zone-distribution.yaml (balanced multi-zone)
  • opendistro-zone-distribution.yaml (balanced multi-zone)
  • workload-orchestrator-zone-distribution.yaml (balanced multi-zone)
  • data-agent-zone-distribution.yaml (balanced multi-zone)
  • rabbitmq-single-zone-distribution.yaml (single-zone fallback)
  • postgres-single-zone-distribution.yaml (single-zone fallback)

Configuration variables (12 total):

  • V4_CFG_MULTI_ZONE_ENABLED (default: true) - Master switch
  • V4_CFG_MULTI_ZONE_RABBITMQ_ENABLED (default: true)
  • V4_CFG_MULTI_ZONE_POSTGRES_ENABLED (default: true)
  • V4_CFG_MULTI_ZONE_CONSUL_ENABLED (default: true)
  • V4_CFG_MULTI_ZONE_REDIS_ENABLED (default: true)
  • V4_CFG_MULTI_ZONE_OPENDISTRO_ENABLED (default: true)
  • V4_CFG_MULTI_ZONE_WORKLOAD_ORCHESTRATOR_ENABLED (default: true)
  • V4_CFG_MULTI_ZONE_DATA_AGENT_ENABLED (default: true)
  • V4_CFG_STATEFUL_NODEPOOL_RESTRICTION (default: true)
  • V4_CFG_STATEFUL_NODEPOOL_LABEL (default: workload.sas.com/class)
  • V4_CFG_MULTI_ZONE_AUTO_DETECT (default: true)
  • V4_CFG_SINGLE_ZONE_FALLBACK (default: true)

Technical Implementation:

  • Topology spread: maxSkew: 1 with DoNotSchedule for balanced zone distribution
  • Node affinity: configurable label restriction (supports workload.sas.com/class
    and agentpool labels)
  • Host distribution: maxSkew: 1 with DoNotSchedule for node-level spreading
  • Automatic detection: auto-detects multi-zone vs single-zone clusters
  • Graceful degradation: applies relaxed constraints for single-zone deployments

Chaos Testing & Validation:

  • Simulated complete zone failure by cordoning all centralus-1 nodes
  • Validated topology constraints prevent pod rescheduling to remaining zones
  • Confirmed pods enter Pending state (by design) until zone recovers
  • Verified automatic rebalancing after zone recovery
  • Coverage: 9/10 StatefulSets with topology constraints (90% success rate)

Multi-Zone Distribution Results:

  • RabbitMQ (3 replicas): distributed across centralus-1, centralus-2, centralus-3
  • Consul (3 replicas): distributed across centralus-1, centralus-2, centralus-3
  • Redis (2 replicas): distributed across centralus-1, centralus-3
  • Workload Orchestrator (2 replicas): distributed across centralus-2, centralus-3
  • PostgreSQL (3 instances): distributed across all 3 zones
  • Data Agent (1 replica): properly constrained
  • All pods restricted to stateful nodepool only

Known Limitation (by design):
During complete zone failure, affected StatefulSet pods cannot reschedule to
remaining zones due to strict maxSkew: 1 enforcement. Pods remain Pending
until the zone recovers. This is intentional to prioritize prevention of
zone concentration during normal operations (primary PSCLOUD-64 objective).

Alternative options documented:

  • Option A: Use whenUnsatisfiable: ScheduleAnyway (weakens enforcement)
  • Option B: Increase maxSkew to 2 (allows less balanced distribution)

Documentation updates:

  • docs/user/MultiZoneDistribution.md: comprehensive implementation guide
  • docs/CONFIG-VARS.md: all 12 configuration variables documented
  • Added chaos testing results and known limitation sections
  • Added alternative constraint options for zone failure scenarios

Benefits:

  • Prevents StatefulSet quorum loss during zone failures
  • Eliminates cross-nodepool scheduling issues
  • Maintains balanced pod distribution across zones
  • Automatic cluster topology detection
  • Flexible nodepool label configuration
  • Works seamlessly with single-zone and multi-zone clusters

Resolves: PSCLOUD-64 - Cross nodepool stateful pods ending up in same zone
Supports: AKS, EKS, GKE multi-zone and single-zone deployments
Tested: Azure AKS 3-zone cluster (centralus-1, centralus-2, centralus-3)
Backward compatible: Works with existing deployments without configuration changes

@abhikumar2204 abhikumar2204 marked this pull request as draft February 13, 2026 06:05
@github-actions github-actions bot added the enhancement New feature or request label Feb 13, 2026
@abhikumar2204 abhikumar2204 self-assigned this Feb 13, 2026
@abhikumar2204 abhikumar2204 changed the title feat: implement strict multi-zone pod distribution for StatefulSets feat: implement strict multi-zone pod distribution for StatefulSets (PSCLOUD-64) Feb 13, 2026
@abhikumar2204 abhikumar2204 marked this pull request as ready for review March 10, 2026 05:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants