Skip to content

Zone-aware load balancing doesn't work with Spring Cloud Kubernetes Discovery #2072

@deathcoder

Description

@deathcoder

Zone-aware load balancing doesn't work with Spring Cloud Kubernetes Discovery

Environment

  • Spring Boot: 3.2.0
  • Spring Cloud: 2023.0.0
  • Spring Cloud Kubernetes: 3.1.0
  • Kubernetes: v1.27+ (tested with Kind)
  • Load Balancer Mode: POD

Summary

The built-in ZonePreferenceServiceInstanceListSupplier does not work with Spring Cloud Kubernetes Discovery because zone information stored in DefaultKubernetesServiceInstance.podMetadata() is not accessible through the ServiceInstance.getMetadata() interface that the zone preference logic uses.

Expected Behavior

When using Spring Cloud Kubernetes Discovery with zone-aware load balancing configuration:

spring:
  cloud:
    kubernetes:
      loadbalancer:
        enabled: true
        mode: POD
        zone-preference-enabled: true
    loadbalancer:
      zone: ${ZONE}
      configurations: zone-preference

And building a ServiceInstanceListSupplier with:

ServiceInstanceListSupplier.builder()
    .withDiscoveryClient()
    .withZonePreference()
    .build(context);

Expected: Requests should be routed only to service instances in the same availability zone as the client.

Actual Behavior

Requests are distributed randomly across all zones (approximately 50/50 split between zones), indicating that zone filtering is not working.

Reproduction

A complete reproduction repository is available with:

More details about this are available in the project mds, and in the Details section

Questions

  1. Is there a configuration option we're missing that would expose pod labels in getMetadata() for use with the built-in ZonePreferenceServiceInstanceListSupplier?

  2. Is this an architectural limitation where ServiceInstance.getMetadata() is intentionally separate from pod-specific metadata?

  3. Should Spring Cloud Kubernetes automatically populate zone information from pod labels into getMetadata() to work with the standard Spring Cloud LoadBalancer zone preference?

  4. Is this a bug that should be fixed, or is the recommended approach to use custom suppliers when zone-aware routing is needed with Kubernetes?

Thank you for your time! Any guidance on the recommended approach for zone-aware load balancing with Spring Cloud Kubernetes would be greatly appreciated.

Test Results

{
  "clientZone": "zone-a",
  "totalCalls": 20,
  "sameZoneCalls": 10,
  "crossZoneCalls": 10,
  "sameZonePercentage": "50.0%"
}

Root Cause Analysis

After extensive investigation, we found that:

  1. Zone information IS available in Spring Cloud Kubernetes Discovery - it's stored in pod labels
  2. But it's in the wrong place for the built-in zone preference logic to find it

Where zone information exists:

DefaultKubernetesServiceInstance instance = ...; // from discovery

// ✅ Zone IS available here:
Map<String, Map<String, String>> podMetadata = instance.podMetadata();
String zone = podMetadata.get("labels").get("topology.kubernetes.io/zone");
// Returns: "zone-a"

// ❌ But NOT available here (where ZonePreferenceServiceInstanceListSupplier looks):
Map<String, String> metadata = instance.getMetadata();
String zone = metadata.get("zone"); // Returns: null
String zone = metadata.get("topology.kubernetes.io/zone"); // Returns: null

Investigation details:

Available in getMetadata():

[app, port.http, k8s_namespace, type, kubectl.kubernetes.io/last-applied-configuration]

Available in podMetadata():

{
  "labels": {
    "app": "sample-service",
    "pod-template-hash": "6f74896b6d",
    "topology.kubernetes.io/zone": "zone-a",
    "zone": "zone-a"
  },
  "annotations": {
    "kubectl.kubernetes.io/restartedAt": "2025-10-14T14:53:02+02:00"
  }
}

Why ZonePreferenceServiceInstanceListSupplier doesn't work:

Looking at the Spring Cloud LoadBalancer source, ZonePreferenceServiceInstanceListSupplier uses:

private String getZone(ServiceInstance serviceInstance) {
    Map<String, String> metadata = serviceInstance.getMetadata();
    if (metadata != null) {
        return metadata.get(ZONE); // Looks for "zone" key
    }
    return null;
}

This method only checks getMetadata(), not podMetadata(), so it never finds the zone information.

Workarounds

We've identified three working approaches:

Workaround 1: Custom supplier accessing podMetadata()

public class PodMetadataZoneServiceInstanceListSupplier implements ServiceInstanceListSupplier {
    
    private final ServiceInstanceListSupplier delegate;
    private final String clientZone;

    @Override
    public Flux<List<ServiceInstance>> get() {
        return delegate.get().map(instances -> {
            if (clientZone == null || "unknown".equalsIgnoreCase(clientZone)) {
                return instances;
            }
            
            return instances.stream()
                .filter(instance -> clientZone.equalsIgnoreCase(getZoneFromPodMetadata(instance)))
                .collect(Collectors.toList());
        });
    }
    
    private String getZoneFromPodMetadata(ServiceInstance instance) {
        if (!(instance instanceof DefaultKubernetesServiceInstance)) {
            return null;
        }
        
        DefaultKubernetesServiceInstance k8sInstance = (DefaultKubernetesServiceInstance) instance;
        Map<String, Map<String, String>> podMetadata = k8sInstance.podMetadata();
        
        if (podMetadata != null && podMetadata.containsKey("labels")) {
            Map<String, String> labels = podMetadata.get("labels");
            String zone = labels.get("topology.kubernetes.io/zone");
            if (zone == null) {
                zone = labels.get("zone");
            }
            return zone;
        }
        
        return null;
    }
}

Workaround 2: Using Kubernetes EndpointSlices API

// Query EndpointSlices which have native zone support via endpoint.getZone()
EndpointSliceList slices = kubernetesClient.discovery().v1()
    .endpointSlices()
    .inNamespace(namespace)
    .withLabel("kubernetes.io/service-name", serviceId)
    .list();

// Build IP to zone mapping
for (EndpointSlice slice : slices.getItems()) {
    for (Endpoint endpoint : slice.getEndpoints()) {
        String zone = endpoint.getZone(); // Native zone support!
        for (String ip : endpoint.getAddresses()) {
            ipToZoneCache.put(ip, zone);
        }
    }
}

Workaround 3: Direct Kubernetes API queries for pod labels

// Query pods by IP to get their labels
List<Pod> pods = kubernetesClient.pods()
    .inNamespace(namespace)
    .list()
    .getItems();

Pod matchingPod = pods.stream()
    .filter(pod -> podIp.equals(pod.getStatus().getPodIP()))
    .findFirst()
    .orElse(null);

if (matchingPod != null) {
    String zone = matchingPod.getMetadata().getLabels()
        .get("topology.kubernetes.io/zone");
}

Reproduction

A complete reproduction repository is available with:

  • Working Kind cluster setup
  • Sample services with zone labels
  • Three client implementations showing the problem and workarounds
  • Test scripts demonstrating the issue

Configuration used:

Pod Labels:

labels:
  app: sample-service
  topology.kubernetes.io/zone: zone-a  # Standard Kubernetes zone label
  zone: zone-a                          # Alternative zone label

Discovery Configuration:

spring:
  cloud:
    kubernetes:
      discovery:
        enabled: true
        metadata:
          add-pod-labels: true
          add-pod-annotations: true
          labels-prefix: ""
          annotations-prefix: ""
      loadbalancer:
        enabled: true
        mode: POD
        zone-preference-enabled: true

LoadBalancer Configuration:

spring:
  cloud:
    loadbalancer:
      zone: ${ZONE}
      configurations: zone-preference

Additional Context

This issue is critical for production deployments where:

  • Services are deployed across multiple availability zones
  • Cross-zone traffic incurs additional latency and costs
  • Zone affinity is required for performance and resilience

The workarounds are functional but require custom code that should ideally be handled by the framework. Understanding whether this is expected behavior or a gap in the integration between Spring Cloud LoadBalancer and Spring Cloud Kubernetes would help the community implement zone-aware routing correctly.

Related Documentation

Versions

<properties>
    <spring-boot.version>3.2.0</spring-boot.version>
    <spring-cloud.version>2023.0.0</spring-cloud.version>
    <spring-cloud-kubernetes.version>3.1.0</spring-cloud-kubernetes.version>
</properties>

<dependencies>
    <dependency>
        <groupId>org.springframework.cloud</groupId>
        <artifactId>spring-cloud-starter-kubernetes-client</artifactId>
    </dependency>
    <dependency>
        <groupId>org.springframework.cloud</groupId>
        <artifactId>spring-cloud-starter-kubernetes-client-loadbalancer</artifactId>
    </dependency>
    <dependency>
        <groupId>org.springframework.cloud</groupId>
        <artifactId>spring-cloud-starter-loadbalancer</artifactId>
    </dependency>
</dependencies>

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions