Skip to content

[Feature Request] Support multiple --service-account-issuer flags for zero-downtime endpoint migration #11694

@jbonzo

Description

@jbonzo

[Feature Request] Support multiple --service-account-issuer flags for zero-downtime endpoint migration

Feature Request

Add support for configuring multiple --service-account-issuer flags in the Talos API server configuration to enable zero-downtime migration when changing cluster control plane endpoints.

Problem Statement

By default, Talos automatically sets the service-account-issuer equal to cluster.controlPlane.endpoint. When the control plane endpoint needs to be changed (e.g., during load balancer migration, DNS changes, or infrastructure updates) or when the service-account-issuer needs to be different than the control plane endpoint (e.g., hosted externally), this causes immediate authentication failures and cluster downtime because:

  1. Existing service account tokens were issued by the old issuer URL
  2. The API server immediately switches to only accept tokens from the new issuer URL
  3. All running workloads with mounted service account tokens lose authentication until pods are restarted

Current Behavior

# Current Talos configuration
cluster:
  controlPlane:
    endpoint: https://new-endpoint.example.com:6443

This results in:

  • Immediate authentication failures for existing workloads
  • Required pod restarts across the entire cluster
  • Operational downtime during endpoint changes

Proposed Solution

Enable configuration of multiple service account issuers, similar to how Kubernetes API server supports this natively since v1.22:

Option 1: Array configuration in controlPlane section

cluster:
  controlPlane:
    endpoint: https://new-endpoint.example.com:6443
    serviceAccountIssuers:
      - https://old-endpoint.example.com:6443  # Still validate existing tokens
      - https://new-endpoint.example.com:6443  # Generate new tokens (first in list)

Option 2: Enhanced extraArgs support for repeatable flags

cluster:
  apiServer:
    extraArgs:
      service-account-issuer: 
        - https://old-endpoint.example.com:6443
        - https://new-endpoint.example.com:6443

Note: This would require Talos to enhance extraArgs to detect array values and automatically convert them to multiple flag instances (e.g., --service-account-issuer=https://old-endpoint.example.com:6443 --service-account-issuer=https://new-endpoint.example.com:6443). This approach would also benefit other repeatable Kubernetes API server flags.

Technical Background

Kubernetes API server has supported multiple --service-account-issuer flags since v1.22, where:

  • The first issuer generates new tokens
  • All issuers are used to validate existing tokens
  • This enables non-disruptive issuer changes per Kubernetes documentation

Use Cases

  1. Load Balancer Migration: Moving from one load balancer to another
  2. DNS Changes: Updating cluster endpoint DNS without service interruption
  3. Multi-Region Setup: Supporting multiple endpoint URLs for the same cluster
  4. Certificate Rotation: Changing endpoint certificates with different CN/SANs
  5. Infrastructure Migration: Moving control plane infrastructure

Benefits

  • Zero-downtime endpoint migrations
  • Improved operational safety during infrastructure changes
  • Alignment with Kubernetes best practices
  • Enhanced cluster reliability during maintenance operations

References

Implementation Considerations

  • Maintain backward compatibility with current single endpoint configuration
  • Ensure proper validation of issuer URLs (HTTPS, OIDC compliance)
  • Consider configuration precedence (explicit vs. derived from controlPlane.endpoint)
  • Integration with existing service account key management

This feature would significantly improve operational workflows for Talos clusters by eliminating forced downtime during common infrastructure operations while following established Kubernetes patterns.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions