Skip to content

Klusterlet bootstrap fails with TLS handshake timeout when cluster network MTU mismatches hub MTU #1314

@agno01

Description

@agno01

Describe the bug

Klusterlet bootstrap fails to complete when the managed cluster's network MTU is higher than the hub cluster's network MTU. The bootstrap process generates large TLS handshake packets during SelfSubjectAccessReview API calls that exceed the hub's MTU, causing silent packet drops and resulting in net/http: TLS handshake timeout errors.

Notably, post-registration communication works fine even with the MTU mismatch because those API calls generate smaller packets that fit within the MTU constraint.

To Reproduce

Steps to reproduce the behavior:

  1. Deploy a hub cluster with cluster network MTU 1400
  2. Deploy a managed cluster with cluster network MTU 8900 (jumbo frames)
  3. Create a ManagedCluster resource and apply the import YAML to the managed cluster
  4. Observe klusterlet bootstrap failure with TLS handshake timeout

The error appears in the klusterlet status:

conditions:
  - type: HubConnectionDegraded
    status: "True"
    reason: BootstrapSecretError
    message: "Failed to create SelfSubjectAccessReview with bootstrap secret: Post \"https://api.hub:6443/apis/authorization.k8s.io/v1/selfsubjectaccessreviews\": net/http: TLS handshake timeout"

Expected behavior

Klusterlet should either:

  1. Successfully complete bootstrap despite MTU difference (by fragmenting packets or using smaller TLS handshake packets), OR
  2. Provide a clear error message indicating MTU mismatch as the likely cause, OR
  3. Validate and warn about MTU mismatch during cluster import

Environment ie: OCM version, Kubernetes version and provider:

Hub Cluster:

  • Platform: Red Hat OpenShift Container Platform (OCP) 4.19.2
  • Kubernetes: v1.32.5
  • RHACM: 2.14.1
  • Multicluster Engine (MCE): 2.9.1
  • Cluster Network MTU: 1400
  • Machine MTU: 1500

Managed Cluster:

  • Platform: OKD 4.20.0-okd-scos.15
  • Kubernetes: v1.33.6
  • Deployment: Bare metal
  • Cluster Network MTU: 8900
  • Machine MTU: 9000

Additional context

Diagnostic Evidence:

  1. Network connectivity verified - curl to hub API succeeds (small packets work)
  2. Bootstrap fails - Go client-go TLS handshake times out (large packets dropped)
  3. MTU mismatch confirmed between clusters (8900 vs 1400)

Workaround:

Temporarily lower the managed cluster MTU to match the hub during bootstrap:

# On managed cluster - lower MTU to 1400
oc patch Network.operator.openshift.io cluster --type=merge --patch '{
  "spec": {
    "migration": {
      "mtu": {
        "network": {"from": 8900, "to": 1400},
        "machine": {"to": 1500}
      }
    }
  }
}'

# Wait for rolling reboot and bootstrap completion (~5-10 minutes)

# After successful registration, raise MTU back to 8900
oc patch Network.operator.openshift.io cluster --type=merge --patch '{
  "spec": {
    "migration": {
      "mtu": {
        "network": {"from": 1400, "to": 8900},
        "machine": {"to": 9000}
      }
    }
  }
}'

Post-registration communication continues to work fine at MTU 8900.

Root Cause:

During the bootstrap phase, klusterlet makes SelfSubjectAccessReview API calls that generate TLS handshake packets larger than 1400 bytes. When the hub cluster's network path MTU is 1400, these packets are silently dropped by intermediate network equipment, causing the client-go library to timeout after 30 seconds.

After successful registration, klusterlet uses client certificates and makes different API calls (lease updates, status updates) that generate smaller packets fitting within the MTU constraint, which is why post-registration communication works fine.

Testing:

This was confirmed through systematic testing with 5 complete MTU migrations and rolling reboots:

  • Bootstrap consistently fails at MTU 8900 (reproducible)
  • Bootstrap consistently succeeds at MTU 1400 (reproducible)
  • Post-registration communication consistently works at MTU 8900 (reproducible)

Impact:

  • Affects any multi-cluster deployment with heterogeneous MTU configurations
  • Common in mixed cloud/on-prem environments where cloud defaults to MTU 1500 and on-prem uses jumbo frames (MTU 9000)
  • Silent failure makes root cause extremely difficult to diagnose without deep networking knowledge

Suggested Fixes:

  1. Documentation: Add MTU requirements to cluster registration/import documentation
  2. Validation: Check for MTU mismatch during import and warn users proactively
  3. Resilience: Implement packet fragmentation or reduce bootstrap TLS packet sizes
  4. Error messaging: Detect timeout patterns during bootstrap and suggest MTU investigation in error messages

Related what led us to find out the issue:

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions