Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 64 additions & 0 deletions content/en/blog/_posts/2026/ccm-metric-route-sync-total.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
---
layout: blog
title: "Kubernetes v1.36: New Metric for Route Sync in the Cloud Controller Manager"
date: 2026-02-26
slug: ccm-new-metric-route-sync-total
author: >
[Lukas Metzner](https://github.com/lukasmetzner) (Hetzner)
---

Kubernetes v1.36 introduces a new alpha counter metric `route_controller_route_sync_total`
to the Cloud Controller Manager (CCM) route controller implementation at
[`k8s.io/cloud-provider`](https://github.com/kubernetes/cloud-provider). This metric
increments each time routes are synced with the cloud provider.

## A/B testing watch-based route reconciliation

This metric was added to help operators validate the
`CloudControllerManagerWatchBasedRoutesReconciliation` feature gate introduced in
[Kubernetes v1.35](/blog/2025/12/30/kubernetes-v1-35-watch-based-route-reconciliation-in-ccm/).
That feature gate switches the route controller from a fixed-interval loop to a watch-based
approach that only reconciles when nodes actually change. This reduces unnecessary API calls
to the infrastructure provider, lowering pressure on rate-limited APIs and allowing operators
to make more efficient use of their available quota.

To A/B test this, compare `route_controller_route_sync_total` with the feature gate
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to have an example of what the metric may look like and how to query and poke it in a running cluster? What's the rate of change usually with and without this feature enabled?

This seems like the metric should stay steady, maybe show that when we have the feature disabled it increments steadily and on the other side the metric should stay still until we update the routes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michaelasp I have added a small example outlying the expected behavior of the metric with the feature gate enabled and disabled.

As this metric is part of the k8s.io/cloud-provider library, giving a concrete example about how to query this is not feasible here. This depends on the concrete cloud-controller-manager implementation of a cloud-provider.

disabled (default) versus enabled. In clusters where node changes are infrequent, you should
see a significant drop in the sync rate with the feature gate turned on.

### Example: expected behavior

**With the feature gate disabled** (the default fixed-interval loop), the counter increments
steadily regardless of whether any node changes occurred:

```
# After 10 minutes with no node changes
route_controller_route_sync_total 60
# After 20 minutes, still no node changes
route_controller_route_sync_total 120
```

**With the feature gate enabled** (watch-based reconciliation), the counter only increments
when nodes are actually added, removed, or updated:

```
# After 10 minutes with no node changes
route_controller_route_sync_total 1
# After 20 minutes, still no node changes — counter unchanged
route_controller_route_sync_total 1
# A new node joins the cluster — counter increments
route_controller_route_sync_total 2
```

The difference is especially visible in stable clusters where nodes rarely change.

## Where can I give feedback?

If you have feedback, feel free to reach out through any of the following channels:
- The [#sig-cloud-provider](https://kubernetes.slack.com/messages/sig-cloud-provider) channel on [Kubernetes Slack](https://slack.k8s.io/)
- The [KEP-5237 issue](https://kep.k8s.io/5237) on GitHub
- The [SIG Cloud Provider community page](https://github.com/kubernetes/community/tree/05223ecbd2d6f960edb40684dc83d053d49f8b68/sig-cloud-provider) for other communication channels

## How can I learn more?

For more details, refer to [KEP-5237](https://kep.k8s.io/5237).