Skip to content

Track Kubernetes Channels for latest versions #351

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 17 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,20 +6,29 @@
![GitHub go.mod Go version](https://img.shields.io/github/go-mod/go-version/jetstack/version-checker)

version-checker is a Kubernetes utility for observing the current versions of
images running in the cluster, as well as the latest available upstream. These
checks get exposed as Prometheus metrics to be viewed on a dashboard, or _soft_
alert cluster operators.
images running in the cluster, as well as the latest available upstream. Additionally,
it monitors the Kubernetes cluster version against the latest available releases
using official Kubernetes release channels. These checks get exposed as Prometheus
metrics to be viewed on a dashboard, or _soft_ alert cluster operators.

## Features

- **Container Image Version Checking**: Monitor and compare container image versions running in the cluster against their latest upstream versions
- **Kubernetes Version Monitoring**: Track your cluster's Kubernetes version against the latest available releases from official Kubernetes channels
- **Prometheus Metrics Integration**: Export all version information as Prometheus metrics for monitoring and alerting
- **Flexible Channel Selection**: Configure which Kubernetes release channel to track (stable, latest, etc.)

---

## Why Use version-checker?

- **Improved Security**: Ensures images are up-to-date, reducing the risk of using vulnerable or compromised versions.
- **Enhanced Visibility**: Provides a clear overview of all running container versions across clusters.
- **Operational Efficiency**: Automates image tracking and reduces manual intervention in version management.
- **Compliance and Policy**: Enforcement: Helps maintain version consistency and adherence to organizational policies.
- **Improved Security**: Ensures images and Kubernetes clusters are up-to-date, reducing the risk of using vulnerable or compromised versions.
- **Enhanced Visibility**: Provides a clear overview of all running container versions and cluster versions across clusters.
- **Operational Efficiency**: Automates image and Kubernetes version tracking and reduces manual intervention in version management.
- **Compliance and Policy Enforcement**: Helps maintain version consistency and adherence to organizational policies for both applications and infrastructure.
- **Incremental Upgrades**: Facilitates frequent, incremental updates to reduce the risk of large, disruptive upgrades.
- **Add-On Compatibility**: Ensures compatibility with the latest versions of Kubernetes add-ons and dependencies.
- **Proactive Cluster Management**: Stay informed about Kubernetes security updates and new features through automated version monitoring.

---

Expand All @@ -45,6 +54,7 @@ These registries support authentication.

- [Installation Guide](docs/installation.md)
- [Metrics](docs/metrics.md)
- [New Features](docs/new_features.md)

---

Expand Down
23 changes: 20 additions & 3 deletions cmd/app/app.go
Original file line number Diff line number Diff line change
Expand Up @@ -110,19 +110,36 @@ func NewCommand(ctx context.Context) *cobra.Command {
return fmt.Errorf("failed to setup image registry clients: %s", err)
}

c := controller.NewPodReconciler(opts.CacheTimeout,
_ = client

podController := controller.NewPodReconciler(opts.CacheTimeout,
metricsServer,
client,
mgr.GetClient(),
log,
opts.RequeueDuration,
opts.DefaultTestAll,
)

if err := c.SetupWithManager(mgr); err != nil {
if err := podController.SetupWithManager(mgr); err != nil {
return err
}

kubeController := controller.NewKubeReconciler(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking, if it's worth checking ig opts.KubeChannel is not empty, and not starting the kubernetes channel version Reconciler... Maybe that's the SetupWithManager's responsibility? wdyt?

log,
mgr.GetConfig(),
metricsServer,
opts.KubeInterval,
opts.KubeChannel,
)

// Only add to manager if controller was created (channel was specified)
if kubeController != nil {
if err := mgr.Add(kubeController); err != nil {
return err
}
log.WithField("channel", opts.KubeChannel).Info("Kubernetes version checking enabled")
}

// Start the manager and all controllers
log.Info("Starting controller manager")
if err := mgr.Start(ctx); err != nil {
Expand Down
14 changes: 13 additions & 1 deletion cmd/app/options.go
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,10 @@ type Options struct {
CacheSyncPeriod time.Duration
RequeueDuration time.Duration

KubeChannel string
KubeInterval time.Duration

// kubeConfigFlags holds the flags for the kubernetes client
kubeConfigFlags *genericclioptions.ConfigFlags

selfhosted selfhosted.Options
Expand Down Expand Up @@ -141,7 +145,15 @@ func (o *Options) addAppFlags(fs *pflag.FlagSet) {

fs.DurationVarP(&o.CacheSyncPeriod,
"cache-sync-period", "", 5*time.Hour,
"The time in which all resources should be updated.")
"The duration in which all resources should be updated.")

fs.DurationVarP(&o.KubeInterval,
"kube-interval", "", o.CacheSyncPeriod,
"The time in which kubernetes channels updates are checked.")

fs.StringVarP(&o.KubeChannel,
"kube-channel", "", "stable",
"The Kubernetes channel to check against for cluster updates.")

fs.DurationVarP(&o.GracefulShutdownTimeout,
"graceful-shutdown-timeout", "", 10*time.Second,
Expand Down
20 changes: 18 additions & 2 deletions docs/metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,16 +2,32 @@

By default, version-checker exposes the following Prometheus metrics on `0.0.0.0:8080/metrics`:

## Container Image Metrics

- `version_checker_is_latest_version`: Indicates whether the container in use is using the latest upstream registry version.
- `version_checker_last_checked`: Timestamp when the image was last checked.
- `version_checker_image_lookup_duration`: Duration of the image version check.
- `version_checker_image_failures_total`: Total of errors encountered during image version checks.

## Kubernetes Version Metrics

- `version_checker_is_latest_kube_version`: Indicates whether the cluster is running the latest version from the configured Kubernetes release channel.
- Labels: `current_version`, `latest_version`, `channel`
- Value `1`: Cluster is up-to-date
- Value `0`: Update available

---

## Example Prometheus Query
## Example Prometheus Queries

### Check container image versions
```sh
QUERY="version_checker_is_latest_version"
curl -s --get --data-urlencode query=$QUERY <PROMETHEUS_URL>
```
```

### Check Kubernetes cluster version
```sh
QUERY="version_checker_is_latest_kube_version"
curl -s --get --data-urlencode query=$QUERY <PROMETHEUS_URL>
```
62 changes: 62 additions & 0 deletions docs/new_features.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# Kubernetes Version Monitoring

version-checker now includes built-in Kubernetes cluster version monitoring capabilities. This feature automatically compares your cluster's current Kubernetes version against the latest available versions from official Kubernetes release channels.

### How It Works

The Kubernetes version checker:
- Fetches the current cluster version using the Kubernetes Discovery API
- Compares it against the latest version from the configured Kubernetes release channel (using official `https://dl.k8s.io/release/` endpoints)
- Exposes the comparison as Prometheus metrics for monitoring and alerting
- Strips metadata from versions for accurate semantic version comparison (e.g., `v1.28.2-gke.1` becomes `v1.28.2`)

### Configuration

You can configure the Kubernetes version checking behavior using the following CLI flags:

- `--kube-channel`: Specifies which Kubernetes release channel to check against (default: `"stable"`)
- Examples: `stable`, `latest`, `stable-1.28`, `latest-1.29`
- `--kube-interval`: How often to check for Kubernetes version updates (default: same as `--cache-sync-period`, 5 hours)

### Metrics

The Kubernetes version monitoring exposes the following Prometheus metric:

```
version_checker_is_latest_kube_version{current_version="1.28.2", latest_version="1.29.1", channel="stable"} 0
```

- Value `1`: Cluster is running the latest version from the specified channel
- Value `0`: Cluster is not running the latest version (update available)

### Supported Channels

version-checker uses official Kubernetes release channels:

- `stable` - Latest stable Kubernetes release (recommended)
- `latest` - Latest Kubernetes release (including pre-releases)
- `latest-1.28` - Latest patch for Kubernetes 1.28.x
- `latest-1.27` - Latest patch for Kubernetes 1.27.x

### Examples

```bash
# Check against latest stable Kubernetes
version-checker --kube-version-channel=stable

# Check against latest Kubernetes (including alpha/beta)
version-checker --kube-version-channel=latest

# Check against latest 1.28.x patch
version-checker --kube-version-channel=latest-1.28

# Monitor against a specific version channel with custom interval
./version-checker --kube-channel=stable-1.28 --kube-interval=1h
```

### Managed Kubernetes Support

Works with all managed Kubernetes services:
- **Amazon EKS**: Compares `v1.28.2-eks-abc123` against upstream `v1.28.2`
- **Google GKE**: Compares `v1.28.2-gke.1034000` against upstream `v1.28.2`
- **Azure AKS**: Compares `v1.28.2-aks-xyz789` against upstream `v1.28.2`
4 changes: 2 additions & 2 deletions pkg/client/fallback/fallback.go
Original file line number Diff line number Diff line change
Expand Up @@ -56,9 +56,9 @@ func (c *Client) Tags(ctx context.Context, host, repo, image string) (tags []api

remaining := len(c.clients) - i - 1
if remaining == 0 {
c.log.Debugf("failed to lookup via %q, Giving up, no more clients", client.Name())
c.log.Infof("failed to lookup via %q, Giving up, no more clients", client.Name())
} else {
c.log.Debugf("failed to lookup via %q, continuing to search with %v clients remaining", client.Name(), remaining)
c.log.Infof("failed to lookup via %q, continuing to search with %v clients remaining", client.Name(), remaining)
}
}

Expand Down
Loading
Loading