🐛 Fix KEDA APIService race condition in nightly E2E#761
Conversation
KEDA operator continuously reconciles v1beta1.external.metrics.k8s.io APIService back to keda-metrics-apiserver within ~2 minutes of any patch. This causes persistent 404 errors when HPA queries external metrics for wva_desired_replicas, breaking scale-up tests. Add a background APIService guard loop that checks every 10 seconds and re-patches the APIService to prometheus-adapter if KEDA reclaims it. The guard starts when KEDA is detected during deploy and stops cleanly during cleanup, without modifying KEDA itself. Also increase external metrics and scale-to-zero test timeouts from 5 to 10 minutes for additional margin on busy clusters. Signed-off-by: Andrew Anderson <andy@clubanderson.com>
GPU Pre-flight Check ✅GPUs are available for e2e-openshift tests. Proceeding with deployment.
|
There was a problem hiding this comment.
Pull request overview
This PR aims to stabilize OpenShift nightly E2E runs on clusters with KEDA installed by preventing the v1beta1.external.metrics.k8s.io APIService from being reconciled away from prometheus-adapter, and by extending E2E timeouts to tolerate slower external-metrics readiness / scale-to-zero convergence.
Changes:
- Add a background “APIService guard” loop in
deploy/install.shthat periodically re-patchesv1beta1.external.metrics.k8s.ioback toprometheus-adapter. - Increase the external-metrics accessibility assertion timeout in the ShareGPT scale-up OpenShift E2E test from 5m to 10m.
- Increase the scale-to-zero “recommend zero replicas” polling window from 5m to 10m.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
deploy/install.sh |
Adds a background guard process intended to keep external-metrics APIService pointed at Prometheus Adapter and stops it during undeploy/cleanup paths. |
test/e2e-openshift/sharegpt_scaleup_test.go |
Extends external metrics API availability polling timeout to reduce flakes. |
test/e2e-openshift/scale_to_zero_test.go |
Extends scale-to-zero polling deadline to accommodate slower environments. |
| fi | ||
|
|
||
| # Start background guard to prevent KEDA from reclaiming the APIService. | ||
| # KEDA's operator continuously reconciles the APIService back to its own | ||
| # metrics server within ~2 minutes of any patch. The guard re-patches it | ||
| # every 10 seconds without modifying KEDA itself. | ||
| start_apiservice_guard "$MONITORING_NAMESPACE" |
There was a problem hiding this comment.
start_apiservice_guard is started whenever the APIService exists, even if it already points at prometheus-adapter/$MONITORING_NAMESPACE. If the guard is only intended to mitigate KEDA’s reconciliation, consider starting it only when the APIService is observed pointing somewhere else (or when it’s specifically pointing to keda-metrics-apiserver) to avoid running an always-on background loop on clusters that don’t need it.
| fi | |
| # Start background guard to prevent KEDA from reclaiming the APIService. | |
| # KEDA's operator continuously reconciles the APIService back to its own | |
| # metrics server within ~2 minutes of any patch. The guard re-patches it | |
| # every 10 seconds without modifying KEDA itself. | |
| start_apiservice_guard "$MONITORING_NAMESPACE" | |
| # Start background guard to prevent KEDA from reclaiming the APIService. | |
| # KEDA's operator continuously reconciles the APIService back to its own | |
| # metrics server within ~2 minutes of any patch. The guard re-patches it | |
| # every 10 seconds without modifying KEDA itself. | |
| start_apiservice_guard "$MONITORING_NAMESPACE" | |
| fi |
| # Start background guard to prevent KEDA from reclaiming the APIService. | ||
| # KEDA's operator continuously reconciles the APIService back to its own | ||
| # metrics server within ~2 minutes of any patch. The guard re-patches it | ||
| # every 10 seconds without modifying KEDA itself. | ||
| start_apiservice_guard "$MONITORING_NAMESPACE" |
There was a problem hiding this comment.
start_apiservice_guard is started here, but the normal deployment path never calls stop_apiservice_guard before main exits. In a non-interactive script, this background loop can outlive the script and continue patching the cluster unexpectedly (and may interfere with later CI cleanup that deletes the adapter/namespace). Consider giving the guard an explicit lifecycle (e.g., start only for E2E runs and add a corresponding stop action invoked by CI cleanup, or stop it automatically on script exit when persistence isn’t required).
| APISERVICE_GUARD_PID=$! | ||
| echo "$APISERVICE_GUARD_PID" > /tmp/apiservice-guard.pid | ||
| log_success "APIService guard started (PID: $APISERVICE_GUARD_PID)" |
There was a problem hiding this comment.
The guard PID is written to a fixed path (/tmp/apiservice-guard.pid). This can collide across concurrent runs on the same runner and can become stale; since PIDs are reused, later stop_apiservice_guard invocations risk killing an unrelated process. Consider making the pidfile unique per run (e.g., include $$ / a unique suffix) and validating the process identity before killing.
| current_ns=$(kubectl get apiservice v1beta1.external.metrics.k8s.io \ | ||
| -o jsonpath='{.spec.service.namespace}' 2>/dev/null || echo "") | ||
| if [ "$current_svc" != "prometheus-adapter" ] || [ "$current_ns" != "$monitoring_ns" ]; then | ||
| echo "[apiservice-guard] KEDA reclaimed APIService (now: $current_svc/$current_ns), re-patching to prometheus-adapter/$monitoring_ns" |
There was a problem hiding this comment.
This log line hard-codes “KEDA reclaimed APIService”, but the condition triggers for any mismatch (e.g., a different external-metrics provider or an empty service/namespace when the APIService is temporarily unavailable). Consider making the message provider-agnostic (e.g., “APIService drift detected”) and include the observed service/namespace without attributing it to KEDA unless KEDA is explicitly detected.
| echo "[apiservice-guard] KEDA reclaimed APIService (now: $current_svc/$current_ns), re-patching to prometheus-adapter/$monitoring_ns" | |
| echo "[apiservice-guard] APIService drift detected: v1beta1.external.metrics.k8s.io now points to $current_svc/$current_ns; re-patching to prometheus-adapter/$monitoring_ns" |
Summary
v1beta1.external.metrics.k8s.ioevery 10s if KEDA reclaims itdeploy_prometheus_adapter(), stops during cleanupprometheus-adapterProblem
KEDA operator continuously reconciles the
v1beta1.external.metrics.k8s.ioAPIService back tokeda-metrics-apiserverwithin ~2 minutes. The existing one-shot patch ininstall.shgets overwritten, causing the HPA to receive 404 errors when queryingwva_desired_replicas. This makes the nightly E2E tests fail intermittently on OCP clusters with KEDA installed.KEDA has no built-in option to disable external metrics registration (issue #470, open since 2019).
Test plan