-
Notifications
You must be signed in to change notification settings - Fork 151
Memory leak caused by high cardinality Prometheus metrics with user emails in path labels #1559
Description
Report
The KEDA HTTP Add-on interceptor is experiencing a severe memory leak due to high cardinality in Prometheus metrics. The interceptor_request_count_total metric includes the full request path as a label, which contains unique user email addresses. This causes the creation of thousands/millions of unique time series that are never released from memory, eventually leading to OOM (Out of Memory) conditions.
Our interceptor pod reached near memory limits within hours of operation, and analysis revealed that the /metrics endpoint exposed on port 2223 contained thousands of metric entries with unique email addresses in the path label.
Expected Behavior
- Prometheus metrics should use low cardinality labels to prevent memory exhaustion
- The path label should be normalized (e.g., /user/:email instead of /user/actual@email.com)
- Memory usage should remain stable over time regardless of the number of unique users
- Metrics should follow Prometheus best practices for label cardinality
Actual Behavior
- Each unique user email creates a new time series in memory
- Memory consumption grows continuously and is never released
- The pod eventually reaches its memory limit and requires restart
- Metrics endpoint shows thousands of entries like:
interceptor_request_count_total{code="400",host="example-app.example.com",method="GET",otel_scope_name="keda-interceptor-proxy",otel_scope_schema_url="",otel_scope_version="",path="/user/john.doe@example.com"} 2
interceptor_request_count_total{code="400",host="example-app.example.com",method="GET",otel_scope_name="keda-interceptor-proxy",otel_scope_schema_url="",otel_scope_version="",path="/user/jane.smith@example.com"} 1
interceptor_request_count_total{code="400",host="example-app.example.com",method="GET",otel_scope_name="keda-interceptor-proxy",otel_scope_schema_url="",otel_scope_version="",path="/user/bob.johnson@example.com"} 1
interceptor_request_count_total{code="400",host="example-app.example.com",method="GET",otel_scope_name="keda-interceptor-proxy",otel_scope_schema_url="",otel_scope_version="",path="/user/alice.williams@example.com"} 1
interceptor_request_count_total{code="400",host="example-app.example.com",method="GET",otel_scope_name="keda-interceptor-proxy",otel_scope_schema_url="",otel_scope_version="",path="/user/charlie.brown@example.com"} 1
interceptor_request_count_total{code="400",host="example-app.example.com",method="GET",otel_scope_name="keda-interceptor-proxy",otel_scope_schema_url="",otel_scope_version="",path="/user/david.miller@example.com"} 3
interceptor_request_count_total{code="400",host="example-app.example.com",method="GET",otel_scope_name="keda-interceptor-proxy",otel_scope_schema_url="",otel_scope_version="",path="/user/emma.davis@example.com"} 1
interceptor_request_count_total{code="400",host="example-app.example.com",method="GET",otel_scope_name="keda-interceptor-proxy",otel_scope_schema_url="",otel_scope_version="",path="/user/frank.wilson@example.com"} 2
Steps to Reproduce the Problem
- Deploy KEDA HTTP Add-on with the interceptor proxy
- Configure an HTTPScaledObject for an application that has endpoints with dynamic path parameters (e.g., /user/{email})
- Generate traffic with multiple unique users (hundreds or thousands of different email addresses)
- Monitor memory usage of the interceptor pod:
kubectl top pod <interceptor-pod-name> - Check the metrics endpoint:
kubectl exec <interceptor-pod-name> -- curl http://localhost:2223/metrics | grep "path=" | wc -l - Observe that:
- The number of metric lines grows continuously
- Memory usage increases over time
- Each unique path creates a new time series
Logs from KEDA HTTP Add-on
2026-03-25T21:21:30Z INFO setup starting interceptor {"timeoutConfig": {"Connect":10000000000,"KeepAlive":1000000000,"ResponseHeader":10000000000,"WorkloadReplicas":20000000000,"ForceHTTP2":false,"MaxIdleConns":1000,"MaxIdleConnsPerHost":200,"IdleConnTimeout":90000000000,"TLSHandshakeTimeout":10000000000,"ExpectContinueTimeout":1000000000,"DialRetryTimeout":15000000000}, "servingConfig": {"CurrentNamespace":"keda","WatchNamespace":"","ProxyPort":8080,"AdminPort":9090,"CacheSyncPeriod":3600000000000,"EndpointsCachePollIntervalMS":1000,"ProxyTLSEnabled":false,"TLSCertPath":"/certs/tls.crt","TLSKeyPath":"/certs/tls.key","TLSCertStorePaths":"","TLSSkipVerify":false,"TLSPort":8443,"ProfilingAddr":"","EnableColdStartHeader":true,"LogRequests":false}, "metricsConfig": {"OtelPrometheusExporterEnabled":true,"OtelPrometheusExporterPort":2223,"OtelHTTPExporterEnabled":false}}
2026-03-25T21:21:30Z INFO setup Interceptor starting
2026-03-25T21:21:30Z INFO setup starting the controller-runtime cache
2026-03-25T21:21:30Z INFO Interceptor Version: 0.12.2
2026-03-25T21:21:30Z INFO Interceptor Commit: 480a41506b0922f427b22799a8885007026bc887
2026-03-25T21:21:30Z INFO Go Version: go1.25.5
2026-03-25T21:21:30Z INFO Go OS/Arch: linux/amd64
2026-03-25T21:21:30Z INFO setup starting the routing table
2026-03-25T21:21:30Z INFO setup starting the admin server {"port": 9090}
2026-03-25T21:21:30Z INFO runAdminServer.pkg.queue.AddCountsRoute adding queue counts route {"path": "/queue"}
2026-03-25T21:21:30Z INFO setup starting the endpoints cache
2026-03-25T21:21:30Z INFO runAdminServer admin server starting {"address": "0.0.0.0:9090"}
2026-03-25T21:21:30Z INFO setup starting the proxy server with TLS disabled {"port": 8080}
2026-03-25T21:21:30Z INFO proxy server starting {"address": "0.0.0.0:8080"}
2026-03-25T21:21:30Z INFO starting the prometheus metrics server {"port": 2223, "path": "/metrics"}
2026-03-26T06:48:19Z ERROR wait function failed, not forwarding request {"error": "context marked done while waiting for workload reach > 0 replicas: context deadline exceeded"}
main.BuildProxyHandler.newForwardingHandler.func3
github.com/kedacore/http-add-on/interceptor/proxy_handlers.go:76
net/http.HandlerFunc.ServeHTTP
net/http/server.go:2322
github.com/kedacore/http-add-on/interceptor/middleware.(*Counting).ServeHTTP
github.com/kedacore/http-add-on/interceptor/middleware/counting.go:35
github.com/kedacore/http-add-on/interceptor/middleware.(*Routing).ServeHTTP
github.com/kedacore/http-add-on/interceptor/middleware/routing.go:72
github.com/kedacore/http-add-on/interceptor/middleware.(*Metrics).ServeHTTP
github.com/kedacore/http-add-on/interceptor/middleware/metrics.go:24
net/http.serverHandler.ServeHTTP
net/http/server.go:3340
net/http.(*conn).serve
net/http/server.go:2109
HTTP Add-on Version
0.12.2
Kubernetes Version
1.32.4
Platform
Microsoft Azure
Would you be open to contributing a fix?
No
Anything else?
Suggested Solution
Implement path normalization in the metrics collection to replace dynamic segments with placeholders:
- Before: path="/user/email@example.com"
- After: path="/user/:email" or path="/user/{email}"
This is a common pattern in observability tools to prevent cardinality explosion while maintaining useful metrics.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status