Skip to content

Memory leak caused by high cardinality Prometheus metrics with user emails in path labels #1559

@onemoreluis

Description

@onemoreluis

Report

The KEDA HTTP Add-on interceptor is experiencing a severe memory leak due to high cardinality in Prometheus metrics. The interceptor_request_count_total metric includes the full request path as a label, which contains unique user email addresses. This causes the creation of thousands/millions of unique time series that are never released from memory, eventually leading to OOM (Out of Memory) conditions.

Image

Our interceptor pod reached near memory limits within hours of operation, and analysis revealed that the /metrics endpoint exposed on port 2223 contained thousands of metric entries with unique email addresses in the path label.

Expected Behavior

  • Prometheus metrics should use low cardinality labels to prevent memory exhaustion
  • The path label should be normalized (e.g., /user/:email instead of /user/actual@email.com)
  • Memory usage should remain stable over time regardless of the number of unique users
  • Metrics should follow Prometheus best practices for label cardinality

Actual Behavior

  • Each unique user email creates a new time series in memory
  • Memory consumption grows continuously and is never released
  • The pod eventually reaches its memory limit and requires restart
  • Metrics endpoint shows thousands of entries like:
interceptor_request_count_total{code="400",host="example-app.example.com",method="GET",otel_scope_name="keda-interceptor-proxy",otel_scope_schema_url="",otel_scope_version="",path="/user/john.doe@example.com"} 2
interceptor_request_count_total{code="400",host="example-app.example.com",method="GET",otel_scope_name="keda-interceptor-proxy",otel_scope_schema_url="",otel_scope_version="",path="/user/jane.smith@example.com"} 1
interceptor_request_count_total{code="400",host="example-app.example.com",method="GET",otel_scope_name="keda-interceptor-proxy",otel_scope_schema_url="",otel_scope_version="",path="/user/bob.johnson@example.com"} 1
interceptor_request_count_total{code="400",host="example-app.example.com",method="GET",otel_scope_name="keda-interceptor-proxy",otel_scope_schema_url="",otel_scope_version="",path="/user/alice.williams@example.com"} 1
interceptor_request_count_total{code="400",host="example-app.example.com",method="GET",otel_scope_name="keda-interceptor-proxy",otel_scope_schema_url="",otel_scope_version="",path="/user/charlie.brown@example.com"} 1
interceptor_request_count_total{code="400",host="example-app.example.com",method="GET",otel_scope_name="keda-interceptor-proxy",otel_scope_schema_url="",otel_scope_version="",path="/user/david.miller@example.com"} 3
interceptor_request_count_total{code="400",host="example-app.example.com",method="GET",otel_scope_name="keda-interceptor-proxy",otel_scope_schema_url="",otel_scope_version="",path="/user/emma.davis@example.com"} 1
interceptor_request_count_total{code="400",host="example-app.example.com",method="GET",otel_scope_name="keda-interceptor-proxy",otel_scope_schema_url="",otel_scope_version="",path="/user/frank.wilson@example.com"} 2

Steps to Reproduce the Problem

  1. Deploy KEDA HTTP Add-on with the interceptor proxy
  2. Configure an HTTPScaledObject for an application that has endpoints with dynamic path parameters (e.g., /user/{email})
  3. Generate traffic with multiple unique users (hundreds or thousands of different email addresses)
  4. Monitor memory usage of the interceptor pod:
    kubectl top pod <interceptor-pod-name>
  5. Check the metrics endpoint:
    kubectl exec <interceptor-pod-name> -- curl http://localhost:2223/metrics | grep "path=" | wc -l
  6. Observe that:
  • The number of metric lines grows continuously
  • Memory usage increases over time
  • Each unique path creates a new time series

Logs from KEDA HTTP Add-on

2026-03-25T21:21:30Z	INFO	setup	starting interceptor	{"timeoutConfig": {"Connect":10000000000,"KeepAlive":1000000000,"ResponseHeader":10000000000,"WorkloadReplicas":20000000000,"ForceHTTP2":false,"MaxIdleConns":1000,"MaxIdleConnsPerHost":200,"IdleConnTimeout":90000000000,"TLSHandshakeTimeout":10000000000,"ExpectContinueTimeout":1000000000,"DialRetryTimeout":15000000000}, "servingConfig": {"CurrentNamespace":"keda","WatchNamespace":"","ProxyPort":8080,"AdminPort":9090,"CacheSyncPeriod":3600000000000,"EndpointsCachePollIntervalMS":1000,"ProxyTLSEnabled":false,"TLSCertPath":"/certs/tls.crt","TLSKeyPath":"/certs/tls.key","TLSCertStorePaths":"","TLSSkipVerify":false,"TLSPort":8443,"ProfilingAddr":"","EnableColdStartHeader":true,"LogRequests":false}, "metricsConfig": {"OtelPrometheusExporterEnabled":true,"OtelPrometheusExporterPort":2223,"OtelHTTPExporterEnabled":false}}
2026-03-25T21:21:30Z	INFO	setup	Interceptor starting
2026-03-25T21:21:30Z	INFO	setup	starting the controller-runtime cache
2026-03-25T21:21:30Z	INFO	Interceptor Version: 0.12.2
2026-03-25T21:21:30Z	INFO	Interceptor Commit: 480a41506b0922f427b22799a8885007026bc887
2026-03-25T21:21:30Z	INFO	Go Version: go1.25.5
2026-03-25T21:21:30Z	INFO	Go OS/Arch: linux/amd64
2026-03-25T21:21:30Z	INFO	setup	starting the routing table
2026-03-25T21:21:30Z	INFO	setup	starting the admin server	{"port": 9090}
2026-03-25T21:21:30Z	INFO	runAdminServer.pkg.queue.AddCountsRoute	adding queue counts route	{"path": "/queue"}
2026-03-25T21:21:30Z	INFO	setup	starting the endpoints cache
2026-03-25T21:21:30Z	INFO	runAdminServer	admin server starting	{"address": "0.0.0.0:9090"}
2026-03-25T21:21:30Z	INFO	setup	starting the proxy server with TLS disabled	{"port": 8080}
2026-03-25T21:21:30Z	INFO	proxy server starting	{"address": "0.0.0.0:8080"}
2026-03-25T21:21:30Z	INFO	starting the prometheus metrics server	{"port": 2223, "path": "/metrics"}
2026-03-26T06:48:19Z	ERROR	wait function failed, not forwarding request	{"error": "context marked done while waiting for workload reach > 0 replicas: context deadline exceeded"}
main.BuildProxyHandler.newForwardingHandler.func3
	github.com/kedacore/http-add-on/interceptor/proxy_handlers.go:76
net/http.HandlerFunc.ServeHTTP
	net/http/server.go:2322
github.com/kedacore/http-add-on/interceptor/middleware.(*Counting).ServeHTTP
	github.com/kedacore/http-add-on/interceptor/middleware/counting.go:35
github.com/kedacore/http-add-on/interceptor/middleware.(*Routing).ServeHTTP
	github.com/kedacore/http-add-on/interceptor/middleware/routing.go:72
github.com/kedacore/http-add-on/interceptor/middleware.(*Metrics).ServeHTTP
	github.com/kedacore/http-add-on/interceptor/middleware/metrics.go:24
net/http.serverHandler.ServeHTTP
	net/http/server.go:3340
net/http.(*conn).serve
	net/http/server.go:2109

HTTP Add-on Version

0.12.2

Kubernetes Version

1.32.4

Platform

Microsoft Azure

Would you be open to contributing a fix?

No

Anything else?

Suggested Solution
Implement path normalization in the metrics collection to replace dynamic segments with placeholders:

  • Before: path="/user/email@example.com"
  • After: path="/user/:email" or path="/user/{email}"

This is a common pattern in observability tools to prevent cardinality explosion while maintaining useful metrics.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

Status

To Triage

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions