Skip to content

Conversation

@aantn
Copy link
Collaborator

@aantn aantn commented Dec 23, 2025

Summary

  • add proxy-aware service discovery that can return Kubernetes API proxy URLs when running outside the cluster
  • include discovered proxy authorization headers when building Prometheus connections to support Holmes running locally

Testing

  • python3 -m py_compile src/robusta/utils/service_discovery.py src/robusta/integrations/prometheus/utils.py

Codex Task

@coderabbitai
Copy link

coderabbitai bot commented Dec 23, 2025

Walkthrough

This refactoring introduces a new DiscoveredServiceUrl data container to encapsulate URLs and optional headers from service discovery. It adds proxy detection logic to the core discovery function and updates all downstream discovery method signatures to return this new type instead of plain strings, enabling headers to flow through the discovery chain.

Changes

Cohort / File(s) Summary
Service Discovery Core
src/robusta/utils/service_discovery.py
Added DiscoveredServiceUrl data class with url and optional headers fields. Added internal helpers for proxy detection (_should_use_proxy, _get_kube_proxy_headers, _build_proxy_url, _derive_service_scheme). Modified find_service_url() to return DiscoveredServiceUrl with proxy support and header attachment.
Prometheus Integration Utilities
src/robusta/integrations/prometheus/utils.py
Added helper functions _unpack_discovery_result() and _merge_discovered_headers() for extracting and merging headers. Updated return type of ServiceDiscovery.find_url(), PrometheusDiscovery.find_prometheus_url(), AlertManagerDiscovery.find_alert_manager_url(), and HolmesDiscovery.find_holmes_url() from Optional[str] to Optional[DiscoveredServiceUrl]. Updated callers to unpack results and merge discovered headers with user-supplied headers.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 15.38% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title directly summarizes the main change: adding proxy-based Prometheus discovery for Holmes. It is concise and reflects the primary objective of the changeset.
Description check ✅ Passed The description clearly relates to the changeset, explaining proxy-aware service discovery and proxy headers for Prometheus connections. It includes testing verification and links to the task.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch codex/linear-mention-rob-71-holmes-extend-prometheus-toolset-di

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/robusta/integrations/prometheus/utils.py (1)

178-188: Inconsistent return type annotation for find_vm_url.

find_vm_url returns Optional[str] but the parent find_url method now returns Optional[DiscoveredServiceUrl]. This is a type mismatch that will cause issues when unpacking at line 66.

🔎 Proposed fix
     @classmethod
-    def find_vm_url(cls) -> Optional[str]:
+    def find_vm_url(cls) -> Optional[DiscoveredServiceUrl]:
         return super().find_url(
             selectors=[
                 "app.kubernetes.io/name=vmsingle",
🧹 Nitpick comments (1)
src/robusta/utils/service_discovery.py (1)

84-86: Headers fetched unconditionally even if proxy URL build fails.

If _should_use_proxy() returns True but _build_proxy_url() returns None (e.g., no host configured), proxy_url is None, so headers becomes None correctly. However, this means we silently fall back to cluster-local URL when proxy is expected but unavailable. Consider logging a warning in this case.

🔎 Suggested improvement
     cluster_local_url = f"{scheme}://{name}.{namespace}.svc.{CLUSTER_DOMAIN}:{port}"
-    proxy_url = _build_proxy_url(name, namespace, port, scheme) if _should_use_proxy() else None
+    use_proxy = _should_use_proxy()
+    proxy_url = _build_proxy_url(name, namespace, port, scheme) if use_proxy else None
+    if use_proxy and not proxy_url:
+        logging.warning("Proxy mode detected but failed to build proxy URL, falling back to cluster-local URL")
     headers = _get_kube_proxy_headers() if proxy_url else None
     final_url = proxy_url or cluster_local_url
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a91ecd5 and 1d3ef58.

📒 Files selected for processing (2)
  • src/robusta/integrations/prometheus/utils.py
  • src/robusta/utils/service_discovery.py
🧰 Additional context used
🪛 Ruff (0.14.10)
src/robusta/utils/service_discovery.py

45-45: Consider moving this statement to an else block

(TRY300)


46-46: Do not catch blind exception: Exception

(BLE001)


54-54: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: run_tests
  • GitHub Check: run_tests
🔇 Additional comments (11)
src/robusta/utils/service_discovery.py (5)

13-19: LGTM! Clean dataclass for encapsulating discovery results.

The DiscoveredServiceUrl dataclass nicely encapsulates the URL and optional headers. The __str__ method returning the URL is a good choice for backward compatibility when the object is used in string contexts.


22-23: Consider a more descriptive variable name or documentation.

The logic correctly detects out-of-cluster execution by checking for the absence of KUBERNETES_SERVICE_HOST. This is a standard approach since that env var is automatically set inside pods.


26-33: Scheme derivation logic looks correct.

The function properly checks port name, app_protocol, and port number (443) to determine HTTPS. Using getattr for app_protocol handles cases where the attribute might not exist on older API versions.


36-48: Broad exception handling is acceptable here for resilience.

The static analyzer flags the bare Exception catch (BLE001), but in this context it's appropriate—we want to gracefully fall back to None if any unexpected error occurs during header extraction rather than crash the discovery flow. The debug logging ensures visibility.


88-92: Good observability with logging.

The info-level log clearly indicates the discovered URL and whether a proxy was used, which aids debugging.

src/robusta/integrations/prometheus/utils.py (6)

36-41: Clean helper for unpacking discovery results.

This helper nicely handles the None case and extracts the URL and headers tuple.


44-52: Header merging logic is correct.

The function properly handles all combinations of None inputs and returns None when no headers exist. Note that additional_headers takes precedence over discovered_headers when keys conflict, which is the expected behavior for user-provided overrides.


57-67: Logic correctly integrates discovered headers.

The flow properly unpacks discovery results and preserves headers from whichever discovery method succeeds.


79-81: Good: User-provided headers override discovered headers.

The merge order ensures user-provided prometheus_additional_headers can override any discovered proxy headers if needed.


139-155: Cache now stores DiscoveredServiceUrl objects.

The caching correctly stores and returns the full DiscoveredServiceUrl object, preserving headers across cached lookups.


214-216: Wrapping user-provided URL correctly.

When holmes_url is provided directly, it's wrapped in DiscoveredServiceUrl without headers, which is correct since no proxy is needed for explicitly configured URLs.

Comment on lines +51 to +63
def _build_proxy_url(name: str, namespace: str, port: int, scheme: str) -> Optional[str]:
try:
configuration = client.Configuration.get_default_copy()
except Exception as e:
logging.debug(f"Unable to load Kubernetes configuration for proxy url: {e}")
return None

host = (configuration.host or "").rstrip("/")
if not host:
return None

path = f"/api/v1/namespaces/{namespace}/services/{scheme}:{name}:{port}/proxy"
return f"{host}{path}"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Kubernetes API proxy URL format services /api/v1/namespaces services proxy

💡 Result:

Kubernetes API server proxy URL patterns (service-level)

  • Base form:
    /api/v1/namespaces/{namespace}/services/{service-ident}/proxy{/path...}.[1][2]

  • {service-ident} supported forms:

    • {service-name} → proxy to service’s default/unnamed port over http.
    • {service-name}:{port} → proxy to the named numeric port (or port-name) over http.
    • https:{service-name}: → proxy to service’s default/unnamed port over HTTPS (note the leading "https:" and trailing colon).
    • https:{service-name}:{port} → proxy to the specified port over HTTPS.
      Examples:
    • /api/v1/namespaces/default/services/web-frontend/proxy/
    • /api/v1/namespaces/default/services/web-frontend:80/proxy/
    • /api/v1/namespaces/kube-system/services/https:kube-ui:/proxy/ (HTTPS to default port). [2][4]
  • You can append additional path/query after /proxy, e.g. /proxy/_search?q=foo or /proxy/api/v1/items. [1][3]

Notes:

  • The API also exposes verb-specific endpoints (GET/POST/HEAD/PUT) at the same path. [1]
  • When using kubectl proxy, call the same path against the local proxy (e.g. http://localhost:8001/api/v1/...). [3]

Sources: Kubernetes API reference and docs; examples/discussion. [1] Kubernetes API reference — Proxy operations. [2] kubectl cluster-info / proxy examples and formats. [3] kubectl proxy docs. [4] community examples/Q&A showing https: prefix.


Fix proxy path format for HTTP services.

The proxy URL format is incorrect for HTTP services. Per Kubernetes API documentation, the service identifier should be:

  • HTTP with port: {name}:{port} (no scheme prefix)
  • HTTPS with port: https:{name}:{port} (with scheme prefix)

The current code uses {scheme}:{name}:{port} for all schemes, which generates invalid URLs for HTTP (e.g., http:my-service:8080 instead of my-service:8080). Conditional logic is needed: omit the scheme prefix for HTTP, include it only for HTTPS.

🧰 Tools
🪛 Ruff (0.14.10)

54-54: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
In src/robusta/utils/service_discovery.py around lines 51 to 63, the proxy path
uses "{scheme}:{name}:{port}" for all services which produces invalid proxy
paths for HTTP; change the path construction so that for scheme == "https" the
service identifier is "https:{name}:{port}" but for "http" (and empty/default)
it is just "{name}:{port}". Keep the rest of the function intact (load
configuration, validate host) and return f"{host}{path}" as before.

Comment on lines +79 to +80
port_obj = svc.spec.ports[0]
port = port_obj.port
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Potential IndexError if service has no ports.

If svc.spec.ports is empty or None, accessing svc.spec.ports[0] will raise an IndexError. Consider adding a guard.

🔎 Proposed fix
     name = svc.metadata.name
     namespace = svc.metadata.namespace
+    if not svc.spec.ports:
+        logging.warning(f"Service {name} in {namespace} has no ports defined")
+        return None
     port_obj = svc.spec.ports[0]
     port = port_obj.port

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In src/robusta/utils/service_discovery.py around lines 79-80, the code
unconditionally accesses svc.spec.ports[0] which will raise an IndexError if
svc.spec.ports is None or empty; add a guard that checks if svc.spec and
svc.spec.ports are truthy and that len(svc.spec.ports) > 0 before indexing, and
handle the empty case by logging/raising a clear error or skipping this service
(returning None or continuing) so the function fails gracefully instead of
throwing an IndexError.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants