Skip to content

[codex] add /metrics endpoint for operational alerting#38

Merged
zachyzissou merged 1 commit intomainfrom
codex/metrics-endpoint-observability
Feb 24, 2026
Merged

[codex] add /metrics endpoint for operational alerting#38
zachyzissou merged 1 commit intomainfrom
codex/metrics-endpoint-observability

Conversation

@zachyzissou
Copy link
Owner

Summary

  • add /metrics endpoint with Prometheus text format (text/plain; version=0.0.4)
  • expose key gauges for operational alerting:
    • downlink_last_update_age_seconds
    • downlink_guide_stale
    • downlink_scheduler_consecutive_failures
    • downlink_cron_supported
  • add integration coverage validating Prometheus output and expected gauge values
  • document /metrics usage and observability role in README

Validation

  • python3 -m black app/server.py test_integration.py
  • python3 -m py_compile app/server.py test_integration.py
  • npm run -s lint:js
  • npm run -s lint:md

Notes

  • local runtime tests were not executed here because this environment has Python 3.9 while the project targets 3.11+.

Copilot AI review requested due to automatic review settings February 24, 2026 22:39
@zachyzissou zachyzissou merged commit a9d8f3d into main Feb 24, 2026
15 checks passed
@zachyzissou zachyzissou deleted the codex/metrics-endpoint-observability branch February 24, 2026 22:40
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new operational /metrics endpoint to expose Prometheus-scrapable gauges that reflect guide freshness, scheduler failure state, and cron support—supporting alerting/monitoring alongside existing /health and /status endpoints.

Changes:

  • Introduces /metrics endpoint returning Prometheus text exposition format gauges.
  • Adds integration coverage to validate expected metric names/values are present.
  • Documents /metrics usage and its role in observability.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
app/server.py Adds /metrics endpoint and emits key runtime gauges in Prometheus text format.
test_integration.py Adds an integration test that scrapes /metrics and validates gauge presence/values.
README.md Documents /metrics endpoint and observability guidance.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1110 to +1113
now = datetime.now(UTC)
state = read_state()
freshness = _guide_freshness_snapshot(now)
cron_supported, _ = _cron_support_status(os.environ.get("CRON_SCHEDULE", CRON_SCHEDULE))
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

metrics() reads state via read_state() and then calls _guide_freshness_snapshot(now), which calls read_state() again. This doubles file I/O and can produce internally inconsistent metric values if state changes between reads. Consider passing the already-read state into the freshness calculation (e.g., allow _guide_freshness_snapshot to accept an optional state dict) or compute freshness inputs in metrics() without re-reading state.

Copilot uses AI. Check for mistakes.
with TestClient(app) as client:
response = client.get("/metrics")
assert response.status_code == 200
assert "text/plain" in response.headers.get("content-type", "")
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test claims to validate Prometheus text format (version=0.0.4), but it only checks that the content-type contains text/plain. To prevent regressions, assert the response Content-Type includes version=0.0.4 (and/or the exact media type parameters you intend to support).

Suggested change
assert "text/plain" in response.headers.get("content-type", "")
content_type = response.headers.get("content-type", "")
assert "text/plain" in content_type
assert "version=0.0.4" in content_type

Copilot uses AI. Check for mistakes.
Comment on lines +96 to 103
```bash
# Scrape Prometheus-style metrics
curl http://localhost:7004/metrics
```

```text
{"status":"ok","version":"1.0.0","artifacts":"m3u xml"}
```
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The README now shows a /metrics curl example immediately followed by a JSON output block. Since /metrics returns Prometheus text format (not JSON), this placement is misleading. Either add a representative /metrics text output example, or move/label the JSON snippet so it’s clearly associated with the correct endpoint.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants