Add Playwright perf budgets and canary telemetry gating#126
Conversation
|
Caution Review failedThe pull request is closed. Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughThis pull request introduces comprehensive performance budgeting and monitoring infrastructure. It adds GitHub Actions workflow jobs for Playwright-based performance testing and canary metrics validation, a new performance test suite, Python and Node.js tooling scripts for metrics processing and budget checking, a v2 performance budget configuration, and corresponding documentation updates. Changes
Sequence Diagram(s)sequenceDiagram
participant GHA as GitHub Actions
participant PB as playwright-budgets Job
participant CM as canary-metrics Job
participant Prom as Prometheus
participant Tempo as Tempo (optional)
GHA->>PB: Trigger
activate PB
PB->>PB: Checkout, setup Node.js, install deps
PB->>PB: Build & start stack
PB->>PB: Run Playwright perf tests<br/>(load perf-budget.yml, measure scenarios)
PB->>PB: Convert metrics to JUnit
PB->>PB: Upload artifacts
PB->>PB: Shutdown stack
PB-->>GHA: Complete
deactivate PB
GHA->>CM: Trigger (depends on playwright-budgets)
activate CM
CM->>Prom: Query P95 latency & 5xx rate
activate Prom
Prom-->>CM: Return metrics
deactivate Prom
alt Tempo enabled
CM->>Tempo: Optional trace regression check
activate Tempo
Tempo-->>CM: Trace results
deactivate Tempo
end
CM->>CM: Compare canary metrics vs budgets
CM->>CM: Generate JUnit report
CM-->>GHA: Complete (pass/fail)
deactivate CM
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes The PR introduces multiple new components across different languages and technologies: a comprehensive Playwright performance test suite with metric collection and aggregation logic, two standalone utility scripts (Python for canary monitoring and Node.js for metrics conversion), workflow orchestration, and configuration. While individual components are relatively focused, the breadth of new code, integration points (Prometheus, Tempo, JUnit), and performance-critical logic require careful review of metric computation, threshold comparisons, and CI/CD flow correctness. Possibly related PRs
Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: ASSERTIVE Plan: Pro 📒 Files selected for processing (3)
Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting
tools/perf/check-canary-metrics.py
Outdated
| if PREVIOUS_BUILD and previous_value is not None: | ||
| message += f', previous {previous_value:.3f}{unit}' | ||
| allowed_regression = previous_value * (1 + regression_pct) | ||
| if current_value > allowed_regression: | ||
| passed = False |
There was a problem hiding this comment.
Skip regression comparison when previous value is zero
Regression limits are computed as previous_value * (1 + regression_pct). When the baseline value is 0—common for error rates—the allowed regression remains 0, so any non‑zero current value fails even if it is well below the absolute threshold (e.g. 0.1% errors vs a 1% limit). This will cause the workflow to fail whenever a metric moves from exactly zero to a small value. Guard the regression check when the previous value is near zero (or use the absolute threshold in that case) to avoid false failures.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
⛔ Files ignored due to path filters (1)
frontend/package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (9)
.github/workflows/perf-light.yml(2 hunks)docs/index.md(1 hunks)docs/release-checklist.md(1 hunks)frontend/package.json(3 hunks)frontend/tests/perf/budget.spec.ts(1 hunks)mkdocs.yml(1 hunks)perf-budget.yml(2 hunks)tools/perf/check-canary-metrics.py(1 hunks)tools/perf/metrics-to-junit.mjs(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
perf-budget.yml (2)
frontend/src/app/components/configurator-panel.tsx (1)
ConfiguratorPanel(7-65)frontend/src/app/configurator/page.tsx (1)
ConfiguratorPage(5-18)
tools/perf/check-canary-metrics.py (2)
tests/perf/k6-quote-cnc.js (1)
file(107-107)tools/perf/metrics-to-junit.mjs (3)
name(49-49)base(62-62)xml(92-92)
🪛 actionlint (1.7.8)
.github/workflows/perf-light.yml
52-52: could not parse as YAML: could not find expected ':'
(syntax-check)
🪛 GitHub Actions: CI
tools/perf/check-canary-metrics.py
[error] 122-123: Type-check: Argument type mismatch for query_prometheus; expected str, got str | None.
.github/workflows/perf-light.yml
[error] 52-53: check-yaml: Could not parse YAML file due to missing ':' in perf-light.yml.
🪛 LanguageTool
docs/release-checklist.md
[grammar] ~1-~1: Use correct spacing
Context: # Release Checklist The release process ties together CI per...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~3-~3: Use correct spacing
Context: ...branch or promote a build to production. ## 1. Verify CI performance gates 1. Confi...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~5-~5: There might be a mistake here.
Context: ...tion. ## 1. Verify CI performance gates 1. Confirm the perf-light workflow is gre...
(QB_NEW_EN_OTHER)
[grammar] ~7-~7: There might be a mistake here.
Context: ...t` workflow is green. This workflow now publishes: - Playwright-based performance budgets sou...
(QB_NEW_EN_OTHER)
[grammar] ~8-~8: There might be a mistake here.
Context: ...budgets sourced from .perf-budget.yml. - k6 smoke test summaries for the configur...
(QB_NEW_EN)
[grammar] ~11-~11: There might be a mistake here.
Context: ...rometheus/Tempo. 2. Review the uploaded artifacts: - perf-budgets/perf-metrics.junit.xml for build-time performance metrics. -...
(QB_NEW_EN_OTHER)
[grammar] ~12-~12: There might be a problem here.
Context: ...xmlfor build-time performance metrics. -perf-budgets/frontend/playwright-report/results.xmlfor Playwright run details. -canary...
(QB_NEW_EN_MERGED_MATCH)
[grammar] ~13-~13: There might be a problem here.
Context: ...results.xmlfor Playwright run details. -canary-metrics/canary-metrics.junit.xml` for Prometheus/Tempo comparisons. 3. Ens...
(QB_NEW_EN_MERGED_MATCH)
[grammar] ~14-~14: There might be a mistake here.
Context: ...t.xml` for Prometheus/Tempo comparisons. 3. Ensure no performance regression comment...
(QB_NEW_EN)
[grammar] ~15-~15: Use correct spacing
Context: ...ilures immediately; they block releases. ## 2. Check Grafana and alerting state 1. ...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~17-~17: There might be a mistake here.
Context: ... ## 2. Check Grafana and alerting state 1. Open the "Configurator Release" Grafana ...
(QB_NEW_EN_OTHER)
[grammar] ~20-~20: There might be a mistake here.
Context: ...ledge or resolve any active alerts tied to: - P95 homepage → configurator navigation l...
(QB_NEW_EN_OTHER)
[grammar] ~21-~21: There might be a mistake here.
Context: ... navigation latency breaching 3 seconds. - Configurator LCP exceeding 4 seconds. ...
(QB_NEW_EN)
[grammar] ~24-~24: Use correct spacing
Context: ...t acknowledgements in the release notes. ## 3. Validate preview environments 1. Ens...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~26-~26: There might be a mistake here.
Context: ...es. ## 3. Validate preview environments 1. Ensure preview environments (staging/QA)...
(QB_NEW_EN_OTHER)
[grammar] ~31-~31: Use correct spacing
Context: ...ember signs off on the remediation plan. ## 4. Final release sign-off 1. Update the...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~33-~33: Use correct spacing
Context: ...tion plan. ## 4. Final release sign-off 1. Update the release ticket with links to:...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~40-~40: Use correct spacing
Context: ...n-call leads before promoting the build. Keeping performance, telemetry, and on-c...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
🪛 Ruff (0.14.0)
tools/perf/check-canary-metrics.py
62-62: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
65-65: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
65-65: Avoid specifying long messages outside the exception class
(TRY003)
68-68: Avoid specifying long messages outside the exception class
(TRY003)
76-76: Avoid specifying long messages outside the exception class
(TRY003)
116-116: Boolean positional value in function call
(FBT003)
178-178: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
181-181: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
181-181: Avoid specifying long messages outside the exception class
(TRY003)
🪛 YAMLlint (1.37.1)
.github/workflows/perf-light.yml
[error] 53-53: syntax error: could not find expected ':'
(syntax)
| python - <<'PY' | ||
| import json | ||
| import os | ||
| import urllib.error | ||
| import urllib.request | ||
|
|
||
| BASE_URL = os.environ.get("BASE_URL", "http://localhost:8000") | ||
|
|
||
| def post(path: str, payload: dict) -> dict: | ||
| req = urllib.request.Request( | ||
| f"{BASE_URL}{path}", | ||
| data=json.dumps(payload).encode("utf-8"), | ||
| headers={"Content-Type": "application/json"}, | ||
| ) | ||
| try: | ||
| with urllib.request.urlopen(req, timeout=10) as resp: | ||
| return json.loads(resp.read().decode("utf-8")) | ||
| except urllib.error.HTTPError as exc: | ||
| detail = exc.read().decode("utf-8", "ignore") | ||
| raise SystemExit(f"Seed request failed ({exc.code}): {detail}") | ||
|
|
||
| material = post( | ||
| "/api/materials/", | ||
| {"name": "Walnut", "texture_url": None, "cost_per_sq_ft": 12.5}, | ||
| ) | ||
| material_id = material.get("id") | ||
| if not material_id: | ||
| raise SystemExit("Material creation failed; missing id") | ||
|
|
||
| post( | ||
| "/api/modules/", | ||
| { | ||
| "name": "Base600", | ||
| "width": 600.0, | ||
| "height": 720.0, | ||
| "depth": 580.0, | ||
| "base_price": 100.0, | ||
| "material_id": material_id, | ||
| }, | ||
| ) | ||
| PY |
There was a problem hiding this comment.
Fix heredoc indentation: workflow YAML is currently invalid
The new inline Python block is flush-left, so the YAML scalar breaks parsing (actionlint/check-yaml fail on Line 52: “could not find expected ':'”). CI jobs cannot run until the block contents are indented at least as much as the python - <<'PY' line.
- python - <<'PY'
-import json
-import os
-import urllib.error
-import urllib.request
-
-BASE_URL = os.environ.get("BASE_URL", "http://localhost:8000")
-
-def post(path: str, payload: dict) -> dict:
- req = urllib.request.Request(
- f"{BASE_URL}{path}",
- data=json.dumps(payload).encode("utf-8"),
- headers={"Content-Type": "application/json"},
- )
- try:
- with urllib.request.urlopen(req, timeout=10) as resp:
- return json.loads(resp.read().decode("utf-8"))
- except urllib.error.HTTPError as exc:
- detail = exc.read().decode("utf-8", "ignore")
- raise SystemExit(f"Seed request failed ({exc.code}): {detail}")
-
-material = post(
- "/api/materials/",
- {"name": "Walnut", "texture_url": None, "cost_per_sq_ft": 12.5},
-)
-material_id = material.get("id")
-if not material_id:
- raise SystemExit("Material creation failed; missing id")
-
-post(
- "/api/modules/",
- {
- "name": "Base600",
- "width": 600.0,
- "height": 720.0,
- "depth": 580.0,
- "base_price": 100.0,
- "material_id": material_id,
- },
-)
-PY
+ python - <<'PY'
+ import json
+ import os
+ import urllib.error
+ import urllib.request
+
+ BASE_URL = os.environ.get("BASE_URL", "http://localhost:8000")
+
+ def post(path: str, payload: dict) -> dict:
+ req = urllib.request.Request(
+ f"{BASE_URL}{path}",
+ data=json.dumps(payload).encode("utf-8"),
+ headers={"Content-Type": "application/json"},
+ )
+ try:
+ with urllib.request.urlopen(req, timeout=10) as resp:
+ return json.loads(resp.read().decode("utf-8"))
+ except urllib.error.HTTPError as exc:
+ detail = exc.read().decode("utf-8", "ignore")
+ raise SystemExit(f"Seed request failed ({exc.code}): {detail}")
+
+ material = post(
+ "/api/materials/",
+ {"name": "Walnut", "texture_url": None, "cost_per_sq_ft": 12.5},
+ )
+ material_id = material.get("id")
+ if not material_id:
+ raise SystemExit("Material creation failed; missing id")
+
+ post(
+ "/api/modules/",
+ {
+ "name": "Base600",
+ "width": 600.0,
+ "height": 720.0,
+ "depth": 580.0,
+ "base_price": 100.0,
+ "material_id": material_id,
+ },
+ )
+ PYThis preserves the zero-indent Python source once YAML strips the common leading spaces, while restoring a valid workflow file.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| python - <<'PY' | |
| import json | |
| import os | |
| import urllib.error | |
| import urllib.request | |
| BASE_URL = os.environ.get("BASE_URL", "http://localhost:8000") | |
| def post(path: str, payload: dict) -> dict: | |
| req = urllib.request.Request( | |
| f"{BASE_URL}{path}", | |
| data=json.dumps(payload).encode("utf-8"), | |
| headers={"Content-Type": "application/json"}, | |
| ) | |
| try: | |
| with urllib.request.urlopen(req, timeout=10) as resp: | |
| return json.loads(resp.read().decode("utf-8")) | |
| except urllib.error.HTTPError as exc: | |
| detail = exc.read().decode("utf-8", "ignore") | |
| raise SystemExit(f"Seed request failed ({exc.code}): {detail}") | |
| material = post( | |
| "/api/materials/", | |
| {"name": "Walnut", "texture_url": None, "cost_per_sq_ft": 12.5}, | |
| ) | |
| material_id = material.get("id") | |
| if not material_id: | |
| raise SystemExit("Material creation failed; missing id") | |
| post( | |
| "/api/modules/", | |
| { | |
| "name": "Base600", | |
| "width": 600.0, | |
| "height": 720.0, | |
| "depth": 580.0, | |
| "base_price": 100.0, | |
| "material_id": material_id, | |
| }, | |
| ) | |
| PY | |
| python - <<'PY' | |
| import json | |
| import os | |
| import urllib.error | |
| import urllib.request | |
| BASE_URL = os.environ.get("BASE_URL", "http://localhost:8000") | |
| def post(path: str, payload: dict) -> dict: | |
| req = urllib.request.Request( | |
| f"{BASE_URL}{path}", | |
| data=json.dumps(payload).encode("utf-8"), | |
| headers={"Content-Type": "application/json"}, | |
| ) | |
| try: | |
| with urllib.request.urlopen(req, timeout=10) as resp: | |
| return json.loads(resp.read().decode("utf-8")) | |
| except urllib.error.HTTPError as exc: | |
| detail = exc.read().decode("utf-8", "ignore") | |
| raise SystemExit(f"Seed request failed ({exc.code}): {detail}") | |
| material = post( | |
| "/api/materials/", | |
| {"name": "Walnut", "texture_url": None, "cost_per_sq_ft": 12.5}, | |
| ) | |
| material_id = material.get("id") | |
| if not material_id: | |
| raise SystemExit("Material creation failed; missing id") | |
| post( | |
| "/api/modules/", | |
| { | |
| "name": "Base600", | |
| "width": 600.0, | |
| "height": 720.0, | |
| "depth": 580.0, | |
| "base_price": 100.0, | |
| "material_id": material_id, | |
| }, | |
| ) | |
| PY |
🧰 Tools
🪛 actionlint (1.7.8)
52-52: could not parse as YAML: could not find expected ':'
(syntax-check)
🪛 GitHub Actions: CI
[error] 52-53: check-yaml: Could not parse YAML file due to missing ':' in perf-light.yml.
🪛 YAMLlint (1.37.1)
[error] 53-53: syntax error: could not find expected ':'
(syntax)
🤖 Prompt for AI Agents
.github/workflows/perf-light.yml around lines 51 to 91: the inline heredoc
python block is not indented to match the "python - <<'PY'" line which breaks
YAML parsing; indent every line of the embedded Python (including the closing PY
delimiter) by the same amount of whitespace as the "python - <<'PY'" line so the
YAML scalar is valid and the Python source retains its zero-relative indentation
after YAML unindentation.
| current_value = query_prometheus(PROMETHEUS_URL, current_query) | ||
| previous_value = None | ||
| if PREVIOUS_BUILD: | ||
| previous_query = render_query(template, PREVIOUS_BUILD) | ||
| previous_value = query_prometheus(PROMETHEUS_URL, previous_query) |
There was a problem hiding this comment.
Make PROMETHEUS_URL non-optional before calling query_prometheus.
PROMETHEUS_URL is declared as an Optional[str], so the type-checker (and Ruff, per the pipeline) sees a str | None argument flowing into query_prometheus. Even though the early guard at Line 48 exits when the env var is missing, the type system cannot infer that, causing the CI failure. Force the variable to str (e.g., via typing.cast, or by rebinding after the guard) before passing it to query_prometheus.
🤖 Prompt for AI Agents
In tools/perf/check-canary-metrics.py around lines 108 to 112, PROMETHEUS_URL is
typed Optional[str] but is passed directly to query_prometheus causing
type-check failures; rebind or cast PROMETHEUS_URL to str after the early guard
(e.g., url: str = PROMETHEUS_URL or use typing.cast(str, PROMETHEUS_URL)) so the
variable passed into query_prometheus is a non-optional str, then use that url
variable in the current_query and previous_query calls.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
tools/perf/check-canary-metrics.py (1)
108-112: MakePROMETHEUS_URLnon-optional before callingquery_prometheus.This is the same type incompatibility issue flagged in the pipeline and previous review.
PROMETHEUS_URLis declared asOptional[str], so passing it toquery_prometheus(which expectsstr) causes type-checker failures. Although the early guard at line 48 ensures it's non-None, the type system cannot infer this.Force
PROMETHEUS_URLtostrafter the guard by rebinding or usingtyping.cast:+from typing import Optional, cast + # ... after line 49, add: +PROMETHEUS_URL = cast(str, PROMETHEUS_URL) # ... then lines 108-112 work correctlyOr rebind after the guard:
if not PROMETHEUS_URL: exit_success('PROMETHEUS_URL not provided; skipping canary metric comparison.') +else: + PROMETHEUS_URL_STR: str = PROMETHEUS_URL # ... then use PROMETHEUS_URL_STR in evaluate_metric
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
tools/perf/check-canary-metrics.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tools/perf/check-canary-metrics.py (1)
tools/perf/metrics-to-junit.mjs (3)
name(49-49)base(62-62)xml(92-92)
🪛 GitHub Actions: CI
tools/perf/check-canary-metrics.py
[error] 122-123: Type hints: incompatible types for query_prometheus/render_query arguments.
🪛 Ruff (0.14.0)
tools/perf/check-canary-metrics.py
62-62: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
65-65: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
65-65: Avoid specifying long messages outside the exception class
(TRY003)
68-68: Avoid specifying long messages outside the exception class
(TRY003)
76-76: Avoid specifying long messages outside the exception class
(TRY003)
116-116: Boolean positional value in function call
(FBT003)
182-182: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
185-185: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
185-185: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (2)
tools/perf/check-canary-metrics.py (2)
123-130: LGTM! Near-zero baseline handling prevents false positives.The logic correctly addresses the concern raised in the previous review about regression checks failing when the baseline is zero. By checking if
previous_valueis near zero (line 123) and skipping the regression check in that case (line 125), you avoid false failures when a metric moves from exactly zero to a small non-zero value.Based on the previous review comment about zero baseline regression.
239-288: LGTM! JUnit XML generation is correctly structured.The JUnit XML generation properly follows the JUnit schema with:
- Correct XML escaping (lines 244-251)
- Appropriate structure with
testsuites/testsuite/testcasehierarchy- Failure messages for budget violations
- System output with metric details in CDATA
- Consistent classname ("canary.metrics") matching the companion script's approach
The structure aligns with the expected format shown in the relevant code snippets from
metrics-to-junit.mjs.
| P95_QUERY_TEMPLATE = os.getenv( | ||
| 'CANARY_P95_QUERY', | ||
| 'histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket{service="' + CANARY_SERVICE + '",deployment="canary",build="$BUILD"}[5m])) by (le)) * 1000', | ||
| ) | ||
| ERROR_QUERY_TEMPLATE = os.getenv( | ||
| 'CANARY_ERROR_RATE_QUERY', | ||
| 'sum(rate(http_requests_total{service="' + CANARY_SERVICE + '",status=~"5..",deployment="canary",build="$BUILD"}[5m])) ' | ||
| '/ sum(rate(http_requests_total{service="' + CANARY_SERVICE + '",deployment="canary",build="$BUILD"}[5m]))', | ||
| ) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider using f-strings or .format() for query template construction.
The current approach of concatenating strings with + in the default values for P95_QUERY_TEMPLATE and ERROR_QUERY_TEMPLATE works but reduces readability. Using f-strings or .format() would make the template structure clearer.
For example:
P95_QUERY_TEMPLATE = os.getenv(
'CANARY_P95_QUERY',
- 'histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket{service="' + CANARY_SERVICE + '",deployment="canary",build="$BUILD"}[5m])) by (le)) * 1000',
+ f'histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket{{service="{CANARY_SERVICE}",deployment="canary",build="$BUILD"}}[5m])) by (le)) * 1000',
)
ERROR_QUERY_TEMPLATE = os.getenv(
'CANARY_ERROR_RATE_QUERY',
- 'sum(rate(http_requests_total{service="' + CANARY_SERVICE + '",status=~"5..",deployment="canary",build="$BUILD"}[5m])) '
- '/ sum(rate(http_requests_total{service="' + CANARY_SERVICE + '",deployment="canary",build="$BUILD"}[5m]))',
+ f'sum(rate(http_requests_total{{service="{CANARY_SERVICE}",status=~"5..",deployment="canary",build="$BUILD"}}[5m])) '
+ f'/ sum(rate(http_requests_total{{service="{CANARY_SERVICE}",deployment="canary",build="$BUILD"}}[5m]))',
)Note: Double braces {{ and }} are used to escape literal braces in f-strings.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| P95_QUERY_TEMPLATE = os.getenv( | |
| 'CANARY_P95_QUERY', | |
| 'histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket{service="' + CANARY_SERVICE + '",deployment="canary",build="$BUILD"}[5m])) by (le)) * 1000', | |
| ) | |
| ERROR_QUERY_TEMPLATE = os.getenv( | |
| 'CANARY_ERROR_RATE_QUERY', | |
| 'sum(rate(http_requests_total{service="' + CANARY_SERVICE + '",status=~"5..",deployment="canary",build="$BUILD"}[5m])) ' | |
| '/ sum(rate(http_requests_total{service="' + CANARY_SERVICE + '",deployment="canary",build="$BUILD"}[5m]))', | |
| ) | |
| P95_QUERY_TEMPLATE = os.getenv( | |
| 'CANARY_P95_QUERY', | |
| f'histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket{{service="{CANARY_SERVICE}",deployment="canary",build="$BUILD"}}[5m])) by (le)) * 1000', | |
| ) | |
| ERROR_QUERY_TEMPLATE = os.getenv( | |
| 'CANARY_ERROR_RATE_QUERY', | |
| f'sum(rate(http_requests_total{{service="{CANARY_SERVICE}",status=~"5..",deployment="canary",build="$BUILD"}}[5m])) ' | |
| f'/ sum(rate(http_requests_total{{service="{CANARY_SERVICE}",deployment="canary",build="$BUILD"}}[5m]))', | |
| ) |
🤖 Prompt for AI Agents
In tools/perf/check-canary-metrics.py around lines 27 to 35, the default
Prometheus query strings are built by concatenating pieces with '+' which is
hard to read; replace these concatenations with f-strings or str.format() to
construct P95_QUERY_TEMPLATE and ERROR_QUERY_TEMPLATE more clearly (remember to
escape literal braces as '{{' and '}}' when using f-strings) and keep the
environment variable fallback behavior the same so the default query text
remains identical to current value; apply the same change to both
P95_QUERY_TEMPLATE and ERROR_QUERY_TEMPLATE.
| def query_prometheus(url: str, query: str) -> Optional[float]: | ||
| encoded_query = urllib.parse.urlencode({'query': query}) | ||
| endpoint = f"{url.rstrip('/')}/api/v1/query?{encoded_query}" | ||
| try: | ||
| with urllib.request.urlopen(endpoint, timeout=15) as response: | ||
| payload = json.loads(response.read().decode('utf-8')) | ||
| except urllib.error.URLError as exc: # pragma: no cover - network failure | ||
| raise RuntimeError(f'Failed to query Prometheus at {endpoint}: {exc}') | ||
|
|
||
| if payload.get('status') != 'success': | ||
| raise RuntimeError(f'Prometheus query {query} failed: {payload}') | ||
|
|
||
| results = payload.get('data', {}).get('result', []) | ||
| if not results: | ||
| return None | ||
| try: | ||
| return float(results[0]['value'][1]) | ||
| except (KeyError, IndexError, ValueError) as exc: # pragma: no cover - unexpected payload | ||
| raise RuntimeError(f'Unexpected response format for query {query}: {payload}') from exc | ||
|
|
There was a problem hiding this comment.
Validate URL schemes to prevent SSRF or file:// access.
urllib.request.urlopen accepts arbitrary schemes including file:// and custom protocols. If PROMETHEUS_URL or TEMPO_URL could be set to untrusted values, this creates a security risk. Consider validating that the URL starts with http:// or https:// before opening.
Apply this diff to add URL scheme validation:
def query_prometheus(url: str, query: str) -> Optional[float]:
+ if not url.startswith(('http://', 'https://')):
+ raise ValueError(f'Invalid URL scheme: {url}')
encoded_query = urllib.parse.urlencode({'query': query})
endpoint = f"{url.rstrip('/')}/api/v1/query?{encoded_query}"
try:
with urllib.request.urlopen(endpoint, timeout=15) as response:
payload = json.loads(response.read().decode('utf-8'))
- except urllib.error.URLError as exc: # pragma: no cover - network failure
- raise RuntimeError(f'Failed to query Prometheus at {endpoint}: {exc}')
+ except urllib.error.URLError as exc: # pragma: no cover - network failure
+ raise RuntimeError(f'Failed to query Prometheus at {endpoint}: {exc}') from excAdditionally, apply the same validation in maybe_check_tempo at line 180:
def maybe_check_tempo() -> Optional[MetricResult]:
if not TEMPO_URL:
print('TEMPO_URL not provided; skipping trace regression checks.')
return None
if not TEMPO_URL.startswith(('http://', 'https://')):
raise ValueError(f'Invalid URL scheme: {TEMPO_URL}')
# ... rest of functionNote: The diff also adds from exc to improve exception chaining (addresses Ruff B904).
🧰 Tools
🪛 Ruff (0.14.0)
62-62: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
65-65: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
65-65: Avoid specifying long messages outside the exception class
(TRY003)
68-68: Avoid specifying long messages outside the exception class
(TRY003)
76-76: Avoid specifying long messages outside the exception class
(TRY003)
🤖 Prompt for AI Agents
In tools/perf/check-canary-metrics.py around lines 58 to 77, validate the URL
scheme before calling urllib.request.urlopen to prevent SSRF/file:// access:
check that the provided url starts with "http://" or "https://", and if not
raise a ValueError with the offending URL; also propagate the original exception
when raising the RuntimeError for unexpected payload by using "from exc" for
proper exception chaining. Additionally, apply the same http/https scheme
validation in maybe_check_tempo at ~line 180 (raise ValueError if TEMPO_URL does
not start with "http://" or "https://").
|
Note Docstrings generation - SUCCESS |
…resholds` Docstrings generation was requested by @shayancoin. * #126 (comment) The following files were modified: * `tools/perf/check-canary-metrics.py`
…resholds` (#285) Docstrings generation was requested by @shayancoin. * #126 (comment) The following files were modified: * `tools/perf/check-canary-metrics.py` Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Summary
Testing
https://chatgpt.com/codex/tasks/task_e_68f1362524008330adf8b11006937151
Summary by CodeRabbit
New Features
Documentation
Chores