Add CI performance budgets and observability checks#84
Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughAdds CI jobs for performance and observability budgets, a Node.js perf-budget runner, a Python observability-budget checker, config files for both budgets, frontend package script/deps changes, and release checklist documentation and navigation. Changes
Sequence DiagramsequenceDiagram
autonumber
actor CI
participant DockerStack as Docker Stack
participant Frontend as Frontend Runner
participant Backend as API Backend
participant Prom as Prometheus
participant Tempo as Tempo
participant Pushgw as Pushgateway
participant Artifacts as Artifact Store
CI->>DockerStack: build & start frontend/backend
DockerStack->>Backend: wait for API readiness
DockerStack->>Frontend: wait for frontend readiness
CI->>Frontend: run perf:budget (run-perf-budget.js)
Frontend->>Frontend: load perf-budget.yml, launch headless browser
Frontend->>Backend: navigate & interact (homepage → configurator)
Frontend->>Frontend: collect Web Vitals & navigation metrics
Frontend->>Pushgw: (optional) push metrics
Frontend->>Artifacts: upload perf-budget JUnit XML
CI->>Prom: query API latency / error rate
Prom-->>CI: return metric values
CI->>Tempo: query trace P95
Tempo-->>CI: return metric values
CI->>CI: compare current vs baseline, evaluate regression allowance
CI->>Artifacts: output observability summary
alt All budgets pass
CI-->>CI: CI gates pass
else Any budget fails
CI-->>CI: CI gates fail (non-zero exit)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting
.github/workflows/ci.yml
Outdated
| - name: Start application stack | ||
| run: | | ||
| docker compose --env-file .env.development -f docker-compose.dev.yml up -d --build | ||
|
|
||
| - name: Wait for API | ||
| run: | | ||
| for i in {1..60}; do curl -sf http://localhost:8000/healthcheck && break || sleep 2; done | ||
|
|
||
| - name: Wait for Frontend | ||
| run: | | ||
| for i in {1..60}; do curl -sf http://localhost:3000/models/manifest.json && break || sleep 2; done | ||
|
|
||
| - name: Run performance budget checks | ||
| working-directory: ./frontend | ||
| env: | ||
| PERF_BUDGET_OUTPUT_DIR: ../test-results/perf | ||
| run: npm run perf:budget | ||
|
|
||
| - name: Upload performance budget report | ||
| if: always() | ||
| uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 | ||
| with: | ||
| name: perf-budget | ||
| path: test-results/perf | ||
|
|
||
| - name: Shutdown stack | ||
| if: always() | ||
| run: | | ||
| docker compose --env-file .env.development -f docker-compose.dev.yml down |
There was a problem hiding this comment.
Avoid referencing a missing .env.development in CI performance job
The new performance-budget job starts the stack with docker compose --env-file .env.development … and later tears it down with the same flag, but the repository does not include a .env.development file nor a step that generates one in CI. On GitHub runners this command exits with open .env.development: no such file or directory, so the job will fail before the performance checks run and will block the entire workflow. Either commit a template env file or drop the --env-file flag so Compose uses defaults.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
⛔ Files ignored due to path filters (1)
frontend/package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (8)
.github/workflows/ci.yml(1 hunks)docs/release-checklist.md(1 hunks)frontend/package.json(2 hunks)frontend/tools/perf/run-perf-budget.js(1 hunks)mkdocs.yml(1 hunks)observability-budgets.yml(1 hunks)perf-budget.yml(1 hunks)tools/ci/check_observability_budgets.py(1 hunks)
🧰 Additional context used
🪛 GitHub Actions: CI
tools/ci/check_observability_budgets.py
[error] 15-15: Library stubs not installed for 'yaml' (typing/import issue).
🪛 LanguageTool
docs/release-checklist.md
[grammar] ~1-~1: Use correct spacing
Context: # Release Checklist This checklist ties together continuous ...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~3-~3: Make sure to use plural and singular nouns correctly
Context: ...st ties together continuous integration signal, Grafana alerting, and the on-call rota...
(QB_NEW_EN_OTHER_ERROR_IDS_10)
[grammar] ~3-~3: Use correct spacing
Context: ...thy performance and reliability metrics. ## 1. Verify CI Observability Gates - Chec...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~5-~5: Use correct spacing
Context: ...cs. ## 1. Verify CI Observability Gates - Check the performance-budget job in ...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~8-~8: There might be a mistake here.
Context: ...passed. It queries Prometheus and Tempo spanmetrics using observability-budgets.yml and f...
(QB_NEW_EN_OTHER)
[grammar] ~8-~8: There might be a mistake here.
Context: ...e exceeded compared to the previous day. - Export any new failure signatures into t...
(QB_NEW_EN)
[grammar] ~9-~9: Use correct spacing
Context: ...ure signatures into the on-call runbook. ## 2. Review Grafana Dashboards - Open the...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~11-~11: Use correct spacing
Context: ...unbook. ## 2. Review Grafana Dashboards - Open the "Configurator Experience" dashb...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~13-~13: There might be a mistake here.
Context: ...ience" dashboard and confirm the panels for: - ci_perf_budget_value vs ci_perf_budget_threshold (pushed fr...
(QB_NEW_EN_OTHER)
[grammar] ~14-~14: There might be a mistake here.
Context: ...(pushed from the Playwright budget run). - Prometheus latency and error-rate panels...
(QB_NEW_EN)
[grammar] ~16-~16: Use correct spacing
Context: ...ntime metrics cross the defined budgets. ## 3. Coordinate On-call Notifications - T...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~18-~18: Use correct spacing
Context: ... ## 3. Coordinate On-call Notifications - Tag the current on-call engineer in the ...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~22-~22: Use correct spacing
Context: ...e acknowledgement in the release ticket. ## 4. Gate Preview Environments - Do not p...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~24-~24: Use correct spacing
Context: ...ticket. ## 4. Gate Preview Environments - Do not promote a preview environment unt...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~26-~26: There might be a mistake here.
Context: ... - Do not promote a preview environment until: - All CI jobs, including performance-budget...
(QB_NEW_EN_OTHER)
[grammar] ~27-~27: There might be a mistake here.
Context: ...dgetandobservability-budgets`, pass. - Grafana dashboards show no active alerts...
(QB_NEW_EN)
[grammar] ~30-~30: Use correct spacing
Context: ...n incident in the on-call tracking tool. ## 5. Final Release Sign-off - Update the ...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~32-~32: Use correct spacing
Context: ...king tool. ## 5. Final Release Sign-off - Update the release ticket with links to:...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
🪛 Ruff (0.14.0)
tools/ci/check_observability_budgets.py
12-12: Import from collections.abc instead: Iterable
Import from collections.abc
(UP035)
12-12: typing.Dict is deprecated, use dict instead
(UP035)
12-12: typing.List is deprecated, use list instead
(UP035)
43-43: Avoid specifying long messages outside the exception class
(TRY003)
49-49: Unnecessary mode argument
Remove mode argument
(UP015)
64-64: Unnecessary mode argument
Remove mode argument
(UP015)
78-78: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
80-80: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
82-82: Unused noqa directive (unused: BLE001)
Remove unused noqa directive
(RUF100)
83-83: Avoid specifying long messages outside the exception class
(TRY003)
86-86: Avoid specifying long messages outside the exception class
(TRY003)
97-97: Unused noqa directive (non-enabled: PERF203)
Remove unused noqa directive
(RUF100)
98-98: Avoid specifying long messages outside the exception class
(TRY003)
104-104: Unused function argument: window_seconds
(ARG001)
110-110: Avoid specifying long messages outside the exception class
(TRY003)
| if (typeof throttling.downloadThroughputKbps === 'number' || typeof throttling.uploadThroughputKbps === 'number' || typeof throttling.requestLatencyMs === 'number') { | ||
| await client.send('Network.emulateNetworkConditions', { | ||
| offline: false, | ||
| latency: throttling.requestLatencyMs ?? 0, | ||
| downloadThroughput: toBytesPerSecond(throttling.downloadThroughputKbps ?? 0), | ||
| uploadThroughput: toBytesPerSecond(throttling.uploadThroughputKbps ?? 0), | ||
| }); |
There was a problem hiding this comment.
Prevent zero-bandwidth throttling when only latency is configured.
If a config sets just requestLatencyMs (or only one of the throughput fields), the current ?? 0 fallback pushes 0 to Network.emulateNetworkConditions, which effectively blocks traffic and makes every run hang. We should only send throughput overrides when values are provided, and use the CDP sentinel (-1) otherwise.
- if (typeof throttling.downloadThroughputKbps === 'number' || typeof throttling.uploadThroughputKbps === 'number' || typeof throttling.requestLatencyMs === 'number') {
- await client.send('Network.emulateNetworkConditions', {
- offline: false,
- latency: throttling.requestLatencyMs ?? 0,
- downloadThroughput: toBytesPerSecond(throttling.downloadThroughputKbps ?? 0),
- uploadThroughput: toBytesPerSecond(throttling.uploadThroughputKbps ?? 0),
- });
- }
+ if (
+ typeof throttling.downloadThroughputKbps === 'number' ||
+ typeof throttling.uploadThroughputKbps === 'number' ||
+ typeof throttling.requestLatencyMs === 'number'
+ ) {
+ await client.send('Network.emulateNetworkConditions', {
+ offline: false,
+ latency: throttling.requestLatencyMs ?? 0,
+ downloadThroughput:
+ typeof throttling.downloadThroughputKbps === 'number'
+ ? toBytesPerSecond(throttling.downloadThroughputKbps)
+ : -1,
+ uploadThroughput:
+ typeof throttling.uploadThroughputKbps === 'number'
+ ? toBytesPerSecond(throttling.uploadThroughputKbps)
+ : -1,
+ });
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (typeof throttling.downloadThroughputKbps === 'number' || typeof throttling.uploadThroughputKbps === 'number' || typeof throttling.requestLatencyMs === 'number') { | |
| await client.send('Network.emulateNetworkConditions', { | |
| offline: false, | |
| latency: throttling.requestLatencyMs ?? 0, | |
| downloadThroughput: toBytesPerSecond(throttling.downloadThroughputKbps ?? 0), | |
| uploadThroughput: toBytesPerSecond(throttling.uploadThroughputKbps ?? 0), | |
| }); | |
| if ( | |
| typeof throttling.downloadThroughputKbps === 'number' || | |
| typeof throttling.uploadThroughputKbps === 'number' || | |
| typeof throttling.requestLatencyMs === 'number' | |
| ) { | |
| await client.send('Network.emulateNetworkConditions', { | |
| offline: false, | |
| latency: throttling.requestLatencyMs ?? 0, | |
| downloadThroughput: | |
| typeof throttling.downloadThroughputKbps === 'number' | |
| ? toBytesPerSecond(throttling.downloadThroughputKbps) | |
| : -1, | |
| uploadThroughput: | |
| typeof throttling.uploadThroughputKbps === 'number' | |
| ? toBytesPerSecond(throttling.uploadThroughputKbps) | |
| : -1, | |
| }); | |
| } |
🤖 Prompt for AI Agents
In frontend/tools/perf/run-perf-budget.js around lines 46 to 52, the code
currently uses ?? 0 for throughput fallbacks which sends 0 to
Network.emulateNetworkConditions and can block traffic when only latency is set;
change the payload so downloadThroughput and uploadThroughput are set to
toBytesPerSecond(value) only if the corresponding throttling field is a number,
otherwise set them to the CDP sentinel -1 (and keep latency as
throttling.requestLatencyMs ?? 0); ensure the condition that triggers
client.send remains the same (any of the three is a number) and that types
passed are numbers.
| from typing import Dict, Iterable, List, Optional | ||
| from urllib import parse, request | ||
|
|
||
| import yaml |
There was a problem hiding this comment.
CI blocked by missing PyYAML type stubs
The type-checking stage now fails with “Library stubs not installed for 'yaml'” at this import. Please add types-PyYAML to the dev dependencies that CI installs (or explicitly silence the import with type: ignore[import]) so the pipeline can go green again.
Based on pipeline failures
🧰 Tools
🪛 GitHub Actions: CI
[error] 15-15: Library stubs not installed for 'yaml' (typing/import issue).
🤖 Prompt for AI Agents
In tools/ci/check_observability_budgets.py around line 15 the import of yaml
triggers type-check failures because PyYAML stubs are missing; either add
types-PyYAML to the project's dev dependencies that CI installs (pip install
--dev or add to requirements-dev/pyproject[tool.poetry].dev-dependencies) so
Mypy/Pyright can find stubs, or silence the import by appending " # type:
ignore[import]" to the import line; update CI dependency installation to include
types-PyYAML if choosing the dev-deps route.
| for i in {1..60}; do curl -sf http://localhost:8000/healthcheck && break || sleep 2; done | ||
|
|
||
| - name: Wait for Frontend | ||
| run: | | ||
| for i in {1..60}; do curl -sf http://localhost:3000/models/manifest.json && break || sleep 2; done | ||
|
|
||
| - name: Run performance budget checks | ||
| working-directory: ./frontend | ||
| env: | ||
| PERF_BUDGET_OUTPUT_DIR: ../test-results/perf | ||
| run: npm run perf:budget | ||
|
|
||
| - name: Upload performance budget report | ||
| if: always() | ||
| uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 | ||
| with: | ||
| name: perf-budget | ||
| path: test-results/perf | ||
|
|
||
| - name: Shutdown stack | ||
| if: always() | ||
| run: | | ||
| docker compose -f docker-compose.dev.yml down |
There was a problem hiding this comment.
Make readiness waits fail when services never come up
Both wait loops (Wait for API and Wait for Frontend) end with curl … && break || sleep 2. If the service never becomes ready, the final sleep returns exit code 0, so the whole step passes and the pipeline moves on with an unhealthy stack. Please fail fast once the retries are exhausted.
- name: Wait for API
run: |
- for i in {1..60}; do curl -sf http://localhost:8000/healthcheck && break || sleep 2; done
+ for i in {1..60}; do
+ if curl -sf http://localhost:8000/healthcheck; then
+ exit 0
+ fi
+ sleep 2
+ done
+ echo "API did not become ready in time" >&2
+ exit 1
@@
- name: Wait for Frontend
run: |
- for i in {1..60}; do curl -sf http://localhost:3000/models/manifest.json && break || sleep 2; done
+ for i in {1..60}; do
+ if curl -sf http://localhost:3000/models/manifest.json; then
+ exit 0
+ fi
+ sleep 2
+ done
+ echo "Frontend did not become ready in time" >&2
+ exit 1📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for i in {1..60}; do curl -sf http://localhost:8000/healthcheck && break || sleep 2; done | |
| - name: Wait for Frontend | |
| run: | | |
| for i in {1..60}; do curl -sf http://localhost:3000/models/manifest.json && break || sleep 2; done | |
| - name: Run performance budget checks | |
| working-directory: ./frontend | |
| env: | |
| PERF_BUDGET_OUTPUT_DIR: ../test-results/perf | |
| run: npm run perf:budget | |
| - name: Upload performance budget report | |
| if: always() | |
| uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 | |
| with: | |
| name: perf-budget | |
| path: test-results/perf | |
| - name: Shutdown stack | |
| if: always() | |
| run: | | |
| docker compose -f docker-compose.dev.yml down | |
| - name: Wait for API | |
| run: | | |
| for i in {1..60}; do | |
| if curl -sf http://localhost:8000/healthcheck; then | |
| exit 0 | |
| fi | |
| sleep 2 | |
| done | |
| echo "API did not become ready in time" >&2 | |
| exit 1 | |
| - name: Wait for Frontend | |
| run: | | |
| for i in {1..60}; do | |
| if curl -sf http://localhost:3000/models/manifest.json; then | |
| exit 0 | |
| fi | |
| sleep 2 | |
| done | |
| echo "Frontend did not become ready in time" >&2 | |
| exit 1 |
🤖 Prompt for AI Agents
.github/workflows/ci.yml lines 132-154: the readiness loops for API and Frontend
use `curl … && break || sleep 2` which can return 0 even when the service never
became healthy; change the loops so they return non-zero when retries are
exhausted — implement a retry loop that attempts curl up to N times and if still
failing after the final attempt explicitly exit with a non-zero status (or check
the final curl result after the loop and `exit 1` on failure) so the CI step
fails when services never come up.
Summary
Testing
https://chatgpt.com/codex/tasks/task_e_68f12adc55a48330813c85bc509e1eb7
Summary by CodeRabbit
New Features
Documentation
Chores