Skip to content

Add CI performance budgets and observability checks#84

Merged
shayancoin merged 3 commits intomainfrom
codex/extend-ci-pipeline-for-performance-budgets
Oct 17, 2025
Merged

Add CI performance budgets and observability checks#84
shayancoin merged 3 commits intomainfrom
codex/extend-ci-pipeline-for-performance-budgets

Conversation

@shayancoin
Copy link
Owner

@shayancoin shayancoin commented Oct 16, 2025

Summary

  • extend the performance budget config with the homepage-to-configurator navigation thresholds
  • add a Playwright-driven runner that publishes JUnit (and optional Pushgateway) metrics and hook it into CI
  • add Prometheus/Tempo observability budget validation plus release checklist documentation for alert gating

Testing

  • python tools/ci/check_observability_budgets.py --config observability-budgets.yml

https://chatgpt.com/codex/tasks/task_e_68f12adc55a48330813c85bc509e1eb7

Summary by CodeRabbit

  • New Features

    • CI now runs automated performance budgets with headless browser runs, throttling, interaction flows, metric collection, JUnit reporting, and optional metric push.
    • CI includes observability budget checks against Prometheus/Tempo to validate latency and error-rate budgets.
  • Documentation

    • Added a Release Checklist with step-by-step CI gate, dashboard review, on-call coordination, preview gating, and release sign-off instructions.
    • Added checklist entry to site navigation.
  • Chores

    • Updated CI to orchestrate performance and observability budget jobs and added a perf-budget npm script.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 16, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds CI jobs for performance and observability budgets, a Node.js perf-budget runner, a Python observability-budget checker, config files for both budgets, frontend package script/deps changes, and release checklist documentation and navigation.

Changes

Cohort / File(s) Summary
GitHub Actions Workflow
.github/workflows/ci.yml
Adds performance-budget job (builds frontend/backend via Docker, waits for readiness, runs frontend perf budget, uploads artifact) and observability-budgets job (Python 3.12 setup, runs observability budget checks using secrets).
Release Documentation
docs/release-checklist.md, mkdocs.yml
Adds a release checklist document and adds the checklist to site navigation.
Frontend package and scripts
frontend/package.json, frontend/tools/perf/run-perf-budget.js, perf-budget.yml
Adds perf:budget script and devDeps (js-yaml, xmlbuilder2); removes gltfpack, meshoptimizer, and @playwright/test. Introduces run-perf-budget.js for browser-based perf runs (throttling, interactions, metrics aggregation, JUnit XML output, optional Pushgateway). Adds perf-budget.yml page config for homepage→configurator flow and metric thresholds.
Observability config & validator
observability-budgets.yml, tools/ci/check_observability_budgets.py
Adds observability budgets config (window, baseline offset, allowed regression, Prometheus/Tempo metric definitions) and a Python script that queries providers, compares current vs baseline values, evaluates thresholds/regressions, and fails CI on violations.

Sequence Diagram

sequenceDiagram
    autonumber
    actor CI
    participant DockerStack as Docker Stack
    participant Frontend as Frontend Runner
    participant Backend as API Backend
    participant Prom as Prometheus
    participant Tempo as Tempo
    participant Pushgw as Pushgateway
    participant Artifacts as Artifact Store

    CI->>DockerStack: build & start frontend/backend
    DockerStack->>Backend: wait for API readiness
    DockerStack->>Frontend: wait for frontend readiness

    CI->>Frontend: run perf:budget (run-perf-budget.js)
    Frontend->>Frontend: load perf-budget.yml, launch headless browser
    Frontend->>Backend: navigate & interact (homepage → configurator)
    Frontend->>Frontend: collect Web Vitals & navigation metrics
    Frontend->>Pushgw: (optional) push metrics
    Frontend->>Artifacts: upload perf-budget JUnit XML

    CI->>Prom: query API latency / error rate
    Prom-->>CI: return metric values
    CI->>Tempo: query trace P95
    Tempo-->>CI: return metric values

    CI->>CI: compare current vs baseline, evaluate regression allowance
    CI->>Artifacts: output observability summary
    alt All budgets pass
        CI-->>CI: CI gates pass
    else Any budget fails
        CI-->>CI: CI gates fail (non-zero exit)
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰
I hopped through pages, timed each beat,
Measured paints and pulses fleet;
Prometheus hummed a careful tune,
Tempo traced beneath the moon;
CI gates closed — the rabbits cheer!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The pull request description does not follow the repository’s required template since it lacks the mandatory “PR Type,” “Short Description,” and “Tests Added” sections and uses different section headings, making it incomplete compared to the specified structure. Please update the description to use the repository’s template by adding a “PR Type” field, a brief “Short Description,” and a “Tests Added” section to clearly document the nature of the changes and any new test coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The title concisely summarizes the primary purpose of the changeset, focusing on adding CI performance and observability checks, which aligns with the modifications to workflows, scripts, and configurations introduced in this pull request.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch codex/extend-ci-pipeline-for-performance-budgets

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting

Comment on lines +126 to +154
- name: Start application stack
run: |
docker compose --env-file .env.development -f docker-compose.dev.yml up -d --build

- name: Wait for API
run: |
for i in {1..60}; do curl -sf http://localhost:8000/healthcheck && break || sleep 2; done

- name: Wait for Frontend
run: |
for i in {1..60}; do curl -sf http://localhost:3000/models/manifest.json && break || sleep 2; done

- name: Run performance budget checks
working-directory: ./frontend
env:
PERF_BUDGET_OUTPUT_DIR: ../test-results/perf
run: npm run perf:budget

- name: Upload performance budget report
if: always()
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02
with:
name: perf-budget
path: test-results/perf

- name: Shutdown stack
if: always()
run: |
docker compose --env-file .env.development -f docker-compose.dev.yml down

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid referencing a missing .env.development in CI performance job

The new performance-budget job starts the stack with docker compose --env-file .env.development … and later tears it down with the same flag, but the repository does not include a .env.development file nor a step that generates one in CI. On GitHub runners this command exits with open .env.development: no such file or directory, so the job will fail before the performance checks run and will block the entire workflow. Either commit a template env file or drop the --env-file flag so Compose uses defaults.

Useful? React with 👍 / 👎.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d2a35c4 and 1cd8537.

⛔ Files ignored due to path filters (1)
  • frontend/package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (8)
  • .github/workflows/ci.yml (1 hunks)
  • docs/release-checklist.md (1 hunks)
  • frontend/package.json (2 hunks)
  • frontend/tools/perf/run-perf-budget.js (1 hunks)
  • mkdocs.yml (1 hunks)
  • observability-budgets.yml (1 hunks)
  • perf-budget.yml (1 hunks)
  • tools/ci/check_observability_budgets.py (1 hunks)
🧰 Additional context used
🪛 GitHub Actions: CI
tools/ci/check_observability_budgets.py

[error] 15-15: Library stubs not installed for 'yaml' (typing/import issue).

🪛 LanguageTool
docs/release-checklist.md

[grammar] ~1-~1: Use correct spacing
Context: # Release Checklist This checklist ties together continuous ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~3-~3: Make sure to use plural and singular nouns correctly
Context: ...st ties together continuous integration signal, Grafana alerting, and the on-call rota...

(QB_NEW_EN_OTHER_ERROR_IDS_10)


[grammar] ~3-~3: Use correct spacing
Context: ...thy performance and reliability metrics. ## 1. Verify CI Observability Gates - Chec...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~5-~5: Use correct spacing
Context: ...cs. ## 1. Verify CI Observability Gates - Check the performance-budget job in ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~8-~8: There might be a mistake here.
Context: ...passed. It queries Prometheus and Tempo spanmetrics using observability-budgets.yml and f...

(QB_NEW_EN_OTHER)


[grammar] ~8-~8: There might be a mistake here.
Context: ...e exceeded compared to the previous day. - Export any new failure signatures into t...

(QB_NEW_EN)


[grammar] ~9-~9: Use correct spacing
Context: ...ure signatures into the on-call runbook. ## 2. Review Grafana Dashboards - Open the...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~11-~11: Use correct spacing
Context: ...unbook. ## 2. Review Grafana Dashboards - Open the "Configurator Experience" dashb...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~13-~13: There might be a mistake here.
Context: ...ience" dashboard and confirm the panels for: - ci_perf_budget_value vs ci_perf_budget_threshold (pushed fr...

(QB_NEW_EN_OTHER)


[grammar] ~14-~14: There might be a mistake here.
Context: ...(pushed from the Playwright budget run). - Prometheus latency and error-rate panels...

(QB_NEW_EN)


[grammar] ~16-~16: Use correct spacing
Context: ...ntime metrics cross the defined budgets. ## 3. Coordinate On-call Notifications - T...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~18-~18: Use correct spacing
Context: ... ## 3. Coordinate On-call Notifications - Tag the current on-call engineer in the ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~22-~22: Use correct spacing
Context: ...e acknowledgement in the release ticket. ## 4. Gate Preview Environments - Do not p...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~24-~24: Use correct spacing
Context: ...ticket. ## 4. Gate Preview Environments - Do not promote a preview environment unt...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~26-~26: There might be a mistake here.
Context: ... - Do not promote a preview environment until: - All CI jobs, including performance-budget...

(QB_NEW_EN_OTHER)


[grammar] ~27-~27: There might be a mistake here.
Context: ...dgetandobservability-budgets`, pass. - Grafana dashboards show no active alerts...

(QB_NEW_EN)


[grammar] ~30-~30: Use correct spacing
Context: ...n incident in the on-call tracking tool. ## 5. Final Release Sign-off - Update the ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~32-~32: Use correct spacing
Context: ...king tool. ## 5. Final Release Sign-off - Update the release ticket with links to:...

(QB_NEW_EN_OTHER_ERROR_IDS_5)

🪛 Ruff (0.14.0)
tools/ci/check_observability_budgets.py

12-12: Import from collections.abc instead: Iterable

Import from collections.abc

(UP035)


12-12: typing.Dict is deprecated, use dict instead

(UP035)


12-12: typing.List is deprecated, use list instead

(UP035)


43-43: Avoid specifying long messages outside the exception class

(TRY003)


49-49: Unnecessary mode argument

Remove mode argument

(UP015)


64-64: Unnecessary mode argument

Remove mode argument

(UP015)


78-78: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.

(S310)


80-80: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.

(S310)


82-82: Unused noqa directive (unused: BLE001)

Remove unused noqa directive

(RUF100)


83-83: Avoid specifying long messages outside the exception class

(TRY003)


86-86: Avoid specifying long messages outside the exception class

(TRY003)


97-97: Unused noqa directive (non-enabled: PERF203)

Remove unused noqa directive

(RUF100)


98-98: Avoid specifying long messages outside the exception class

(TRY003)


104-104: Unused function argument: window_seconds

(ARG001)


110-110: Avoid specifying long messages outside the exception class

(TRY003)

Comment on lines +46 to +52
if (typeof throttling.downloadThroughputKbps === 'number' || typeof throttling.uploadThroughputKbps === 'number' || typeof throttling.requestLatencyMs === 'number') {
await client.send('Network.emulateNetworkConditions', {
offline: false,
latency: throttling.requestLatencyMs ?? 0,
downloadThroughput: toBytesPerSecond(throttling.downloadThroughputKbps ?? 0),
uploadThroughput: toBytesPerSecond(throttling.uploadThroughputKbps ?? 0),
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Prevent zero-bandwidth throttling when only latency is configured.

If a config sets just requestLatencyMs (or only one of the throughput fields), the current ?? 0 fallback pushes 0 to Network.emulateNetworkConditions, which effectively blocks traffic and makes every run hang. We should only send throughput overrides when values are provided, and use the CDP sentinel (-1) otherwise.

-  if (typeof throttling.downloadThroughputKbps === 'number' || typeof throttling.uploadThroughputKbps === 'number' || typeof throttling.requestLatencyMs === 'number') {
-    await client.send('Network.emulateNetworkConditions', {
-      offline: false,
-      latency: throttling.requestLatencyMs ?? 0,
-      downloadThroughput: toBytesPerSecond(throttling.downloadThroughputKbps ?? 0),
-      uploadThroughput: toBytesPerSecond(throttling.uploadThroughputKbps ?? 0),
-    });
-  }
+  if (
+    typeof throttling.downloadThroughputKbps === 'number' ||
+    typeof throttling.uploadThroughputKbps === 'number' ||
+    typeof throttling.requestLatencyMs === 'number'
+  ) {
+    await client.send('Network.emulateNetworkConditions', {
+      offline: false,
+      latency: throttling.requestLatencyMs ?? 0,
+      downloadThroughput:
+        typeof throttling.downloadThroughputKbps === 'number'
+          ? toBytesPerSecond(throttling.downloadThroughputKbps)
+          : -1,
+      uploadThroughput:
+        typeof throttling.uploadThroughputKbps === 'number'
+          ? toBytesPerSecond(throttling.uploadThroughputKbps)
+          : -1,
+    });
+  }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (typeof throttling.downloadThroughputKbps === 'number' || typeof throttling.uploadThroughputKbps === 'number' || typeof throttling.requestLatencyMs === 'number') {
await client.send('Network.emulateNetworkConditions', {
offline: false,
latency: throttling.requestLatencyMs ?? 0,
downloadThroughput: toBytesPerSecond(throttling.downloadThroughputKbps ?? 0),
uploadThroughput: toBytesPerSecond(throttling.uploadThroughputKbps ?? 0),
});
if (
typeof throttling.downloadThroughputKbps === 'number' ||
typeof throttling.uploadThroughputKbps === 'number' ||
typeof throttling.requestLatencyMs === 'number'
) {
await client.send('Network.emulateNetworkConditions', {
offline: false,
latency: throttling.requestLatencyMs ?? 0,
downloadThroughput:
typeof throttling.downloadThroughputKbps === 'number'
? toBytesPerSecond(throttling.downloadThroughputKbps)
: -1,
uploadThroughput:
typeof throttling.uploadThroughputKbps === 'number'
? toBytesPerSecond(throttling.uploadThroughputKbps)
: -1,
});
}
🤖 Prompt for AI Agents
In frontend/tools/perf/run-perf-budget.js around lines 46 to 52, the code
currently uses ?? 0 for throughput fallbacks which sends 0 to
Network.emulateNetworkConditions and can block traffic when only latency is set;
change the payload so downloadThroughput and uploadThroughput are set to
toBytesPerSecond(value) only if the corresponding throttling field is a number,
otherwise set them to the CDP sentinel -1 (and keep latency as
throttling.requestLatencyMs ?? 0); ensure the condition that triggers
client.send remains the same (any of the three is a number) and that types
passed are numbers.

from typing import Dict, Iterable, List, Optional
from urllib import parse, request

import yaml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

CI blocked by missing PyYAML type stubs

The type-checking stage now fails with “Library stubs not installed for 'yaml'” at this import. Please add types-PyYAML to the dev dependencies that CI installs (or explicitly silence the import with type: ignore[import]) so the pipeline can go green again.

Based on pipeline failures

🧰 Tools
🪛 GitHub Actions: CI

[error] 15-15: Library stubs not installed for 'yaml' (typing/import issue).

🤖 Prompt for AI Agents
In tools/ci/check_observability_budgets.py around line 15 the import of yaml
triggers type-check failures because PyYAML stubs are missing; either add
types-PyYAML to the project's dev dependencies that CI installs (pip install
--dev or add to requirements-dev/pyproject[tool.poetry].dev-dependencies) so
Mypy/Pyright can find stubs, or silence the import by appending "  # type:
ignore[import]" to the import line; update CI dependency installation to include
types-PyYAML if choosing the dev-deps route.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1cd8537 and a93d911.

📒 Files selected for processing (1)
  • .github/workflows/ci.yml (1 hunks)

Comment on lines +132 to +154
for i in {1..60}; do curl -sf http://localhost:8000/healthcheck && break || sleep 2; done

- name: Wait for Frontend
run: |
for i in {1..60}; do curl -sf http://localhost:3000/models/manifest.json && break || sleep 2; done

- name: Run performance budget checks
working-directory: ./frontend
env:
PERF_BUDGET_OUTPUT_DIR: ../test-results/perf
run: npm run perf:budget

- name: Upload performance budget report
if: always()
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02
with:
name: perf-budget
path: test-results/perf

- name: Shutdown stack
if: always()
run: |
docker compose -f docker-compose.dev.yml down
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Make readiness waits fail when services never come up

Both wait loops (Wait for API and Wait for Frontend) end with curl … && break || sleep 2. If the service never becomes ready, the final sleep returns exit code 0, so the whole step passes and the pipeline moves on with an unhealthy stack. Please fail fast once the retries are exhausted.

       - name: Wait for API
         run: |
-          for i in {1..60}; do curl -sf http://localhost:8000/healthcheck && break || sleep 2; done
+          for i in {1..60}; do
+            if curl -sf http://localhost:8000/healthcheck; then
+              exit 0
+            fi
+            sleep 2
+          done
+          echo "API did not become ready in time" >&2
+          exit 1
@@
       - name: Wait for Frontend
         run: |
-          for i in {1..60}; do curl -sf http://localhost:3000/models/manifest.json && break || sleep 2; done
+          for i in {1..60}; do
+            if curl -sf http://localhost:3000/models/manifest.json; then
+              exit 0
+            fi
+            sleep 2
+          done
+          echo "Frontend did not become ready in time" >&2
+          exit 1
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for i in {1..60}; do curl -sf http://localhost:8000/healthcheck && break || sleep 2; done
- name: Wait for Frontend
run: |
for i in {1..60}; do curl -sf http://localhost:3000/models/manifest.json && break || sleep 2; done
- name: Run performance budget checks
working-directory: ./frontend
env:
PERF_BUDGET_OUTPUT_DIR: ../test-results/perf
run: npm run perf:budget
- name: Upload performance budget report
if: always()
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02
with:
name: perf-budget
path: test-results/perf
- name: Shutdown stack
if: always()
run: |
docker compose -f docker-compose.dev.yml down
- name: Wait for API
run: |
for i in {1..60}; do
if curl -sf http://localhost:8000/healthcheck; then
exit 0
fi
sleep 2
done
echo "API did not become ready in time" >&2
exit 1
- name: Wait for Frontend
run: |
for i in {1..60}; do
if curl -sf http://localhost:3000/models/manifest.json; then
exit 0
fi
sleep 2
done
echo "Frontend did not become ready in time" >&2
exit 1
🤖 Prompt for AI Agents
.github/workflows/ci.yml lines 132-154: the readiness loops for API and Frontend
use `curl … && break || sleep 2` which can return 0 even when the service never
became healthy; change the loops so they return non-zero when retries are
exhausted — implement a retry loop that attempts curl up to N times and if still
failing after the final attempt explicitly exit with a non-zero status (or check
the final curl result after the loop and `exit 1` on failure) so the CI step
fails when services never come up.

@shayancoin shayancoin merged commit cc87e69 into main Oct 17, 2025
0 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant