feat(grafana): add performance and API monitoring dashboards#17
Merged
lgahdl merged 10 commits intojefferson/cow-614-cow-591-phase-2-extended-prometheus-metricsfrom Feb 25, 2026
Conversation
Add two Grafana dashboards for monitoring performance tests: - Overview dashboard: test progress, order rates, latency distributions - API Performance dashboard: response times, throughput, error rates Configure dashboard provisioning via docker-compose volume mount and add explicit UID to Prometheus datasource for dashboard compatibility.
Add upload_app_data_with_retry() and get_open_order_count() methods that were missing from the instrumented wrapper, causing AttributeError when used in place of the underlying OrderbookClient.
Add three new dashboards completing the Grafana visualization suite: - Resources dashboard: CPU, memory, network monitoring per container - Comparison dashboard: baseline vs current with regression indicators - Trader Activity dashboard: per-trader statistics and activity patterns Update existing dashboards with cross-navigation links to all 5 dashboards.
… COW-593 Document Prometheus exporter phases and Grafana dashboard implementation plans to track progress on metrics infrastructure work.
- Add prometheus_port config field with default 9091 - CLI uses config default, --prometheus-port 0 to disable - Enhance order timeout logging with status, age, token pair, lifecycle - Improve monitoring output with status breakdown counts - Show all terminal states in final summary (filled/expired/failed/cancelled) - Update README and CLI docs with monitoring instructions
Add concurrent Prometheus metrics update loop that exports test progress and throughput metrics every second during performance test runs. This fixes "No Data" panels in the Overview dashboard. Remove redundant P50 delta panels from the comparison dashboard and adjust grid positions for cleaner layout.
- Create 7 core alerting rules (latency, error rate, throughput, resources, test execution) - Enable rule_files in Prometheus configuration - Add alerts volume mount in Docker Compose - Add Grafana annotations to show firing alerts on dashboard - Add container_memory_percent metric for CriticalMemoryUsage alert
- Add implementation plan: thoughts/plans/2026-02-13-cow-598-alerting-rules.md - Add implementation notes to ticket file documenting scope decisions - Update INDEX.md with plan entry and document cluster reference Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…aining-dashboards-resources-comparison COW-593 task 2 remaining dashboards resources comparison
5b5298a
into
jefferson/cow-614-cow-591-phase-2-extended-prometheus-metrics
10 checks passed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add Grafana dashboards for real-time monitoring of performance tests, completing Task 1 of COW-593. This includes an Overview dashboard for test progress and order metrics, plus an API Performance dashboard for detailed endpoint analysis.
Changes
Grafana Dashboards
Overview Dashboard (
configs/dashboards/performance.json):API Performance Dashboard (
configs/dashboards/api-performance.json):Infrastructure
configs/grafana-datasource.yml): Added explicituid: prometheusfor dashboard compatibilitydocker-compose.yml): Added volume mount for dashboard provisioningBug Fix
src/cow_performance/api/instrumented_client.py):upload_app_data_with_retry()method with exponential backoffget_open_order_count()delegation methodHow to Test
Start the Docker services:
Access Grafana at http://localhost:3000 (admin/admin)
Navigate to Dashboards → Browse and verify:
Run a performance test to generate metrics and verify panels populate
Checklist
poetry run pytest)poetry run ruff check .)poetry run mypy .)Breaking Changes
None
Related Issues