Add recover_status parser lustrefs_exporter#118
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## spoutn1k/EHT-1348-history-in-the-making #118 +/- ##
===========================================================================
- Coverage 89.60% 89.34% -0.26%
===========================================================================
Files 44 44
Lines 5375 5470 +95
Branches 5375 5470 +95
===========================================================================
+ Hits 4816 4887 +71
- Misses 484 509 +25
+ Partials 75 74 -1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
| Branch | breuhan/parsing_full_recovery_status |
| Testbed | ci-runner |
Click to view all benchmark results
| Benchmark | Latency | Benchmark Result nanoseconds (ns) (Result Δ%) | Lower Boundary nanoseconds (ns) (Limit %) | Upper Boundary nanoseconds (ns) (Limit %) |
|---|---|---|---|---|
| parse_benchmarks/combine_performance | 📈 view plot 🚷 view threshold | 124,620,000.00 ns(-57.53%)Baseline: 293,399,285.71 ns | -764,455,415.43 ns (-613.43%) | 1,351,253,986.86 ns (9.22%) |
|
| Branch | breuhan/parsing_full_recovery_status |
| Testbed | ci-runner |
⚠️ WARNING: No Threshold found!Without a Threshold, no Alerts will ever be generated.
- LL Hits (hits)
- Estimated Cycles (cycles)
- I1mr (misses (reads))
- LL Hit Rate (hits (%))
- Total read+write (reads/writes)
- Dr (reads)
- RAM Hit Rate (hits (%))
- D1mw (misses (writes))
- L1 Hits (hits)
- Dw (writes)
- I1 Miss Rate (misses (%))
- DLmw (misses (writes))
- DLmr (misses (reads))
- ILmr (misses (reads))
- L1 Hit Rate (hits (%))
- D1mr (misses (reads))
- LL Miss Rate (misses (%))
- LLd Miss Rate (misses (%))
- LLi Miss Rate (misses (%))
- RAM Hits (hits)
- D1 Miss Rate (misses (%))
Click here to create a new Threshold
For more information, see the Threshold documentation.
To only post results if a Threshold exists, set the--ci-only-thresholdsflag.
Click to view all benchmark results
| Benchmark | D1 Miss Rate | misses (%) | D1mr | misses (reads) x 1e3 | D1mw | misses (writes) x 1e3 | DLmr | misses (reads) | DLmw | misses (writes) x 1e3 | Dr | reads x 1e6 | Dw | writes x 1e6 | Estimated Cycles | cycles x 1e6 | I1 Miss Rate | misses (%) | I1mr | misses (reads) x 1e3 | ILmr | misses (reads) | Instructions | Benchmark Result instructions x 1e6 (Result Δ%) | Lower Boundary instructions x 1e6 (Limit %) | Upper Boundary instructions x 1e6 (Limit %) | L1 Hit Rate | hits (%) | L1 Hits | hits x 1e6 | LL Hit Rate | hits (%) | LL Hits | hits x 1e3 | LL Miss Rate | misses (%) | LLd Miss Rate | misses (%) | LLi Miss Rate | misses (%) | RAM Hit Rate | hits (%) | RAM Hits | hits x 1e3 | Total read+write | reads/writes x 1e6 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| lustre_metrics::memory_benches::bench_encode_lustre_metrics with_setup:generate_records() | 📈 view plot | 0.93 % | 📈 view plot | 25.00 reads x 1e3 | 📈 view plot | 9.21 writes x 1e3 | 📈 view plot | 117.00 reads | 📈 view plot | 6.45 writes x 1e3 | 📈 view plot | 2.47 x 1e6 | 📈 view plot | 1.23 x 1e6 | 📈 view plot | 14.81 x 1e6 | 📈 view plot | 0.01 % | 📈 view plot | 1.06 reads x 1e3 | 📈 view plot | 890.00 reads | 📈 view plot 🚷 view threshold | 10.75 x 1e6(-23.14%)Baseline: 13.99 x 1e6 | 2.43 x 1e6 (22.65%) | 25.54 x 1e6 (42.09%) | 📈 view plot | 99.76 % | 📈 view plot | 14.41 x 1e6 | 📈 view plot | 0.19 % | 📈 view plot | 27.83 x 1e3 | 📈 view plot | 0.05 % | 📈 view plot | 0.18 % | 📈 view plot | 0.01 % | 📈 view plot | 0.05 % | 📈 view plot | 7.46 x 1e3 | 📈 view plot | 14.44 x 1e6 |
9334a78 to
1478c07
Compare
1478c07 to
4daf4d8
Compare
There was a problem hiding this comment.
Pull Request Overview
This PR enhances the recovery status parser to collect and export four additional metrics for Lustre filesystem recovery monitoring: completed clients, duration, time remaining, and total clients involved in recovery operations.
- Added support for parsing
recovery_duration,time_remaining, andtotal_clientsfields from recovery status output - Introduced new Prometheus metrics for recovery duration, time remaining, and total client counts
- Updated test fixtures and snapshots to validate the new metrics extraction
Reviewed Changes
Copilot reviewed 23 out of 23 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| lustre-collector/src/recovery_status_parser.rs | Extended parser to extract duration, time remaining, and total client metrics from recovery status |
| lustre-collector/src/types.rs | Added new TargetStats variants for the additional recovery metrics |
| lustre-collector/src/parser.rs | Integrated recovery status parser into main parsing flow |
| lustrefs-exporter/src/brw_stats.rs | Added Prometheus metric families and registration for the new recovery metrics |
| lustrefs-exporter/src/lib.rs | Added new metric names to the validation list and improved error handling |
| Test fixtures and snapshots | Updated test data and expected outputs to validate new metric extraction |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
4daf4d8 to
bcfff9a
Compare
enhance recovery status parser to include additional metrics: - RecoveryDuration - RecoveryTimeRemaining - RecoveryTotalClients
bcfff9a to
e42468a
Compare
...efs_exporter__tests__valid_fixture_lustre-2.14.0_ddn212__2.14.0_ddn212_recovery.txt.histsnap
Show resolved
Hide resolved
johnsonw
left a comment
There was a problem hiding this comment.
A few comments. Also, since this also updates the lustre-collector I believe we will need to update the version in EMF once this lands.
0240dbd to
7268720
Compare
a789529 to
dbc3217
Compare
7268720 to
d2519eb
Compare
Added capture of
recovery_statustolustrefs_exporterand added these new metrics:RecoveryDuration
RecoveryTimeRemaining
RecoveryTotalClients
Demo:
Input:
Output: