chore: Add data from auto-collector pipeline 46210829 (h100_sxm_sglang_0.5.9)#597
chore: Add data from auto-collector pipeline 46210829 (h100_sxm_sglang_0.5.9)#597dynamo-ops wants to merge 1 commit intomainfrom
Conversation
Signed-off-by: dynamo-ops <170655669+dynamo-ops@users.noreply.github.com>
WalkthroughSeven new Git LFS pointer files are added to track large binary performance metric assets in the H100 SXM SGLang 0.5.9 directory. Each pointer contains standard Git LFS metadata (version, oid, size) without introducing code or logic changes. Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. 📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip You can customize the high-level summary generated by CodeRabbit.Configure the |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/aiconfigurator/systems/data/h100_sxm/sglang/0.5.9/gemm_perf.txt`:
- Around line 1-3: Block ingestion of the gemm_perf.txt artifact until collector
errors are resolved: stop persisting
src/aiconfigurator/systems/data/h100_sxm/sglang/0.5.9/gemm_perf.txt when the run
metadata reports errors (currently 486 GEMM module errors out of 487); require
either a successful rerun with zero collector/GEMM errors or attach a validation
report that proves coverage and data integrity for gemm_perf.txt before allowing
ingestion; update the ingestion gating logic (the collector/GEMM validation
step) to check the run metadata error count and reject artifacts with non-zero
GEMM/collector errors, and surface a clear error message referencing the
offending gemm_perf.txt artifact when rejecting.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 730bf8a2-96d7-4519-9807-a0b6c976df22
📒 Files selected for processing (7)
src/aiconfigurator/systems/data/h100_sxm/sglang/0.5.9/context_attention_perf.txtsrc/aiconfigurator/systems/data/h100_sxm/sglang/0.5.9/context_mla_perf.txtsrc/aiconfigurator/systems/data/h100_sxm/sglang/0.5.9/gemm_perf.txtsrc/aiconfigurator/systems/data/h100_sxm/sglang/0.5.9/generation_attention_perf.txtsrc/aiconfigurator/systems/data/h100_sxm/sglang/0.5.9/generation_mla_perf.txtsrc/aiconfigurator/systems/data/h100_sxm/sglang/0.5.9/mla_bmm_perf.txtsrc/aiconfigurator/systems/data/h100_sxm/sglang/0.5.9/moe_perf.txt
| version https://git-lfs.github.com/spec/v1 | ||
| oid sha256:5ef3d903bda0116a2dbb99c3394e7a5763301e6197dcfd6f5d3af33c55f64517 | ||
| size 9096207 |
There was a problem hiding this comment.
Block ingest of this GEMM artifact until collector errors are resolved.
The pointer itself is valid, but this PR’s run metadata reports 486 GEMM module errors (out of 487 total). Shipping gemm_perf.txt from that run risks persisting incomplete/corrupted performance data. Please gate this update on a successful rerun (or attach a validation report proving coverage/quality for this artifact).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/aiconfigurator/systems/data/h100_sxm/sglang/0.5.9/gemm_perf.txt` around
lines 1 - 3, Block ingestion of the gemm_perf.txt artifact until collector
errors are resolved: stop persisting
src/aiconfigurator/systems/data/h100_sxm/sglang/0.5.9/gemm_perf.txt when the run
metadata reports errors (currently 486 GEMM module errors out of 487); require
either a successful rerun with zero collector/GEMM errors or attach a validation
report that proves coverage and data integrity for gemm_perf.txt before allowing
ingestion; update the ingestion gating logic (the collector/GEMM validation
step) to check the run metadata error count and reject artifacts with non-zero
GEMM/collector errors, and surface a clear error message referencing the
offending gemm_perf.txt artifact when rejecting.
|
Do we actually expect most of the GEMM test cases to fail? |
Error Summary for Auto-Collector Run
Collection summary for h100_sxm sglang:0.5.9
Error summary
Summary by CodeRabbit