-
Notifications
You must be signed in to change notification settings - Fork 169
Fix gpu test mcore inference utility for Mcore 0.14+ #401
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughIntroduces a StaticInferenceContext derived from InferenceWrapperConfig and routes inference via a context-aware GPTInferenceWrapper. Adjusts prep_model_for_inference call signature, sets materialize_only_last_token_logits=False, and keeps downstream steps unchanged. Separately, removes a version-based skip and unused imports in a GPU quantization test to run a test unconditionally. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor T as Test
participant C as StaticInferenceContext
participant W as GPTInferenceWrapper
participant M as Model
Note over T,C: Build inference context from InferenceWrapperConfig
T->>C: create from InferenceWrapperConfig
C-->>T: inference_context
Note over T,W: Initialize wrapper with context
T->>W: init(inference_context)
Note over W,M: Preparation phase
T->>W: prep_model_for_inference()
W->>M: configure(materialize_only_last_token_logits=false)
Note over W,M: Inference flow (unchanged steps, context-aware)
T->>W: prep_inference_input(...)
W->>W: get_batch_for_context_window(...)
W->>M: run_one_forward_step(...)
M-->>W: logits/output
W-->>T: broadcast_from_last_pipeline_stage(...)
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
💤 Files with no reviewable changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
🔇 Additional comments (3)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Signed-off-by: Keval Morabia <[email protected]>
2f2e9a8
to
1f0080a
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #401 +/- ##
==========================================
- Coverage 73.79% 73.79% -0.01%
==========================================
Files 171 171
Lines 17591 17591
==========================================
- Hits 12982 12981 -1
- Misses 4609 4610 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
What does this PR do?
Type of change: Mcore Test utility fix
Testing
Summary by CodeRabbit
Tests
Chores