-
Notifications
You must be signed in to change notification settings - Fork 170
Example: add offline eagle training commands to README #366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: h-guo18 <[email protected]>
Signed-off-by: h-guo18 <[email protected]>
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
WalkthroughThe README for examples/speculative_decoding was reorganized: “Complete Workflow” was replaced by separate online/offline draft-model training workflows, with new commands for training, validation, export, and deployment. It adds expanded contents, concrete Accelerate examples, hidden-state dumping for offline use, and deployment notes for TRT-LLM and SGLang. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant DraftTrainer as Draft Trainer
participant BaseModel as Base Model (Online)
participant Validator as Validator (MT-Bench)
participant Exporter as Checkpoint Export
rect rgba(200,235,255,0.25)
note over User,DraftTrainer: Online Training Flow
User->>DraftTrainer: Launch accelerate training
DraftTrainer->>BaseModel: Forward pass for hidden states
DraftTrainer-->>User: Trained draft model checkpoint
end
rect rgba(220,255,220,0.25)
note over User,Validator: Validation
User->>Validator: Evaluate trained checkpoint
Validator-->>User: MT-Bench metrics
end
rect rgba(255,240,200,0.25)
note over User,Exporter: Export
User->>Exporter: Export trained checkpoint
Exporter-->>User: Deployment-ready artifacts
end
sequenceDiagram
autonumber
actor User
participant Dumper as Hidden State Dumper
participant Storage as Offline Store
participant DraftTrainer as Draft Trainer (Offline)
participant Validator as Validator
participant Exporter as Export
rect rgba(200,235,255,0.25)
note over User,Dumper: Offline Preparation
User->>Dumper: Run base model to dump hidden states
Dumper->>Storage: Save hidden states
end
rect rgba(220,255,220,0.25)
note over User,DraftTrainer: Offline Training Flow
User->>DraftTrainer: Train with --offline-data
DraftTrainer->>Storage: Load hidden states
DraftTrainer-->>User: Trained draft model checkpoint
end
rect rgba(255,240,200,0.25)
note over User,Validator: Validation and Export
User->>Validator: Evaluate checkpoint
Validator-->>User: Results
User->>Exporter: Export checkpoint
Exporter-->>User: Artifacts for TRT-LLM / SGLang
end
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
examples/speculative_decoding/README.md (1)
98-109
: Clarify dataset alignment when using cached hidden statesWhen we pass cached hidden states, the trainer still streams the original JSONL so it can pair each example with the corresponding hidden-state file. If
--data
points at a different file (or even a differently ordered clone), training dies with shape/length assertions. Adding a one-line reminder here will save readers a painful debugging roundtrip.Apply this diff to add the clarification:
Then, train draft model with `--offline-data` argument: @@ --offline-data $HIDDEN_STATES_DIR
+> Note: Ensure the path supplied via
--data
matches the dataset you used when dumping hidden states (for example,Daring-Anteater/train.jsonl
). A mismatch will cause ordering/length errors when the trainer pairs samples with cached activations.</blockquote></details> </blockquote></details> <details> <summary>📜 Review details</summary> **Configuration used**: CodeRabbit UI **Review profile**: CHILL **Plan**: Pro <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between 26c203abde6ea430dbb84e1f13e5673cd86a15bd and 4eda3e240cceae20356093ad4462491f1d102a45. </details> <details> <summary>📒 Files selected for processing (1)</summary> * `examples/speculative_decoding/README.md` (6 hunks) </details> <details> <summary>⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)</summary> * GitHub Check: linux * GitHub Check: code-quality </details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #366 +/- ##
=======================================
Coverage 73.46% 73.46%
=======================================
Files 172 172
Lines 17640 17640
=======================================
Hits 12959 12959
Misses 4681 4681 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: h-guo18 <[email protected]> Signed-off-by: Ye Yu <[email protected]>
What does this PR do?
Type of change: new example
Overview:
Usage
# Add a code snippet demonstrating how to use this
Testing
Before your PR is "Ready for review"
Additional Information
Summary by CodeRabbit