-
Notifications
You must be signed in to change notification settings - Fork 3.6k
ci/gpu: setting oldest dependencies #20939
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
⚡ Required checks status: All passing 🟢Groups summary🟢 pytorch_lightning: Tests workflow
These checks are required after the changes to 🟢 pytorch_lightning: Azure GPU
These checks are required after the changes to 🟢 pytorch_lightning: Benchmarks
These checks are required after the changes to 🟢 fabric: Docs
These checks are required after the changes to 🟢 pytorch_lightning: Docs
These checks are required after the changes to 🟢 pytorch_lightning: Docker
These checks are required after the changes to 🟢 lightning_fabric: CPU workflow
These checks are required after the changes to 🟢 lightning_fabric: Azure GPU
These checks are required after the changes to 🟢 mypy
These checks are required after the changes to 🟢 install
These checks are required after the changes to Thank you for your contribution! 💜
|
This reverts commit c362ab6.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #20939 +/- ##
========================================
- Coverage 87% 86% -1%
========================================
Files 268 268
Lines 23453 23453
========================================
- Hits 20404 20273 -131
- Misses 3049 3180 +131 |
This reverts commit fb3640c.
# note: is a bug around 0.10 with `MPS_Accelerator must implement all abstract methods` | ||
# shall be resolved by https://github.com/microsoft/DeepSpeed/issues/4372 | ||
deepspeed >=0.8.2, <=0.9.3; platform_system != "Windows" and platform_system != "Darwin" # strict | ||
deepspeed >=0.9.3, <=0.9.3; platform_system != "Windows" and platform_system != "Darwin" # strict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dropping the 0.8
since it would need to be compiled from source
Also, noted that we are quite far behind the latest 0.17
🤔
Why Upgrade? Upgrading to v0.17 delivers significant performance, stability, and integration benefits—vital for training larger models with improved efficiency and reliability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
ZeRO Optimizations: • v0.9: Early experiments in partitioning model states for memory savings. • v0.17: Advanced refinements (ZeRO-Offload and improved stage 3) enable training massively scaled models.
-
Performance Enhancements: • Upgraded distributed communication and fused operations. • Better mixed precision (fp16/bf16) support for faster training and efficient hardware usage.
-
Stability & API Maturation: • Streamlined configuration, enhanced documentation, and robust testing. • Fewer bugs and smoother integration with frameworks like HuggingFace Transformers.
-
Inference Improvements: • Expanded inference API with support for quantization. • Optimized runtime strategies for production deployment.
-
Ecosystem Integration: • Broader compatibility with modern AI tools and libraries. • Simplifies building and deploying complex deep learning workflows.
* ci/gpu: setting oldest dependencies * pip install "cython<3.0" * deepspeed ==0.9.3 * typing-extensions >=4.5.0 * PyYAML >5.4 * torchmetrics >0.7.0 * lightning-utilities >=0.10.0 (cherry picked from commit a651975)
* ci/gpu: setting oldest dependencies * pip install "cython<3.0" * deepspeed ==0.9.3 * typing-extensions >=4.5.0 * PyYAML >5.4 * torchmetrics >0.7.0 * lightning-utilities >=0.10.0 (cherry picked from commit a651975)
What does this PR do?
Having a pretty setup between the CPU and GPU testing env.
Before submitting
PR review
Anyone in the community is welcome to review the PR.
Before you start reviewing, make sure you have read the review guidelines. In short, see the following bullet-list:
Reviewer checklist