-
Notifications
You must be signed in to change notification settings - Fork 41
MLP storage advice #145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MLP storage advice #145
Conversation
|
preview available: https://docs.tds.cscs.ch/145 |
|
preview available: https://docs.tds.cscs.ch/145 |
|
preview available: https://docs.tds.cscs.ch/145 |
|
preview available: https://docs.tds.cscs.ch/145 |
|
preview available: https://docs.tds.cscs.ch/145 |
|
preview available: https://docs.tds.cscs.ch/145 |
3 similar comments
|
preview available: https://docs.tds.cscs.ch/145 |
|
preview available: https://docs.tds.cscs.ch/145 |
|
preview available: https://docs.tds.cscs.ch/145 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This page provides information about file systems that is generic to all.
This information is MLP specific (or ML workflow specific), so it doesn't quite fit in her.
I would suggest creating an "ML Storage" Guide under the Guides section?
you can then link to it from the storage docs for MLP: https://docs.cscs.ch/platforms/mlp/#file-systems-and-storage
Could you explain more why you think it does this not fit? This is MLP-specific advice on the MLP page. The previous version of this page already detailed some information about the scratch spaces and usage, this simply updates that and provides a clearer description specific to the typical workflows of our users. |
|
The proposed updates directly discuss best practices for where to store checkpoints and inputs for LLM training. Specifically it also discusses Iopsstore, which is currently only for the use of MLP - suggesting that all users on the system should start using it would affect LLM training runs. More generally, the documentation on this page aims to explain what Scratch/Store/Home are, and the policies that apply to them. We need additional best-practices guides for storage. For the MLP you have practices, and I think these can go in a page under the MLp, where they are easily acccessible to MLP users. A more generic guide, that covers topics like LUSTRE striping, can go under the Guides section. |
bcumming
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I completely missed that these updates were made to the MLP storage docs.
Please ignore my earlier comments
No description provided.