-
Notifications
You must be signed in to change notification settings - Fork 91
Support SkyPilot Storage configurations in file_mounts for automatic cloud sync
#335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hi @andylizf, thank you so much for your contribution. Could you fix the formatting check and add some unit tests for your change? Should be good to merge after that. |
- Modified SkypilotExecutor to handle both string paths and dict configs in file_mounts - Dictionary configs are automatically converted to sky.Storage objects - Enables automatic cloud storage mounting (GCS, S3, etc.) for outputs This change allows users to specify cloud storage backends directly in file_mounts, enabling automatic synchronization of training outputs to cloud storage without manual rsync operations. Signed-off-by: Andy Lee <[email protected]>
Signed-off-by: Andy Lee <[email protected]>
|
@hemildesai Formatted. Can you take a look again? |
|
Formatting looks good, can you add unit tests to https://github.com/NVIDIA-NeMo/Run/blob/main/test/core/execution/test_skypilot.py? |
- Test storage_mounts parameter initialization - Test to_task() method with storage_mounts configurations - Test combined file_mounts and storage_mounts usage - Verify Storage.from_yaml_config() integration - Ensure backward compatibility when storage_mounts is None Signed-off-by: Andy Lee <[email protected]>
Added. PTAL! |
Signed-off-by: Hemil Desai <[email protected]>
…c cloud sync (NVIDIA-NeMo#335) * fix: support for SkyPilot Storage configurations in file_mounts - Modified SkypilotExecutor to handle both string paths and dict configs in file_mounts - Dictionary configs are automatically converted to sky.Storage objects - Enables automatic cloud storage mounting (GCS, S3, etc.) for outputs This change allows users to specify cloud storage backends directly in file_mounts, enabling automatic synchronization of training outputs to cloud storage without manual rsync operations. Signed-off-by: Andy Lee <[email protected]> * refactor: Separate storage_mounts from file_mounts for cleaner API Signed-off-by: Andy Lee <[email protected]> * test: Add unit tests for storage_mounts functionality - Test storage_mounts parameter initialization - Test to_task() method with storage_mounts configurations - Test combined file_mounts and storage_mounts usage - Verify Storage.from_yaml_config() integration - Ensure backward compatibility when storage_mounts is None Signed-off-by: Andy Lee <[email protected]> * fix tests Signed-off-by: Hemil Desai <[email protected]> --------- Signed-off-by: Andy Lee <[email protected]> Signed-off-by: Hemil Desai <[email protected]> Co-authored-by: Hemil Desai <[email protected]> Signed-off-by: Zoey Zhang <[email protected]>
Problem
Currently,
SkypilotExecutoronly supports simple string-to-string mappings infile_mounts, which limits users to local file syncing. However, SkyPilot's YAML specification supports dictionary configurations for cloud storage mounts, enabling automatic synchronization with GCS, S3, and other cloud storage backends.Users cannot leverage SkyPilot's powerful storage mounting features when using NeMo-Run, forcing them to manually sync outputs after training completion.
Solution
This PR enhances
SkypilotExecutorto handle both string paths and dictionary configurations infile_mounts, matching SkyPilot's YAML capabilities:sky.Storageobjects usingStorage.from_yaml_config()task.set_file_mounts()for strings andtask.set_storage_mounts()for Storage objectsChanges
SkypilotExecutor.to_task()method to handle mixed file_mount typesfile_mountstype hint fromdict[str, str]todict[str, Any]to support both formatsUsage Example
Benefits
Testing