Skip to content

Deprecate prepare_data_per_node and auto detect shared file-system #12159

@rohitgr7

Description

@rohitgr7

Proposed refactor

This was proposed by @tchaton on live-stream. prepare_data_per_nodecan be identified by dumping a file on rank 0 and checking whether all other ranks can see that file or not. In the case of a multi-node system, if the file system is shared then all of them will be able to see that file, else not.

Motivation

This way users won't have to take care of setting this flag manually.

Pitch

possible pseudocode (feel free to update this or comment)

if global_rank == 0:
    filename = 'something_{uuid()}.txt'
    broadcast(filename)
    dump_file(filename)

barrier()

if global_rank == 0
    is_file_present = does_file_exist(filename)
    is_file_present = all_gather(is_file_present)
    prepare_data_per_node = is_file_present.all()
    prepare_data_per_node = broadcast(prepare_data_per_node)

any other alternatives??

Additional context


If you enjoy Lightning, check out our other projects! ⚡

  • Metrics: Machine learning metrics for distributed, scalable PyTorch applications.

  • Lite: enables pure PyTorch users to scale their existing code on any kind of device while retaining full control over their own loops and optimization logic.

  • Flash: The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, fine-tuning, and solving problems with deep learning.

  • Bolts: Pretrained SOTA Deep Learning models, callbacks, and more for research and production with PyTorch Lightning and PyTorch.

  • Lightning Transformers: Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.

cc @Borda @justusschock @kaushikb11 @awaelchli @ananthsub @ninginthecloud @jjenniferdai @rohitgr7 @akihironitta

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureIs an improvement or enhancementtrainer

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions