Skip to content

Investigate an Armada integration with TorchX #198

@sync-by-unito

Description

@sync-by-unito

To enable features such as elastic training.

Looks like a very useful framework for distributed deep learning and currently has integrations with Slurm, K8s, Ray etc.

It would be interesting to investigate the possibility of an Armada integration and know how much work this might be.

https://pytorch.org/torchx/latest/

┆Issue is synchronized with this Jira Task by Unito

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions