Skip to content
This repository was archived by the owner on Feb 6, 2020. It is now read-only.

another approach to use spot instance for persistent training in AWS #89

@xiuliren

Description

@xiuliren
  • create a spot fleet or persistent request, it is a kind of persistent spot instance managed by AWS. define the training command by user_data of the spot instance.
  • read all the data and configuration files in S3, save network to S3. This can be implemented using boto3 and remove the complex dependency of starcluster.
  • whenever there is an update of saved network file, plot the learning curve online using Plotly.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions