Skip to content

Hypervisor multi process scheduling support AutoFreeze #319

@Code2Life

Description

@Code2Life

Summary

For hands-on lab case, developers use GPU randomly, but the program consumes GPU constantly, it cause insufficient VRAM usage and should be cold-down to host memory or disk.

Motivation

Reduce VRAM in low-frequency GPU usage cases, to increase oversubscription ratio and save more costs.

Tech Design

  1. TF Controller injects hypervisor AutoFreezeAndResume configs into env var
Image
  1. Hypervisor deserialize the config, and pass it to multi-process scheduling module

  2. Hypervisor maintains timer for each active process, if if meets the autoFreeze criteria, call TF workers' ABI to trigger suspend, and move it to queue that won't be awaked.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions