Hypervisor multi process scheduling support AutoFreeze

## Summary
For hands-on lab case, developers use GPU randomly, but the program consumes GPU constantly, it cause insufficient VRAM usage and should be cold-down to host memory or disk.

## Motivation
Reduce VRAM in low-frequency GPU usage cases, to increase oversubscription ratio and save more costs.

## Tech Design
1. TF Controller injects hypervisor AutoFreezeAndResume configs into env var

<img width="1211" height="539" alt="Image" src="https://github.com/user-attachments/assets/b612eb92-1044-4e45-a5ab-901181cd2497" />

2. Hypervisor deserialize the config, and pass it to multi-process scheduling module

3. Hypervisor maintains timer for each active process, if if meets the autoFreeze criteria, call TF workers' ABI to trigger suspend, and move it to queue that won't be awaked.




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Hypervisor multi process scheduling support AutoFreeze #319

Summary

Motivation

Tech Design

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Hypervisor multi process scheduling support AutoFreeze #319

Description

Summary

Motivation

Tech Design

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions