Skip to content

OpenNebula FabricManager - Automatically recover last configured partitions (Stateful appliance) #7385

@josepselga

Description

@josepselga

This feature aims to make the OpenNebula FabricManager Appliance stateful concerning GPU partitioning. Upon initialization or recovery, the FabricManager must automatically detect and re-apply the last successfully configured set of GPUs partitions across the NVSwitches. This ensures workload continuity and eliminates the need for manual intervention to restore the correct topology after a power cycle or crash.

Metadata

Metadata

Assignees

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions