Skip to content

Requires a way to control segment setting for NVL72 system #1421

@youngeunkwon0405

Description

@youngeunkwon0405

Is your feature request related to a problem? Please describe.

In the NVL72 system non-colocated setup, we might want to configure each virtual cluster to be assigned to the same rack. We can specify --segment variable in the slurm command, but that only guarantees the global GPU allocation. Within the global allocation, we might need a way to control Training/Generation GPU assignment considering the segment.

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Metadata

Metadata

Labels

PerformanceRelated to improving performanceenhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions