Skip to content

[Colocation Enhancement] Support configurable node monitor parameters #4911

@JesseStutler

Description

@JesseStutler

What is the problem you're trying to solve

Currently, in volcano colocation scenario, When the usage of a resource exceeds a threshold (such as CPU), six consecutive instances of exceeding the threshold (detected every 10 seconds) are required before eviction of BE pods. Currently, this is hard-coded, we need to make it configurable for users.

The highUsageCountLimit constant is currently hardcoded:
https://github.com/volcano-sh/volcano/blob/7b14346ab7a7c46814da3951a21ee3cad9ccf5c0/pkg/agent/events/probes/nodemonitor/node_monitor.go#L201C47-L201C66
as are the frequencies of detect and monitor:

go wait.Until(m.utilizationMonitoring, 10*time.Second, stop)
go wait.Until(m.detect, 10*time.Second, stop)
).

Describe the solution you'd like

We need to support configurable values, we can still set 10 seconds and 6 times for highUsageLimit by default

Additional context

No response

Documentation Updates

  • This feature requires design or user documentation changes.
  • If documentation changes are required, I will ensure the relevant documents are updated and published to the Volcano official website (https://volcano.sh) via the volcano-sh/website repository.

Metadata

Metadata

Labels

area/colocationissues or PRs related to colocation featuresgood first issueDenotes an issue ready for a new contributor, according to the "help wanted" guidelines.help wantedDenotes an issue that needs help from a contributor. Must meet "help wanted" guidelines.kind/featureCategorizes issue or PR as related to a new feature.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions