Skip to content

Conversation

@irar2
Copy link
Collaborator

@irar2 irar2 commented Aug 26, 2025

fixes #145

This PRs changes the way the metrics of number of running and waiting requests are updated. The current way of atomic increments and a separate Prometheus set is causing the metrics to be inconsistent. Therefore, a use of channels is added.

Also, this PR adds metrics tests, and reorganizes simulator tests a bit.

@irar2 irar2 requested a review from mayabar August 26, 2025 05:36

// reportWaitingRequests sets information about waiting completion requests
func (s *VllmSimulator) reportWaitingRequests() {
func (s *VllmSimulator) reportWaitingRequests(nWaitingReqs int64) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to pass the nWaitingReqs, the s.nWaitingReqs has an updated value


// reportRunningRequests sets information about running completion requests
func (s *VllmSimulator) reportRunningRequests() {
func (s *VllmSimulator) reportRunningRequests(nRunningReqs int64) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to pass the nRunningReqs, the s. nRunningReqs has an updated value

// waitingReqChan is a channel to update nWaitingReqs
waitingReqChan chan int64
// loraInfo is prometheus gauge
loraInfo *prometheus.GaugeVec
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we need a channel for loraInfos? please create an issue for this

kvcacheHelper: nil, // kvcache helper will be created only if required after reading configuration
namespace: os.Getenv(podNsEnv),
pod: os.Getenv(podNameEnv),
runReqChan: make(chan int64, 1000),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can define a constant for metrics channels size or even expose it in the configuration?

case inc := <-s.waitingReqChan:
s.nWaitingReqs += inc
s.reportWaitingRequests(s.nWaitingReqs)
case inc := <-s.runReqChan:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please create a separate loop for each metric

Signed-off-by: Ira <[email protected]>
@mayabar
Copy link
Collaborator

mayabar commented Aug 26, 2025

/lgtm

/approve

@github-actions github-actions bot added the lgtm label Aug 26, 2025
@mayabar
Copy link
Collaborator

mayabar commented Aug 26, 2025

/lgtm
/approve

@irar2 irar2 merged commit 57657bf into llm-d:main Aug 26, 2025
4 checks passed
@irar2 irar2 deleted the metrics branch August 26, 2025 08:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add metrics tests

2 participants