-
Notifications
You must be signed in to change notification settings - Fork 105
vllm - Add initial set of metrics #7285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
rzabarazesh
wants to merge
4
commits into
pytorch:main
Choose a base branch
from
rzabarazesh:vllm-ci-metrics
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+2,262
−117
Open
Changes from 1 commit
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
16 changes: 16 additions & 0 deletions
16
torchci/clickhouse_queries/vllm/ci_run_duration/params.json
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
{ | ||
"params": { | ||
"repo": "String", | ||
"pipelineName": "String", | ||
"startTime": "DateTime64(3)", | ||
"stopTime": "DateTime64(3)" | ||
}, | ||
"tests": [ | ||
{ | ||
"repo": "vllm-project/vllm", | ||
"pipelineName": "CI", | ||
"startTime": "2025-09-26T00:00:00.000", | ||
"stopTime": "2025-10-03T00:00:00.000" | ||
} | ||
] | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
-- vLLM CI run durations (Buildkite builds) | ||
-- Lists per-build durations based on build.started_at and build.finished_at | ||
|
||
WITH b AS ( | ||
SELECT | ||
tupleElement(pipeline, 'repository') AS repository, | ||
tupleElement(pipeline, 'name') AS pipeline_name, | ||
toUInt32(tupleElement(build, 'number')) AS build_number, | ||
tupleElement(build, 'started_at') AS build_started_at, | ||
tupleElement(build, 'finished_at') AS build_finished_at, | ||
tupleElement(build, 'state') AS build_state | ||
FROM vllm.vllm_buildkite_jobs | ||
WHERE | ||
tupleElement(pipeline, 'repository') = {repo: String } | ||
AND tupleElement(pipeline, 'name') = {pipelineName: String } | ||
AND tupleElement(build, 'started_at') IS NOT NULL | ||
AND tupleElement(build, 'finished_at') IS NOT NULL | ||
AND tupleElement(build, 'started_at') >= {startTime: DateTime64(3) } | ||
AND tupleElement(build, 'started_at') < {stopTime: DateTime64(3) } | ||
) | ||
|
||
SELECT | ||
pipeline_name, | ||
build_number, | ||
max(build_started_at) AS started_at, | ||
max(build_finished_at) AS finished_at, | ||
any(build_state) AS build_state, | ||
dateDiff('second', started_at, finished_at) AS duration_seconds, | ||
round(duration_seconds / 3600.0, 3) AS duration_hours | ||
FROM b | ||
GROUP BY pipeline_name, build_number | ||
ORDER BY started_at ASC |
14 changes: 14 additions & 0 deletions
14
torchci/clickhouse_queries/vllm/pr_cycle_time_breakdown/params.json
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
{ | ||
"params": { | ||
"repo": "String", | ||
"startTime": "DateTime64(3)", | ||
"stopTime": "DateTime64(3)" | ||
}, | ||
"tests": [ | ||
{ | ||
"repo": "vllm-project/vllm", | ||
"startTime": "2025-09-22T00:00:00.000", | ||
"stopTime": "2025-09-29T00:00:00.000" | ||
} | ||
] | ||
} |
182 changes: 182 additions & 0 deletions
182
torchci/clickhouse_queries/vllm/pr_cycle_time_breakdown/query.sql
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,182 @@ | ||
-- vLLM PR cycle time breakdown | ||
-- Computes P50 and P90 (hours) for: | ||
-- 1) Time to first (human) review: PR ready -> first human review | ||
-- 2) Time to approval: first human review -> first approval | ||
-- 3) Time in merge queue: first approval -> merge time | ||
-- Notes: | ||
-- - "Ready" is derived from the first time the 'ready' label was applied. | ||
-- - Reviews excluded if state = 'DISMISSED' and if reviewer looks like a bot. | ||
-- - Human review is approximated via author_association in an allowed set and reviewer != PR author. | ||
-- - Metrics only consider merged PRs within the window [startTime, stopTime). | ||
|
||
WITH prs AS ( | ||
SELECT | ||
number AS pr_number, | ||
user.login AS author, | ||
parseDateTimeBestEffort(created_at) AS created_at_ts, | ||
parseDateTimeBestEffort(closed_at) AS merged_at_ts | ||
FROM default.pull_request | ||
WHERE | ||
dynamoKey LIKE concat({repo: String }, '%') | ||
AND state = 'closed' | ||
AND closed_at != '' | ||
AND parseDateTimeBestEffort(closed_at) >= {startTime: DateTime64(3) } | ||
AND parseDateTimeBestEffort(closed_at) < {stopTime: DateTime64(3) } | ||
), | ||
|
||
ready_events AS ( | ||
SELECT | ||
ple.pr_number, | ||
minIf( | ||
ple.event_time, | ||
lowerUTF8(ple.label_name) = 'ready' AND ple.action = 'labeled' | ||
) AS first_ready_ts | ||
FROM default.pull_label_event ple | ||
WHERE | ||
ple.repo_name = {repo: String } | ||
GROUP BY ple.pr_number | ||
), | ||
|
||
reviews_raw AS ( | ||
SELECT | ||
toUInt32( | ||
extractGroups(review.'pull_request_url', 'pulls/([0-9]+)')[1] | ||
) AS pr_number, | ||
review.'user'.'login' AS reviewer, | ||
review.'state' AS state, | ||
review.'author_association' AS author_association, | ||
review.'submitted_at' AS submitted_at_ts | ||
FROM default.pull_request_review | ||
WHERE | ||
dynamoKey LIKE concat({repo: String }, '%') | ||
AND review.'submitted_at' IS NOT NULL | ||
), | ||
|
||
-- Filter to human reviews and exclude dismissed ones and bot reviewers | ||
human_reviews AS ( | ||
SELECT | ||
r.pr_number, | ||
r.reviewer, | ||
r.state, | ||
r.author_association, | ||
r.submitted_at_ts | ||
FROM reviews_raw r | ||
WHERE | ||
lowerUTF8(r.state) != 'dismissed' | ||
AND r.author_association IN ( | ||
'MEMBER', 'OWNER', 'COLLABORATOR', 'CONTRIBUTOR' | ||
) | ||
AND r.reviewer NOT LIKE '%[bot]' | ||
AND lowerUTF8(r.reviewer) NOT LIKE '%bot%' | ||
), | ||
|
||
first_human_review AS ( | ||
SELECT | ||
pr.pr_number, | ||
-- Define "first review" as first non-approved human review (commented/changes_requested) | ||
minIf( | ||
hr.submitted_at_ts, | ||
hr.reviewer != pr.author | ||
AND lowerUTF8(hr.state) IN ('commented', 'changes_requested') | ||
) AS first_review_ts | ||
FROM prs pr | ||
LEFT JOIN human_reviews hr ON pr.pr_number = hr.pr_number | ||
GROUP BY pr.pr_number | ||
), | ||
|
||
first_approval AS ( | ||
SELECT | ||
pr.pr_number, | ||
-- Only count approvals from maintainers (exclude contributor approvals) | ||
minIf( | ||
hr.submitted_at_ts, | ||
lowerUTF8(hr.state) = 'approved' | ||
AND hr.reviewer != pr.author | ||
AND hr.author_association IN ('MEMBER', 'OWNER', 'COLLABORATOR') | ||
) AS first_approval_ts | ||
FROM prs pr | ||
LEFT JOIN human_reviews hr ON pr.pr_number = hr.pr_number | ||
GROUP BY pr.pr_number | ||
), | ||
|
||
durations AS ( | ||
SELECT | ||
pr.pr_number, | ||
coalesce(re.first_ready_ts, pr.created_at_ts) AS ready_ts, | ||
fr.first_review_ts, | ||
fa.first_approval_ts, | ||
pr.merged_at_ts, | ||
-- Durations in hours | ||
if( | ||
fr.first_review_ts IS NULL | ||
OR fr.first_review_ts | ||
< coalesce(re.first_ready_ts, pr.created_at_ts), | ||
NULL, | ||
dateDiff( | ||
'second', | ||
coalesce(re.first_ready_ts, pr.created_at_ts), | ||
fr.first_review_ts | ||
) | ||
/ 3600.0 | ||
) AS time_to_first_review_hours, | ||
|
||
if( | ||
fa.first_approval_ts IS NULL | ||
OR fr.first_review_ts IS NULL | ||
OR fa.first_approval_ts < fr.first_review_ts, | ||
NULL, | ||
dateDiff('second', fr.first_review_ts, fa.first_approval_ts) | ||
/ 3600.0 | ||
) AS time_to_approval_hours, | ||
|
||
if( | ||
fa.first_approval_ts IS NULL | ||
OR pr.merged_at_ts < fa.first_approval_ts, | ||
NULL, | ||
dateDiff('second', fa.first_approval_ts, pr.merged_at_ts) / 3600.0 | ||
) AS time_in_merge_queue_hours | ||
FROM prs pr | ||
LEFT JOIN ready_events re ON pr.pr_number = re.pr_number | ||
LEFT JOIN first_human_review fr ON pr.pr_number = fr.pr_number | ||
LEFT JOIN first_approval fa ON pr.pr_number = fa.pr_number | ||
), | ||
|
||
filtered AS ( | ||
SELECT * | ||
FROM durations | ||
WHERE | ||
( | ||
time_to_first_review_hours IS NULL | ||
OR ( | ||
time_to_first_review_hours >= 0 | ||
AND time_to_first_review_hours < 24 * 30 | ||
) | ||
) | ||
AND ( | ||
time_to_approval_hours IS NULL | ||
OR ( | ||
time_to_approval_hours >= 0 AND time_to_approval_hours < 24 * 30 | ||
) | ||
) | ||
AND ( | ||
time_in_merge_queue_hours IS NULL | ||
OR ( | ||
time_in_merge_queue_hours >= 0 | ||
AND time_in_merge_queue_hours < 24 * 30 | ||
) | ||
) | ||
) | ||
|
||
SELECT | ||
round(quantile(0.5) (time_to_first_review_hours), 2) | ||
AS time_to_first_review_p50, | ||
round(quantile(0.9) (time_to_first_review_hours), 2) | ||
AS time_to_first_review_p90, | ||
round(quantile(0.5) (time_to_approval_hours), 2) AS time_to_approval_p50, | ||
round(quantile(0.9) (time_to_approval_hours), 2) AS time_to_approval_p90, | ||
round(quantile(0.5) (time_in_merge_queue_hours), 2) | ||
AS time_in_merge_queue_p50, | ||
round(quantile(0.9) (time_in_merge_queue_hours), 2) | ||
AS time_in_merge_queue_p90 | ||
FROM filtered | ||
-- Quantiles ignore NULLs implicitly; if a column is entirely NULL in window, result will be NULL |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.