Skip to content

add random concurrent workload template for inference-perf#635

Merged
namasl merged 2 commits intollm-d:mainfrom
huaxig:template
Feb 2, 2026
Merged

add random concurrent workload template for inference-perf#635
namasl merged 2 commits intollm-d:mainfrom
huaxig:template

Conversation

@huaxig
Copy link
Contributor

@huaxig huaxig commented Jan 30, 2026

introduces a new inference-perf workload profile for sanity testing with concurrent users.

The sanity_random_concurrent profile is designed to test the stability and performance of the llm-d stack under increasing load. It consists of 4 stages, starting with 1 concurrent user and scaling up to 8.

Copy link
Collaborator

@namasl namasl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Profiles prefixed with sanity_ tend to be small benchmarks that run quickly (sanity checks, used for tests and CI). Drop the prefix, then it looks good to me.

@maugustosilva
Copy link
Collaborator

+1 on @namasl comment

@huaxig
Copy link
Contributor Author

huaxig commented Feb 2, 2026

SG, rename it to random_concurrent.yaml.in

@namasl namasl merged commit 2efad69 into llm-d:main Feb 2, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants