Skip to content

Conversation

@maugustosilva
Copy link
Collaborator

This configmap, llm-d-benchmark-standup-parameters is accessible by the harness pod

Additionally take a list of all running pods at the end of the running of a benchmark (goal is to detected any restarts on pods serving models during benchmark)

Finally, fix a bug on the (automatic) detection of admin privileges

@maugustosilva maugustosilva force-pushed the cicd_and_pod_status_capture branch from f3bb68c to 4f274ed Compare January 28, 2026 21:57
@maugustosilva maugustosilva marked this pull request as ready for review January 28, 2026 21:57
@namasl
Copy link
Collaborator

namasl commented Jan 28, 2026

This is useful, but there are still some unresolved bits. For example, if I want to get the args used to execute vllm this is not simple:

$ oc get cm llm-d-benchmark-standup-parameters -o yaml | grep -i args      
    base64_args: -w0
    vllm_modelservice_decode_extra_args: /tmp/tmp.yjfWhrk4pN
    vllm_modelservice_prefill_extra_args: /tmp/tmp.WYtBTiGjJt
    vllm_standalone_args: REPLACE_ENV_LLMDBENCH_VLLM_STANDALONE_PREPROCESS____;____vllm____serve____REPLACE_ENV_LLMDBENCH_DEPLOY_CURRENT_MODEL____--no-enable-prefix-caching____--load-format____REPLACE_ENV_LLMDBENCH_VLLM_COMMON_VLLM_LOAD_FORMAT____--port____REPLACE_ENV_LLMDBENCH_VLLM_COMMON_INFERENCE_PORT____--max-model-len____REPLACE_ENV_LLMDBENCH_VLLM_COMMON_MAX_MODEL_LEN____--disable-log-requests____--gpu-memory-utilization____REPLACE_ENV_LLMDBENCH_VLLM_COMMON_ACCELERATOR_MEM_UTIL____--tensor-parallel-size____REPLACE_ENV_LLMDBENCH_VLLM_COMMON_TENSOR_PARALLELISM____--model-loader-extra-config____"$LLMDBENCH_VLLM_COMMON_MODEL_LOADER_EXTRA_CONFIG"
    vllm_standalone_launcher_args: REPLACE_ENV_LLMDBENCH_VLLM_STANDALONE_PREPROCESS____;____uvicorn____launcher:app____--host____0.0.0.0____--log-level____info____--port____REPLACE_ENV_LLMDBENCH_VLLM_STANDALONE_LAUNCHER_PORT

namasl
namasl previously approved these changes Jan 29, 2026
Copy link
Collaborator

@namasl namasl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some unresolved parts remain, but that can be addressed in later PRs.

This configmap, `llm-d-benchmark-standup-parameters` is accessible by
the harness pod

Additionally take a list of all running pods at the end of the running
of a benchmark (goal is to detected any restarts on pods serving models
during benchmark)

Updated the list of models.

Ensure precise-prefix-cache-aware is still operational.

Finally, fix a bug on the (automatic) detection of admin privileges

Signed-off-by: maugustosilva <maugusto.silva@gmail.com>
@maugustosilva maugustosilva merged commit cfc21a0 into llm-d:main Jan 30, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants