You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[Run] Capture the logs for all pods (from llm-d stack) at the end (#638)
Whenever run is invoked, try to capture the logs from gaie, decode,
prefill and inference-gateway pods at the end of a run.
The llm-d-benchmark executable now has indepedent try loops for the
harness and analyzer.
Finally, a few improvements on `preprocess/set_llmdbench_environment.py`
Signed-off-by: maugustosilva <maugusto.silva@gmail.com>
announce "ℹ️ Harness was started with LLMDBENCH_HARNESS_WAIT_TIMEOUT=0. Will NOT wait for pod \"${LLMDBENCH_HARNESS_POD_LABEL}\" for model \"$model\" to be in \"Completed\" state. The pod can be accessed through \"${LLMDBENCH_CONTROL_KCMD} --namespace ${LLMDBENCH_HARNESS_NAMESPACE} exec -it pod/<POD_NAME> -- bash\""
528
519
announce "ℹ️ To list pod names \"${LLMDBENCH_CONTROL_KCMD} --namespace ${LLMDBENCH_HARNESS_NAMESPACE} get pods -l app=${LLMDBENCH_HARNESS_POD_LABEL}\""
@@ -536,6 +527,39 @@ function deploy_harness_config {
536
527
}
537
528
export -f deploy_harness_config
538
529
530
+
functioncapture_pod_logs {
531
+
local model=$1
532
+
local local_results_dir=$2
533
+
534
+
local modelid_label=$(model_attribute $model modelid_label)
0 commit comments