Adding ci_calibration_smoke_tests.sh into v0.16.0#1042
Conversation
Signed-off-by: Iryna Boiko <iboiko@habana.ai>
There was a problem hiding this comment.
Pull request overview
Adds a new CI-oriented smoke-test script to validate that the FP8 calibration workflow can run end-to-end for a couple of representative models.
Changes:
- Introduces
tests/calibration_tests/ci_calibration_smoke_tests.shto run lightweight calibration runs (batch=1, limit=1). - Adds helper functions for per-test temp output cleanup and a simple function-dispatch entrypoint.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| rm -rf "${CALIBRATION_OUTPUT_DIR}" | ||
| fi | ||
| } | ||
|
|
There was a problem hiding this comment.
On failure, set -e will exit before any later cleanup runs, which can leave tmp-calibration-output behind. Consider registering cleanup_calibration_output via a trap (e.g., on EXIT) so artifacts are removed even when a calibration step fails.
| # Ensure calibration output is cleaned on any script exit (including failures) | |
| trap cleanup_calibration_output EXIT |
| echo "If no function_name is provided, all tests will be run." | ||
| echo "" | ||
| echo "Available functions:" | ||
| declare -F | awk '{print " - " $3}' | grep --color=never "run_" |
There was a problem hiding this comment.
usage() only lists functions matching run_, but users can also invoke launch_all_tests (and it’s the default). Consider including launch_all_tests in the output (or update the help text) so the usage info matches actual behavior.
| declare -F | awk '{print " - " $3}' | grep --color=never "run_" | |
| declare -F | awk '{print " - " $3}' | grep --color=never "run_" | |
| echo " - launch_all_tests (default: runs all tests sequentially)" |
| -d "${CALIBRATION_DATASET}" \ | ||
| -o "${CALIBRATION_OUTPUT_DIR}" \ | ||
| -b ${BATCH_SIZE} \ | ||
| -l ${LIMIT} \ | ||
| -t 1 |
There was a problem hiding this comment.
calibrate_model.sh resolves -o via realpath, which fails if the output directory doesn’t exist. Since cleanup_calibration_output deletes the directory, recreate it (e.g., mkdir -p) before invoking calibration so this smoke test doesn’t fail immediately.
| if [ $? -ne 0 ]; then | ||
| echo "Error: Calibration failed for ibm-granite/granite-3.3-2b-instruct" >&2 | ||
| exit 1 | ||
| fi |
There was a problem hiding this comment.
This script uses set -e, so if calibrate_model.sh fails the script will exit immediately and this $? check is dead code. If you want a custom error message, wrap the command in if ! ...; then ...; fi; otherwise remove the explicit status check.
| if [ $? -ne 0 ]; then | ||
| echo "Error: Calibration failed for Qwen/Qwen2.5-0.5B-Instruct" >&2 | ||
| exit 1 | ||
| fi |
There was a problem hiding this comment.
Same issue as above: with set -e enabled, this $? check will never run on failure. Prefer if ! ...; then ...; fi (for a custom error message) or drop the check.
988ecd2
into
vllm-project:releases/v0.16.0
No description provided.