This folder contains launch_eval.sh, a script that:
- downloads model weights (optional),
- ensures a suitable Dynamo container image (based on the config.env),
- launch etcd + NATS,
- enters the container and runs
aiconfigurator eval.
Run:
./aiconfigurator/tools/automation/launch_eval.sh /path/to/config.env
Assumptions
- etcd and NATS are accessible.
- The container already has aiconfigurator and dynamo installed.
- The dynamo repo is mounted at
/workspace/.
How to use
- Create a
config.envwithIN_CONTAINER=true(You can use the existing config.env in the current directory). - Make sure the model path in the config exists inside the container. If it does not exist, the script will fetch the corresponding model.
- Run the script; it will directly call
aiconfigurator eval.
Please make sure you have properly set the following parameters in config.env.
# --- Optional: already inside the image ---
IN_CONTAINER=true
# --- Model paths ---
MODEL_LOCAL_DIR=/raid/hub/qwen3-32b-fp8
MODEL_HF_REPO=Qwen/Qwen3-32B-FP8
# --- Deployment knobs ---
SYSTEM=h200_sxm
MODEL=QWEN3_32B
VERSION=1.0.0rc3
GENERATED_CONFIG_VERSION=1.0.0rc6
VENV_PATH=/workspace/aic
ISL=5000
OSL=1000
TTFT=1000
TPOT=10
TOTAL_GPUS=8
HEAD_NODE_IP=0.0.0.0
PREFILL_FREE_GPU_MEM_FRAC=0.9
FREE_GPU_MEM_FRAC=0.7
DECODE_FREE_GPU_MEM_FRAC=0.5
PORT=8000
MODE=disagg
# --- Service naming ---
SERVED_MODEL_NAME=Qwen3/Qwen3-32B-FP8
# --- Benchmarking settings ---
BENCHMARK_CONCURRENCY=auto # can be auto or cc list likes BENCHMARK_CONCURRENCY="1 4 8 12 16 20"
# --- Optional: download model if not present ---
ENABLE_MODEL_DOWNLOAD=true
# --- Optional: already inside the image ---
IN_CONTAINER=trueThe script will skip image build and compose, and directly run:
aiconfigurator eval ...
What the script does
-
Checks whether the target Dynamo container image exists.
-
If not, it will pull the container
-
if the container cannot be pulled, it clones/updates the dynamo repo and builds the image:
./container/build.sh --framework TRTLLM \ --tensorrtllm-pip-wheel <TRTLLM_PIP> \ --tag <DYNAMO_IMAGE>
-
-
Launch etcd and NATS using:
docker compose -f deploy/docker-compose.yml up -d -
Starts the container with host networking and GPUs.
-
Inside the container:
-
If
aiconfiguratoris missing, it mounts the project source to/opt/aiconfigurator-srcand runs:python3 -m pip install -e /opt/aiconfigurator-src -
Then it executes
aiconfigurator eval.
-
Make sure to set IN_CONTAINER to false in config.env:
# --- Optional: already inside the image ---
IN_CONTAINER=false- Service logs (from
aiconfigurator eval’s service launcher):/<save_dir>/log/<run_name>_<mode>_p<port>.log(inside the container mount). - Evaluation results:
/<save_dir>/eval_runs/<run_name>/...(bench JSON/CSV, Pareto plot, GPU stats, etc.).
-
If you use Hugging Face models, export
HF_TOKENbefore running the script so it can download weights. -
The script mounts:
- model directory to
/workspace/model_hub/<name>, - save directory to
/workspace/aiconf_save, - project root to
/opt/aiconfigurator-srcfor editable install (only if needed).
- model directory to