Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 44 additions & 1 deletion qa/L0_orca/test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ MODEL_NAME="gpt2_tensorrt_llm"
NAME="tensorrt_llm_benchmarking_test"
MODEL_REPOSITORY="$(pwd)/triton_model_repo"
TENSORRTLLM_BACKEND_DIR="/workspace/tensorrtllm_backend"
GPT_DIR="$TENSORRTLLM_BACKEND_DIR/tensorrt_llm/examples/gpt"
GPT_DIR="$TENSORRTLLM_BACKEND_DIR/tensorrt_llm/examples/models/core/gpt"
TOKENIZER_DIR="$GPT_DIR/gpt2"
ENGINES_DIR="${BASE_DIR}/engines/inflight_batcher_llm/${NUM_GPUS}-gpu"
TRITON_DIR=${TRITON_DIR:="/opt/tritonserver"}
Expand All @@ -48,6 +48,13 @@ CLIENT_PY=${BASE_DIR}/orca_http_test.py
CLIENT_LOG="${NAME}_orca_http_test.log"
source ../common/util.sh

function replace_config_tags {
tag_to_replace="${1}"
new_value="${2}"
config_file_path="${3}"
sed -i "s|${tag_to_replace}|${new_value}|g" ${config_file_path}
}

function prepare_model_repository {
rm -rf ${MODEL_REPOSITORY} && mkdir ${MODEL_REPOSITORY}
cp -r ${TENSORRTLLM_BACKEND_DIR}/all_models/inflight_batcher_llm/* ${MODEL_REPOSITORY}
Expand Down Expand Up @@ -138,6 +145,42 @@ function kill_server {
done
}

function clone_tensorrt_llm_backend_repo {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the previous PR, I suggested moving these functions to qa/common/util.sh but somehow they were deleted instead of moved.
#8009 (comment)

rm -rf $TENSORRTLLM_BACKEND_DIR && mkdir $TENSORRTLLM_BACKEND_DIR
apt-get update && apt-get install git-lfs -y --no-install-recommends
git clone --single-branch --depth=1 -b ${TENSORRTLLM_BACKEND_REPO_TAG} ${TRITON_REPO_ORG}/tensorrtllm_backend.git $TENSORRTLLM_BACKEND_DIR
cd $TENSORRTLLM_BACKEND_DIR && git lfs install && git submodule update --init --recursive
}

function build_gpt2_base_model {
# Download weights from HuggingFace Transformers
cd ${GPT_DIR} && rm -rf gpt2 && git clone https://huggingface.co/gpt2-medium gpt2 && cd gpt2
rm pytorch_model.bin model.safetensors
if ! wget -q https://huggingface.co/gpt2-medium/resolve/main/pytorch_model.bin; then
echo "Downloading pytorch_model.bin failed."
exit 1
fi
cd ${GPT_DIR}

# Convert weights from HF Tranformers to FT format
python3 convert_checkpoint.py --model_dir gpt2 --dtype float16 --tp_size ${NUM_GPUS} --output_dir "./c-model/gpt2/${NUM_GPUS}-gpu/"
cd ${BASE_DIR}
}

function build_gpt2_tensorrt_engine {
# Build TensorRT engines
cd ${GPT_DIR}
trtllm-build --checkpoint_dir "./c-model/gpt2/${NUM_GPUS}-gpu/" \
--gpt_attention_plugin float16 \
--remove_input_padding enable \
--paged_kv_cache enable \
--gemm_plugin float16 \
--workers "${NUM_GPUS}" \
--output_dir "${ENGINES_DIR}"

cd ${BASE_DIR}
}

clone_tensorrt_llm_backend_repo
build_gpt2_base_model
build_gpt2_tensorrt_engine
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did this call to build_gpt2_tensorrt_engine previously work if the function didn't exist until this PR?

Copy link
Contributor Author

@indrajit96 indrajit96 May 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried finding an answer to this, but was unable to
Checked with @mc-nv if anything changed in the way CI runs or mounts. No luck.
Double checked if the test did actually run as expected before it broke (It did)
Test started breaking when we merged 24.04 release branch into main.
They were originally ONLY defined in L0_perf_tensorrt_llm/test.sh and prev tests used to pick the defn from there

Copy link
Contributor

@rmccorm4 rmccorm4 May 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They were originally ONLY defined in L0_perf_tensorrt_llm/test.sh and prev tests used to pick the defn from there

Do we do a source L0_perf_tensorrt_llm/test.sh somewhere when running this test? I didn't see it in this bash script. Maybe on gitlab job side?

Copy link
Contributor

@yinggeh yinggeh May 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe because there is no set +e so bash just ignore the error.

Expand Down
4 changes: 2 additions & 2 deletions qa/L0_perf_tensorrt_llm/test.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/bin/bash
# Copyright 2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# Copyright 2024-2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
Expand Down Expand Up @@ -35,7 +35,7 @@ MODEL_NAME="gpt2_tensorrt_llm"
NAME="tensorrt_llm_benchmarking_test"
MODEL_REPOSITORY="$(pwd)/triton_model_repo"
TENSORRTLLM_BACKEND_DIR="/workspace/tensorrtllm_backend"
GPT_DIR="$TENSORRTLLM_BACKEND_DIR/tensorrt_llm/examples/gpt"
GPT_DIR="$TENSORRTLLM_BACKEND_DIR/tensorrt_llm/examples/models/core/gpt"
TOKENIZER_DIR="$GPT_DIR/gpt2"
ENGINES_DIR="${BASE_DIR}/engines/inflight_batcher_llm/${NUM_GPUS}-gpu"
TRITON_DIR=${TRITON_DIR:="/opt/tritonserver"}
Expand Down
Loading