-
Notifications
You must be signed in to change notification settings - Fork 48
Open
Description
I want to get benchmark NVIDIA NIM(gpt-oss-120b)
`#!/bin/bash
set -e
MODEL="openai/gpt-oss-120b"
TOKENIZER="xxxx/.cache/nim/ngc/hub/models--nim--openai--gpt-oss-120b/snapshots/xxxx"
URL="http://localhost:8000/v1"
CONCURRENCIES=(100)
INPUT_OUTPUT_PAIRS=(
"500 2500"
"500 10000"
"10000 2500"
"10000 5000"
"10000 10000"
)
BASE_OUTDIR="artifacts_gpt_oss_120b"
mkdir -p "${BASE_OUTDIR}"
for CONCURRENCY in "${CONCURRENCIES[@]}"; do
REQUEST_COUNT=$((CONCURRENCY * 10))
for PAIR in "${INPUT_OUTPUT_PAIRS[@]}"; do
read INPUT_TOKENS OUTPUT_TOKENS <<< "${PAIR}"
RUN_NAME="c${CONCURRENCY}_in${INPUT_TOKENS}_out${OUTPUT_TOKENS}"
OUTDIR="${BASE_OUTDIR}/${RUN_NAME}"
PREFIX="profile_${RUN_NAME}"
echo "========================================"
echo "Running benchmark: ${RUN_NAME}"
echo "Concurrency : ${CONCURRENCY}"
echo "Request count : ${REQUEST_COUNT}"
echo "========================================"
mkdir -p "${OUTDIR}"
aiperf profile \
--model "${MODEL}" \
--tokenizer "${TOKENIZER}" \
--url "${URL}" \
--endpoint-type chat \
--streaming \
--concurrency "${CONCURRENCY}" \
--request-rate 200 \
--prompt-input-tokens-mean "${INPUT_TOKENS}" \
--prompt-output-tokens-mean "${OUTPUT_TOKENS}" \
--request-count "${REQUEST_COUNT}" \
--warmup-request-count 1 \
--output-artifact-dir "${OUTDIR}" \
--profile-export-prefix "${PREFIX}"
done
done
echo "All benchmarks completed."
`
Most of the requests are failing; do you know what the cause might be?

Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels