Negative values for tokens per sec for Mixtral benchmarsk #1856
Unanswered
prasad-nair-amd
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I am using the nvidia provided benchmark scripts running Mixtral 8x7B model on H100 why its reporting negative values for throughput ?
[BENCHMARK] num_samples 1
[BENCHMARK] num_error_samples 0
[BENCHMARK] num_samples 1
[BENCHMARK] total_latency(ms) 4610.37
[BENCHMARK] seq_throughput(seq/sec) 0.22
[BENCHMARK] token_throughput(token/sec) -2385.71
[BENCHMARK] avg_sequence_latency(ms) 4608.56
[BENCHMARK] max_sequence_latency(ms) 4608.56
[BENCHMARK] min_sequence_latency(ms) 4608.56
[BENCHMARK] p99_sequence_latency(ms) 4608.56
[BENCHMARK] p90_sequence_latency(ms) 4608.56
[BENCHMARK] p50_sequence_latency(ms) 4608.56
[BENCHMARK] avg_time_to_first_token(ms) 173.11
[BENCHMARK] max_time_to_first_token(ms) 173.11
[BENCHMARK] min_time_to_first_token(ms) 173.11
[BENCHMARK] p99_time_to_first_token(ms) 173.11
[BENCHMARK] p90_time_to_first_token(ms) 173.11
[BENCHMARK] p50_time_to_first_token(ms) 173.11
[BENCHMARK] avg_inter_token_latency(ms) 0.00
[BENCHMARK] max_inter_token_latency(ms) 0.00
[BENCHMARK] min_inter_token_latency(ms) 0.00
[BENCHMARK] p99_inter_token_latency(ms) 0.00
[BENCHMARK] p90_inter_token_latency(ms) 0.00
[BENCHMARK] p50_inter_token_latency(ms) 0.00
[TensorRT-LLM][INFO] Terminate signal received, worker thread exiting.
[TensorRT-LLM][INFO] Terminate signal received, worker thread exiting.
[TensorRT-LLM][INFO] Terminate signal received, worker thread exiting.
[TensorRT-LLM][INFO] Terminate signal received, worker thread exiting.
Beta Was this translation helpful? Give feedback.
All reactions