Skip to content

Commit 99b1c89

Browse files
authored
Merge pull request #773 from swethmandava/master
removie trt, fix queuing delay typo in triton readme for bert
2 parents 66667f1 + 33ea90e commit 99b1c89

File tree

4 files changed

+2
-195
lines changed

4 files changed

+2
-195
lines changed

TensorFlow/LanguageModeling/BERT/triton/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -187,7 +187,7 @@ Performance numbers for BERT Large, sequence length=384 are obtained from [exper
187187

188188
![](../data/images/bert_trt_throughput_vs_latency.png?raw=true)
189189

190-
The plot above shows that throughput gains taper off from increasing batch size above 12. There is minimal gain in throughput going from batch size 12 to 128. However, running inference with a single large batch might be faster than running several small inference requests. Therefore, we choose to maximize batch size for Dynamic Batching with a maximum acceptable queuing delay of 50ms and maximum acceptable inference latency of 100ms.
190+
The plot above shows that throughput gains taper off from increasing batch size above 12. There is minimal gain in throughput going from batch size 12 to 128. However, running inference with a single large batch might be faster than running several small inference requests. Therefore, we choose to maximize batch size for Dynamic Batching with a maximum acceptable queuing delay of 1ms and maximum acceptable inference latency of 100ms.
191191

192192
### Dynamic Batching Support
193193

@@ -232,4 +232,4 @@ April 2020
232232
TRTIS -> TRITON
233233

234234
October 2019
235-
Initial release
235+
Initial release

TensorFlow/LanguageModeling/BERT/trt/helpers/calibrator.py

Lines changed: 0 additions & 95 deletions
This file was deleted.

TensorFlow/LanguageModeling/BERT/trt/squad/dev-v1.1.json

Lines changed: 0 additions & 1 deletion
This file was deleted.

TensorFlow/LanguageModeling/BERT/trt/squad/evaluate-v1.1.py

Lines changed: 0 additions & 97 deletions
This file was deleted.

0 commit comments

Comments
 (0)