Skip to content

Commit 8fbfd00

Browse files
committed
chore: update v0.4.0 benchmarks (#481)
1 parent b3e10bc commit 8fbfd00

File tree

3 files changed

+54
-52
lines changed

3 files changed

+54
-52
lines changed

docs/versioned_docs/version-0.4.x/benchmarks/inference-time.md

Lines changed: 35 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -8,50 +8,52 @@ Times presented in the tables are measured as consecutive runs of the model. Ini
88

99
## Classification
1010

11-
| Model | iPhone 16 Pro (Core ML) [ms] | iPhone 13 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
11+
| Model | iPhone 17 Pro (Core ML) [ms] | iPhone 16 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
1212
| ----------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
13-
| EFFICIENTNET_V2_S | 100 | 120 | 130 | 180 | 170 |
13+
| EFFICIENTNET_V2_S | 150 | 161 | 227 | 196 | 214 |
1414

1515
## Object Detection
1616

17-
| Model | iPhone 16 Pro (XNNPACK) [ms] | iPhone 13 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
17+
| Model | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
1818
| ------------------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
19-
| SSDLITE_320_MOBILENET_V3_LARGE | 190 | 260 | 280 | 100 | 90 |
19+
| SSDLITE_320_MOBILENET_V3_LARGE | 261 | 279 | 414 | 125 | 115 |
2020

2121
## Style Transfer
2222

23-
| Model | iPhone 16 Pro (Core ML) [ms] | iPhone 13 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
23+
| Model | iPhone 17 Pro (Core ML) [ms] | iPhone 16 Pro (Core ML) [ms] | iPhone SE 3 (Core ML) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
2424
| ---------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
25-
| STYLE_TRANSFER_CANDY | 450 | 600 | 750 | 1650 | 1800 |
26-
| STYLE_TRANSFER_MOSAIC | 450 | 600 | 750 | 1650 | 1800 |
27-
| STYLE_TRANSFER_UDNIE | 450 | 600 | 750 | 1650 | 1800 |
28-
| STYLE_TRANSFER_RAIN_PRINCESS | 450 | 600 | 750 | 1650 | 1800 |
25+
| STYLE_TRANSFER_CANDY | 1565 | 1675 | 2325 | 1750 | 1620 |
26+
| STYLE_TRANSFER_MOSAIC | 1565 | 1675 | 2325 | 1750 | 1620 |
27+
| STYLE_TRANSFER_UDNIE | 1565 | 1675 | 2325 | 1750 | 1620 |
28+
| STYLE_TRANSFER_RAIN_PRINCESS | 1565 | 1675 | 2325 | 1750 | 1620 |
2929

3030
## OCR
3131

32-
| Model | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | Samsung Galaxy S21 (XNNPACK) [ms] |
33-
| --------------------- | :--------------------------: | :------------------------------: | :------------------------: | :-------------------------------: | :-------------------------------: |
34-
| Detector (CRAFT_800) | 2099 | 2227 || 2245 | 7108 |
35-
| Recognizer (CRNN_512) | 70 | 252 || 54 | 151 |
36-
| Recognizer (CRNN_256) | 39 | 123 || 24 | 78 |
37-
| Recognizer (CRNN_128) | 17 | 83 || 14 | 39 |
32+
Notice that the recognizer models were executed between 3 and 7 times during a single recognition.
33+
The values below represent the averages across all runs for the benchmark image.
3834

39-
❌ - Insufficient RAM.
35+
| Model | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
36+
| ------------------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
37+
| Detector (CRAFT_800_QUANTIZED) | 779 | 897 | 1276 | 553 | 586 |
38+
| Recognizer (CRNN_512) | 77 | 74 | 244 | 56 | 57 |
39+
| Recognizer (CRNN_256) | 35 | 37 | 120 | 28 | 30 |
40+
| Recognizer (CRNN_128) | 18 | 19 | 60 | 14 | 16 |
4041

4142
## Vertical OCR
4243

43-
| Model | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | Samsung Galaxy S21 (XNNPACK) [ms] |
44-
| --------------------- | :--------------------------: | :------------------------------: | :------------------------: | :-------------------------------: | :-------------------------------: |
45-
| Detector (CRAFT_1280) | 5457 | 5833 || 6296 | 14053 |
46-
| Detector (CRAFT_320) | 1351 | 1460 || 1485 | 3101 |
47-
| Recognizer (CRNN_512) | 39 | 123 || 24 | 78 |
48-
| Recognizer (CRNN_64) | 10 | 33 || 7 | 18 |
44+
Notice that the recognizer models, as well as detector CRAFT_320 model, were executed between 4 and 21 times during a single recognition.
45+
The values below represent the averages across all runs for the benchmark image.
4946

50-
❌ - Insufficient RAM.
47+
| Model | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
48+
| ------------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
49+
| Detector (CRAFT_1280_QUANTIZED) | 1918 | 2304 | 3371 | 1391 | 1445 |
50+
| Detector (CRAFT_320_QUANTIZED) | 473 | 563 | 813 | 361 | 382 |
51+
| Recognizer (CRNN_512) | 78 | 83 | 310 | 59 | 57 |
52+
| Recognizer (CRNN_64) | 9 | 9 | 38 | 8 | 7 |
5153

5254
## LLMs
5355

54-
| Model | iPhone 16 Pro (XNNPACK) [tokens/s] | iPhone 13 Pro (XNNPACK) [tokens/s] | iPhone SE 3 (XNNPACK) [tokens/s] | Samsung Galaxy S24 (XNNPACK) [tokens/s] | OnePlus 12 (XNNPACK) [tokens/s] |
56+
| Model | iPhone 17 Pro (XNNPACK) [tokens/s] | iPhone 16 Pro (XNNPACK) [tokens/s] | iPhone SE 3 (XNNPACK) [tokens/s] | Samsung Galaxy S24 (XNNPACK) [tokens/s] | OnePlus 12 (XNNPACK) [tokens/s] |
5557
| --------------------- | :--------------------------------: | :--------------------------------: | :------------------------------: | :-------------------------------------: | :-----------------------------: |
5658
| LLAMA3_2_1B | 16.1 | 11.4 || 15.6 | 19.3 |
5759
| LLAMA3_2_1B_SPINQUANT | 40.6 | 16.7 | 16.5 | 40.3 | 48.2 |
@@ -68,7 +70,7 @@ Times presented in the tables are measured as consecutive runs of the model. Ini
6870

6971
Notice than for `Whisper` model which has to take as an input 30 seconds audio chunks (for shorter audio it is automatically padded with silence to 30 seconds) `fast` mode has the lowest latency (time from starting transcription to first token returned, caused by streaming algorithm), but the slowest speed. That's why for the lowest latency and the fastest transcription we suggest using `Moonshine` model, if you still want to proceed with `Whisper` use preferably the `balanced` mode.
7072

71-
| Model (mode) | iPhone 16 Pro (XNNPACK) [latency \| tokens/s] | iPhone 14 Pro (XNNPACK) [latency \| tokens/s] | iPhone SE 3 (XNNPACK) [latency \| tokens/s] | Samsung Galaxy S24 (XNNPACK) [latency \| tokens/s] | OnePlus 12 (XNNPACK) [latency \| tokens/s] |
73+
| Model (mode) | iPhone 17 Pro (XNNPACK) [latency \| tokens/s] | iPhone 16 Pro (XNNPACK) [latency \| tokens/s] | iPhone SE 3 (XNNPACK) [latency \| tokens/s] | Samsung Galaxy S24 (XNNPACK) [latency \| tokens/s] | OnePlus 12 (XNNPACK) [latency \| tokens/s] |
7274
| ------------------------- | :-------------------------------------------: | :-------------------------------------------: | :-----------------------------------------: | :------------------------------------------------: | :----------------------------------------: |
7375
| Moonshine-tiny (fast) | 0.8s \| 19.0t/s | 1.5s \| 11.3t/s | 1.5s \| 10.4t/s | 2.0s \| 8.8t/s | 1.6s \| 12.5t/s |
7476
| Moonshine-tiny (balanced) | 2.0s \| 20.0t/s | 3.2s \| 12.4t/s | 3.7s \| 10.4t/s | 4.6s \| 11.2t/s | 3.4s \| 14.6t/s |
@@ -81,7 +83,7 @@ Notice than for `Whisper` model which has to take as an input 30 seconds audio c
8183

8284
Average time for encoding audio of given length over 10 runs. For `Whisper` model we only list 30 sec audio chunks since `Whisper` does not accept other lengths (for shorter audio the audio needs to be padded to 30sec with silence).
8385

84-
| Model | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
86+
| Model | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
8587
| -------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
8688
| Moonshine-tiny (5s) | 99 | 95 | 115 | 284 | 277 |
8789
| Moonshine-tiny (10s) | 178 | 177 | 204 | 555 | 528 |
@@ -92,7 +94,7 @@ Average time for encoding audio of given length over 10 runs. For `Whisper` mode
9294

9395
Average time for decoding one token in sequence of 100 tokens, with encoding context is obtained from audio of noted length.
9496

95-
| Model | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
97+
| Model | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
9698
| -------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
9799
| Moonshine-tiny (5s) | 48.98 | 47.98 | 46.86 | 36.70 | 29.03 |
98100
| Moonshine-tiny (10s) | 54.24 | 51.74 | 55.07 | 46.31 | 32.41 |
@@ -101,9 +103,9 @@ Average time for decoding one token in sequence of 100 tokens, with encoding con
101103

102104
## Text Embeddings
103105

104-
| Model | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) | OnePlus 12 (XNNPACK) [ms] |
105-
| -------------------------- | :--------------------------: | :------------------------------: | :------------------------: | :--------------------------: | :-----------------------: |
106-
| ALL_MINILM_L6_V2 | 53 | 69 | 78 | 60 | 65 |
107-
| ALL_MPNET_BASE_V2 | 352 | 423 | 478 | 521 | 527 |
108-
| MULTI_QA_MINILM_L6_COS_V1 | 135 | 166 | 180 | 158 | 165 |
109-
| MULTI_QA_MPNET_BASE_DOT_V1 | 503 | 598 | 680 | 694 | 743 |
106+
| Model | iPhone 17 Pro (XNNPACK) [ms] | iPhone 16 Pro (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | OnePlus 12 (XNNPACK) [ms] |
107+
| -------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
108+
| ALL_MINILM_L6_V2 | 50 | 58 | 84 | 58 | 58 |
109+
| ALL_MPNET_BASE_V2 | 352 | 428 | 879 | 483 | 517 |
110+
| MULTI_QA_MINILM_L6_COS_V1 | 133 | 161 | 269 | 151 | 155 |
111+
| MULTI_QA_MPNET_BASE_DOT_V1 | 502 | 796 | 1216 | 915 | 713 |

docs/versioned_docs/version-0.4.x/benchmarks/memory-usage.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -25,16 +25,16 @@ title: Memory Usage
2525

2626
## OCR
2727

28-
| Model | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
29-
| -------------------------------------------------------------------------------------------- | :--------------------: | :----------------: |
30-
| Detector (CRAFT_800) + Recognizer (CRNN_512) + Recognizer (CRNN_256) + Recognizer (CRNN_128) | 2100 | 1782 |
28+
| Model | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
29+
| ------------------------------------------------------------------------------------------------------ | :--------------------: | :----------------: |
30+
| Detector (CRAFT_800_QUANTIZED) + Recognizer (CRNN_512) + Recognizer (CRNN_256) + Recognizer (CRNN_128) | 1400 | 1320 |
3131

3232
## Vertical OCR
3333

34-
| Model | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
35-
| -------------------------------------------------------------------- | :--------------------: | :----------------: |
36-
| Detector (CRAFT_1280) + Detector (CRAFT_320) + Recognizer (CRNN_512) | 2770 | 3720 |
37-
| Detector(CRAFT_1280) + Detector(CRAFT_320) + Recognizer (CRNN_64) | 1770 | 2740 |
34+
| Model | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
35+
| ---------------------------------------------------------------------------------------- | :--------------------: | :----------------: |
36+
| Detector (CRAFT_1280_QUANTIZED) + Detector (CRAFT_320_QUANTIZED) + Recognizer (CRNN_512) | 1540 | 1470 |
37+
| Detector(CRAFT_1280) + Detector(CRAFT_320) + Recognizer (CRNN_64) | 1070 | 1000 |
3838

3939
## LLMs
4040

docs/versioned_docs/version-0.4.x/benchmarks/model-size.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -25,23 +25,23 @@ title: Model Size
2525

2626
## OCR
2727

28-
| Model | XNNPACK [MB] |
29-
| --------------------- | :----------: |
30-
| Detector (CRAFT_800) | 83.1 |
31-
| Recognizer (CRNN_512) | 15 - 18\* |
32-
| Recognizer (CRNN_256) | 16 - 18\* |
33-
| Recognizer (CRNN_128) | 17 - 19\* |
28+
| Model | XNNPACK [MB] |
29+
| ------------------------------ | :----------: |
30+
| Detector (CRAFT_800_QUANTIZED) | 19.8 |
31+
| Recognizer (CRNN_512) | 15 - 18\* |
32+
| Recognizer (CRNN_256) | 16 - 18\* |
33+
| Recognizer (CRNN_128) | 17 - 19\* |
3434

3535
\* - The model weights vary depending on the language.
3636

3737
## Vertical OCR
3838

39-
| Model | XNNPACK [MB] |
40-
| ------------------------ | :----------: |
41-
| Detector (CRAFT_1280) | 83.1 |
42-
| Detector (CRAFT_320) | 83.1 |
43-
| Recognizer (CRNN_EN_512) | 15 - 18\* |
44-
| Recognizer (CRNN_EN_64) | 15 - 16\* |
39+
| Model | XNNPACK [MB] |
40+
| ------------------------------- | :----------: |
41+
| Detector (CRAFT_1280_QUANTIZED) | 19.8 |
42+
| Detector (CRAFT_320_QUANTIZED) | 19.8 |
43+
| Recognizer (CRNN_EN_512) | 15 - 18\* |
44+
| Recognizer (CRNN_EN_64) | 15 - 16\* |
4545

4646
\* - The model weights vary depending on the language.
4747

0 commit comments

Comments
 (0)