@@ -8,50 +8,52 @@ Times presented in the tables are measured as consecutive runs of the model. Ini
88
99## Classification
1010
11- | Model | iPhone 16 Pro (Core ML) [ ms] | iPhone 13 Pro (Core ML) [ ms] | iPhone SE 3 (Core ML) [ ms] | Samsung Galaxy S24 (XNNPACK) [ ms] | OnePlus 12 (XNNPACK) [ ms] |
11+ | Model | iPhone 17 Pro (Core ML) [ ms] | iPhone 16 Pro (Core ML) [ ms] | iPhone SE 3 (Core ML) [ ms] | Samsung Galaxy S24 (XNNPACK) [ ms] | OnePlus 12 (XNNPACK) [ ms] |
1212| ----------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
13- | EFFICIENTNET_V2_S | 100 | 120 | 130 | 180 | 170 |
13+ | EFFICIENTNET_V2_S | 150 | 161 | 227 | 196 | 214 |
1414
1515## Object Detection
1616
17- | Model | iPhone 16 Pro (XNNPACK) [ ms] | iPhone 13 Pro (XNNPACK) [ ms] | iPhone SE 3 (XNNPACK) [ ms] | Samsung Galaxy S24 (XNNPACK) [ ms] | OnePlus 12 (XNNPACK) [ ms] |
17+ | Model | iPhone 17 Pro (XNNPACK) [ ms] | iPhone 16 Pro (XNNPACK) [ ms] | iPhone SE 3 (XNNPACK) [ ms] | Samsung Galaxy S24 (XNNPACK) [ ms] | OnePlus 12 (XNNPACK) [ ms] |
1818| ------------------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
19- | SSDLITE_320_MOBILENET_V3_LARGE | 190 | 260 | 280 | 100 | 90 |
19+ | SSDLITE_320_MOBILENET_V3_LARGE | 261 | 279 | 414 | 125 | 115 |
2020
2121## Style Transfer
2222
23- | Model | iPhone 16 Pro (Core ML) [ ms] | iPhone 13 Pro (Core ML) [ ms] | iPhone SE 3 (Core ML) [ ms] | Samsung Galaxy S24 (XNNPACK) [ ms] | OnePlus 12 (XNNPACK) [ ms] |
23+ | Model | iPhone 17 Pro (Core ML) [ ms] | iPhone 16 Pro (Core ML) [ ms] | iPhone SE 3 (Core ML) [ ms] | Samsung Galaxy S24 (XNNPACK) [ ms] | OnePlus 12 (XNNPACK) [ ms] |
2424| ---------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
25- | STYLE_TRANSFER_CANDY | 450 | 600 | 750 | 1650 | 1800 |
26- | STYLE_TRANSFER_MOSAIC | 450 | 600 | 750 | 1650 | 1800 |
27- | STYLE_TRANSFER_UDNIE | 450 | 600 | 750 | 1650 | 1800 |
28- | STYLE_TRANSFER_RAIN_PRINCESS | 450 | 600 | 750 | 1650 | 1800 |
25+ | STYLE_TRANSFER_CANDY | 1565 | 1675 | 2325 | 1750 | 1620 |
26+ | STYLE_TRANSFER_MOSAIC | 1565 | 1675 | 2325 | 1750 | 1620 |
27+ | STYLE_TRANSFER_UDNIE | 1565 | 1675 | 2325 | 1750 | 1620 |
28+ | STYLE_TRANSFER_RAIN_PRINCESS | 1565 | 1675 | 2325 | 1750 | 1620 |
2929
3030## OCR
3131
32- | Model | iPhone 16 Pro (XNNPACK) [ ms] | iPhone 14 Pro Max (XNNPACK) [ ms] | iPhone SE 3 (XNNPACK) [ ms] | Samsung Galaxy S24 (XNNPACK) [ ms] | Samsung Galaxy S21 (XNNPACK) [ ms] |
33- | --------------------- | :--------------------------: | :------------------------------: | :------------------------: | :-------------------------------: | :-------------------------------: |
34- | Detector (CRAFT_800) | 2099 | 2227 | ❌ | 2245 | 7108 |
35- | Recognizer (CRNN_512) | 70 | 252 | ❌ | 54 | 151 |
36- | Recognizer (CRNN_256) | 39 | 123 | ❌ | 24 | 78 |
37- | Recognizer (CRNN_128) | 17 | 83 | ❌ | 14 | 39 |
32+ Notice that the recognizer models were executed between 3 and 7 times during a single recognition.
33+ The values below represent the averages across all runs for the benchmark image.
3834
39- ❌ - Insufficient RAM.
35+ | Model | iPhone 17 Pro (XNNPACK) [ ms] | iPhone 16 Pro (XNNPACK) [ ms] | iPhone SE 3 (XNNPACK) [ ms] | Samsung Galaxy S24 (XNNPACK) [ ms] | OnePlus 12 (XNNPACK) [ ms] |
36+ | ------------------------------ | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
37+ | Detector (CRAFT_800_QUANTIZED) | 779 | 897 | 1276 | 553 | 586 |
38+ | Recognizer (CRNN_512) | 77 | 74 | 244 | 56 | 57 |
39+ | Recognizer (CRNN_256) | 35 | 37 | 120 | 28 | 30 |
40+ | Recognizer (CRNN_128) | 18 | 19 | 60 | 14 | 16 |
4041
4142## Vertical OCR
4243
43- | Model | iPhone 16 Pro (XNNPACK) [ ms] | iPhone 14 Pro Max (XNNPACK) [ ms] | iPhone SE 3 (XNNPACK) [ ms] | Samsung Galaxy S24 (XNNPACK) [ ms] | Samsung Galaxy S21 (XNNPACK) [ ms] |
44- | --------------------- | :--------------------------: | :------------------------------: | :------------------------: | :-------------------------------: | :-------------------------------: |
45- | Detector (CRAFT_1280) | 5457 | 5833 | ❌ | 6296 | 14053 |
46- | Detector (CRAFT_320) | 1351 | 1460 | ❌ | 1485 | 3101 |
47- | Recognizer (CRNN_512) | 39 | 123 | ❌ | 24 | 78 |
48- | Recognizer (CRNN_64) | 10 | 33 | ❌ | 7 | 18 |
44+ Notice that the recognizer models, as well as detector CRAFT_320 model, were executed between 4 and 21 times during a single recognition.
45+ The values below represent the averages across all runs for the benchmark image.
4946
50- ❌ - Insufficient RAM.
47+ | Model | iPhone 17 Pro (XNNPACK) [ ms] | iPhone 16 Pro (XNNPACK) [ ms] | iPhone SE 3 (XNNPACK) [ ms] | Samsung Galaxy S24 (XNNPACK) [ ms] | OnePlus 12 (XNNPACK) [ ms] |
48+ | ------------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
49+ | Detector (CRAFT_1280_QUANTIZED) | 1918 | 2304 | 3371 | 1391 | 1445 |
50+ | Detector (CRAFT_320_QUANTIZED) | 473 | 563 | 813 | 361 | 382 |
51+ | Recognizer (CRNN_512) | 78 | 83 | 310 | 59 | 57 |
52+ | Recognizer (CRNN_64) | 9 | 9 | 38 | 8 | 7 |
5153
5254## LLMs
5355
54- | Model | iPhone 16 Pro (XNNPACK) [ tokens/s] | iPhone 13 Pro (XNNPACK) [ tokens/s] | iPhone SE 3 (XNNPACK) [ tokens/s] | Samsung Galaxy S24 (XNNPACK) [ tokens/s] | OnePlus 12 (XNNPACK) [ tokens/s] |
56+ | Model | iPhone 17 Pro (XNNPACK) [ tokens/s] | iPhone 16 Pro (XNNPACK) [ tokens/s] | iPhone SE 3 (XNNPACK) [ tokens/s] | Samsung Galaxy S24 (XNNPACK) [ tokens/s] | OnePlus 12 (XNNPACK) [ tokens/s] |
5557| --------------------- | :--------------------------------: | :--------------------------------: | :------------------------------: | :-------------------------------------: | :-----------------------------: |
5658| LLAMA3_2_1B | 16.1 | 11.4 | ❌ | 15.6 | 19.3 |
5759| LLAMA3_2_1B_SPINQUANT | 40.6 | 16.7 | 16.5 | 40.3 | 48.2 |
@@ -68,7 +70,7 @@ Times presented in the tables are measured as consecutive runs of the model. Ini
6870
6971Notice than for ` Whisper ` model which has to take as an input 30 seconds audio chunks (for shorter audio it is automatically padded with silence to 30 seconds) ` fast ` mode has the lowest latency (time from starting transcription to first token returned, caused by streaming algorithm), but the slowest speed. That's why for the lowest latency and the fastest transcription we suggest using ` Moonshine ` model, if you still want to proceed with ` Whisper ` use preferably the ` balanced ` mode.
7072
71- | Model (mode) | iPhone 16 Pro (XNNPACK) [ latency \| tokens/s] | iPhone 14 Pro (XNNPACK) [ latency \| tokens/s] | iPhone SE 3 (XNNPACK) [ latency \| tokens/s] | Samsung Galaxy S24 (XNNPACK) [ latency \| tokens/s] | OnePlus 12 (XNNPACK) [ latency \| tokens/s] |
73+ | Model (mode) | iPhone 17 Pro (XNNPACK) [ latency \| tokens/s] | iPhone 16 Pro (XNNPACK) [ latency \| tokens/s] | iPhone SE 3 (XNNPACK) [ latency \| tokens/s] | Samsung Galaxy S24 (XNNPACK) [ latency \| tokens/s] | OnePlus 12 (XNNPACK) [ latency \| tokens/s] |
7274| ------------------------- | :-------------------------------------------: | :-------------------------------------------: | :-----------------------------------------: | :------------------------------------------------: | :----------------------------------------: |
7375| Moonshine-tiny (fast) | 0.8s \| 19.0t/s | 1.5s \| 11.3t/s | 1.5s \| 10.4t/s | 2.0s \| 8.8t/s | 1.6s \| 12.5t/s |
7476| Moonshine-tiny (balanced) | 2.0s \| 20.0t/s | 3.2s \| 12.4t/s | 3.7s \| 10.4t/s | 4.6s \| 11.2t/s | 3.4s \| 14.6t/s |
@@ -81,7 +83,7 @@ Notice than for `Whisper` model which has to take as an input 30 seconds audio c
8183
8284Average time for encoding audio of given length over 10 runs. For ` Whisper ` model we only list 30 sec audio chunks since ` Whisper ` does not accept other lengths (for shorter audio the audio needs to be padded to 30sec with silence).
8385
84- | Model | iPhone 16 Pro (XNNPACK) [ ms] | iPhone 14 Pro (XNNPACK) [ ms] | iPhone SE 3 (XNNPACK) [ ms] | Samsung Galaxy S24 (XNNPACK) [ ms] | OnePlus 12 (XNNPACK) [ ms] |
86+ | Model | iPhone 17 Pro (XNNPACK) [ ms] | iPhone 16 Pro (XNNPACK) [ ms] | iPhone SE 3 (XNNPACK) [ ms] | Samsung Galaxy S24 (XNNPACK) [ ms] | OnePlus 12 (XNNPACK) [ ms] |
8587| -------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
8688| Moonshine-tiny (5s) | 99 | 95 | 115 | 284 | 277 |
8789| Moonshine-tiny (10s) | 178 | 177 | 204 | 555 | 528 |
@@ -92,7 +94,7 @@ Average time for encoding audio of given length over 10 runs. For `Whisper` mode
9294
9395Average time for decoding one token in sequence of 100 tokens, with encoding context is obtained from audio of noted length.
9496
95- | Model | iPhone 16 Pro (XNNPACK) [ ms] | iPhone 14 Pro (XNNPACK) [ ms] | iPhone SE 3 (XNNPACK) [ ms] | Samsung Galaxy S24 (XNNPACK) [ ms] | OnePlus 12 (XNNPACK) [ ms] |
97+ | Model | iPhone 17 Pro (XNNPACK) [ ms] | iPhone 16 Pro (XNNPACK) [ ms] | iPhone SE 3 (XNNPACK) [ ms] | Samsung Galaxy S24 (XNNPACK) [ ms] | OnePlus 12 (XNNPACK) [ ms] |
9698| -------------------- | :--------------------------: | :--------------------------: | :------------------------: | :-------------------------------: | :-----------------------: |
9799| Moonshine-tiny (5s) | 48.98 | 47.98 | 46.86 | 36.70 | 29.03 |
98100| Moonshine-tiny (10s) | 54.24 | 51.74 | 55.07 | 46.31 | 32.41 |
@@ -101,9 +103,9 @@ Average time for decoding one token in sequence of 100 tokens, with encoding con
101103
102104## Text Embeddings
103105
104- | Model | iPhone 16 Pro (XNNPACK) [ ms] | iPhone 14 Pro Max (XNNPACK) [ ms] | iPhone SE 3 (XNNPACK) [ ms] | Samsung Galaxy S24 (XNNPACK) | OnePlus 12 (XNNPACK) [ ms] |
105- | -------------------------- | :--------------------------: | :------------------------------ : | :------------------------: | :--------------------------: | :-----------------------: |
106- | ALL_MINILM_L6_V2 | 53 | 69 | 78 | 60 | 65 |
107- | ALL_MPNET_BASE_V2 | 352 | 423 | 478 | 521 | 527 |
108- | MULTI_QA_MINILM_L6_COS_V1 | 135 | 166 | 180 | 158 | 165 |
109- | MULTI_QA_MPNET_BASE_DOT_V1 | 503 | 598 | 680 | 694 | 743 |
106+ | Model | iPhone 17 Pro (XNNPACK) [ ms] | iPhone 16 Pro (XNNPACK) [ ms] | iPhone SE 3 (XNNPACK) [ ms] | Samsung Galaxy S24 (XNNPACK) [ ms ] | OnePlus 12 (XNNPACK) [ ms] |
107+ | -------------------------- | :--------------------------: | :--------------------------: | :------------------------: | :----- --------------------------: | :-----------------------: |
108+ | ALL_MINILM_L6_V2 | 50 | 58 | 84 | 58 | 58 |
109+ | ALL_MPNET_BASE_V2 | 352 | 428 | 879 | 483 | 517 |
110+ | MULTI_QA_MINILM_L6_COS_V1 | 133 | 161 | 269 | 151 | 155 |
111+ | MULTI_QA_MPNET_BASE_DOT_V1 | 502 | 796 | 1216 | 915 | 713 |
0 commit comments