Skip to content

Commit 1141a86

Browse files
JakubGoneramlodyjesieninNorbertKlockiewiczmsluszniak
authored
feat: port OCR to C++ (#389)
## Description Port the native implementation to C++ ### Type of change - [ ] Bug fix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [ ] Documentation update (improves or adds clarity to existing documentation) ### Tested on - [x] iOS - [ ] Android ### Related issues #259 ### Checklist - [x] I have performed a self-review of my code - [ ] I have commented my code, particularly in hard-to-understand areas - [ ] I have updated the documentation accordingly - [ ] My changes generate no new warnings --------- Co-authored-by: mlodyjesienin <[email protected]> Co-authored-by: Norbert Klockiewicz <[email protected]> Co-authored-by: Filip Zieliński <[email protected]> Co-authored-by: Mateusz Sluszniak <[email protected]>
1 parent bffcd9f commit 1141a86

File tree

94 files changed

+2716
-4619
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

94 files changed

+2716
-4619
lines changed

apps/computer-vision/app/object_detection/index.tsx

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,6 @@ export default function ObjectDetectionScreen() {
4343
if (imageUri) {
4444
try {
4545
const output = await ssdLite.forward(imageUri);
46-
console.log(output);
4746
setResults(output);
4847
} catch (e) {
4948
console.error(e);

apps/computer-vision/app/ocr/index.tsx

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,6 @@ export default function OCRScreen() {
3838
try {
3939
const output = await model.forward(imageUri);
4040
setResults(output);
41-
console.log(output);
4241
} catch (e) {
4342
console.error(e);
4443
}
@@ -78,8 +77,8 @@ export default function OCRScreen() {
7877
<View style={styles.results}>
7978
<Text style={styles.resultHeader}>Results</Text>
8079
<ScrollView style={styles.resultsList}>
81-
{results.map(({ text, score }) => (
82-
<View key={text} style={styles.resultRecord}>
80+
{results.map(({ text, score }, index) => (
81+
<View key={index} style={styles.resultRecord}>
8382
<Text style={styles.resultLabel}>{text}</Text>
8483
<Text>{score.toFixed(3)}</Text>
8584
</View>

apps/computer-vision/app/ocr_vertical/index.tsx

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,6 @@ export default function VerticalOCRScree() {
4040
try {
4141
const output = await model.forward(imageUri);
4242
setResults(output);
43-
console.log(output);
4443
} catch (e) {
4544
console.error(e);
4645
}
@@ -80,8 +79,8 @@ export default function VerticalOCRScree() {
8079
<View style={styles.results}>
8180
<Text style={styles.resultHeader}>Results</Text>
8281
<ScrollView style={styles.resultsList}>
83-
{results.map(({ text, score }) => (
84-
<View key={text} style={styles.resultRecord}>
82+
{results.map(({ text, score }, index) => (
83+
<View key={index} style={styles.resultRecord}>
8584
<Text style={styles.resultLabel}>{text}</Text>
8685
<Text>{score.toFixed(3)}</Text>
8786
</View>

apps/computer-vision/ios/Podfile.lock

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2454,7 +2454,7 @@ SPEC CHECKSUMS:
24542454
React-logger: 8edfcedc100544791cd82692ca5a574240a16219
24552455
React-Mapbuffer: c3f4b608e4a59dd2f6a416ef4d47a14400194468
24562456
React-microtasksnativemodule: 054f34e9b82f02bd40f09cebd4083828b5b2beb6
2457-
react-native-executorch: 98a2d5c0fc2290d473db87f2d6f3bf9dc7b77ab1
2457+
react-native-executorch: d06ae11e5411f0cb798316c4e69cf7d8678da297
24582458
react-native-image-picker: 8a3f16000e794f5381a7fe47bb48fd8d06741e47
24592459
react-native-safe-area-context: 562163222d999b79a51577eda2ea8ad2c32b4d06
24602460
react-native-skia: b6cb66e99a953dae6880348c92cfb20a76d90b4f

docs/docs/02-hooks/02-computer-vision/useOCR.md

Lines changed: 22 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -301,19 +301,34 @@ You need to make sure the recognizer models you pass in `recognizerSources` matc
301301

302302
| Model | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
303303
| -------------------------------------------------------------------------------------------- | :--------------------: | :----------------: |
304-
| Detector (CRAFT_800) + Recognizer (CRNN_512) + Recognizer (CRNN_256) + Recognizer (CRNN_128) | 2100 | 1782 |
304+
| Detector (CRAFT_800) + Recognizer (CRNN_512) + Recognizer (CRNN_256) + Recognizer (CRNN_128) | 1600 | 1700 |
305305

306306
### Inference time
307307

308+
**Image Used for Benchmarking:**
309+
310+
| ![Alt text](../../../static/img/harvard.png) | ![Alt text](../../../static/img/harvard-boxes.png) |
311+
| -------------------------------------------- | -------------------------------------------------- |
312+
| Original Image | Image with detected Text Boxes |
313+
308314
:::warning warning
309315
Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
310316
:::
311317

312-
| Model | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | Samsung Galaxy S21 (XNNPACK) [ms] |
313-
| --------------------- | :--------------------------: | :------------------------------: | :------------------------: | :-------------------------------: | :-------------------------------: |
314-
| Detector (CRAFT_800) | 2099 | 2227 || 2245 | 7108 |
315-
| Recognizer (CRNN_512) | 70 | 252 || 54 | 151 |
316-
| Recognizer (CRNN_256) | 39 | 123 || 24 | 78 |
317-
| Recognizer (CRNN_128) | 17 | 83 || 14 | 39 |
318+
**Time measurements:**
319+
320+
| Metric | iPhone 14 Pro Max <br /> [ms] | iPhone 16 Pro <br /> [ms] | iPhone SE 3 | Samsung Galaxy S24 <br /> [ms] | OnePlus 12 <br /> [ms] |
321+
| ------------------------- | ----------------------------- | ------------------------- | ----------- | ------------------------------ | ---------------------- |
322+
| **Total Inference Time** | 4330 | 2537 || 6648 | 5993 |
323+
| **Detector (CRAFT_800)** | 1945 | 1809 || 2080 | 1961 |
324+
| **Recognizer (CRNN_512)** | | | | | |
325+
| ├─ Average Time | 273 | 76 || 289 | 252 |
326+
| ├─ Total Time (3 runs) | 820 | 229 || 867 | 756 |
327+
| **Recognizer (CRNN_256)** | | | | | |
328+
| ├─ Average Time | 137 | 39 || 260 | 229 |
329+
| ├─ Total Time (7 runs) | 958 | 271 || 1818 | 1601 |
330+
| **Recognizer (CRNN_128)** | | | | | |
331+
| ├─ Average Time | 68 | 18 || 239 | 214 |
332+
| ├─ Total Time (7 runs) | 478 | 124 || 1673 | 1498 |
318333

319334
❌ - Insufficient RAM.

docs/docs/02-hooks/02-computer-vision/useVerticalOCR.md

Lines changed: 23 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -316,20 +316,35 @@ You need to make sure the recognizer models you pass in `recognizerSources` matc
316316

317317
| Model | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
318318
| -------------------------------------------------------------------- | :--------------------: | :----------------: |
319-
| Detector (CRAFT_1280) + Detector (CRAFT_320) + Recognizer (CRNN_512) | 2770 | 3720 |
320-
| Detector(CRAFT_1280) + Detector(CRAFT_320) + Recognizer (CRNN_64) | 1770 | 2740 |
319+
| Detector (CRAFT_1280) + Detector (CRAFT_320) + Recognizer (CRNN_512) | 2172 | 2214 |
320+
| Detector(CRAFT_1280) + Detector(CRAFT_320) + Recognizer (CRNN_64) | 1774 | 1705 |
321321

322322
### Inference time
323323

324+
**Image Used for Benchmarking:**
325+
326+
| ![Alt text](../../../static/img/sales-vertical.jpeg) | ![Alt text](../../../static/img/sales-vertical-boxes.png) |
327+
| ---------------------------------------------------- | --------------------------------------------------------- |
328+
| Original Image | Image with detected Text Boxes |
329+
324330
:::warning warning
325331
Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
326332
:::
327333

328-
| Model | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) [ms] | Samsung Galaxy S21 (XNNPACK) [ms] |
329-
| --------------------- | :--------------------------: | :------------------------------: | :------------------------: | :-------------------------------: | :-------------------------------: |
330-
| Detector (CRAFT_1280) | 5457 | 5833 || 6296 | 14053 |
331-
| Detector (CRAFT_320) | 1351 | 1460 || 1485 | 3101 |
332-
| Recognizer (CRNN_512) | 39 | 123 || 24 | 78 |
333-
| Recognizer (CRNN_64) | 10 | 33 || 7 | 18 |
334+
**Time measurements:**
335+
336+
| Metric | iPhone 14 Pro Max <br /> [ms] | iPhone 16 Pro <br /> [ms] | iPhone SE 3 | Samsung Galaxy S24 <br /> [ms] | OnePlus 12 <br /> [ms] |
337+
| -------------------------------------------------------------------------- | ----------------------------- | ------------------------- | ----------- | ------------------------------ | ---------------------- |
338+
| **Total Inference Time** | 9350 / 9620 | 8572 / 8621 || 13737 / 10570 | 13436 / 9848 |
339+
| **Detector (CRAFT_1250)** | 4895 | 4756 || 5574 | 5016 |
340+
| **Detector (CRAFT_320)** | | | | | |
341+
| ├─ Average Time | 1247 | 1206 || 1350 | 1356 |
342+
| ├─ Total Time (3 runs) | 3741 | 3617 || 4050 | 4069 |
343+
| **Recognizer (CRNN_64)** <br /> (_With Flag `independentChars == true`_) | | | | | |
344+
| ├─ Average Time | 31 | 9 || 195 | 207 |
345+
| ├─ Total Time (21 runs) | 649 | 191 || 4092 | 4339 |
346+
| **Recognizer (CRNN_512)** <br /> (_With Flag `independentChars == false`_) | | | | | |
347+
| ├─ Average Time | 306 | 80 || 308 | 250 |
348+
| ├─ Total Time (3 runs) | 919 | 240 || 925 | 751 |
334349

335350
❌ - Insufficient RAM.

0 commit comments

Comments
 (0)