Skip to content

Conversation

@benITo47
Copy link
Contributor

Description

This PR refactors the OCR and VerticalOCR to improve efficiency.

Previously, our implementation relied on multiple instances of the same detector and recognizer models, each handling a different input size (3x Detector, 4x Recognizer instances). This approach was resource-intensive.

This update introduces a more streamlined approach by using a single detector and a single recognizer model, each with multiple forward_ methods (e.g., forward_800, forward_320). These methods handle different input widths within the same model instance, significantly reducing the number of loaded models and simplifying the API.

This change is a breaking change as it modifies the arguments for useOCR, useVerticalOCR, OCRModule, and VerticalOCRModule

Introduces a breaking change?

  • Yes
  • No

Type of change

  • Bug fix (change which fixes an issue)
  • New feature (change which adds functionality)
  • Documentation update (improves or adds clarity to existing documentation)
  • Other (chores, tests, code style improvements etc.)

Tested on

  • iOS
  • Android

Testing instructions

Manual sanity checks.

Screenshots

Related issues

#692

Checklist

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have updated the documentation accordingly
  • My changes generate no new warnings

Additional notes

Refactor of TypeScript interfaces and hooks for OCR and VerticalOCR to support models that expose multiple inference methods for different input sizes.
This commit simplifies current setup by allowing a single detector and recognizer source, rather than requiring separate entries for different input sizes.
…er model

	Adapts the C++ Recognition controllers to handle a single recognizer file that contains multiple inference methods.
	This commit addapts the C++ OCR and VerticalOCR controllers to handle a single detector model  with multiple inference methods
@msluszniak msluszniak added the feature PRs that implement a new feature label Jan 14, 2026
@msluszniak msluszniak linked an issue Jan 14, 2026 that may be closed by this pull request
@msluszniak
Copy link
Member

The cpp code and docs look good, someone needs to review the rest and test it

Copy link
Contributor

@IgorSwat IgorSwat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall a good code.

@benITo47 benITo47 force-pushed the @bo/ocr_single_weights branch from 4e3b744 to 093ecf3 Compare January 14, 2026 12:42
…rtical_ocr/VerticalDetector.h

Co-authored-by: Jakub Chmura <[email protected]>
@benITo47 benITo47 requested review from IgorSwat and chmjkb January 15, 2026 12:58
@benITo47 benITo47 force-pushed the @bo/ocr_single_weights branch from e592b72 to d935acb Compare January 15, 2026 14:10
@msluszniak
Copy link
Member

I tested OCR on demo apps on your branch, and I don't know if the models are just off or something is wrong:
image
image

@benITo47
Copy link
Contributor Author

image @msluszniak Checked your example against main branch. It's not caused by my changes

@msluszniak
Copy link
Member

Ok, these fields with 0.000 were especially concerning. I don't think it should show any box that recognises no signs 😅 Maybe we should fix this in demo app to not show them?

@benITo47 benITo47 force-pushed the @bo/ocr_single_weights branch from d935acb to 9b8d376 Compare January 15, 2026 16:06
	- Changed error messages in Detector classes
	- Added input width check in generate functions
	- Reverted 'export' keyword from modelUrls.ts
@benITo47 benITo47 force-pushed the @bo/ocr_single_weights branch from 9b8d376 to cbf7e36 Compare January 15, 2026 16:28
@benITo47 benITo47 merged commit 49fcc59 into main Jan 16, 2026
3 of 5 checks passed
@benITo47 benITo47 deleted the @bo/ocr_single_weights branch January 16, 2026 07:40
@msluszniak
Copy link
Member

I guess we released code which does not compile:

image It was not visible because CI failed earlier because of cache bloating. Please fix this @benITo47

@benITo47
Copy link
Contributor Author

Rebasing error. On it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature PRs that implement a new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Re-export OCR to use single weights

5 participants