Replies: 1 comment 2 replies
-
I'm still traveling and only have computer access on Tuesday. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I'm having a problem with alignment of redactions when using AWS Rekognition (and occasionally Textract), and wonder if I need to alter how I'm computing page locations for placing the OCR text (and therefore the redactions). When managing normal text everything is fine, and the word alignment is good. The problem is on larger font sizes, since we have a requirement to redact things like license plate numbers and other oddly sized texts.
The code to set up for OCR using this is as follows:
Once I'm iterating over the returned OCR results, I'm using the following to compute dimensions of the bbox for the invisible text.
(fontSize = bbox.width / textLen was also tried, but occasionally causes near full-page redactions)
But when the results appear, they're offset in some way -- usually like one of these examples:
This has to be somehow related to how coordinates are being computed for non-standard font sizes, but I'm not sure how to overcome it, since we have no way of knowing what font size the detected text in an image is. Has anyone tried this and had success? Any hints or ideas appreciated.
Beta Was this translation helpful? Give feedback.
All reactions