Skip to content

Conversation

@PeterStaar-IBM
Copy link
Contributor

No description provided.

Signed-off-by: Peter Staar <[email protected]>
@PeterStaar-IBM PeterStaar-IBM self-assigned this Oct 29, 2025
@mergify
Copy link

mergify bot commented Oct 29, 2025

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

@github-actions
Copy link
Contributor

DCO Check Passed

Thanks @PeterStaar-IBM, all your commits are properly signed off. 🎉

@simonschoe
Copy link

simonschoe commented Nov 8, 2025

Hi @PeterStaar-IBM, are you still actively working on this? Every once in a while we encouter documents where the parsed text looks somewhat like this:

�GLYPH<c=3,font=/FONT_NAME> GLYPH<c=3,font=/FONT_NAME> %HVRQGHUHGLYPH<c=3,font=/FONT_NAME>%HGLQJXQJHQGLYPH<c=3,font=/FONT_NAME>I�UGLYPH<c=3,font=/FONT_NAME>%HUHFKWLJWHGLYPH<c=3,font=/FONT_NAME>GHVGLYPH<c=3,font=/FONT_NAME>7 DULIVGLYPH<c=3,font=/FONT_NAME> GLYPH<c=3,font=/FONT_NAME> ÄgIIHQWOLFKHUGLYPH<c=3,font=/FONT_NAME> ...

It is particularly prevalent in documents that are created with corporate fonts. Would love to see a general fix for this issue (if possible) as it may tackle a variety of different parsing issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants