Skip to content
Discussion options

You must be logged in to vote

The translation has converted the original page to an image, upon which the translated text has been written.
So the easiest way probably is to just remove that image:

import fitz
doc=fitz.open("page_4_tr.pdf")
page=doc[0]
print(page.get_images())
[(13, 0, 1681, 2378, 8, 'DeviceRGB', '', 'FXX1', 'DCTDecode')]
page.delete_image(13)
doc.save("english.pdf", garbage=3, deflate=True)

This gives a page with only the English text - nothing else. Note the garbage collection and compression options.

Replies: 2 comments 2 replies

Comment options

You must be logged in to vote
2 replies
@chad130
Comment options

@JorjMcKie
Comment options

Answer selected by JorjMcKie
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants