Replies: 1 comment
-
Here the method I use. Although, I'm not sure 'transform' is always happening before 'paintImageXObject'. So far it has been the case with the PDFs I use. Basically, if you iterate through the operator list you can catch the image 'transform' and use that. type Transform = [number, number, number, number, number, number]
// Use the unit to convert numbers to inch if needed.
const unit = page.userUnit * 72
const opList = await page.getOperatorList()
let lastTransform: Transform | null = null
for (let i = 0; i < opList.fnArray.length; i++) {
if (opList.fnArray[i] === pdfjsLib.OPS.transform) {
const args = opList.argsArray[i]
lastTransform = args as Transform
}
if (opList.fnArray[i] === pdfjsLib.OPS.paintImageXObject && lastTransform) {
// here use the transform to get the size, you can also use it to get the position and polygon.
const imageWidth = Math.abs(lastTransform[0])
const imageHeight = Math.abs(lastTransform[3])
const imageWidthInch = Math.abs(lastTransform[0]) / unit
const imageHeightInch = Math.abs(lastTransform[3]) / unit
}
} |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Attach (recommended) or Link to PDF file here:
https://arxiv.org/pdf/2404.01370.pdf
Configuration:
Steps to reproduce the problem:
viewport
withpage.getViewport({ scale: 1 })
page.objs.get(id)
What went wrong?
Why all the images size greater than page size?

Link to a viewer (if hosted on a site other than mozilla.github.io/pdf.js or as Firefox/Chrome extension):
Beta Was this translation helpful? Give feedback.
All reactions