Skip to content
Discussion options

You must be logged in to vote

@shubham-wysa You will not be able obtain bounding box and page information for .docx files, since internally .docx files do not track these information, they are simply a tree of text elements. Page information result from rendering .docx files through a viewer (e.g., MS Word). If you require these information, you should convert to pdf before ingestion.

Replies: 2 comments

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Answer selected by cau-git
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants