How to extract text with respect to the heading? #2192
-
Hai, Thank you in advance.
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
The problem is, that nothing in the PDF identifies text in categories as HTML knows them. Your HTML extraction simply contains everything on that page as text (certainly with various different properties).
So the HTML element "h1" simply is not there! |
Beta Was this translation helpful? Give feedback.
The problem is, that nothing in the PDF identifies text in categories as HTML knows them. Your HTML extraction simply contains everything on that page as text (certainly with various different properties).