table data break between pages #2035
sachinkaoor2055
started this conversation in
General
Replies: 1 comment
-
Well, it seems that "15-10-2024", "Confidential" are breaking up your table, which are probably page headers or footers. Depending on how generalized you want your solution to be: you can clean these out. The table just seems to continue on the next page, so docling should normally be able to handle that correctly (if that footer/header is no longer breaking things up) |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
i have a naive pdf document and i convert into markdown format. using this code.
from docling.document_converter import DocumentConverter
source = r"sample.pdf" # Local PDF file
converter = DocumentConverter()
result = converter.convert(source)
Extract Markdown content
markdown_content = result.document.export_to_markdown()
Save to .md file
output_path = "Wavence_GAD_Converted.md"
with open(output_path, "w", encoding="utf-8") as f:
f.write(markdown_content)
print(f"Markdown saved to: {output_path}")
but in few pages tables extended to next page and i got markdown like this
The UBT can be connected to a dedicated plug-in (EAC, EAC-10G, CAHD or EASv2 cards), directly to the CorEvo card or to MSS-E-HE-XE, using optical or electrical GE cables according to the specific UBT.
The active part is a true wide band radio: it is not sub-band dependent, meaning that a UBT at a given frequency can support all sub-bands and all shifters specified for that frequency, according to the following table.
15-10-2024
Confidential
i want a table without break. how can i do that any idea.
Beta Was this translation helpful? Give feedback.
All reactions