Skip to content

Commit 9bc256e

Browse files
fix: add export to xml and html (#17)
* added the XML export Signed-off-by: Peter Staar <[email protected]> * reformatted all Signed-off-by: Peter Staar <[email protected]> * fixed tests Signed-off-by: Peter Staar <[email protected]> * added the DocumentTokens class Signed-off-by: Peter Staar <[email protected]> * updating the to-xml method Signed-off-by: Peter Staar <[email protected]> * updating the to-xml method Signed-off-by: Peter Staar <[email protected]> * fixed the to-md method Signed-off-by: Peter Staar <[email protected]> * added the strict-text in the to-md method Signed-off-by: Peter Staar <[email protected]> * added page-tokens Signed-off-by: Peter Staar <[email protected]> * updated the location/page tokens Signed-off-by: Peter Staar <[email protected]> * small fix to have correct special document-tokens Signed-off-by: Peter Staar <[email protected]> * reformatted the code Signed-off-by: Peter Staar <[email protected]> --------- Signed-off-by: Peter Staar <[email protected]>
1 parent 2f55d92 commit 9bc256e

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

docling_core/types/doc/document.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -410,21 +410,21 @@ def get_special_tokens(
410410
special_tokens = [token.value for token in cls]
411411

412412
# Adding dynamically generated row and col tokens
413-
for i in range(0, max_rows):
413+
for i in range(0, max_rows + 1):
414414
special_tokens += [f"<row_{i}>", f"</row_{i}>"]
415415

416-
for i in range(0, max_cols):
416+
for i in range(0, max_cols + 1):
417417
special_tokens += [f"<col_{i}>", f"</col_{i}>"]
418418

419419
for i in range(6):
420420
special_tokens += [f"<section-header-{i}>", f"</section-header-{i}>"]
421421

422422
# Adding dynamically generated page-tokens
423-
for i in range(0, max_pages):
423+
for i in range(0, max_pages + 1):
424424
special_tokens.append(f"<page_{i}>")
425425

426426
# Adding dynamically generated location-tokens
427-
for i in range(0, max(page_dimension[0], page_dimension[1])):
427+
for i in range(0, max(page_dimension[0] + 1, page_dimension[1] + 1)):
428428
special_tokens.append(f"<loc_{i}>")
429429

430430
return special_tokens

0 commit comments

Comments
 (0)