Skip to content

Commit 26f639d

Browse files
authored
perf: accelerate span deduplication (#248)
Signed-off-by: Panos Vagenas <[email protected]>
1 parent 587e67f commit 26f639d

File tree

1 file changed

+3
-1
lines changed
  • docling_core/experimental/serializer

1 file changed

+3
-1
lines changed

docling_core/experimental/serializer/common.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -70,9 +70,11 @@ def create_ser_result(
7070
else:
7171
results: list[SerializationResult] = span_source
7272
spans = []
73+
span_ids: set[str] = set()
7374
for ser_res in results:
7475
for span in ser_res.spans:
75-
if span not in spans:
76+
if (span_id := span.item.self_ref) not in span_ids:
77+
span_ids.add(span_id)
7678
spans.append(span)
7779
return SerializationResult(
7880
text=text,

0 commit comments

Comments
 (0)