Skip to content
This repository was archived by the owner on Apr 11, 2025. It is now read-only.

Commit f120302

Browse files
committed
[REF] remove_unconnected_edges
1. **Using a Separate List for Removal**: Instead of deleting elements from `text_alignment.textlines` while iterating over it, we first collect the indices of elements to remove in a separate list called `to_remove`. This prevents issues that can arise from modifying a list while iterating over it. 2. **Reverse Index Removal**: When removing items, we iterate over `to_remove` in reverse order. This ensures that the indices of elements to be removed do not affect the remaining elements in the list as we delete them. 3. **Clearer Condition for Removal**: The condition checking whether a text line is a singleton remains the same, but moving the deletion logic outside of the inner loop improves clarity and avoids potential errors that could lead to infinite loops. 4. **Termination Flag**: The loop continues as long as `removed_singletons` is `True`, which indicates that at least one text line has been removed in the last iteration. If no text lines are removed in a complete pass through the alignments, the loop will exit. 5. **Recomputing Alignments**: The line `self._textline_to_alignments = {}` is retained to clear the cache of alignments before recomputing them. This ensures that the latest state is reflected after modifications.
1 parent a66923f commit f120302

File tree

1 file changed

+18
-4
lines changed

1 file changed

+18
-4
lines changed

camelot/parsers/network.py

Lines changed: 18 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -387,24 +387,38 @@ def remove_unconnected_edges(self):
387387
Elements should be connected to others both vertically
388388
and horizontally.
389389
"""
390+
# Initialize a flag to indicate if any singletons were removed
390391
removed_singletons = True
392+
391393
while removed_singletons:
392394
removed_singletons = False
395+
393396
for text_alignments in self._text_alignments.values():
394397
# For each alignment edge, remove items if they are singletons
395398
# either horizontally or vertically
396399
for text_alignment in text_alignments:
397-
for i in range(len(text_alignment.textlines) - 1, -1, -1):
400+
# Create a list to hold textlines to be removed
401+
to_remove = []
402+
403+
for i in range(len(text_alignment.textlines)):
398404
textline = text_alignment.textlines[i]
399405
alignments = self._textline_to_alignments[textline]
406+
407+
# Check if the textline is a singleton in either direction
400408
if (
401409
alignments.max_h_count() <= 1
402410
or alignments.max_v_count() <= 1
403411
):
404-
del text_alignment.textlines[i]
405-
removed_singletons = True
412+
to_remove.append(i) # Mark for removal
413+
414+
# Remove items after iterating to avoid modifying the list during iteration
415+
for index in reversed(to_remove):
416+
del text_alignment.textlines[index]
417+
removed_singletons = True
418+
419+
# Clear the alignment cache
406420
self._textline_to_alignments = {}
407-
self._compute_alignment_counts()
421+
self._compute_alignment_counts() # Recompute alignment counts after removals
408422

409423
def most_connected_textline(self):
410424
"""Retrieve the textline that is most connected."""

0 commit comments

Comments
 (0)