Skip to content

Commit 0ad1c4b

Browse files
committed
Remove incorrect duplicate batch skipping in ReorgAwareStream
The _is_duplicate_batch check was incorrectly skipping batches based only on comparing ranges to the previous batch. This is wrong because: 1. Proper duplicate detection is handled by state_store.is_processed() in load_stream_continuous, which checks against ALL processed batches 2. The range-only comparison does not check hashes, so batches with same ranges but different data (after reorg) would be incorrectly skipped 3. After crash/restart, prev_ranges_by_network is empty anyway
1 parent 8ce25f1 commit 0ad1c4b

File tree

1 file changed

+0
-7
lines changed

1 file changed

+0
-7
lines changed

src/amp/streaming/reorg.py

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -49,13 +49,6 @@ def __next__(self) -> ResponseBatch:
4949
# Get next batch from underlying stream
5050
batch = next(self.stream_iterator)
5151

52-
# Note: ranges_complete flag is handled by CheckpointStore in load_stream_continuous
53-
# Check if this batch contains only duplicate ranges
54-
if self._is_duplicate_batch(batch.metadata.ranges):
55-
self.logger.debug(f'Skipping duplicate batch with ranges: {batch.metadata.ranges}')
56-
# Recursively call to get the next non-duplicate batch
57-
return self.__next__()
58-
5952
# Detect reorgs by comparing with previous ranges
6053
invalidation_ranges = self._detect_reorg(batch.metadata.ranges)
6154

0 commit comments

Comments
 (0)