Skip to content

Commit 1588701

Browse files
authored
Temporary fix for filtering on empty batches (#1901)
Potential fix for #1804 Might want to write a test, but not sure yet how to reproduce without using glue. Closes #1804
1 parent 76d02ad commit 1588701

File tree

1 file changed

+6
-2
lines changed

1 file changed

+6
-2
lines changed

pyiceberg/io/pyarrow.py

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1441,11 +1441,15 @@ def _task_to_record_batches(
14411441

14421442
# Apply the user filter
14431443
if pyarrow_filter is not None:
1444-
current_batch = current_batch.filter(pyarrow_filter)
1444+
# Temporary fix until PyArrow 21 is released ( https://github.com/apache/arrow/pull/46057 )
1445+
table = pa.Table.from_batches([current_batch])
1446+
table = table.filter(pyarrow_filter)
14451447
# skip empty batches
1446-
if current_batch.num_rows == 0:
1448+
if table.num_rows == 0:
14471449
continue
14481450

1451+
current_batch = table.combine_chunks().to_batches()[0]
1452+
14491453
result_batch = _to_requested_schema(
14501454
projected_schema,
14511455
file_project_schema,

0 commit comments

Comments
 (0)