fix(table): AddFiles: close file after usage and parallelize by starpact · Pull Request #799 · apache/iceberg-go

starpact · 2026-03-18T07:51:23Z

Previously files are closed only after the whole iteration finishes:

iceberg-go/table/arrow_utils.go

Line 1372 in 55bdfbf

defer rdr.Close()

causing resource leak within a single filesToDataFiles invocation as the file holds an open S3 read stream:

iceberg-go/io/gocloud/blob.go

Line 39 in 55bdfbf

*blob.Reader

A sample trace can demonstrate it clearly:

Besides, I also added parallelism to AddFiles.

Future improvements:
Parallelism partially "fixes" the performance issue of AddFiles but a more fundamental improvement is related to IO, currently for extracting Parquet metadata, the IO layer needs to:

do one GetObject and hold the read stream. The size is used by Seek, but the reader is never used(but still consumes quite some memory due to underlying buffering)
do one range GetObject for the last 8 bytes
do one range GetObject for the footer

A common trick is optimistically prefetching the footer(e.g. 512KB) to coleasce the above 3 requests. However, this requires leaking the IO abstraction, e.g. by determining the file format from the path and returning different implementations accordingly, which may require more discussions.

fix(table): AddFiles: close file after usage and parallelize

4993d22

starpact force-pushed the main branch from 492810d to 4993d22 Compare March 20, 2026 05:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(table): AddFiles: close file after usage and parallelize#799

fix(table): AddFiles: close file after usage and parallelize#799
starpact wants to merge 1 commit intoapache:mainfrom
starpact:main

starpact commented Mar 18, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

starpact commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

starpact commented Mar 18, 2026 •

edited

Loading