Skip to content

Fix FileStream scanning_total to include sync next-file open time#20627

Open
RatulDawar wants to merge 1 commit intoapache:mainfrom
RatulDawar:fix-20571-file-stream-scanning-timer
Open

Fix FileStream scanning_total to include sync next-file open time#20627
RatulDawar wants to merge 1 commit intoapache:mainfrom
RatulDawar:fix-20571-file-stream-scanning-timer

Conversation

@RatulDawar
Copy link

Summary

  • include synchronous start_next_file() / FileOpener::open() setup time in time_elapsed_scanning_total
  • keep existing time_opening and scanning timers lifecycle intact
  • avoid timer overlap by scoping the temporary timer before calling time_scanning_total.start()

Details

In FileStreamState::Open, start_next_file() is invoked before time_scanning_total.start(). If open() performs synchronous work before returning the future, that time was previously unaccounted for in time_elapsed_scanning_total.

This change wraps the start_next_file() call in a scoped timer on the same time_scanning_total metric so the missing segment is recorded.

Fixes #20571

Validation

  • cargo test -p datafusion-datasource with_limit_at_middle_of_batch -- --nocapture

@github-actions github-actions bot added the datasource Changes to the datasource crate label Mar 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

datasource Changes to the datasource crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

FileStream start_next_file() is not measured in compute time.

1 participant