Commit 6da3517
committed
Fix sequential metadata fetching in ListingTable causing high latency
When scanning an exact list of remote Parquet files, the ListingTable was fetching file metadata (via head calls) sequentially. This was due to using `stream::iter(file_list).flatten()`, which processes each one-item stream in order. For remote blob stores, where each head call can take tens to hundreds of milliseconds, this sequential behavior significantly increased the time to create the physical plan.
This commit replaces the sequential flattening with concurrent merging using `futures::stream::select_all(file_list)`. With this change, the `head` requests are executed in parallel (up to the configured `meta_fetch_concurrency` limit), reducing latency when creating the physical plan.
Additionally, tests have been updated to ensure that metadata fetching occurs concurrently.1 parent e9284cf commit 6da3517
1 file changed
+2
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
55 | 55 | | |
56 | 56 | | |
57 | 57 | | |
58 | | - | |
| 58 | + | |
59 | 59 | | |
60 | 60 | | |
61 | 61 | | |
| |||
1112 | 1112 | | |
1113 | 1113 | | |
1114 | 1114 | | |
1115 | | - | |
| 1115 | + | |
1116 | 1116 | | |
1117 | 1117 | | |
1118 | 1118 | | |
| |||
0 commit comments