-
Notifications
You must be signed in to change notification settings - Fork 2k
Add FileScanConfigBuilder
#15352
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add FileScanConfigBuilder
#15352
Changes from 3 commits
b7ae6ed
93002ea
dc53de6
c936329
c42c906
e13bfa9
3c216b8
1fc4b3d
ae32a01
4501798
f0b33d9
7e3521c
d13dfc5
e1a056d
748a335
958f81e
45b6175
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -27,7 +27,7 @@ use datafusion::common::{ | |
| internal_datafusion_err, DFSchema, DataFusionError, Result, ScalarValue, | ||
| }; | ||
| use datafusion::datasource::listing::PartitionedFile; | ||
| use datafusion::datasource::physical_plan::{FileScanConfig, ParquetSource}; | ||
| use datafusion::datasource::physical_plan::{FileScanConfigBuilder, ParquetSource}; | ||
| use datafusion::datasource::TableProvider; | ||
| use datafusion::execution::object_store::ObjectStoreUrl; | ||
| use datafusion::logical_expr::{ | ||
|
|
@@ -244,9 +244,10 @@ impl TableProvider for IndexTableProvider { | |
| let source = | ||
| Arc::new(ParquetSource::default().with_predicate(self.schema(), predicate)); | ||
| let mut file_scan_config = | ||
| FileScanConfig::new(object_store_url, self.schema(), source) | ||
| FileScanConfigBuilder::new(object_store_url, self.schema(), source) | ||
| .with_projection(projection.cloned()) | ||
| .with_limit(limit); | ||
| .with_limit(limit) | ||
| .build(); | ||
|
||
|
|
||
| // Transform to the format needed to pass to DataSourceExec | ||
| // Create one file group per file (default to scanning them all in parallel) | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it possible to have this return as a function? Is it because of import cycles?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you mean as a
DataSourceExecfunction? It also looks a bit verbose to me, but the inner Arc is needed for dynamic dispatch, and the outer one makes the return type more explicit. Happy to make itDataSourceExec::new_arcif you want, but i don't think we use that a lot in datafusionThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant just a cosmetic change to not re-writing the whole
Arc::new(DataSourceExec::new(Arc::new(file_scan)))Maybe it can be something likeDataSourceExec::from_file_source(file_scan) -> Arc<DataSourceExec>What would you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you like 748a335 ? Looks cleaner