-
Notifications
You must be signed in to change notification settings - Fork 85
Use return_stats option to collect column statistics #108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 45 commits
Commits
Show all changes
46 commits
Select commit
Hold shift + click to select a range
bbc924d
WIP: Use return_stats option to collect column statistics
sfc-gh-agedemenli 92593f3
Duckdb patch return_stats
sfc-gh-abozkurt 068455c
Fix null schema
sfc-gh-agedemenli a572d54
Null check for min/max values
sfc-gh-agedemenli 0f16944
Skip statistics for some types
sfc-gh-agedemenli 9f206da
Add schema==NULL check for column stats
sfc-gh-agedemenli 85bdfb8
Fallback to previous mechanism it stats are null
sfc-gh-agedemenli 8b12b80
Fix: Make list from file stats
sfc-gh-agedemenli 19185f3
Skip tests for nested fields
sfc-gh-agedemenli 36c2048
Do not use enable stats guc
sfc-gh-agedemenli 393f6ea
fixup
sfc-gh-agedemenli f95d58e
parse return_stats output to map type
sfc-gh-abozkurt 135672f
Add map type to parse duckdb result
sfc-gh-agedemenli adc97a6
Remove unnecessary deepcopy for stats
sfc-gh-agedemenli 94cb814
Add comments
sfc-gh-agedemenli d1f5a89
Minor
sfc-gh-agedemenli a93d356
Use names from file stats instead of ListRemoteFileNames
sfc-gh-agedemenli dec8b2d
Minor improvements
sfc-gh-agedemenli 1d8ad4c
Rename FindGeneratedDataFiles to GetNewFileOpsFromFileStats
sfc-gh-agedemenli 4436d09
Add struct ColumnStatsCollector
sfc-gh-agedemenli 96e2162
Rewrite GetDataFileColumnStatsList, add helpers and logs
sfc-gh-agedemenli 6fde32d
Use ColumnStatsCollector in PerformDeleteFromParquet
sfc-gh-agedemenli 1b1bc05
Minor rename variable
sfc-gh-agedemenli 8c0e688
Move FindLeafField to engine
sfc-gh-agedemenli 994cc33
Move ShouldSkipStatistics to engine
sfc-gh-agedemenli 78c536a
Add leaf_field.c
sfc-gh-agedemenli 40ec785
Comment
sfc-gh-agedemenli 1810013
fix reference stats list
sfc-gh-agedemenli 2721262
fixup
sfc-gh-agedemenli 076e761
Comment
sfc-gh-agedemenli 0db1368
Get rid of redundant string duplication
sfc-gh-agedemenli 5276ba3
Use the collector as return type
sfc-gh-agedemenli 4c8a83e
fixup
sfc-gh-agedemenli 3b9a2df
fixup
sfc-gh-agedemenli fad3ea5
Move CreateDataFileStatsForTable
sfc-gh-agedemenli f7b993a
Add ExecuteCopyCommandOnPGDuckConnection
sfc-gh-agedemenli cfb770c
Handle modification in case stats is empty
sfc-gh-agedemenli 5b297b1
Move stats related logic to new file: data_file_stats.c
sfc-gh-agedemenli 887ac71
Move field&leaf field functions
sfc-gh-agedemenli 2db85e2
Remove unnecessary includes and whitespaces
sfc-gh-agedemenli 5647b47
Use returned stats for deleted files
sfc-gh-agedemenli e3b51e7
Rename ColumnStatsCollector to StatsCollector
sfc-gh-agedemenli fa8ae41
Reindent
sfc-gh-agedemenli 699eb3f
generate stats for all files at WriteQueryResultTo
sfc-gh-abozkurt 5037045
add assertion
sfc-gh-abozkurt f1cf990
minor improvements
sfc-gh-agedemenli File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -33,3 +33,9 @@ CREATE FUNCTION __lake__internal__nsp__.from_hex(text) | |
| LANGUAGE C | ||
| IMMUTABLE PARALLEL SAFE STRICT | ||
| AS 'MODULE_PATHNAME', $function$pg_lake_internal_dummy_function$function$; | ||
|
|
||
| -- Register map types, will be used for parsing DuckDB maps for COPY .. (return_stats) | ||
| -- we prefer to create in the extension script to avoid concurrent attempts to create | ||
| -- the same map, which may throw errors | ||
| SELECT map_type.create('TEXT','TEXT'); | ||
| SELECT map_type.create('TEXT','map_type.key_text_val_text'); | ||
|
||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.