Breaking changes!
get_add_actions returns Arro3Table instead of Arro3RecordBatch
Main changes
- perf: parallel partition writers via per-stream JoinSet by @fvaleye in #4193
- refactor(python): get add action return arrow table by @vsmanish1772 in #4204
- feat: added disk spilling for merge by @thomasfrederikhoeck in #4219
- feat: log compaction by @ion-elgreco in #4210
- feat: vacuum lite mode to avoid storage listing by @khalidmammadov in #4227
- feat: implement batched deletion in delete_dir function by @nabobery in #4244
- chore: python datafusion 52 upgrade by @ethan-tyler in #4226
What's Changed
- chore: bump version from 1.4.1 to 1.4.2 by @ion-elgreco in #4182
- refactor: use FileSelection for matched file scans by @ethan-tyler in #4188
- feat: add DeltaScan insert_into with runtime log_store by @ethan-tyler in #4187
- fix: make session the first argument of update_datafusion_session by @pauldouane in #4192
- fix: preserve generated column metadata during schema merge by @ethan-tyler in #4191
- refactor: use BatchAdapterFactory for scan adaptation by @ethan-tyler in #4195
- fix: clarify vacuum command documentation for DeltaTable by @khalidmammadov in #4196
- fix(delete): use Add metadata for partition only DELETE by @ethan-tyler in #4150
- chore: harden scan adapter caching and DV mask edge cases by @ethan-tyler in #4199
- fix(datafusion): avoid overflow when scanning add actions by @vsmanish1772 in #4197
- chore: drop unused ReceiverStreamBuilder spawn by @ethan-tyler in #4201
- fix: propagate session config through Delta factory path by @ethan-tyler in #4202
- fix: enforce file-id filter semantics in scan planning by @ethan-tyler in #4206
- chore: set compression for partition optimization as well by @rtyler in #4208
- chore: enable snappy compression on checkpoints by @rtyler in #4209
- chore: change the versions for the next "majorish" release of 🦀 by @rtyler in #4207
- fix: code block indenting and fencing by @plaindocs in #4212
- fix: delete partition fallback batching and add action coalescing by @ethan-tyler in #4211
- fix: pad DV keep masks to numRecords by @ethan-tyler in #4236
- feat: route DeltaDataSink through shared write_streams by @ethan-tyler in #4194
- fix: generated column expr with SchemaMode::Merge handles missing columns by @veeceey in #4223
- docs: minor updates to the readme and contributing by @plaindocs in #4238
- feat: improve rust code samples in docs by @khalidmammadov in #4242
- docs: fix several nit issues in docs by @anshulbaliga7 in #4245
- feat(python): add post_commithook_properties to alter metadata apis by @vsmanish1772 in #4249
- docs: home page refactor by @plaindocs in #4234
- fix: move table builder local path check guard to open_table by @khalidmammadov in #4248
- chore: allow easier running of Azure integration tests in Python by @rtyler in #4255
- fix: create_write_transaction works again, now with 100% more coverage by @rtyler in #4260
- fix(warnings): change const to static for extension planners and reduce warnings by @khalidmammadov in #4259
- refactor: add insert_into and file selection write path to DeltaScan by @ethan-tyler in #4250
- fix: coerce decimal literals in target subset filters by @ethan-tyler in #4267
- fix: add a central arrow delta type normalization by @fvaleye in #4254
- fix: change visibility of add_action method to public by @lizardoluis in #4232
- refactor: reduce clippy warnings in core and in LogicalPlanBuilder and DeltaScanStream by @khalidmammadov in #4270
- docs: update contributing docs to include DCO more explicitly by @rtyler in #4271
- feat: consolidate target_file_size and allow unbounded writes by @abhiaagarwal in #4257
- fix: warn on lossy nanosecond timestamp truncation during normalization by @fvaleye in #4272
- refactor: migrate merge target scan to DeltaScanNext by @ethan-tyler in #4266
- fix: remove unsupported create_add from public API by @ethan-tyler in #4274
- refactor: reduce clippy warnings in core create by @khalidmammadov in #4275
New Contributors
- @pauldouane made their first contribution in #4192
- @plaindocs made their first contribution in #4212
- @veeceey made their first contribution in #4223
- @nabobery made their first contribution in #4244
- @anshulbaliga7 made their first contribution in #4245
- @lizardoluis made their first contribution in #4232
Full Changelog: python-v1.4.2...python-v1.5.0