v0.16.0 (2025-09-19)
- New expression variants:
UnaryExpressionandToJsonexpression (#1192) - New SnapshotBuilder API:
Snapshot::try_new(...)replaced withSnapshot::builder(...)and its associated methods. TLDR, you make a builder and callbuildto construct aSnapshot. (#1189) - Simplify the
Expr::TransformAPI, add FFI support:- Reworks the pub members of Transform used by Expr::Transform and introduce a new FieldTransform struct. Also, rework Transform::new (constructor) and Transform::with_input_path (method) into a pair of constructors, new_top_level and new_nested.
- Adds two new members to the FFI EngineExpressionVisitor struct -- visit_transform_expression and visit_field_transform, which also changes the ordering of existing fields. (#1243)
- Add
numRecordstoADD_FILES_SCHEMA(#1235) - New
EngineDatatrait required method:try_append_columns(#1190) - Make ColumnType private (#1258)
- Add row tracking writer feature: updates
ADD_FILES_SCHEMA(see PR for details) (#1239) - Migrate
Snapshot::try_new_fromintoSnapshotBuilder::new_from(#1289) - (FFI) Add CDvInfo struct: The
CScanCallbacknow takes a&CDvInfoand not a&DvInfo. (#1286) - (FFI) Add explicit numbers for each
KernelErrorenum variants. (see PR for details) (#1313) - (more) new expression variants:
Expression::VariadicandCoalesceexpressions (#1198) - All new/modified
StructTypeconstructors, see PR for details (#1278) - Introduce metadata column API:
StructTypehas new private field (#1266) - (FFI)
engine_data::get_engine_datanow takes anAllocateErrorFninstead of an engine. (#1325) StructType::into_fieldsreturnsDoubleEndedIterator + FusedIterator(#1327)
- (catalog-managed) Add log_tail to list_log_files (#1194)
- CommitInfo sets a txnId (#1262)
- Allow LargeUTF8 -> String and LargeBinary -> Binary in arrow conversion (#1294)
- Implement log compaction (#1234)
- Disallow equal version in log compaction (#1309)
- Add
IterabletoStructType(#1287) - ParsedLogPath for staged commits (#1305)
- Default expression eval supports nested transforms (#1247)
- Introduce row index metadata column (#1272)
- Update README.md to enhance FFI documentation (#1237)
- Make checkpoint visitor more efficient using short circuiting (#1203)
- Factor out a method for LastCheckpointHint path generation (#1228)
- Do not guess Vec size for checkpoints (#1263)
- Introduce current_time_ms() helper (#1256)
- Retention calculation into a new trait (#1264)
- Minor Refactoring in Log Compaction (#1301)
- Rename SnapshotBuilder::new to new_for (#1306)
- Move log replay into the action reconciliation module (#1295)
- Introduce SnapshotRef type alias (#1299)
- Row tracking write cleanup (#1291)
- Update invalid-handle tests for rustc 1.90 (#1321)
- Create expression benchmark for default engine (#1220)
- Update changelog for 0.15.1 release (#1227)
- Sync changelog for 0.15.2 (#1251)
- Update data types test to validate full Arrow error message (#1259)
- Add better panic message when not OK (#1293)
- Add test for empty commits and clean up test error types (#1252)
- Update contributing.md (#1206)
v0.15.2 (2025-09-03)
- pin
comfy-tableat7.1.4to restore kernel MSRV (#1231) - Arrow json decoder fix for breakage on long json string (#1244)
v0.15.1 (2025-08-28)
- Make ListedLogFiles::try_new internal-api (again) (#1226)
v0.15.0 (2025-08-28)
- Rename
default-enginefeature todefault-engine-native-tls(#1100) - Add arrow 56 support, drop arrow 54 (#1141)
- Add
catalogManaged(andcatalogOwned-preview) table features +catalog-managedexperimental feature flag (#1165) ExpressionRefinstead of ownedExpressionfor transforms (#1171):Expression::Structnow takes aVec<ExpressionRef>instead ofVec<Expression>- Add support for Column Mapping Id Mode (#1056): significantly changes the semantics (
Enginetrait requirements) of the parquet handler in column mapping id mode. SeeParquetHandler::read_parquet_filesdocs for details. StructField.physical_nameis no longer public (internal-api) (#1186)- Add support for sparse transform expressions (#1199): adds a new
Expression::Transformvariant. - Expression evaluators take
ExpressionRefas input (#1221):EvaluationHandler::new_expression_evaluatorandEvaluationHandler::new_predicate_evaluatortake Arc instead of owned expression/predicate.scan::state::transform_to_logicaltakes ownedOption<ExpressionRef>instead of a borrowed reference.transaction::WriteContext::logical_to_physicalreturns an Arc instead of a borrowed reference
- Impl IntoEngineData for Protocol action (#1136)
- Add txnId to commit info (#1148)
- (catalog-managed) Experimental uc client (#1164)
- Implement
IntoEngineDataforDomainMetadata(#1169) - Add example for table writes (#1119)
- (ffi) Add
visit_expression_literal_date(#1096)
- Match arrow versions in examples (#1166)
- Support arrow views in ensure_data_types (#1028)
- Make
ListedLogFilesinternal-api again (#1209) - Provide accurate error when evaluating a different type in LiteralExpressionTransform (#1207)
- Fix failing test and improve indentation test error message (#1135)
- Contiguous commit file checking inside
ListedLogFiles::try_new()(#1107) - New listed_log_files module (#1150)
- Move LastCheckpointHint to separate module (#1154)
- (catalog-managed) Push down _last_checkpoint read into LogSegment (#1204)
- Add metadata-only regression test (#1183)
- Parameterize column mapping tests to check different modes (#1176)
- Add apply_schema mismatch test (#1210)
- Appease clippy in rustc 1.89 (#1151)
- Bump MSRV to 1.84 (#1142)
- Remove object store versioning (#1161)
- Remove unused deps from examples (#1175)
- Update deps (#1181)
v0.14.0 (2025-08-01)
- Removed Table APIs: instead use
SnapshotandTransactiondirectly. (#976) - Add support for Variant type and the variantType table feature (new
DataType::Variantenum variant and newvariantType-previewandvariantShreddingReader/Writer features) (#1015) - Expose post commit stats. Now, in
Transaction::committheCommittedvariant of the enum includes apost_commit_statsfield with info about the commits since checkpoint and log compaction. (#1079) - Replace
Transaction::with_commit_info()API withwith_engine_info()API (#997) - Removed
DataType::decimal_uncheckedAPI (#1087) make_physicaltakes column mapping and sets parquet field ids. breaking: (1)StructField::make_physicalis now an internal_api instead of a public function. Its signature has also changed. And (2) IfColumnMappingModeisNone, then the physical schema's name is the logical name. Previously, kernel would unconditionally use the column mapping physical name, even if column mapping mode is none. (#1082)
- (ffi) Added default-engine-rustls feature and extern "C" for .h file (#1023)
- Add log segment constructor for timestamp to version conversion (#895)
- Expose unshredded variant type as
DataType::unshredded_variant()(#1086) - New ffi API for
get_domain_metadata()(#1041) - Add append functions to ffi (#962)
- Add try_new and
IntoEngineDatafor Metadata action (#1122)
- Rename object_store PutMultipartOpts (#1071, #1090)
- Use object_store >= 0.12.3 for arrow 55 feature (#1117)
- VARIANT follow-ups for SchemaTransform etc (#1106)
- Downgrade stale
_last_checkpointlog fromwarn!toinfo!(#777) - Exclude
tests/datafrom release (#1092) - Deny panics in prod code (#1113)
- Add derive macro tests (#514)
- Add unshredded variant read test (#1088)
- (ffi)
AllocateErrorFnshould be able to allocate a nullptr (#1105) - Assert tests on error message instead of
is_err()(#1110)
- Expose Snapshot and ListedLogFiles constructors behind internal api flag (#1076)
- Only semver check released crates (#1101)
v0.13.0 (2025-07-11)
- Add support for opaque engine expressions. Includes a number of changes: new
ExpressionTypes (OpaqueExpression,OpaquePredicate,Unknown) andExpression/Predicatevariants (Opaque,Unknown), and visitors, transforms, and evaluators changed to support opaque/unknown expressions/predicate. (#686) - Rename
Transaction::add_write_metadatatoTransaction::add_files(#1019)
- add ability to only retain SetTransaction actions <= SetTransactionRetentionDuration (#1013)
- (ffi) Add timetravel by version number (#1044)
- Introduce a crate for args that are common between examples (#1046)
- Support reordering structs that are inside maps in default parquet reader (#1060)
- Add default engine support for arrow eval of opaque expressions (#980)
- Expose descriptive fields on Metadata action (#1051)
- Clippy fmt cleanup (#1042)
- Examples: move logic into the thread::scope call so examples don't hang (#1040)
- Remove panic from read_last_checkpoint (#1022)
- Always write
_last_checkpointwith parts = None (#1053) - Don't release
commoncrate (used only by example programs) (#1065)
- Move various test util functions to test-utils crate (#985)
- Define and use a cow helper for transforms (#1057)
- Expand capability and usage of
Cowhelper for transforms (#1061)
v0.12.1 (2025-06-05)
- Remove azure suffix range request (#1006)
v0.12.0 (2025-06-04)
- Remove
GlobalScanState: instead use newScanAPIs directly (logical_schema,physical_schema, etc.) (#947) - table feature enums are now
internal_api(not public, unlessinternal-apiflag is set) (#998)
- Use compacted log files in log-replay (#950)
- New
#[derive(IntoEngineData)]proc macro (#830) - Add support for kernel default expression evaluation (#979)
- New: panic in debug builds if ListedLogFiles breaks invariants (#986)
- Create visitor for getting In-commit Timestamp (#897)
- Binary searching utility function for timestamp to version conversion (#896)
- Enable "TimestampWithoutTimezone" table feature and add protocol validation for it (#988)
- add missing reader/writer features (variantType/clustered) (#998)
- Disable timestamp column's
maxValuesfor data skipping (#1003)
- Make KernelPredicateEvaluator trait dyn-compatible (#994)
v0.11.0 (2025-05-27)
- Add in-commit timestamp table feature (#894)
- Make
Errornon_exhaustive (will reduce future breaking changes!) (#913) Scalar::Mapsupport (#881)- New
Scalar::Map(MapData)variant andMapDatastruct to describeScalarmaps. - New
visit_literal_mapFFI
- New
- Split out predicates as different from expressions (#775): pervasive change which moves some expressions to new predicate type.
- Bump MSRV from 1.81 to 1.82 (#942)
DataSkippingPredicateEvaluator's associated typesTypedStatandIntStatcombined into oneColumnStattype (#939)- Code movement in FFI crate (#940):
- Rename
ffi::expressions::enginemod askernel_visitor - Rename
ffi::expressions::kernelmod asengine_visitor - Move the
free_kernel_[expression|predicate]functions to theexpressionsmod - Move the
EnginePredicatestruct to theffi::scanmodule
- Rename
- Fix timestamp ntz in physical to logical cdf (#948): now
TableChangesScan::executereturns a schema with_commit_timestampof typeTimestamp(UTC) instead ofTimestampNtz. - Add TryIntoKernel/Arrow traits (#946): Removes old
From/Intoimplementations for kernel schema types, replaces withTryFromKernel/TryIntoKernel/TryFromArrow/TryIntoArrow. Migration should be as simple as changing a.try_into()to a.try_into_kernel()or.try_into_arrow(). - Remove
SyncEngine(now test-only), useDefaultEngineeverywhere else (#957)
- Add
Snapshot::checkpoint()&Table::checkpoint()API (#797) - Add CRC ParsedLogPath (#889)
- Use arrow array builders in Scalar::to_array (#905)
- Add
domainMetadataread support (#875) - Support maps and arrays in literal_expression_transform (#882)
- Add
CheckpointWriter::finalize()API (#851) DataSkippingPredicatedyn compatible (#939):finish_eval_pred_junctionnow takes&dyn Iterator- Store compacted log files in LogSegment (#936)
- Add CRC, FileSizeHistogram, and DeletedRecordCountsHistogram schemas (#917)
- Scan from previous result (#829)
- Include latest CRC in LogSegment (#964)
- CRC protocol+metadata visitor (#972)
- Make several types/function pub and fix their doc comments (#977)
KernelPredicateEvaluatorandKernelPredicateEvaluatorDefaultsare now pub.DataSkippingPredicateEvaluatoris now pub.- add new type aliases
DirectDataSkippingPredicateEvaluatorandIndirectDataSkippingPredicateEvaluator - Arrow engine
evaluate_expressionandevaluate_predicateare now pub. Expression::predicaterenamed toExpression::from_pred
- Fix incorrect results for
Scalar::Array::to_array(#905) - Use object_store::Path::from_url_path when appropriate (#924)
- Don't include modules via a macro (#935)
- Rustc 1.87 clippy fixes (#955)
- Allow CheckpointDataIterator to be used across await (#961)
- Remove
target-cpu=nativerustflags (#960) - Rename
drop_null_container_valuestoallow_null_container_values(#965) - Make
ActionsBatchfields pub forinternal-api(#983)
- Add readme badges (#904)
- Combine actions counts in
CheckpointVisitor(#883) - Simplify Display for Expression and Predicate (#938)
- Macro traits cleanup (#967)
- Remove redundant binary predicate operations (#949)
- Make arrow predicate eval directly invertible (#956)
- Add
ActionsBatch(#974)
- Remove abs_diff since we have rust 1.81 (#909)
- Conditional compilation instead of suppressing clippy warnings (#945)
- Expose some more arrow utils via
internal-api(#971) - Use consistent naming of kernel data type in arrow eval tests (#978)
- Cargo doc workspace + all-features (#981)
v0.10.0 (2025-04-28)
- Updated dependencies, breaking updates:
itertools 0.14,thiserror 2, andstrum 0.27(#814) - Rename
developer-visibilityfeature flag tointernal-api(#834) - Tidy up AND/OR/NOT API and usage (#842)
- Rename VariadicExpression to JunctionExpression (#841)
- Enforce precision/scale correctness of Decimal types and values (#857)
- Expression system refactors
- Make literal expressions more strict (removed
Intotrait impl) (#867) - Remove nearly-unused expression
lt_eq/gt_eqoverloads (#871) - Move expression transforms (
ExpressionTransformandExpressionDepthChecker) to own module (#878) - Code movement in expression-related code (Reordered variants of the
BinaryExpressionOpenum) (#879)
- Make literal expressions more strict (removed
- Introduce the ability for consumers to add ObjectStore url handlers (#873)
- Update to arrow 55, drop arrow 53 support (#885, #903)
- Add
CheckpointVisitorin newcheckpointmod (#738) - Add
CheckpointLogReplayProcessorin newcheckpointsmod (#744) - Add
transaction.with_transaction_id()API (#824) - Add
snapshot.get_app_id_version(app_id, engine)(#862) - Overwrite logic in
write_json_filefor default & sync engine (#849)
- default engine: Sort list results based on URL scheme (#820)
impl AllocateError for T: ExternEngine(#856)- Disable predicate pushdown in
Scan::execute(#861)
- Correct docstring for
DefaultEngine::new(#821) - Remove
acceptancefromrust-analyzer.cargo.featuresin README (#858)
- Rename
predicatesmod tokernel_predicates(#822) - Code movement to tidy up ffi (#840)
- Grab bag of cosmetic tweaks and comment updates (#848)
- New
#[internal_api]macro instead ofvisibilitycrate (#835) - Expression transforms use new recurse_into_children helper (#869)
- Minor test improvements (#872)
- Remove unused dependencies (#863)
- Test code uses Expr shorthand for Expression (#866)
- Arrow DefaultExpressionEvaluator need not box its inner expression (#868)
v0.9.0 (2025-04-08)
- Change
MetadataValue::Number(i32)toMetadataValue::Number(i64)(#733) - Get prefix from offset path:
DefaultEngine::newno longer requires atable_rootparameter andlist_fromconsistently returns keys greater than the offset (#699) - Make
snapshot.schema()return aSchemaRef(#751) - Make
visit_expression_internalprivate, andunwrap_kernel_expressionpub(crate) (#767) - Make actions types
pub(crate)instead ofpub(#405) - New
null_rowExpressionHandler API (#662) - Rename enums
ReaderFeatures->ReaderFeatureandWriterFeatures->WriterFeature(#802) - Remove
get_prefix from engine getters (#804) - Rename
FileSystemClienttoStorageHandler(#805) - Adopt types for table features (New
ReadFeature::Unknown(String)and (WriterFeature::Unknown(String)) (#684) - Renamed
ScanDatatoScanMetadata(#817)- rename
ScanDatatoScanMetadata - rename
Scan::scan_data()toScan::scan_metadata() - (ffi) rename
free_kernel_scan_data()tofree_scan_metadata_iter() - (ffi) rename
kernel_scan_data_next()toscan_metadata_next() - (ffi) rename
visit_scan_data()tovisit_scan_metadata() - (ffi) rename
kernel_scan_data_init()toscan_metadata_iter_init() - (ffi) rename
KernelScanDataIteratortoScanMetadataIterator - (ffi) rename
SharedScanDataIteratortoSharedScanMetadataIterator
- rename
ScanMetadatais now a struct (instead of tuple) with newFiltereEngineDatatype (#768)
- (
v2Checkpoint) Extract & insert sidecar batches inreplay's action iterator (#679) - Support the
v2Checkpointreader/writer feature (#685) - Add check for whether
appendOnlytable feature is supported or enabled (#664) - Add basic partition pruning support (#713)
- Add
DeletionVectorsto supported writer features (#735) - Add writer version 2/invariant table feature support (#734)
- Improved pre-signed URL checks (#760)
- Add
CheckpointMetadataaction (#781) - Add classic and uuid parquet checkpoint path generation (#782)
- New
Snapshot::try_new_from()API (#549)
- Return
Error::unsupportedinstead of panic inScalar::to_array(MapType)(#757) - Remove 'default-members' in workspace, default to all crates (#752)
- Update compilation error and clippy lints for rustc 1.86 (#800)
- Split up
arrow_expressionmodule (#750) - Flatten deeply nested match statement (#756)
- Simplify predicate evaluation by supporting inversion (#761)
- Rename
LogSegment::replaytoLogSegment::read_actions(#766) - Extract deduplication logic from
AddRemoveDedupVisitorinto embeddableFileActionsDeduplicator(#769) - Move testing helper function to
test_utilsmod (#794) - Rename
_last_checkpointfromCheckpointMetadatatoLastCheckpointHint(#789) - Use ExpressionTransform instead of adhoc expression traversals (#803)
- Extract log replay processing structure into
LogReplayProcessortrait (#774)
- Add V2 checkpoint read support integration tests (#690)
- Use maintained action to setup rust toolchain (#585)
v0.8.0 (2025-03-04)
- ffi:
get_partition_column_countandget_partition_columnsnow take aSnapshotinstead of aScan(#697) - ffi: expression visitor callback
visit_literal_decimalnow takesi64for the upper half of a 128-bit int value (#724) -
DefaultJsonHandler::with_readahead()renamed toDefaultJsonHandler::with_buffer_size()(#711)
- DefaultJsonHandler's defaults changed:
- default buffer size: 10 => 1000 requests/files
- default batch size: 1024 => 1000 rows
- Bump MSRV to rustc 1.81 (#725)
- Pin
chronoversion to fix arrow compilation failure (#719)
- Replace default engine JSON reader's
FileStreamwith concurrent futures (#711)
v0.7.0 (2025-02-24)
- Read transforms are now communicated via expressions (#607, #612, #613, #614) This includes:
ScanDatanow includes a third tuple field: a row-indexed vector of transforms to apply to theEngineData.- Adds a new
scan::state::transform_to_logicalfunction that encapsulates the boilerplate of applying the transform expression - Removes
scan_action_iterAPI andlogical_to_physicalAPI - Removes
column_mapping_modefromGlobalScanState - ffi: exposes methods to get an expression evaluator and evaluate an expression from c
- read-table example: Removes
add_partition_columnsin arrow.c - read-table example: adds an
apply_transformfunction in arrow.c
- ffi: support field nullability in schema visitor (#656)
- ffi: expose metadata in SchemaEngineVisitor ffi api (#659)
- ffi: new
visit_schemaFFI now operates on aSchemainstead of aSnapshot(#683, #709) - Introduced feature flags (
arrow_54andarrow_53) to select major arrow versions (#654, #708, #717)
- Read
partition_valuesinRemoveVisitorand removebreakinRowVisitorforRemoveVisitor(#633) - Add the in-commit timestamp field to CommitInfo (#581)
- Support NOT and column expressions in eval_sql_where (#653)
- Add check for schema read compatibility (#554)
- Introduce
TableConfigurationto jointly manage metadata, protocol, and table properties (#644) - Add visitor
SidecarVisitorandSidecaraction struct (#673) - Add in-commit timestamps table properties (#558)
- Support writing to writer version 1 (#693)
- ffi: new
logical_schemaFFI to get the logical schema of a snapshot (#709)
- Incomplete multi-part checkpoint handling when no hint is provided (#641)
- Consistent PartialEq for Scalar (#677)
- Cargo fmt does not handle mods defined in macros (#676)
- Ensure properly nested null masks for parquet reads (#692)
- Handle predicates on non-nullable columns without stats (#700)
- Update readme to reflect tracing feature is needed for read-table (#619)
- Clarify
JsonHandlersemantics on EngineData ordering (#635)
- Make [non] nullable struct fields easier to create (#646)
- Make eval_sql_where available to DefaultPredicateEvaluator (#627)
- Port cdf tests from delta-spark to kernel (#611)
v0.6.1 (2025-01-10)
- New feature flag
default-engine-rustls(#572)
- Allow partition value timestamp to be ISO8601 formatted string (#622)
- Fix stderr output for handle tests (#630)
v0.6.0 (2024-12-17)
API Changes
Breaking
Scan::executetakes anArc<dyn EngineData>now (#553)StructField::physical_nameno longer takes aColumnMappingargument (#543)- removed
ColumnMappingModeDefaultimplementation (#562) - Remove lifetime requirement on
Scan::execute(#588) scan::Scan::predicaterenamed asphysical_predicateto eliminate ambiguity (#512)scan::log_replay::scan_action_iternow takes fewer (and different) params. (#512)Expression::Unary,Expression::Binary, andExpression::Variadicnow wrap a struct of the same name containing their fields (#530)- Moved
delta_kernel::engine::parquet_stats_skippingmodule todelta_kernel::predicate::parquet_stats_skipping(#602) - New
ErrorvariantsError::ChangeDataFeedIncompatibleSchemaandError::InvalidCheckpoint(#593)
Additions
- Ability to read a table's change data feed with new TableChanges API! See new
table_changesmodule as well as the 'read-table-changes' example (#597). Changes include:
- Implement Log Replay for Change Data Feed (#540)
ScanFileexpression and visitor for CDF (#546)- Resolve deletion vectors to find inserted and removed rows for CDF (#568)
- Helper methods for CDF Physical to Logical Transformation (#579)
TableChangesScan::executeand end to end testing for CDF (#580)TableChangesScan::schemamethod to get logical schema (#589)
- Enable relaying log events via FFI (#542)
Implemented enhancements:
- Define an ExpressionTransform trait (#530)
- [chore] appease clippy in rustc 1.83 (#557)
- Simplify column mapping mode handling (#543)
- Adding some more miri tests (#503)
- Data skipping correctly handles nested columns and column mapping (#512)
- Engines now return FileMeta with correct millisecond timestamps (#565)
Fixed bugs:
- don't use std abs_diff, put it in test_utils instead, run tests with msrv in action (#596)
- (CDF) Add fix for sv extension (#591)
- minimal CI fixes in arrow integration test and semver check (#548)
v0.5.0 (2024-11-26)
API Changes
Breaking
Expression::Column(String)is nowExpression::Column(ColumnName)#400- delta_kernel_ffi::expressions moved into two modules:
delta_kernel_ffi::expressions::engineanddelta_kernel_ffi::expressions::kernel#363 - FFI: removed (hazardous)
impl FromforKernelStringSlizeand addedunsafeconstructor instead #441 - Moved
LogSegmentinto its own module (log_segment::LogSegment) #438 - Renamed
EngineData::lengthasEngineData::len#471 - New
AsAnytrait:AsAny: Any + Send + Syncrequired bound on all engine traits #450 - Rename
mod featurestomod table_features#454 - LogSegment fields renamed:
commit_files->ascending_commit_filesandcheckpoint_files->checkpoint_parts#495 - Added minimum-supported rust version: currenly rust 1.80 #504
- Improved row visitor API: renamed
EngineData::extractasEngineData::visit_rows, andDataVisitortrait renamed asRowVisitor#481 - FFI: New
mod engine_dataandmod error(movedErrortoerror::Error) #537 - new error types:
InvalidProtocol,InvalidCommitInfo,MissingCommitInfo,FileAlreadyExists,Unsupported,ParseIntervalError,ChangeDataFeedUnsupported
Additions
- New
ColumnName,column_name!,column_expr!for structured column name parsing. #400 #467 - New
EngineAPIwrite_json_file()for atomically writing JSON #370 - New
TransactionAPI for creating transactions, adding commit info and write metadata, and commiting the transaction to the table. IncludesTable.new_transaction(),Transaction.write_context(),Transaction.with_commit_info,Transaction.with_operation(),Transaction.with_write_metadata(), andTransaction.commit()#370 #393 - FFI: Visitor for converting kernel expressions to engine expressions. See the new example at
ffi/examples/visit-expression/#363 - FFI: New
TryFromStringSlicetrait andkernel_string_slicemacro #441 - New
DefaultEngineengine implementation for writing parquet:write_parquet_file()#393 - Added support for parsing comma-separated column name lists:
ColumnName::parse_column_name_list()#458 - New
VacuumProtocolChecktable feature #454 DvInfonow implementsClone,PartialEq, andEq#468Statsnow implementsDebug,Clone,PartialEq, andEq#468- Added
Cdcaction support #506 - (early CDF read support) New
TableChangestype to read CDF from a table between versions #505 - (early CDF read support) Builder for scans on
TableChanges#521 - New
TablePropertiesstruct which can parse tables'metadata.configuration#453 #536
Implemented enhancements:
- FFI examples now use AddressSanitizer #447
ColumnNamenow tracks a path of field names instead of a simple string #445- use
ParsedLogPathsfor files inLogSegment#472 - FFI: added Miri support for tests #470
- check table URI has trailing slash #432
- build
cargo docsin CI #479 - new
test-utilscrate #477 - added proper protocol validation (both parsing correctness and semantic correctness) #454 #493
- harmonize predicate evaluation between delta stats and parquet footer stats #420
- more log path tests #485
ensure_read_supportedandensure_write_supportedAPIs #518- include NOTICE and LICENSE in published crates #520
- FFI: factored out read_table kernel utils into
kernel_utils.h/c#539 - simplified log replay visitor and avoid materializing Add/Remove actions #494
- simplified schema transform API #531
- support arrow view types in conversion from
ArrowDataTypeto kernel'sDataType#533
Fixed bugs:
- Disabled missing-column row group skipping: The optimization to treat a physically missing column as all-null is unsound, if the schema was not already verified to prove that the table's logical schema actually includes the missing column. We disable it until we can add the necessary validation. #435
- fixed leaks in read_table FFI example #449
- fixed read_table compilation on windows #455
- fixed various predicate eval bugs #420
v0.4.1 (2024-10-28)
API Changes
None.
Fixed bugs:
- Disabled missing-column row group skipping: The optimization to treat a physically missing column as all-null is unsound, if the schema was not already verified to prove that the table's logical schema actually includes the missing column. We disable it until we can add the necessary validation. #435
v0.4.0 (2024-10-23)
API Changes
Breaking
pub ScanResult.maskfield made private and only accessible asScanResult.raw_mask()method #374- new
ReaderFeaturesenum variant:TypeWideningandTypeWideningPreview#335 - new
WriterFeaturesenum variant:TypeWideningandTypeWideningPreview#335 - new
Errorenum variant:InvalidLogPathwhen kernel is unable to parse the name of a log path #347 - Module moved:
mod delta_kernel::transaction->mod delta_kernel::actions::set_transaction#386 - change
default-featureto be none (removedsync-engineby default. If downstream users relied on this, turn onsync-enginefeature or specific arrow-related feature flags to pull in the pieces needed) #339 Scan'sexecute(..)method now returns a lazy iterator instead of materializing aVec<ScanResult>. You can trivially migrate to the new API (and force eager materialization by using.collect()or the like on the returned iterator) #340- schema and expression FFI moved to their own
mod delta_kernel_ffi::schemaandmod delta_kernel_ffi::expressions#360 - Parquet and JSON readers in
Enginetrait now takeArc<Expression>(aliased toExpressionRef) instead ofExpression#364 StructType::new(..)now takes animpl IntoIterator<Item = StructField>instead ofVec<StructField>#385DataType::struct_type(..)now takes animpl IntoIterator<Item = StructField>instead ofVec<StructField>#385- removed
DataType::array_type(..)API: there is already animpl From<ArrayType> for DataType#385 Expression::struct_expr(..)renamed toExpression::struct_from(..)#399- lots of expressions take
impl Into<Self>orimpl Into<Expression>instead of justSelf/Expressionnow #399 - remove
log_replay_iterandprocess_batchAPIs inscan::log_replay#402
Additions
- remove feature flag requirement for
impl GetDataon()#334 - new
full_mask()method onScanResult#374 StructType::try_new(fields: impl IntoIterator<Item = StructField>)#385DataType::try_struct_type(fields: impl IntoIterator<Item = StructField>)#385StructField.metadata_with_string_values(&self) -> HashMap<String, String>to materialize and return our metadata into a hashmap #331
Implemented enhancements:
- support reading tables with type widening in default engine #335
- add predicate to protocol and metadata log replay for pushdown #336 and #343
- support annotation (macro) for nullable values in a container (for
#[derive(Schema)]) #342 - new
ParsedLogPathtype for better log path parsing #347 - implemented row group skipping for default engine parquet readers and new utility trait for stats-based skipping logic #357, #362, #381
- depend on wider arrow versions and add arrow integration testing #366 and #413
- added semver testing to CI #369, #383, #384
- new
SchemaTransformtrait and usage in column mapping and data skipping #395 and #398 - arrow expression evaluation improvements #401
- replace panics with
to_compiler_errorin macros #409
Fixed bugs:
- output of arrow expression evaluation now applies/validates output schema in default arrow expression handler #331
- add
arrow-buffertoarrow-expressionfeature #332 - fix bug with out-of-date last checkpoint #354
- fixed broken sync engine json parsing and harmonized sync/async json parsing #373
- filesystem client now always returns a sorted list #344
v0.3.1 (2024-09-10)
API Changes
Additions
- Two new binary expressions:
InandNotIn, as well as a newScalar::Arrayvariant to represent arrays in the expression framework #270 NOTE: exact API for these expressions is still evolving.
Implemented enhancements:
- Enabled more golden table tests #301
Fixed bugs:
- Allow kernel to read tables with invalid
_last_checkpoint#311 - List log files with checkpoint hint when constructing latest snapshot (when version requested is
None) #312 - Fix incorrect offset value when computing list offsets #327
- Fix metadata string conversion in default engine arrow conversion #328
v0.3.0 (2024-08-07)
API Changes
Breaking
delta_kernel::column_mappingmodule moved todelta_kernel::features::column_mapping#222
Additions
- New deletion vector API
row_indexes(and accompanying FFI) to get row indexes instead of seletion vector of deleted rows. This can be more efficient for sparse DVs. #215 - Typed table features:
ReaderFeatures,WriterFeaturesenums andhas_reader_feature/has_writer_featureAPI #222
Implemented enhancements:
- Add
--limitoption to exampleread-table-multi-threaded#297 - FFI now built with cmake. Move to using the read-test example as an ffi-test. And building on macos. #288
- Golden table tests migrated from delta-spark/delta-kernel java #295
- Code coverage implemented via cargo-llvm-cov and reported with codecov #287
- All tests enabled to run in CI #284
- Updated DAT to 0.3 #290
Fixed bugs:
- Evaluate timestamps as "UTC" instead of "+00:00" for timezone #295
- Make Map arrow type field naming consistent with parquet field naming #299
v0.2.0 (2024-07-17)
API Changes
Breaking
-
The scan callback if using
visit_scan_filesnow takes an extraOption<Stats>argument, holding top level stats for associated scan file. You will need to add this argument to your callback.Likewise, the callback in the ffi code also needs to take a new argument which is a pointer to a
Statsstruct, and which can be null if no stats are present.
Additions
- You can call
scan_builder()directly on a snapshot, for more convenience. - You can pass a
URLstarting with"hdfs"or"viewfs"to the default client to read usinghdfs_native_store
Implemented enhancements:
- Handle nested structs in
schemaString(allows reading iceberg compat tables) #257 - Expose top level stats in scans #227
- Hugely expanded C-FFI example #203
- Add
scan_builderfunction toSnapshot#273 - Add
hdfs_native_storesupport #273 - Proper reading of Parquet files, including only reading requested leaves, type casting, and reordering #271
- Allow building the package if you are behind an https proxy #282
Fixed bugs:
- Don't error if more fields exist than expected in a struct expression #267
- Handle cases where the deletion vector length is less than the total number of rows in the chunk #276
- Fix partition map indexing if column mapping is in effect #278
v0.1.1 (2024-06-03)
Implemented enhancements:
- Support unary
NOTandIsNullfor data skipping #231 - Add unary visitors to c ffi #247
- Minor other QOL improvements
v0.1.0 (2024-06-12)
Initial public release