v0.20.0 (2026-02-26)
- Remove
DefaultEngine::new(#1583)- Use
DefaultEngineBuilderinstead like:DefaultEngineBuilder::new(store).build()
- Use
- Add ParseJson expression (#1586)
- Implementors of the ExpressionHandler trait now need to handle this expression
- Change CommitResponse::Committed to return a FileMeta (#1599)
- Committer implementations must now return a FileMeta of the written file after each commit, instead of only returning the committed version
- Add stats_columns to ParquetHandler (#1668)
- Add stat_columns to
write_parquet_fileengine implementation, which specifies the columns to collect Delta stats on
- Add stat_columns to
- Add StatisticsCollector core with numRecords (#1662)
- Renames
_stat_columnsabove tostat_columns
- Renames
- Return updated Snapshot from
Snapshot::publish(#1694)- Snapshot::publish now takes self: Arc and returns DeltaResult instead of ()
- Pass engine to Snapshot::transaction() for domain metadata access (#1707)
- Snapshot::transaction() now requires an engine: &dyn Engine parameter to read domain metadata
- Add tracing instrumentation to transaction and snapshot operations (#1772)
- snapshot and transaction have both stopped implementing auto traits UnwindSafe and RefUnwindSafe due to storing new instrumentation span fields
- Use physical stats column names in
WriteContext(#1836)WriteContext.stats_columnsnow uses physical column names per column mapping. Ref: https://github.com/delta-io/delta/blob/master/PROTOCOL.md#column-mapping
- Generate
physical_schemainWriteContextw.r.t column mapping andmaterializePartitionColumns(#1837)WriteContext.physical_schemanow respects column mapping, and retains partition columns whenmaterializePartitionColumnsis enabled. Ref: https://github.com/delta-io/delta/blob/master/PROTOCOL.md#column-mapping
- Fix get_app_id_version to take &self (#1770)
- If you are calling
get_app_idpass a reference to theSnapshotnotArc<Snapshot>
- If you are calling
- Add ability to 'enter' the runtime to the default engine (#1847)
- Implementors of the
TaskExecutortrait now need to support this
- Implementors of the
- Add doctests for
IntoEngineDataderive macro (#1580) - Create
DefaultEngineBuilderto buildDefaultEngine(#1582) - Implement
Scalar::From<HashMap<K, V>>(#1541) - Add
logSegment.new_with_commit_appendedAPI (#1602) snapshot.new_post_commit(#1604)- Creates a new Snapshot reflecting a just-committed transaction without re-reading the log
- Enable Arrow to convert nullable StructArray to RecordBatch (#1635)
- Add
snapshot.checkpoint()for all-in-one checkpointing (#1600) - Add a tracing statement to print table configuration for each version (#1634)
- Add CheckpointDeduplicator for checkpoint phase of distributed log replay (#1538)
- Add CreateTable API with simplified single-stage flow (#1629)
- Add with_table_properties method to CreateTableTransactionBuilder (#1649)
- Add post-commit Snapshot to txn (#1633)
- Add CDF tracing for Phase 1 of Change Data feed (#1654)
- Make Sequential phase schema only contain add and remove actions (#1679)
- Add executor for distributed log replay (#1539)
- Transaction stats API (#1658)
Snapshot::publishAPI with e2e in-memory UC test (#1628)- Expose a
Snapshot::get_domain_metadata_internalAPI, guarded byinternal-apifeature flag (#1692) - Add nullCount support to StatisticsCollector (#1663)
- Add minValues and maxValues support to StatisticsCollector (#1664)
- Enable NullCount collection for complex data types (#1706)
- Implement schema diffing for flat schemas (2/5]) (#1478)
- Add API on Scan to perform 2-phase log replay (#1547)
- Enable distributed log replay serde serialization for serializable scan state (#1549)
- Add InCommitTimestamp support to ChangeDataFeed (#1670)
- Add include_stats_columns API and output_stats_schema field (#1728)
- Add write support for clustered tables behind feature flag (#1704)
- Add snapshot load instrumentation (#1750)
- Create table builder and domain metadata handling (#1762)
- Add crc module with schema, visitor, reader, and lazy loader (#1780)
- Add clustering support for CREATE TABLE (#1763)
- Support owned runtime in
TokioMultiThreadExecutor(#1719) - (transaction) Support blind append commit metadata (#1783)
- Adds
set_is_blind_append()API toTransaction, includesisBlindAppendin generatedCommitInfo, and validates blind-append semantics (add-only, no removals/DV updates,dataChangemust be true) before commit.
- Adds
- Add stats transform module for checkpoint stats population (#1646)
- Refactor data skipping to use stats_parsed directly (#1715)
- Support using stats_columns and predicate together in scans (#1691)
- Support creation of
DefaultEnginewithTokioMultiThreadExecutorin FFI (#1755) - Add column mapping support for CREATE TABLE (#1764)
- Write parsed stats in checkpoints (#1643)
- Implement ReadConfig for Benchmark Framework (#1758)
- Implement TableInfo Deserialization for Benchmark Framework (#1759)
- Implement Read Spec Deserialization for Benchmark Framework (#1760)
- Allow visitors to visit REE Arrow columns. (#1829)
- (committer) Add tracing instrumentation to FileSystemCommitter::commit (#1811)
- Try and cache brew packages to speed up CI (#1909)
- Extend GetData with float, double, date, timestamp, decimal types (#1901)
- Define and use constants for protocol (3,7]) (#1917)
- Generate transform in
WriteContextw.r.t column mapping (#1862) - Support v2 checkpoints in create_table API (#1864)
- Expand add files schema to include all stats fields (#1748)
- Support write with both partition columns and column mapping in
DefaultEngine(#1870) - Feat: support scanning for multiple specific domains in domain metadata replay (#1881)
- Allows callers to request multiple domain names in a single metadata replay pass, with early termination once all requested domains are found. Includes optimized skip of domain metadata fields when a domain has already been seen in a newer commit.
- Allow ffi for uc_catalog stuff (#1711)
- Support column mapping on writes (#1863)
- Coerce parquet read nullability to match table schema (#1903)
- Relax clustering column constraints to align with Delta protocol (#1913)
- Auto-enable variantType feature during CREATE TABLE ([#1922]) (#1949)
- Add type validation for
evaluate_expression(#1575) - Use ReaderBuilder::with_coerce_primitive when parsing JSON (#1651)
- Allow to change tracing level and callback more than once (#1111)
- Simplify checkpoint-table with Snapshot::checkpoint (#1813)
- Add size metadata to the CdfScanFile (#1935)
- Add deletion vector APIs to transaction (#1430)
- Include max known published commit version inside of
LogSegment(#1587) - Use CRC for In-Commit-Timestamp reading (#1806)
- Refactor
ListedLogFiles::try_newto be more extensible and with default values by using builder pattern (#1585) - Implement the read metadata workload runner (#1919)
- Provide expected stats schema (#1592)
- Add checkpoint schema discovery for stats_parsed detection (#1550)
- Add function to check if schema supports parsed stats (#1573)
- Read parsed-stats from checkpoint (#1638)
- feat: add get clustering columns in transactions (#1693)
- Change expected_stats_schema to return logical schema + physical schema (#1749)
- Add support for outputting parsed file statistics to scan batches (#1720)
- Checkpoint and sidecar row group skipping via stats_parsed (#1853)
- Add serialization/deserialization support for Predicates and Expressions (#1543)
- Distributed Log Replay serialization/deserialization (#1503)
- Introduce Deduplicator trait to unify mutable and immutable deduplication (#1537)
- Add ffi api to perform a checkpoint (#1619)
- Make parquet read actually use the executor (#1596)
- Deadlock for
TokioMultiThreadExecutor(#1606) - Remove
breaking-changetag after semver passes (#1621) - Enable arrow conversion from Int96 (#1653)
- Preserve null bitmap in nested transform expressions (#1645)
- Include domain metadata in checkpoints (#1718)
- Domain metadata was not being written to checkpoint files, causing it to be lost after checkpoints
- Propagate struct-level nulls when computing nested column stats (#1745)
- Express One Zone URLs do not support lexicographical ordering (#1753)
- Preserve non-commit files (CRC, checkpoints, compactions) at log tail versions (#1817)
- Fixes
list_log_filesto no longer discard CRC, checkpoint, and compaction files at the log tail boundary, ensuring these auxiliary files are preserved alongside their commit files.
- Fixes
- Fix Miri CI failure by cleaning stale Miri artifacts before test run (#1845)
- Strip parquet field IDs from physical stats schema for checkpoint reading (#1839)
- Unify v2 checkpoint batch schemas (#1833)
- Improve performance and correctness of EngineMap implementation in default engine (#1785)
- Parquet footer skipping cannot trust nullcount=0 stat (#1914)
- Column extraction for visitors should not rely on schema order (#1818)
- Ensure consistent usage of parquet.field.id and conversion to PARQUET:field_id in kernel/default engine (#1850)
- Make log segment merging in
Snapshot::try_new_fromdeduplicate compaction files (#1954)
- Pre-allocate Vecs and HashSets when size is known (#1676)
- Add skip_stats option to skip reading file statistics (#1738)
- Use CRC in Protocol + Metadata log replay (#1790)
- Move doctest into mods (#1574)
- Deny panics in ffi crate (#1576)
- Extract shared HTTP utilities to http.rs (#1590)
- Rename
Snapshot.checkpoint(#1608) - Extract stats from
ActionReconciliationIterator(#1618) - Cleanup repeated schema definitions in
kernel/tests/write.rs(#1637) - Split
committer.rsinto multiple files (#1622) - Consolidate nullable stat transforms (#1636)
- Add Expression::coalesce helper method (#1648)
- Add checkpoint info to ScanLogReplayProcessor (#1752)
- Extract protocol & metadata replay into log_segment submodule (#1782)
- Define constants for table property keys (#1797)
- Replaces scattered string literals for Delta table property keys (e.g.
delta.appendOnly,delta.enableChangeDataFeed) with named constants, improving maintainability and preventing typos.
- Replaces scattered string literals for Delta table property keys (e.g.
- Update metadata schema to be a SchemaRef and add appropriate Arcs (#1802)
- Rename
set_is_blind_appendtowith_blind_append, returningSelf(#1838)- Adopts builder-style API for the blind append flag, allowing method chaining (e.g.
txn.with_blind_append(true).commit(...)).
- Adopts builder-style API for the blind append flag, allowing method chaining (e.g.
- Extract clustering tests into sub-module (#1828)
- Split
UCCommitsClientintoUCCommitClientandUCGetCommitsClient(#1854)- Separates the Unity Catalog commits client into two focused traits — one for committing and one for reading commits — enabling cleaner dependency boundaries and testability.
- Use type-state pattern for
CreateTableTransactioncompile-time API safety (#1842)- Encodes the create-table workflow states (building → ready → committed) in the type system, so invalid transitions (e.g. committing before setting schema) are caught at compile time. Reorganizes create-table code and moves tests to integration tests.
- Simplify table feature parsing (#1878)
- Define and use new TableConfiguration methods (#1905)
- Improve Protocol::try_new and make tests call it reliably (#1907)
- Simplify GetData impls with bool::then() (#1918)
- Split transaction module into
mod.rsandupdate.rs(#1877)- Breaks the growing transaction module into separate files: core transaction logic in
mod.rsand update/DV-related logic inupdate.rs, improving navigability.
- Breaks the growing transaction module into separate files: core transaction logic in
- Rename FeatureType::Writer as WriterOnly (#1934)
- Clean up TableConfiguration validation and unit tests (#1947)
- StructType modification method and stat_transform schema boilerplate code refactor. (#1872)
- In-Memory UC-Commits-Client (#1644)
- Add test for post_commit_snapshot with create table API (#1680)
- Add rs-test support (#1708)
- Add test validating collect_stats() output against Spark (#1778)
- Add test for parquet id when CM enabled (#1946)
- [Test Only] Minor refactor to log_segment tests (#1581
- Add file size to the unit test of Engine's ParquetReader (#1921)
- Remove unnecessary spaces in PR description (#1598)
- Upgrade to reqwest 0.13 and rustls as default (#1588)
- Stats-schema improvements (#1642)
- Add Rust caching to build and test jobs (#1672)
- Use cargo-nextest for parallel test execution (#1673)
- ~19x faster locally via per-test process isolation
- Fix ffi_test cache miss by using consistent toolchain action (#1702)
- Add caching and optimize tool installation across all jobs (#1674)
- Remove unused remove metadata (#1732)
- Prefer
append_value_noverappend_value(#1868) - Pin native-tls to 0.2.16 due to upstream breakage (#1880)
- Fix unit tests with bad protocol versions (#1879)
- Add nextest support for miri tests (#1685)
- Unpin Miri nightly toolchain (#1900)
- Bring 0.19.1 changes into main (#1632)
- Remove comfy-table dependency declaration (#1860)
- Update review policy in CONTRIBUTING.md (#1945)
- Revert "chore: pin native-tls to 0.2.16 due to upstream breakage" (#1915)
- Remove comments and text from
pull_request_template.md(#1589)
v0.19.1 (2026-01-20)
v0.19.0 (2025-12-19)
- Error on surplus columns in output schema (#1528)
- Remove
arrow-55support (upgrate to arrow 56 or 57 required) (#1507) - Add a new
read_parquet_schemafunction to theParquetHandlertrait (#1498) - Add a new
write_parquet_filefunction to theParquetHandlertrait (#1392) - Make PartialEq for Scalar a physical comparison (#1554)
Caution
Note this is a breaking behavior change. Code that previously relied on PartialEq as a
logical comparison will still compile, but its runtime behavior will silently change to perform
structural comparisons.
This change moves the current definition of PartialEq for Scalar to a new Scalar::logical_eq
method, and derives PartialEq (= physical comparison).
We also remove PartialOrd for Scalar because it, too, would become physical (required to match PartialEq), and the result would be largely nonsensical. The logical comparison moves to Scalar::logical_partial_cmp instead.
These changes are needed because today there's no reliable way to physically compare two scalars, and most comparisons are physical in practice. Only predicate evaluation needs logical comparisons, and that code already has a narrow waist.
- Expose mod time in scan metadata callbacks: users must change the scan callback function to take a struct which has all the previous arguments as members (and the mod time). See an example of the needed change here. For FFI code, your callback function needs an extra argument. See an example of the change needed here. (#1565)
- Initial Metrics implementation (#1448)
- Build TableConfiguration for each version of change data feed (#1531)
- Add ability for engines to specify a scan schema (#1463)
- Add bidirectional expression round-trip test with visitor functions (#1467)
- Add support for the materializePartitionColumns writer feature (#1476)
- Allow comfy-table 7.2.x (#1545)
- Rustls for uc-client (#1533)
- Add file name metadata column to parquet reading. (#1512)
- Add checkpoint example (#1544)
- Commit Reader for processing commit actions (#1499)
- Add CheckpointManifestReader to process sidecar files (#1500)
- Distributed Log Replay Sequential Phase (#1502)
- Passing schema from C, plus example/tests in C (#1535)
- Support sidecar in inspect-table (#1566)
- short-circuit coalesce evaluation when array has no nulls (#1568)
- Force usage of ListedLogFiles::try_new() (#1562)
- Improve parse_json performance by removing line-by-line parsing (#1561)
- Move ensure_read_support/ensure_write_support to operation entry points (#1518)
- Migrate custom feature functions to generic is_feature_enabled/is_feature_supported (#1519)
- Separate async handler logic from sync bridge logic (#1435)
- Migrated protocol validation tests to table_configuration (#1517)
- Move scan/mod.rs to scan/tests.rs and scan/test_utils.rs (#1485)
- Remove macOS metadata from test data tarballs (#1534)
- Make tests async if they rely on async (#1438)
- Cleanup scalar eq workaround (#1560)
- Remove architecture.md from readme (#1551)
v0.18.2 (2025-12-03)
- Address column mapping edge case in protocol validation (#1513)
- Remove arrow error message dependency from test (#1529)
v0.18.1 (2025-11-24)
- Scan::execute no longer requires lifetime bound (#1515)
- Migrate protocol validation to table_configuration (#1411)
- Add Display for StructType, StructField, and MetadataColumnSpec (#1494)
- Add EngineDataArrowExt and use it everywhere (#1516)
- Implement builder for StructType (#1492)
- Enable CDF for column-mapped tables (#1510)
- Extract File Action tests (#1365)
v0.18.0 (2025-11-19)
- New Engine StorageHandler head API (#1465)
- Engine API implementers must add the
headAPI to StorageHandler which fetches metadata about a file in storage
- Engine API implementers must add the
- Add remove_files API (#1353)
- The schema for scan rows (from
Scan::scan_metadata) has been updated to include two new fields:fileConstantValues.tagsandfileConstantValues.defaultRowCommitVersion.
- The schema for scan rows (from
- Add parser for iceberg compat properties (#1466)
- Pass ColumnMappingMode to physical_name (#1403)
- Allow visiting entire domain metadata (#1384)
- Add Table Feature Info (#1462)
- (FFI) Snapshot log tail FFI (#1379)
- Add generic is_feature_supported and is_feature_enabled methods to TableConfiguration (#1405)
- Un-deprecate ArrayData.array_elements() (#1493)
- Allow writes to CDF tables for add-only, remove-only, and non-data-change transactions (#1490)
- (catalog-managed) UCCommitter (#1418)
- Eliminate endless busy looping in read_json_files on failed read (#1489)
- Handle array/map types in ffi schema example and test (#1497)
- Fix docs for rustc 1.92+ (#1470)
- Harmonize checkpoint and log compaction iterators (#1436)
- Avoid overly complex itertools methods in log listing code (#1434)
- Simplify creation of default engine in tests (#1437)
- Add tests for StructField.physical_name (#1469)
v0.17.1 (2025-11-13)
- Fix docs for rustc 1.92+ (#1470)
v0.17.0 (2025-11-10)
- (catalog-managed): New copy_atomic StorageHandler method (#1400)
- StorageHandler implementers must implement the copy_atomic method.
- Make expression and predicate evaluator constructors fallible (#1452)
- Predicate and expression evaluator constructors return DeltaResult.
- (catalog-managed): add
log_tailtoSnapshotBuilder(#1290)into_scan_builder()no longer exists onSnapshot. Must create anArc<Snapshot>
- Arrow 57, MSRV 1.85+ (#1424)
- The Minimum Required Rust Version to use kernel-rs is now 1.85.
- Add ffi for idempotent write primitives (#1191)
- get_transform_for_row now returns new FFI-safe OptionalValue instead of Option
- Rearchitect
CommitResult(#1343)- CommitResult is now an enum containing CommittedTransaction, ConflictedTransaction, and RetryableTransaction
- Add with_data_change to transaction (#1281)
- Engines must use with_data_change on the transaction level instead of passing it to the method. add_files_schema is moved to be scoped on a the transaction.
- (catalog-managed) Introduce Committer (with FileSystemCommitter) (#1349)
- Constructing a transaction now requires a committer. Ex: FileSystemCommitter
- Switch scan.execute to return pre-filtered data (#1429)
- Connectors no longer need to filter data that is returned from
scan.execute()
- Connectors no longer need to filter data that is returned from
- Add visit_string_map to the ffi (#1342)
- Add tags field to LastCheckpointHint (#1455)
- Support writing domain metadata (1/2]) (#1274)
- Change input to write_json_file to be FilteredEngineData (#1312)
- Convert DV
storage_typeto enum (#1366) - Add latest_commit_file field to LogSegment (#1364)
- No staged commits in checkpoint/compaction (#1374)
- Generate In Commit Timestamp on write (#1314)
- (catalog-managed) Add
uc-catalogcrate with load_table (#1324) - Snapshot should not expose delta implementation details (#1339)
- (catalog-managed) Uc-client commit API (#1399)
- Add row tracking support (#1375)
- Support writing domain metadata (2/2]) (#1275)
- Add parser for enableTypeWidening table property (#1456)
- Implement
FromtraitEngineDataintoFilteredEngineData(#1397) - Unify TableFeatures followups (#1404)
- Accept nullable values in "tags" HashMap in
Addaction (#1395) - Enable writes to CDF enabled tables only if append only is supported (#1449)
- Add deletion vector file writer (#1425)
- Allow converting
bytes::Bytesinto a Binary Scalar (#1373) - CDF API for FFI (#1335)
- Add optional stats field to remove action (#1390)
- Modify read_actions to not require callers to know details about checkpoints. (#1407)
- Add Accessor for
Binarydata (#1383)
- Change InCommitTimestamp enablement getter function (#1357)
- Be adaptive to the log schema changing in inspect-table (#1368)
- Typo on variable name for ScanTransformFieldClassifierieldClassifier (#1394)
- Pin cbindgen to 0.29.0 (#1412)
- Unpin cbindgen (#1414)
- Don't return errors from ParsedLogPath::try_from (#1433)
- Doc issue, stray ' (#1445)
- Replace todo!() with proper error handling in deletion vector (#1447)
- Fix scan_metadata docs (#1450)
- Pull out transform spec utils and definitions (#1326)
- Use expression transforms in change data feed (#1330)
- Remove raw pointer indexing and add unit tests for RowIndexBuilder (#1334)
- Make
Metadatafields private (#1347) - Remove storing UUID in LogPathFileType::UuidCheckpoint (#1317)
- Consolidate physical/logical info into StateInfo (#1350)
- Consolidate regular scan and CDF scan field handling (#1359)
- Make get_cdf_transform_expr return Option (#1401)
- Separate domain metadata additions and removals (#1421)
- Unify Reader/WriterFeature into a single TableFeature (#1345)
- Put
DataFileMetadata::as_record_batchunder#[internal_api](#1409) - Create static variables for magic values in deletion vector (#1446)
- E2e test for log compaction (#1308)
- Tombstone expiration e2e test for log compaction (#1341)
- Add memory tests (via DHAT) (#1009)
- One liner to skip read_table_version_hdfs (#1428)
- Add CI for examples (#1393)
- Small typo's in
log_segment.rs(#1396) - Reduce log verbosity when encountering non-standard files in _delta_log (#1416)
- Follow up on TODO in
log_replay.rs(#1408) - Remove a stray comment in the kernel visitor (#1457)
- Allow passing more on the command line for all the cli examples (#1352)
- add back arrow-55 support (#1458)
- Rename log_schema to commit_schema (#1419)
v0.16.0 (2025-09-19)
- New expression variants:
UnaryExpressionandToJsonexpression (#1192) - New SnapshotBuilder API:
Snapshot::try_new(...)replaced withSnapshot::builder(...)and its associated methods. TLDR, you make a builder and callbuildto construct aSnapshot. (#1189) - Simplify the
Expr::TransformAPI, add FFI support:- Reworks the pub members of Transform used by Expr::Transform and introduce a new FieldTransform struct. Also, rework Transform::new (constructor) and Transform::with_input_path (method) into a pair of constructors, new_top_level and new_nested.
- Adds two new members to the FFI EngineExpressionVisitor struct -- visit_transform_expression and visit_field_transform, which also changes the ordering of existing fields. (#1243)
- Add
numRecordstoADD_FILES_SCHEMA(#1235) - New
EngineDatatrait required method:try_append_columns(#1190) - Make ColumnType private (#1258)
- Add row tracking writer feature: updates
ADD_FILES_SCHEMA(see PR for details) (#1239) - Migrate
Snapshot::try_new_fromintoSnapshotBuilder::new_from(#1289) - (FFI) Add CDvInfo struct: The
CScanCallbacknow takes a&CDvInfoand not a&DvInfo. (#1286) - (FFI) Add explicit numbers for each
KernelErrorenum variants. (see PR for details) (#1313) - (more) new expression variants:
Expression::VariadicandCoalesceexpressions (#1198) - All new/modified
StructTypeconstructors, see PR for details (#1278) - Introduce metadata column API:
StructTypehas new private field (#1266) - (FFI)
engine_data::get_engine_datanow takes anAllocateErrorFninstead of an engine. (#1325) StructType::into_fieldsreturnsDoubleEndedIterator + FusedIterator(#1327)
- (catalog-managed) Add log_tail to list_log_files (#1194)
- CommitInfo sets a txnId (#1262)
- Allow LargeUTF8 -> String and LargeBinary -> Binary in arrow conversion (#1294)
- Implement log compaction (#1234)
- Disallow equal version in log compaction (#1309)
- Add
IterabletoStructType(#1287) - ParsedLogPath for staged commits (#1305)
- Default expression eval supports nested transforms (#1247)
- Introduce row index metadata column (#1272)
- Update README.md to enhance FFI documentation (#1237)
- Make checkpoint visitor more efficient using short circuiting (#1203)
- Factor out a method for LastCheckpointHint path generation (#1228)
- Do not guess Vec size for checkpoints (#1263)
- Introduce current_time_ms() helper (#1256)
- Retention calculation into a new trait (#1264)
- Minor Refactoring in Log Compaction (#1301)
- Rename SnapshotBuilder::new to new_for (#1306)
- Move log replay into the action reconciliation module (#1295)
- Introduce SnapshotRef type alias (#1299)
- Row tracking write cleanup (#1291)
- Update invalid-handle tests for rustc 1.90 (#1321)
- Create expression benchmark for default engine (#1220)
- Update changelog for 0.15.1 release (#1227)
- Sync changelog for 0.15.2 (#1251)
- Update data types test to validate full Arrow error message (#1259)
- Add better panic message when not OK (#1293)
- Add test for empty commits and clean up test error types (#1252)
- Update contributing.md (#1206)
v0.15.2 (2025-09-03)
- pin
comfy-tableat7.1.4to restore kernel MSRV (#1231) - Arrow json decoder fix for breakage on long json string (#1244)
v0.15.1 (2025-08-28)
- Make ListedLogFiles::try_new internal-api (again) (#1226)
v0.15.0 (2025-08-28)
- Rename
default-enginefeature todefault-engine-native-tls(#1100) - Add arrow 56 support, drop arrow 54 (#1141)
- Add
catalogManaged(andcatalogOwned-preview) table features +catalog-managedexperimental feature flag (#1165) ExpressionRefinstead of ownedExpressionfor transforms (#1171):Expression::Structnow takes aVec<ExpressionRef>instead ofVec<Expression>- Add support for Column Mapping Id Mode (#1056): significantly changes the semantics (
Enginetrait requirements) of the parquet handler in column mapping id mode. SeeParquetHandler::read_parquet_filesdocs for details. StructField.physical_nameis no longer public (internal-api) (#1186)- Add support for sparse transform expressions (#1199): adds a new
Expression::Transformvariant. - Expression evaluators take
ExpressionRefas input (#1221):EvaluationHandler::new_expression_evaluatorandEvaluationHandler::new_predicate_evaluatortake Arc instead of owned expression/predicate.scan::state::transform_to_logicaltakes ownedOption<ExpressionRef>instead of a borrowed reference.transaction::WriteContext::logical_to_physicalreturns an Arc instead of a borrowed reference
- Impl IntoEngineData for Protocol action (#1136)
- Add txnId to commit info (#1148)
- (catalog-managed) Experimental uc client (#1164)
- Implement
IntoEngineDataforDomainMetadata(#1169) - Add example for table writes (#1119)
- (ffi) Add
visit_expression_literal_date(#1096)
- Match arrow versions in examples (#1166)
- Support arrow views in ensure_data_types (#1028)
- Make
ListedLogFilesinternal-api again (#1209) - Provide accurate error when evaluating a different type in LiteralExpressionTransform (#1207)
- Fix failing test and improve indentation test error message (#1135)
- Contiguous commit file checking inside
ListedLogFiles::try_new()(#1107) - New listed_log_files module (#1150)
- Move LastCheckpointHint to separate module (#1154)
- (catalog-managed) Push down _last_checkpoint read into LogSegment (#1204)
- Add metadata-only regression test (#1183)
- Parameterize column mapping tests to check different modes (#1176)
- Add apply_schema mismatch test (#1210)
- Appease clippy in rustc 1.89 (#1151)
- Bump MSRV to 1.84 (#1142)
- Remove object store versioning (#1161)
- Remove unused deps from examples (#1175)
- Update deps (#1181)
v0.14.0 (2025-08-01)
- Removed Table APIs: instead use
SnapshotandTransactiondirectly. (#976) - Add support for Variant type and the variantType table feature (new
DataType::Variantenum variant and newvariantType-previewandvariantShreddingReader/Writer features) (#1015) - Expose post commit stats. Now, in
Transaction::committheCommittedvariant of the enum includes apost_commit_statsfield with info about the commits since checkpoint and log compaction. (#1079) - Replace
Transaction::with_commit_info()API withwith_engine_info()API (#997) - Removed
DataType::decimal_uncheckedAPI (#1087) make_physicaltakes column mapping and sets parquet field ids. breaking: (1)StructField::make_physicalis now an internal_api instead of a public function. Its signature has also changed. And (2) IfColumnMappingModeisNone, then the physical schema's name is the logical name. Previously, kernel would unconditionally use the column mapping physical name, even if column mapping mode is none. (#1082)
- (ffi) Added default-engine-rustls feature and extern "C" for .h file (#1023)
- Add log segment constructor for timestamp to version conversion (#895)
- Expose unshredded variant type as
DataType::unshredded_variant()(#1086) - New ffi API for
get_domain_metadata()(#1041) - Add append functions to ffi (#962)
- Add try_new and
IntoEngineDatafor Metadata action (#1122)
- Rename object_store PutMultipartOpts (#1071, #1090)
- Use object_store >= 0.12.3 for arrow 55 feature (#1117)
- VARIANT follow-ups for SchemaTransform etc (#1106)
- Downgrade stale
_last_checkpointlog fromwarn!toinfo!(#777) - Exclude
tests/datafrom release (#1092) - Deny panics in prod code (#1113)
- Add derive macro tests (#514)
- Add unshredded variant read test (#1088)
- (ffi)
AllocateErrorFnshould be able to allocate a nullptr (#1105) - Assert tests on error message instead of
is_err()(#1110)
- Expose Snapshot and ListedLogFiles constructors behind internal api flag (#1076)
- Only semver check released crates (#1101)
v0.13.0 (2025-07-11)
- Add support for opaque engine expressions. Includes a number of changes: new
ExpressionTypes (OpaqueExpression,OpaquePredicate,Unknown) andExpression/Predicatevariants (Opaque,Unknown), and visitors, transforms, and evaluators changed to support opaque/unknown expressions/predicate. (#686) - Rename
Transaction::add_write_metadatatoTransaction::add_files(#1019)
- add ability to only retain SetTransaction actions <= SetTransactionRetentionDuration (#1013)
- (ffi) Add timetravel by version number (#1044)
- Introduce a crate for args that are common between examples (#1046)
- Support reordering structs that are inside maps in default parquet reader (#1060)
- Add default engine support for arrow eval of opaque expressions (#980)
- Expose descriptive fields on Metadata action (#1051)
- Clippy fmt cleanup (#1042)
- Examples: move logic into the thread::scope call so examples don't hang (#1040)
- Remove panic from read_last_checkpoint (#1022)
- Always write
_last_checkpointwith parts = None (#1053) - Don't release
commoncrate (used only by example programs) (#1065)
- Move various test util functions to test-utils crate (#985)
- Define and use a cow helper for transforms (#1057)
- Expand capability and usage of
Cowhelper for transforms (#1061)
v0.12.1 (2025-06-05)
- Remove azure suffix range request (#1006)
v0.12.0 (2025-06-04)
- Remove
GlobalScanState: instead use newScanAPIs directly (logical_schema,physical_schema, etc.) (#947) - table feature enums are now
internal_api(not public, unlessinternal-apiflag is set) (#998)
- Use compacted log files in log-replay (#950)
- New
#[derive(IntoEngineData)]proc macro (#830) - Add support for kernel default expression evaluation (#979)
- New: panic in debug builds if ListedLogFiles breaks invariants (#986)
- Create visitor for getting In-commit Timestamp (#897)
- Binary searching utility function for timestamp to version conversion (#896)
- Enable "TimestampWithoutTimezone" table feature and add protocol validation for it (#988)
- add missing reader/writer features (variantType/clustered) (#998)
- Disable timestamp column's
maxValuesfor data skipping (#1003)
- Make KernelPredicateEvaluator trait dyn-compatible (#994)
v0.11.0 (2025-05-27)
- Add in-commit timestamp table feature (#894)
- Make
Errornon_exhaustive (will reduce future breaking changes!) (#913) Scalar::Mapsupport (#881)- New
Scalar::Map(MapData)variant andMapDatastruct to describeScalarmaps. - New
visit_literal_mapFFI
- New
- Split out predicates as different from expressions (#775): pervasive change which moves some expressions to new predicate type.
- Bump MSRV from 1.81 to 1.82 (#942)
DataSkippingPredicateEvaluator's associated typesTypedStatandIntStatcombined into oneColumnStattype (#939)- Code movement in FFI crate (#940):
- Rename
ffi::expressions::enginemod askernel_visitor - Rename
ffi::expressions::kernelmod asengine_visitor - Move the
free_kernel_[expression|predicate]functions to theexpressionsmod - Move the
EnginePredicatestruct to theffi::scanmodule
- Rename
- Fix timestamp ntz in physical to logical cdf (#948): now
TableChangesScan::executereturns a schema with_commit_timestampof typeTimestamp(UTC) instead ofTimestampNtz. - Add TryIntoKernel/Arrow traits (#946): Removes old
From/Intoimplementations for kernel schema types, replaces withTryFromKernel/TryIntoKernel/TryFromArrow/TryIntoArrow. Migration should be as simple as changing a.try_into()to a.try_into_kernel()or.try_into_arrow(). - Remove
SyncEngine(now test-only), useDefaultEngineeverywhere else (#957)
- Add
Snapshot::checkpoint()&Table::checkpoint()API (#797) - Add CRC ParsedLogPath (#889)
- Use arrow array builders in Scalar::to_array (#905)
- Add
domainMetadataread support (#875) - Support maps and arrays in literal_expression_transform (#882)
- Add
CheckpointWriter::finalize()API (#851) DataSkippingPredicatedyn compatible (#939):finish_eval_pred_junctionnow takes&dyn Iterator- Store compacted log files in LogSegment (#936)
- Add CRC, FileSizeHistogram, and DeletedRecordCountsHistogram schemas (#917)
- Scan from previous result (#829)
- Include latest CRC in LogSegment (#964)
- CRC protocol+metadata visitor (#972)
- Make several types/function pub and fix their doc comments (#977)
KernelPredicateEvaluatorandKernelPredicateEvaluatorDefaultsare now pub.DataSkippingPredicateEvaluatoris now pub.- add new type aliases
DirectDataSkippingPredicateEvaluatorandIndirectDataSkippingPredicateEvaluator - Arrow engine
evaluate_expressionandevaluate_predicateare now pub. Expression::predicaterenamed toExpression::from_pred
- Fix incorrect results for
Scalar::Array::to_array(#905) - Use object_store::Path::from_url_path when appropriate (#924)
- Don't include modules via a macro (#935)
- Rustc 1.87 clippy fixes (#955)
- Allow CheckpointDataIterator to be used across await (#961)
- Remove
target-cpu=nativerustflags (#960) - Rename
drop_null_container_valuestoallow_null_container_values(#965) - Make
ActionsBatchfields pub forinternal-api(#983)
- Add readme badges (#904)
- Combine actions counts in
CheckpointVisitor(#883) - Simplify Display for Expression and Predicate (#938)
- Macro traits cleanup (#967)
- Remove redundant binary predicate operations (#949)
- Make arrow predicate eval directly invertible (#956)
- Add
ActionsBatch(#974)
- Remove abs_diff since we have rust 1.81 (#909)
- Conditional compilation instead of suppressing clippy warnings (#945)
- Expose some more arrow utils via
internal-api(#971) - Use consistent naming of kernel data type in arrow eval tests (#978)
- Cargo doc workspace + all-features (#981)
v0.10.0 (2025-04-28)
- Updated dependencies, breaking updates:
itertools 0.14,thiserror 2, andstrum 0.27(#814) - Rename
developer-visibilityfeature flag tointernal-api(#834) - Tidy up AND/OR/NOT API and usage (#842)
- Rename VariadicExpression to JunctionExpression (#841)
- Enforce precision/scale correctness of Decimal types and values (#857)
- Expression system refactors
- Make literal expressions more strict (removed
Intotrait impl) (#867) - Remove nearly-unused expression
lt_eq/gt_eqoverloads (#871) - Move expression transforms (
ExpressionTransformandExpressionDepthChecker) to own module (#878) - Code movement in expression-related code (Reordered variants of the
BinaryExpressionOpenum) (#879)
- Make literal expressions more strict (removed
- Introduce the ability for consumers to add ObjectStore url handlers (#873)
- Update to arrow 55, drop arrow 53 support (#885, #903)
- Add
CheckpointVisitorin newcheckpointmod (#738) - Add
CheckpointLogReplayProcessorin newcheckpointsmod (#744) - Add
transaction.with_transaction_id()API (#824) - Add
snapshot.get_app_id_version(app_id, engine)(#862) - Overwrite logic in
write_json_filefor default & sync engine (#849)
- default engine: Sort list results based on URL scheme (#820)
impl AllocateError for T: ExternEngine(#856)- Disable predicate pushdown in
Scan::execute(#861)
- Correct docstring for
DefaultEngine::new(#821) - Remove
acceptancefromrust-analyzer.cargo.featuresin README (#858)
- Rename
predicatesmod tokernel_predicates(#822) - Code movement to tidy up ffi (#840)
- Grab bag of cosmetic tweaks and comment updates (#848)
- New
#[internal_api]macro instead ofvisibilitycrate (#835) - Expression transforms use new recurse_into_children helper (#869)
- Minor test improvements (#872)
- Remove unused dependencies (#863)
- Test code uses Expr shorthand for Expression (#866)
- Arrow DefaultExpressionEvaluator need not box its inner expression (#868)
v0.9.0 (2025-04-08)
- Change
MetadataValue::Number(i32)toMetadataValue::Number(i64)(#733) - Get prefix from offset path:
DefaultEngine::newno longer requires atable_rootparameter andlist_fromconsistently returns keys greater than the offset (#699) - Make
snapshot.schema()return aSchemaRef(#751) - Make
visit_expression_internalprivate, andunwrap_kernel_expressionpub(crate) (#767) - Make actions types
pub(crate)instead ofpub(#405) - New
null_rowExpressionHandler API (#662) - Rename enums
ReaderFeatures->ReaderFeatureandWriterFeatures->WriterFeature(#802) - Remove
get_prefix from engine getters (#804) - Rename
FileSystemClienttoStorageHandler(#805) - Adopt types for table features (New
ReadFeature::Unknown(String)and (WriterFeature::Unknown(String)) (#684) - Renamed
ScanDatatoScanMetadata(#817)- rename
ScanDatatoScanMetadata - rename
Scan::scan_data()toScan::scan_metadata() - (ffi) rename
free_kernel_scan_data()tofree_scan_metadata_iter() - (ffi) rename
kernel_scan_data_next()toscan_metadata_next() - (ffi) rename
visit_scan_data()tovisit_scan_metadata() - (ffi) rename
kernel_scan_data_init()toscan_metadata_iter_init() - (ffi) rename
KernelScanDataIteratortoScanMetadataIterator - (ffi) rename
SharedScanDataIteratortoSharedScanMetadataIterator
- rename
ScanMetadatais now a struct (instead of tuple) with newFiltereEngineDatatype (#768)
- (
v2Checkpoint) Extract & insert sidecar batches inreplay's action iterator (#679) - Support the
v2Checkpointreader/writer feature (#685) - Add check for whether
appendOnlytable feature is supported or enabled (#664) - Add basic partition pruning support (#713)
- Add
DeletionVectorsto supported writer features (#735) - Add writer version 2/invariant table feature support (#734)
- Improved pre-signed URL checks (#760)
- Add
CheckpointMetadataaction (#781) - Add classic and uuid parquet checkpoint path generation (#782)
- New
Snapshot::try_new_from()API (#549)
- Return
Error::unsupportedinstead of panic inScalar::to_array(MapType)(#757) - Remove 'default-members' in workspace, default to all crates (#752)
- Update compilation error and clippy lints for rustc 1.86 (#800)
- Split up
arrow_expressionmodule (#750) - Flatten deeply nested match statement (#756)
- Simplify predicate evaluation by supporting inversion (#761)
- Rename
LogSegment::replaytoLogSegment::read_actions(#766) - Extract deduplication logic from
AddRemoveDedupVisitorinto embeddableFileActionsDeduplicator(#769) - Move testing helper function to
test_utilsmod (#794) - Rename
_last_checkpointfromCheckpointMetadatatoLastCheckpointHint(#789) - Use ExpressionTransform instead of adhoc expression traversals (#803)
- Extract log replay processing structure into
LogReplayProcessortrait (#774)
- Add V2 checkpoint read support integration tests (#690)
- Use maintained action to setup rust toolchain (#585)
v0.8.0 (2025-03-04)
- ffi:
get_partition_column_countandget_partition_columnsnow take aSnapshotinstead of aScan(#697) - ffi: expression visitor callback
visit_literal_decimalnow takesi64for the upper half of a 128-bit int value (#724) -
DefaultJsonHandler::with_readahead()renamed toDefaultJsonHandler::with_buffer_size()(#711)
- DefaultJsonHandler's defaults changed:
- default buffer size: 10 => 1000 requests/files
- default batch size: 1024 => 1000 rows
- Bump MSRV to rustc 1.81 (#725)
- Pin
chronoversion to fix arrow compilation failure (#719)
- Replace default engine JSON reader's
FileStreamwith concurrent futures (#711)
v0.7.0 (2025-02-24)
- Read transforms are now communicated via expressions (#607, #612, #613, #614) This includes:
ScanDatanow includes a third tuple field: a row-indexed vector of transforms to apply to theEngineData.- Adds a new
scan::state::transform_to_logicalfunction that encapsulates the boilerplate of applying the transform expression - Removes
scan_action_iterAPI andlogical_to_physicalAPI - Removes
column_mapping_modefromGlobalScanState - ffi: exposes methods to get an expression evaluator and evaluate an expression from c
- read-table example: Removes
add_partition_columnsin arrow.c - read-table example: adds an
apply_transformfunction in arrow.c
- ffi: support field nullability in schema visitor (#656)
- ffi: expose metadata in SchemaEngineVisitor ffi api (#659)
- ffi: new
visit_schemaFFI now operates on aSchemainstead of aSnapshot(#683, #709) - Introduced feature flags (
arrow_54andarrow_53) to select major arrow versions (#654, #708, #717)
- Read
partition_valuesinRemoveVisitorand removebreakinRowVisitorforRemoveVisitor(#633) - Add the in-commit timestamp field to CommitInfo (#581)
- Support NOT and column expressions in eval_sql_where (#653)
- Add check for schema read compatibility (#554)
- Introduce
TableConfigurationto jointly manage metadata, protocol, and table properties (#644) - Add visitor
SidecarVisitorandSidecaraction struct (#673) - Add in-commit timestamps table properties (#558)
- Support writing to writer version 1 (#693)
- ffi: new
logical_schemaFFI to get the logical schema of a snapshot (#709)
- Incomplete multi-part checkpoint handling when no hint is provided (#641)
- Consistent PartialEq for Scalar (#677)
- Cargo fmt does not handle mods defined in macros (#676)
- Ensure properly nested null masks for parquet reads (#692)
- Handle predicates on non-nullable columns without stats (#700)
- Update readme to reflect tracing feature is needed for read-table (#619)
- Clarify
JsonHandlersemantics on EngineData ordering (#635)
- Make [non] nullable struct fields easier to create (#646)
- Make eval_sql_where available to DefaultPredicateEvaluator (#627)
- Port cdf tests from delta-spark to kernel (#611)
v0.6.1 (2025-01-10)
- New feature flag
default-engine-rustls(#572)
- Allow partition value timestamp to be ISO8601 formatted string (#622)
- Fix stderr output for handle tests (#630)
v0.6.0 (2024-12-17)
API Changes
Breaking
Scan::executetakes anArc<dyn EngineData>now (#553)StructField::physical_nameno longer takes aColumnMappingargument (#543)- removed
ColumnMappingModeDefaultimplementation (#562) - Remove lifetime requirement on
Scan::execute(#588) scan::Scan::predicaterenamed asphysical_predicateto eliminate ambiguity (#512)scan::log_replay::scan_action_iternow takes fewer (and different) params. (#512)Expression::Unary,Expression::Binary, andExpression::Variadicnow wrap a struct of the same name containing their fields (#530)- Moved
delta_kernel::engine::parquet_stats_skippingmodule todelta_kernel::predicate::parquet_stats_skipping(#602) - New
ErrorvariantsError::ChangeDataFeedIncompatibleSchemaandError::InvalidCheckpoint(#593)
Additions
- Ability to read a table's change data feed with new TableChanges API! See new
table_changesmodule as well as the 'read-table-changes' example (#597). Changes include:
- Implement Log Replay for Change Data Feed (#540)
ScanFileexpression and visitor for CDF (#546)- Resolve deletion vectors to find inserted and removed rows for CDF (#568)
- Helper methods for CDF Physical to Logical Transformation (#579)
TableChangesScan::executeand end to end testing for CDF (#580)TableChangesScan::schemamethod to get logical schema (#589)
- Enable relaying log events via FFI (#542)
Implemented enhancements:
- Define an ExpressionTransform trait (#530)
- [chore] appease clippy in rustc 1.83 (#557)
- Simplify column mapping mode handling (#543)
- Adding some more miri tests (#503)
- Data skipping correctly handles nested columns and column mapping (#512)
- Engines now return FileMeta with correct millisecond timestamps (#565)
Fixed bugs:
- don't use std abs_diff, put it in test_utils instead, run tests with msrv in action (#596)
- (CDF) Add fix for sv extension (#591)
- minimal CI fixes in arrow integration test and semver check (#548)
v0.5.0 (2024-11-26)
API Changes
Breaking
Expression::Column(String)is nowExpression::Column(ColumnName)#400- delta_kernel_ffi::expressions moved into two modules:
delta_kernel_ffi::expressions::engineanddelta_kernel_ffi::expressions::kernel#363 - FFI: removed (hazardous)
impl FromforKernelStringSlizeand addedunsafeconstructor instead #441 - Moved
LogSegmentinto its own module (log_segment::LogSegment) #438 - Renamed
EngineData::lengthasEngineData::len#471 - New
AsAnytrait:AsAny: Any + Send + Syncrequired bound on all engine traits #450 - Rename
mod featurestomod table_features#454 - LogSegment fields renamed:
commit_files->ascending_commit_filesandcheckpoint_files->checkpoint_parts#495 - Added minimum-supported rust version: currenly rust 1.80 #504
- Improved row visitor API: renamed
EngineData::extractasEngineData::visit_rows, andDataVisitortrait renamed asRowVisitor#481 - FFI: New
mod engine_dataandmod error(movedErrortoerror::Error) #537 - new error types:
InvalidProtocol,InvalidCommitInfo,MissingCommitInfo,FileAlreadyExists,Unsupported,ParseIntervalError,ChangeDataFeedUnsupported
Additions
- New
ColumnName,column_name!,column_expr!for structured column name parsing. #400 #467 - New
EngineAPIwrite_json_file()for atomically writing JSON #370 - New
TransactionAPI for creating transactions, adding commit info and write metadata, and commiting the transaction to the table. IncludesTable.new_transaction(),Transaction.write_context(),Transaction.with_commit_info,Transaction.with_operation(),Transaction.with_write_metadata(), andTransaction.commit()#370 #393 - FFI: Visitor for converting kernel expressions to engine expressions. See the new example at
ffi/examples/visit-expression/#363 - FFI: New
TryFromStringSlicetrait andkernel_string_slicemacro #441 - New
DefaultEngineengine implementation for writing parquet:write_parquet_file()#393 - Added support for parsing comma-separated column name lists:
ColumnName::parse_column_name_list()#458 - New
VacuumProtocolChecktable feature #454 DvInfonow implementsClone,PartialEq, andEq#468Statsnow implementsDebug,Clone,PartialEq, andEq#468- Added
Cdcaction support #506 - (early CDF read support) New
TableChangestype to read CDF from a table between versions #505 - (early CDF read support) Builder for scans on
TableChanges#521 - New
TablePropertiesstruct which can parse tables'metadata.configuration#453 #536
Implemented enhancements:
- FFI examples now use AddressSanitizer #447
ColumnNamenow tracks a path of field names instead of a simple string #445- use
ParsedLogPathsfor files inLogSegment#472 - FFI: added Miri support for tests #470
- check table URI has trailing slash #432
- build
cargo docsin CI #479 - new
test-utilscrate #477 - added proper protocol validation (both parsing correctness and semantic correctness) #454 #493
- harmonize predicate evaluation between delta stats and parquet footer stats #420
- more log path tests #485
ensure_read_supportedandensure_write_supportedAPIs #518- include NOTICE and LICENSE in published crates #520
- FFI: factored out read_table kernel utils into
kernel_utils.h/c#539 - simplified log replay visitor and avoid materializing Add/Remove actions #494
- simplified schema transform API #531
- support arrow view types in conversion from
ArrowDataTypeto kernel'sDataType#533
Fixed bugs:
- Disabled missing-column row group skipping: The optimization to treat a physically missing column as all-null is unsound, if the schema was not already verified to prove that the table's logical schema actually includes the missing column. We disable it until we can add the necessary validation. #435
- fixed leaks in read_table FFI example #449
- fixed read_table compilation on windows #455
- fixed various predicate eval bugs #420
v0.4.1 (2024-10-28)
API Changes
None.
Fixed bugs:
- Disabled missing-column row group skipping: The optimization to treat a physically missing column as all-null is unsound, if the schema was not already verified to prove that the table's logical schema actually includes the missing column. We disable it until we can add the necessary validation. #435
v0.4.0 (2024-10-23)
API Changes
Breaking
pub ScanResult.maskfield made private and only accessible asScanResult.raw_mask()method #374- new
ReaderFeaturesenum variant:TypeWideningandTypeWideningPreview#335 - new
WriterFeaturesenum variant:TypeWideningandTypeWideningPreview#335 - new
Errorenum variant:InvalidLogPathwhen kernel is unable to parse the name of a log path #347 - Module moved:
mod delta_kernel::transaction->mod delta_kernel::actions::set_transaction#386 - change
default-featureto be none (removedsync-engineby default. If downstream users relied on this, turn onsync-enginefeature or specific arrow-related feature flags to pull in the pieces needed) #339 Scan'sexecute(..)method now returns a lazy iterator instead of materializing aVec<ScanResult>. You can trivially migrate to the new API (and force eager materialization by using.collect()or the like on the returned iterator) #340- schema and expression FFI moved to their own
mod delta_kernel_ffi::schemaandmod delta_kernel_ffi::expressions#360 - Parquet and JSON readers in
Enginetrait now takeArc<Expression>(aliased toExpressionRef) instead ofExpression#364 StructType::new(..)now takes animpl IntoIterator<Item = StructField>instead ofVec<StructField>#385DataType::struct_type(..)now takes animpl IntoIterator<Item = StructField>instead ofVec<StructField>#385- removed
DataType::array_type(..)API: there is already animpl From<ArrayType> for DataType#385 Expression::struct_expr(..)renamed toExpression::struct_from(..)#399- lots of expressions take
impl Into<Self>orimpl Into<Expression>instead of justSelf/Expressionnow #399 - remove
log_replay_iterandprocess_batchAPIs inscan::log_replay#402
Additions
- remove feature flag requirement for
impl GetDataon()#334 - new
full_mask()method onScanResult#374 StructType::try_new(fields: impl IntoIterator<Item = StructField>)#385DataType::try_struct_type(fields: impl IntoIterator<Item = StructField>)#385StructField.metadata_with_string_values(&self) -> HashMap<String, String>to materialize and return our metadata into a hashmap #331
Implemented enhancements:
- support reading tables with type widening in default engine #335
- add predicate to protocol and metadata log replay for pushdown #336 and #343
- support annotation (macro) for nullable values in a container (for
#[derive(Schema)]) #342 - new
ParsedLogPathtype for better log path parsing #347 - implemented row group skipping for default engine parquet readers and new utility trait for stats-based skipping logic #357, #362, #381
- depend on wider arrow versions and add arrow integration testing #366 and #413
- added semver testing to CI #369, #383, #384
- new
SchemaTransformtrait and usage in column mapping and data skipping #395 and #398 - arrow expression evaluation improvements #401
- replace panics with
to_compiler_errorin macros #409
Fixed bugs:
- output of arrow expression evaluation now applies/validates output schema in default arrow expression handler #331
- add
arrow-buffertoarrow-expressionfeature #332 - fix bug with out-of-date last checkpoint #354
- fixed broken sync engine json parsing and harmonized sync/async json parsing #373
- filesystem client now always returns a sorted list #344
v0.3.1 (2024-09-10)
API Changes
Additions
- Two new binary expressions:
InandNotIn, as well as a newScalar::Arrayvariant to represent arrays in the expression framework #270 NOTE: exact API for these expressions is still evolving.
Implemented enhancements:
- Enabled more golden table tests #301
Fixed bugs:
- Allow kernel to read tables with invalid
_last_checkpoint#311 - List log files with checkpoint hint when constructing latest snapshot (when version requested is
None) #312 - Fix incorrect offset value when computing list offsets #327
- Fix metadata string conversion in default engine arrow conversion #328
v0.3.0 (2024-08-07)
API Changes
Breaking
delta_kernel::column_mappingmodule moved todelta_kernel::features::column_mapping#222
Additions
- New deletion vector API
row_indexes(and accompanying FFI) to get row indexes instead of seletion vector of deleted rows. This can be more efficient for sparse DVs. #215 - Typed table features:
ReaderFeatures,WriterFeaturesenums andhas_reader_feature/has_writer_featureAPI #222
Implemented enhancements:
- Add
--limitoption to exampleread-table-multi-threaded#297 - FFI now built with cmake. Move to using the read-test example as an ffi-test. And building on macos. #288
- Golden table tests migrated from delta-spark/delta-kernel java #295
- Code coverage implemented via cargo-llvm-cov and reported with codecov #287
- All tests enabled to run in CI #284
- Updated DAT to 0.3 #290
Fixed bugs:
- Evaluate timestamps as "UTC" instead of "+00:00" for timezone #295
- Make Map arrow type field naming consistent with parquet field naming #299
v0.2.0 (2024-07-17)
API Changes
Breaking
-
The scan callback if using
visit_scan_filesnow takes an extraOption<Stats>argument, holding top level stats for associated scan file. You will need to add this argument to your callback.Likewise, the callback in the ffi code also needs to take a new argument which is a pointer to a
Statsstruct, and which can be null if no stats are present.
Additions
- You can call
scan_builder()directly on a snapshot, for more convenience. - You can pass a
URLstarting with"hdfs"or"viewfs"to the default client to read usinghdfs_native_store
Implemented enhancements:
- Handle nested structs in
schemaString(allows reading iceberg compat tables) #257 - Expose top level stats in scans #227
- Hugely expanded C-FFI example #203
- Add
scan_builderfunction toSnapshot#273 - Add
hdfs_native_storesupport #273 - Proper reading of Parquet files, including only reading requested leaves, type casting, and reordering #271
- Allow building the package if you are behind an https proxy #282
Fixed bugs:
- Don't error if more fields exist than expected in a struct expression #267
- Handle cases where the deletion vector length is less than the total number of rows in the chunk #276
- Fix partition map indexing if column mapping is in effect #278
v0.1.1 (2024-06-03)
Implemented enhancements:
- Support unary
NOTandIsNullfor data skipping #231 - Add unary visitors to c ffi #247
- Minor other QOL improvements
v0.1.0 (2024-06-12)
Initial public release