Commit 7263d40
amend! Read metadata and protocol information from Delta checksum files
Read metadata and protocol information from Delta checksum files
Compliant Delta writers may emit optional checksum files alongside
commits containing metadata and protocol information. Instead of
loading the latest checkpoint and replaying intervening commits (which
can be expensive, especially for large v1 checkpoints), Trino can read
the latest commit’s checksum file to obtain this information with a
single listing and small JSON read. Ref.
https://github.com/delta-io/delta/blob/master/PROTOCOL.md#version-checksum-file
If the checksum file is missing or does not contain both metadata and
protocol, we fall back to the existing Delta log scanning approach.
Behavior is gated by session property load_metadata_from_checksum_file
(defaulting to config delta.load_metadata_from_checksum_file, which
defaults to false). Internal testing reduced analysis time for large
v1-checkpoint tables from ~10s to <500ms.
Co-authored-by: Eric Hwang <eh@openai.com>
Co-authored-by: Fred Liu <fredliu@openai.com>1 parent 6965882 commit 7263d40
File tree
0 file changed
+0
-0
lines changed0 file changed
+0
-0
lines changed
0 commit comments