Skip to content

Commit a11b92d

Browse files
authored
[PROTOCOL Change Request] Return latestTableVersion in the CommitStore.getCommits API (delta-io#2712)
## Protocol Change Request ### Description of the protocol change This change puts added responsibility on CommitStore to return the latestTableVersion in the CommitStore.getCommits API along with the list of Commits. Protocol RFC issue: delta-io#2598 ### Willingness to contribute The Delta Lake Community encourages protocol innovations. Would you or another member of your organization be willing to contribute this feature to the Delta Lake code base? - [x] Yes. I can contribute. - [ ] Yes. I would be willing to contribute with guidance from the Delta Lake community. - [ ] No. I cannot contribute at this time.
1 parent a584fe1 commit a11b92d

File tree

1 file changed

+16
-6
lines changed

1 file changed

+16
-6
lines changed

protocol_rfcs/managed-commits.md

Lines changed: 16 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -143,7 +143,7 @@ are responsible to define the commit atomicity and backfill protocols which the
143143

144144
At a high level, the `commit-owner` needs to provide:
145145
- API to atomically commit a version `x` with given set of `actions`. This is explained in detail in the [commit protocol](#commit-protocol) section.
146-
- API to retrieve information about the recent commits on the table. This is explained in detail in the [getting un-backfilled commits from commit-owner](#getting-un-backfilled-commits-from-commit-owner) section.
146+
- API to retrieve information about the recent commits and the latest ratified version on the table. This is explained in detail in the [getting un-backfilled commits from commit-owner](#getting-un-backfilled-commits-from-commit-owner) section.
147147

148148
### Commit Protocol
149149

@@ -161,7 +161,14 @@ Even after a commit succeeds, Delta clients can only discover the commit through
161161
have no way to determine which file in `_delta_log/_commits` directory corresponds to the actual commit `v`.
162162

163163
The commit-owner is responsible to implement an API (defined by the Delta client) that Delta clients can use to retrieve information about un-backfilled commits maintained
164-
by the commit-owner. Delta clients who are unaware of the commit-owner (or unwilling to talk to it), may not see recent un-backfilled commits and thus may encounter stale reads.
164+
by the commit-owner. The API must also return the latest version of the table ratified by the commit-owner (if any).
165+
Providing the latest ratified table version helps address potential race conditions between listing commits and contacting the commit-owner.
166+
For example, if a client performs a listing before a recently ratified commit is backfilled, and then contacts the commit-owner after the backfill completes,
167+
the commit-owner may return an empty list of un-backfilled commits. Without knowing the latest ratified version, the client might incorrectly assume their listing was complete
168+
and read a stale snapshot.
169+
170+
Delta clients who are unaware of the commit-owner (or unwilling to talk to it), may not see recent un-backfilled commits and thus may encounter stale reads.
171+
165172

166173
## Sample Commit Owner API
167174

@@ -176,7 +183,7 @@ interface CommitStore {
176183
* @param version The version we want to commit.
177184
* @param actions Actions that need to be committed.
178185
*
179-
* returns CommitResponse which has details around the new committed delta file.
186+
* @return CommitResponse which has details around the new committed delta file.
180187
*/
181188
def commit(
182189
version: Long,
@@ -191,13 +198,16 @@ interface CommitStore {
191198
* Note that the first version returned by this API may not be equal to the `startVersion`. This
192199
* happens when few versions starting from `startVersion` are already backfilled and so
193200
* CommitStore may have stopped tracking them.
201+
* The returned latestTableVersion is the maximum commit version ratified by the Commit-Owner.
202+
* Note that returning latestTableVersion as -1 is acceptable only if the commit-owner never
203+
* ratified any version i.e. it never accepted any un-backfilled commit.
194204
*
195-
* @return a list of `Commit` which are tracked by commit-owner.
196-
*
205+
* @return GetCommitsResponse which contains a list of `Commit`s and the latestTableVersion
206+
* tracked by the commit-owner.
197207
*/
198208
def getCommits(
199209
startVersion: Long,
200-
endVersion: Long): Seq[Commit]
210+
endVersion: Long): GetCommitsResponse
201211

202212
/**
203213
* API to ask the commit-owner to backfill all commits <= given `version`.

0 commit comments

Comments
 (0)