feat: Distributed Log Replay Driver Phase #1502

OussamaSaoudi · 2025-11-19T05:35:25Z

🥞 Stacked PR

Use this link to review incremental changes.

stack/dlr_commit [Files changed]
- stack/dlr_manifest [Files changed]
  - stack/dlr_leaf [Files changed]
    - stack/dlr_driver [Files changed]
      - stack/dlr_serde [Files changed]
        
        stack/dlr_executor

What changes are proposed in this pull request?

How was this change tested?

scovich · 2025-11-20T16:16:06Z

kernel/src/distributed/driver.rs

+use crate::log_segment::LogSegment;
+use crate::{DeltaResult, Engine, Error, FileMeta};
+
+/// Driver-side log replay (Phase 1) for distributed execution.


"driver" and "executor" are spark jargon, do we want to intentionally adopt that in kernel?
Or should we use e.g. PreparePhase and DistributedPhase?

scovich · 2025-11-20T16:20:18Z

kernel/src/distributed/driver.rs

+/// ```
+pub(crate) struct DriverPhase<P> {
+    processor: P,
+    state: Option<DriverState>,


We have a small nomenclature clash here -- here we use "state" as in "state of a state machine" but other parts of kernel use "state" as in "the information tracked by a stateful operation". Maybe we call this one a step instead (as in, a phase contains multiple steps)?

scovich · 2025-11-20T16:22:23Z

kernel/src/distributed/driver.rs

+}
+
+/// Result of driver phase processing.
+pub(crate) enum DriverPhaseResult<P> {


another small clash: "result" usually means something related to std::result::Result (success/failure), but here it just means "[successful] outcome" which must then be wrapped as Result<DriverPhaseResult>?

scovich · 2025-11-20T16:49:02Z

kernel/src/distributed/driver.rs

+/// ```
+pub(crate) struct DriverPhase<P> {
+    processor: P,
+    state: Option<DriverState>,


A couple questions:

Do we want DriverPhase to impl Send (assuming P does), but purposefully not impl Sync (maybe by a PhantomData<Cell<()>> member?

Send + !Sync + !Clone could be an API contract that tells both humans and compiler how that phase should be used. And then ExecutorPhase should be Send + Sync + Clone (assuming P is), again because that tells the compiler what we intend.

However, I notice that types like Vec<T> are Send + Sync, presumably because the borrow checker enforces the necessary mutual exclusion (only one mutable reference can exist at a time even within a single thread, let alone across multiple threads). So maybe we don't need to do anything special here?

Should types that impl Processor have type states to transition them from (mutable) driver phase to (immutable or at least different) executor phase? Or should we leave it up to the implementations to sort out things like that?

The driver state is ~always mutable (building "seen" set, collecting P&M that fly past, etc)

The executor state for map-type operations like scan tends to be immutable (probe the "seen" set)

The executor state for fold-type operations like P&M query tends to be mutable (still looking for P&M that the driver phase never saw)

The executor state for checkpoint operations is both map (actions to be written) and fold (collecting stats for the checkpoint manifest and/or _last_checkpoint file).

So actually... driver phase state/processor only needs to be left-associative: f(x, y) != f(y, x) and f(x, f(y, z)) != f(f(x, y), z) (order matters, can't parallelize); meanwhile, executor phase state/processor must be fully associative and commutative: f(x, y) = f(y, x) and f(x, f(y, z)) = f(f(x, y), z). That guarantees the results (whatever they are) can be partially aggregated by executors with final aggregation of those partial results on the driver.

scovich · 2025-11-20T17:35:25Z

kernel/src/distributed/driver.rs

+        engine: Arc<dyn Engine>,
+    ) -> DeltaResult<Self> {
+        let commit_schema = get_commit_schema();
+        let commit = CommitReader::try_new(engine.as_ref(), &log_segment, commit_schema.clone())?;


There's no guarantee the log segment contains any commits. It might just have a checkpoint. And it might be a (v1) multi-part checkpoint as well.

So this phase could have any of four possible sets of steps: [commits], [commits, manifest], [manifest], or []. Should we just track a (Option<CommitReader>, Option<ManifestReader>) pair and be done with it? The choreograph is the same either way: commits first (if present), followed by manifest (if present).

scovich · 2025-11-20T18:04:39Z

kernel/src/distributed/driver.rs

+            Ok(DriverState::Manifest(manifest))
+        } else {
+            // Multi-part checkpoint: all parts are leaf files
+            let files: Vec<_> = log_segment


Suggested change

let files: Vec<_> = log_segment

let files = log_segment

scovich · 2025-11-20T18:17:56Z

kernel/src/distributed/driver.rs

+        loop {
+            // Try to get item from current phase
+            let batch_result = match self.state.as_mut()? {


If we just tracked a pair of states, I think this code simplifies a lot:

if let Some(ref mut commits) = self.commits { match commits.next() { None => self.commits = None, // exhausted! next => return next, } } if let Some(ref mut manifest) = self.manifest { match manifest.next() { None => self.manifest = None, // exhausted! next => return next, } } None

We would no longer need self.is_finished because

self.commits.is_none() && self.manifest.is_none()

should suffice

OussamaSaoudi added 9 commits November 18, 2025 14:57

commit reader

a84798b

improve commit

07824d3

manifest v1

23bf703

manifest

8abf164

allow_unused

e129ce0

improve test

92737a4

add leaf reader

6b81919

implement driver

6c1aeea

driver

3df290c

This was referenced Nov 19, 2025

feat: Distributed Log Replay serialization/deserialization #1503

Draft

feat: Commit Reader #1499

Draft

feat: Leaf Checkpoint Reader #1501

Draft

github-actions bot assigned OussamaSaoudi Nov 19, 2025

OussamaSaoudi mentioned this pull request Nov 19, 2025

feat: Manifest File Reader #1500

Draft

OussamaSaoudi changed the title ~~implement driver~~ feat: Distributed Log Replay Driver Phase Nov 19, 2025

scovich reviewed Nov 20, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Distributed Log Replay Driver Phase #1502

feat: Distributed Log Replay Driver Phase #1502

OussamaSaoudi commented Nov 19, 2025 •

edited

Loading

Uh oh!

scovich Nov 20, 2025

Uh oh!

scovich Nov 20, 2025

Uh oh!

scovich Nov 20, 2025

Uh oh!

scovich Nov 20, 2025

Uh oh!

scovich Nov 20, 2025

Uh oh!

scovich Nov 20, 2025

Uh oh!

scovich Nov 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: Distributed Log Replay Driver Phase #1502

Are you sure you want to change the base?

feat: Distributed Log Replay Driver Phase #1502

Conversation

OussamaSaoudi commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🥞 Stacked PR

What changes are proposed in this pull request?

How was this change tested?

Uh oh!

scovich Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

scovich Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

scovich Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

scovich Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

scovich Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

scovich Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

scovich Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

OussamaSaoudi commented Nov 19, 2025 •

edited

Loading