-
Notifications
You must be signed in to change notification settings - Fork 72
docs(rfc): stream alignment #2027
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Deploying hydroflow with
|
| Latest commit: |
34f0c80
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://2c7d1fe0.hydroflow.pages.dev |
| Branch Preview URL: | https://pr2027.hydroflow.pages.dev |
cb86b05 to
19cd28f
Compare
|
I agree that we both need the ability to say that things happen atomically, but that developers using this hinders our ability to do optimization so we want to discourage this as much as possible. We talk about the analogy of developers making their systems correct by putting locks everywhere. This could end up being used in a similar way if we don't direct people towards the Hydraulic way of building with this primitive. I think the guarantee this offers by default should be mappable to a classic consistency level or it should be obvious how a developer specifies what consistency level they want this to offer. If we are joining things together across clients, we need an answer to what consistency that joining offers. Ensuring this atomicity cross-node will be expensive, but we do need a story for it. Should the cross-node version be the same keyword? a align_point_networked() keyword? Would it feel natural to program with both the align_point and align_point_networked keywords? |
|
Great points @conor-23. I've updated the RFC to use Also agreed that eventually we will likely want a very similar API for cross-node atomicity. I'm thinking along the same lines of a |
| > [!NOTE] | ||
| > **More Explicit Design:** | ||
| > | ||
| > Then, we need to declare _which_ upstream values will be aligned at this point. In our case, we want alignment for the increment stream, so we call `aligned_at` to declare that this stream will be aligned when accessed at that point. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I keep going back and forth on whether I like this more explicit approach. On one hand, it helps developers think even more carefully about what is being aligned instead of a semi-opaque tracing approach. On the other hand, it can lead to frustration if you accidentally forget to add a aligned_at annotation on something you do want to be aligned.
In a world where developers are building everything using the simulator, I'd hope that this concern is mitigated since the simulator would quickly identify misalignment failures. But I am not sure how likely that world is.
|
The API itself seems clunky, which is maybe fine. Someone like a closure or block feels more natural I wonder how the actual implementation can be decoupled from ticks in the future; I assume that is desirable |
|
Update: added to FAQ in the RFC doc
I think the tendency towards a closure / block arises from a bit of historical mental model. We've been thinking of this in terms of atomic compute sections that execute in a single tick on a single machine, with the effect being that all downstreams will see the same version of data. The new API is more like providing consistency guarantees when asking for a version of some asynchronously updated data. The necessary atomic region is actually inferred (in fact, there may be methods other than atomic regions for achieving the same result!). As a result, it's not clear where the beginning of a closure / block would need to be. The other reason for the non-block approach is it's common to have semantically separate pieces of logic that need to share a consistent view of some shared data. For example different request handlers that need to read the same key. Ideally, these separate components can be in different Rust functions, but that means that we can't have a "block" around the portions of each function that deal with the shared data. Instead, the token approach lets us pass in a "token that points to some consistent but non-deterministic version of the data". For co-incidence and loops, the latter situation is far less common; the entirety of a loop is almost always in a single function. So there I think we will want to move towards a block / closure approach (but that is for a separate RFC). |
2a7d3b6 to
34f0c80
Compare
|
Is it a goal to remove ticks from the API, using alignment as constraints for reordering and scheduling? If so, aligning with cuts in the stream helps with atomicity and grouping, but can one specify ordering constraints without referring to ticks? My only experience with this is batched incremental join, so grains of salt and all that, but in this example a batch from two streams is aligned with a 1-tick delayed snapshot of these streams. Can alignment include before/after the cut point in the stream, so we don't need to refer to ticks to say which "side" of the consistent view we're referring to? Returning to the join, it uses the same cut point in let r_idx = r_stream.local_cut(nondet!(/** ... */));
let r = r_stream.clone()
.map(q!(...))
.into_keyed()
.fold_commutative(...)
.snapshot_before(&r_idx);
let r_x_ds = r_stream.clone()
.map(q!(...))
.batch_after(&r_idx);So there are no messages between If it's an ambition to capture causal or sequential consistency, cuts at scopes other than |
I think ticks will remain in some form (perhaps under a different name), but the goal that ticks should only be used when there is meaningful iterative / logical-time-dependent code. So of the three current uses of ticks, only the first should remain:
So in this design, when we batch a stream, we still provide a tick in addition to the token that ensures it is consistent with respect to other batches / snapshots. So in your example, I think we would do something like this with the propoosed API let r_stream_align = process.local_align();
let r = r_stream.clone()
.map(q!(...))
.into_keyed()
.fold_commutative(...)
.snapshot_aligned(&process.tick(), &r_idx)
.defer_tick(); // get the snapshot from the previous tick
let r_x_ds = r_stream.clone()
.map(q!(...))
.batch_aligned(&process.tick(), &r_idx);I think it's a bit hard to know (for me) how much better / worse this is compared to the existing API for your use case, but the idea is that ticks should be involved whenever there is any logic dealing with consecutive steps of time.
Yeah, I think this is similar to @conor-23 brought up. I think the ambition is to have a similar, but separate API for slices / alignment that span several machines. Would have to be a different API both because the guarantees will be different and also since we will want to let developers configure the mechanism / consistency level they want. But the hope is that it will be a conceptual sibling to local alignment. |
Got it. This RFC definitely improves on the existing atomic API and I don't want to slow it down. Thinking in ticks instead of dependencies may just be something to get used to. Would cross-machine alignment be tied to ticks? Or are ticks exclusively local, and there's a different synchronization/heartbeat... token across processes? Alignment is passing something more like a scope to the stream, but even using the same tick it's not aligning across streams? In the incremental join example, it could buffer This could be read as at each process, create a batch/snapshot that is non-deterministically and independently chosen from the stream whereas attempting a batch/snapshot at a different scope (cluster?), it would require proof that barrier was meaningful? So something like |
jhellerstein
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nits
Deploying hydro with
|
| Latest commit: |
f56e168
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://8d0b0af2.hydroflow.pages.dev |
| Branch Preview URL: | https://pr2027.hydroflow.pages.dev |
No description provided.