You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(hydro_lang): make network channels configurable with a generic Stream::send API (#2400)
This eliminates the hardcoding of networking APIs to specific
serialzation formats, transport protocols, etc. Includes similar
refactors across `demux` / `round_robin` / etc APIs.
Copy file name to clipboardExpand all lines: docs/docs/hydro/learn/quickstart/partitioned-counter.mdx
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -68,17 +68,17 @@ The `prefix_key` method adds a new key to the front of the keyed stream. Here, y
68
68
69
69
:::
70
70
71
-
Now you'll use `demux_bincode` to send data from the leader to the cluster. This is where Hydro fundamentally differs from traditional distributed systems frameworks. In most frameworks, you write separate programs for each service (one for the leader, one for the shards), then configure external message brokers or RPC systems to connect them. You have to manually serialize messages, manage network connections, and coordinate deployment.
71
+
Now you'll use `demux` to send data from the leader to the cluster using the configured network protocol and serialization format. This is where Hydro fundamentally differs from traditional distributed systems frameworks. In most frameworks, you write separate programs for each service (one for the leader, one for the shards), then configure external message brokers or RPC systems to connect them. You have to manually serialize messages, manage network connections, and coordinate deployment.
72
72
73
-
In Hydro, you write a **single Rust function** that describes the entire distributed service. When you call `demux_bincode`, you're performing network communication right in the middle of your function - but it feels like ordinary Rust code. The Hydro compiler automatically generates the network code, handles serialization, and deploys the right code to each machine. You can reason about your entire distributed system in one place, with full type safety and IDE support.
73
+
In Hydro, you write a **single Rust function** that describes the entire distributed service. When you call `demux`, you're performing network communication right in the middle of your function - but it feels like ordinary Rust code. The Hydro compiler automatically generates the network code, handles serialization, and deploys the right code to each machine. You can reason about your entire distributed system in one place, with full type safety and IDE support.
74
74
75
-
The `demux_bincode` method sends each element to the cluster member specified by the first component of the key:
75
+
The `demux` method sends each element to the cluster member specified by the first component of the key:
After `demux_bincode`, the stream is now located at the cluster. The stream returned by a demux operator preserves the remaining keys after the `MemberId` is consumed for routing. In this case, we are left with a `KeyedStream<u32, String, Cluster<'a, CounterShard>>` - a stream keyed by client ID and key name, located on the cluster.
81
+
After `demux`, the stream is now located at the cluster. The stream returned by a demux operator preserves the remaining keys after the `MemberId` is consumed for routing. In this case, we are left with a `KeyedStream<u32, String, Cluster<'a, CounterShard>>` - a stream keyed by client ID and key name, located on the cluster.
82
82
83
83
## Running the Counter on the Cluster
84
84
@@ -96,7 +96,7 @@ Finally, you need to send the responses back to the leader process:
The `send_bincode` method sends data from the cluster to the leader process. When data moves from a cluster to a process, it arrives as a keyed stream with `MemberId` as the first key. The `drop_key_prefix` method removes this `MemberId` key, leaving just the original keys (client ID and response data).
99
+
The `send` method sends data from the cluster to the leader process. When data moves from a cluster to a process, it arrives as a keyed stream with `MemberId` as the first key. The `drop_key_prefix` method removes this `MemberId` key, leaving just the original keys (client ID and response data).
100
100
101
101
This is a standard Hydro pattern for building a partitioned service in Hydro: prefix a key for routing, demux to cluster, process, send back, drop the routing key. You will find similar code in key-value stores and even consensus protocols.
let on_p2: Stream<_, Process<_>, Unbounded, NoOrder> =
70
-
numbers.send_bincode(&process).values();
70
+
numbers.send(&process, TCP.bincode()).values();
71
71
```
72
72
73
73
The ordering of a stream determines which APIs are available on it. For example, `map` and `filter` are available on all streams, but `last` is only available on streams with `TotalOrder`. This ensures that even when the network introduces non-determinism, the program will not compile if it tries to use an API that requires a deterministic order.
@@ -82,7 +82,7 @@ let process: Process<()> = flow.process::<()>();
82
82
let all_words: Stream<_, Process<_>, Unbounded, NoOrder> = workers
83
83
.source_iter(q!(vec!["hello", "world"]))
84
84
.map(q!(|x| x.to_string()))
85
-
.send_bincode(&process)
85
+
.send(&process, TCP.bincode())
86
86
.values();
87
87
88
88
let words_concat = all_words
@@ -92,7 +92,7 @@ let words_concat = all_words
92
92
93
93
:::tip
94
94
95
-
We use `values()` here to drop the member IDs which are included in `send_bincode`. See [Clusters](../locations/clusters.md) for more details.
95
+
We use `values()` here to drop the member IDs which are included in `send`. See [Clusters](../locations/clusters.md) for more details.
96
96
97
97
Running an aggregation (`fold`, `reduce`) converts a `Stream` into a `Singleton`, as we see in the type signature here. The `Singleton` type is still "live" in the sense of a [Live Collection](./index.md), so updates to the `Stream` input cause updates to the `Singleton` output. See [Singletons and Optionals](./singletons-optionals.md) for more information.
98
98
@@ -109,7 +109,7 @@ To perform an aggregation with an unordered stream, you must use [`fold_commutat
109
109
# let all_words: Stream<_, Process<_>, _, hydro_lang::live_collections::stream::NoOrder> = workers
When sending a live collection from a cluster to another location, **each** member of the cluster will send its local collection. On the receiver side, these collections will be joined together into a **keyed stream** of with `ID` keys and groups of `Data` values where the ID uniquely identifies which member of the cluster the data came from. For example, you can send a stream from the worker cluster to another process using the `send_bincode` method:
29
+
When sending a live collection from a cluster to another location, **each** member of the cluster will send its local collection. On the receiver side, these collections will be joined together into a **keyed stream** of with `ID` keys and groups of `Data` values where the ID uniquely identifies which member of the cluster the data came from. For example, you can send a stream from the worker cluster to another process using the `send` method:
In the reverse direction, when sending a stream _to_ a cluster, the sender must prepare `(ID, Data)` tuples, where the ID uniquely identifies which member of the cluster the data is intended for. Then, we can send a stream from a process to the worker cluster using the `demux_bincode` method:
76
+
In the reverse direction, when sending a stream _to_ a cluster, the sender must prepare `(ID, Data)` tuples, where the ID uniquely identifies which member of the cluster the data is intended for. Then, we can send a stream from a process to the worker cluster using the `demux` method:
77
77
78
78
```rust
79
79
# usehydro_lang::prelude::*;
@@ -84,8 +84,8 @@ In the reverse direction, when sending a stream _to_ a cluster, the sender must
A common pattern in distributed systems is to broadcast data to all members of a cluster. In Hydro, this can be achieved using `broadcast_bincode`, which takes in a stream of **only data elements** and broadcasts them to all members of the cluster. For example, we can broadcast a stream of integers to the worker cluster:
103
+
A common pattern in distributed systems is to broadcast data to all members of a cluster. In Hydro, this can be achieved using `broadcast`, which takes in a stream of **only data elements** and broadcasts them to all members of the cluster. For example, we can broadcast a stream of integers to the worker cluster:
104
104
105
105
```rust
106
106
# usehydro_lang::prelude::*;
@@ -109,8 +109,8 @@ A common pattern in distributed systems is to broadcast data to all members of a
This API requires a [non-determinism guard](../live-collections/determinism.md#unsafe-operations-in-hydro), because the set of cluster members may asynchronously change over time. Depending on when we are notified of membership changes, we will broadcast to different members. Under the hood, the `broadcast_bincode` API uses a list of members of the cluster provided by the deployment system. To manually access this list, you can use the `source_cluster_members` method to get a stream of membership events (cluster members joining or leaving):
127
+
This API requires a [non-determinism guard](../live-collections/determinism.md#unsafe-operations-in-hydro), because the set of cluster members may asynchronously change over time. Depending on when we are notified of membership changes, we will broadcast to different members. Under the hood, the `broadcast` API uses a list of members of the cluster provided by the deployment system. To manually access this list, you can use the `source_cluster_members` method to get a stream of membership events (cluster members joining or leaving):
128
128
129
129
```rust
130
130
# usehydro_lang::prelude::*;
@@ -135,7 +135,7 @@ let workers: Cluster<()> = flow.cluster::<()>();
Copy file name to clipboardExpand all lines: docs/docs/hydro/reference/locations/index.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,7 @@ Hydro is a **global**, **distributed** programming model. This means that the da
3
3
4
4
Each [live collection](pathname:///rustdoc/hydro_lang/live_collections/) has a type parameter `L` which will always be a type that implements the `Location` trait (e.g. [`Process`](./processes.md) and [`Cluster`](./clusters.md), documented in this section). Computation has to happen at a single place, so Hydro APIs that consume multiple live collections will require all inputs to have the same location type. Moreover, most Hydro APIs that transform live collections will emit a new live collection output with the same location type as the input.
5
5
6
-
To create distributed programs, Hydro provides a variety of APIs to _move_ live collections between locations via network send/receive. For example, `Stream`s can be sent from one process to another process using `.send_bincode(&loc2)` (which uses [bincode](https://docs.rs/bincode/latest/bincode/) as a serialization format). The sections for each location type ([`Process`](./processes.md), [`Cluster`](./clusters.md)) discuss the networking APIs in further detail.
6
+
To create distributed programs, Hydro provides a variety of APIs to _move_ live collections between locations via network send/receive. For example, `Stream`s can be sent from one process to another process using `.send(&loc2, ...)`. The sections for each location type ([`Process`](./processes.md), [`Cluster`](./clusters.md)) discuss the networking APIs in further detail.
7
7
8
8
## Creating Locations
9
9
Locations can be created by calling the appropriate method on the global `FlowBuilder` (e.g. `flow.process()` or `flow.cluster()`). These methods will return a handle to the location that can be used to create live collections and run computations.
Because a process represents a single machine, it is straightforward to send data to and from a process. For example, we can send a stream of integers from the leader process to another process using the `send_bincode` method (which uses [bincode](https://docs.rs/bincode/latest/bincode/) as a serialization format). This automatically sets up network senders and receivers on the two processes.
33
+
Because a process represents a single machine, it is straightforward to send data to and from a process. For example, we can send a stream of integers from the leader process to another process using the `send` method (which can be configured to use a particular network protocol and serialization format). This automatically sets up network senders and receivers on the two processes.
34
34
35
35
```rust,no_run
36
36
# use hydro_lang::prelude::*;
@@ -39,5 +39,5 @@ Because a process represents a single machine, it is straightforward to send dat
39
39
# let leader: Process<Leader> = flow.process::<Leader>();
40
40
let numbers = leader.source_iter(q!(vec![1, 2, 3, 4]));
41
41
let process2: Process<()> = flow.process::<()>();
42
-
let on_p2: Stream<_, Process<()>, _> = numbers.send_bincode(&process2);
42
+
let on_p2: Stream<_, Process<()>, _> = numbers.send(&process2, TCP.bincode());
0 commit comments