Deploying to gh-pages from @ 229c114 🚀

frankmcsherry · frankmcsherry · commit 3f2d504b28ce · 2024-11-11T11:17:34.000Z
diff --git a/chapter_0/chapter_0_0.html b/chapter_0/chapter_0_0.html
@@ -175,16 +175,17 @@ <h1 class="menu-title"></h1>
                     <main>
                         <h2 id="a-simplest-example"><a class="header" href="#a-simplest-example">A simplest example</a></h2>
 <p>Let's start with what may be the simplest non-trivial timely dataflow program.</p>
-<pre><pre class="playground"><code class="language-rust">extern crate timely;
-
+<pre><pre class="playground"><code class="language-rust"><span class="boring">#![allow(unused)]
+</span><span class="boring">fn main() {
+</span><span class="boring">extern crate timely;
+</span>
 use timely::dataflow::operators::{ToStream, Inspect};
 
-fn main() {
-    timely::example(|scope| {
-        (0..10).to_stream(scope)
-               .inspect(|x| println!("seen: {:?}", x));
-    });
-}</code></pre></pre>
+timely::example(|scope| {
+    (0..10).to_stream(scope)
+           .inspect(|x| println!("seen: {:?}", x));
+});
+<span class="boring">}</span></code></pre></pre>
 <p>This program gives us a bit of a flavor for what a timely dataflow program might look like, including a bit of what Rust looks like, without getting too bogged down in weird stream processing details. Not to worry; we will do that in just a moment!</p>
 <p>If we run the program up above, we see it print out the numbers zero through nine.</p>
 <pre><code class="language-ignore">    Echidnatron% cargo run --example simple
@@ -203,9 +204,10 @@ <h2 id="a-simplest-example"><a class="header" href="#a-simplest-example">A simpl
     Echidnatron%
 </code></pre>
 <p>This isn't very different from a Rust program that would do this much more simply, namely the program</p>
-<pre><pre class="playground"><code class="language-rust">fn main() {
-    (0..10).for_each(|x| println!("seen: {:?}", x));
-}</code></pre></pre>
+<pre><pre class="playground"><code class="language-rust"><span class="boring">#![allow(unused)]
+</span><span class="boring">fn main() {
+</span>(0..10).for_each(|x| println!("seen: {:?}", x));
+<span class="boring">}</span></code></pre></pre>
 <p>Why would we want to make our life so complicated? The main reason is that we can make our program <em>reactive</em>, so that we can run it without knowing ahead of time the data we will use, and it will respond as we produce new data.</p>
 
                     </main>
diff --git a/chapter_0/chapter_0_1.html b/chapter_0/chapter_0_1.html
@@ -176,38 +176,39 @@ <h1 class="menu-title"></h1>
                         <h2 id="an-example"><a class="header" href="#an-example">An example</a></h2>
 <p>Timely dataflow means to capture a large number of idioms, so it is a bit tricky to wrap together one example that shows off all of its features, but let's look at something that shows off some core functionality to give a taste.</p>
 <p>The following complete program initializes a timely dataflow computation, in which participants can supply a stream of numbers which are exchanged between the workers based on their value. Workers print to the screen when they see numbers. You can also find this as <a href="https://github.com/TimelyDataflow/timely-dataflow/blob/master/examples/hello.rs"><code>examples/hello.rs</code></a> in the <a href="https://github.com/TimelyDataflow/timely-dataflow/tree/master/examples">timely dataflow repository</a>.</p>
-<pre><pre class="playground"><code class="language-rust">extern crate timely;
+<pre><pre class="playground"><code class="language-rust"><span class="boring">#![allow(unused)]
+</span><span class="boring">fn main() {
+</span>extern crate timely;
 
 use timely::dataflow::InputHandle;
 use timely::dataflow::operators::{Input, Exchange, Inspect, Probe};
 
-fn main() {
-    // initializes and runs a timely dataflow.
-    timely::execute_from_args(std::env::args(), |worker| {
-
-        let index = worker.index();
-        let mut input = InputHandle::new();
-
-        // create a new input, exchange data, and inspect its output
-        let probe = worker.dataflow(|scope|
-            scope.input_from(&amp;mut input)
-                 .exchange(|x| *x)
-                 .inspect(move |x| println!("worker {}:\thello {}", index, x))
-                 .probe()
-        );
-
-        // introduce data and watch!
-        for round in 0..10 {
-            if index == 0 {
-                input.send(round);
-            }
-            input.advance_to(round + 1);
-            while probe.less_than(input.time()) {
-                worker.step();
-            }
+// initializes and runs a timely dataflow.
+timely::execute_from_args(std::env::args(), |worker| {
+
+    let index = worker.index();
+    let mut input = InputHandle::new();
+
+    // create a new input, exchange data, and inspect its output
+    let probe = worker.dataflow(|scope|
+        scope.input_from(&amp;mut input)
+             .exchange(|x| *x)
+             .inspect(move |x| println!("worker {}:\thello {}", index, x))
+             .probe()
+    );
+
+    // introduce data and watch!
+    for round in 0..10 {
+        if index == 0 {
+            input.send(round);
+        }
+        input.advance_to(round + 1);
+        while probe.less_than(input.time()) {
+            worker.step();
         }
-    }).unwrap();
-}</code></pre></pre>
+    }
+}).unwrap();
+<span class="boring">}</span></code></pre></pre>
 <p>We can run this program in a variety of configurations: with just a single worker thread, with one process and multiple worker threads, and with multiple processes each with multiple worker threads.</p>
 <p>To try this out yourself, first clone the timely dataflow repository using <code>git</code></p>
 <pre><code class="language-ignore">    Echidnatron% git clone https://github.com/TimelyDataflow/timely-dataflow
diff --git a/chapter_4/chapter_4_4.html b/chapter_4/chapter_4_4.html
@@ -264,7 +264,7 @@ <h2 id="replaying-streams"><a class="header" href="#replaying-streams">Replaying
 <h2 id="an-example"><a class="header" href="#an-example">An Example</a></h2>
 <p>We can check out the examples <code>examples/capture_send.rs</code> and <code>examples/capture_recv.rs</code> to see a paired use of capture and receive demonstrating the generality.</p>
 <p>The <code>capture_send</code> example creates a new TCP connection for each worker, which it wraps and uses as an <code>EventPusher</code>. Timely dataflow takes care of all the serialization and stuff like that (warning: it uses abomonation, so this is not great for long-term storage).</p>
-<pre><code class="language-rust ignore">extern crate timely;
+<pre><pre class="playground"><code class="language-rust no_run">extern crate timely;
 
 use std::net::TcpStream;
 use timely::dataflow::operators::ToStream;
@@ -282,9 +282,9 @@ <h2 id="an-example"><a class="header" href="#an-example">An Example</a></h2>
                 .capture_into(EventWriter::new(send))
         );
     }).unwrap();
-}</code></pre>
+}</code></pre></pre>
 <p>The <code>capture_recv</code> example is more complicated, because we may have a different number of workers replaying the stream than initially captured it.</p>
-<pre><code class="language-rust ignore">extern crate timely;
+<pre><pre class="playground"><code class="language-rust no_run">extern crate timely;
 
 use std::net::TcpListener;
 use timely::dataflow::operators::Inspect;
@@ -303,16 +303,16 @@ <h2 id="an-example"><a class="header" href="#an-example">An Example</a></h2>
             .collect::&lt;Vec&lt;_&gt;&gt;()
             .into_iter()
             .map(|l| l.incoming().next().unwrap().unwrap())
-            .map(|r| EventReader::&lt;_,u64,_&gt;::new(r))
+            .map(|r| EventReader::&lt;_,Vec&lt;u64&gt;,_&gt;::new(r))
             .collect::&lt;Vec&lt;_&gt;&gt;();
 
-        worker.dataflow::&lt;u64,_,_&gt;(|scope| {
+        worker.dataflow::&lt;u64,_,_&gt;(move |scope| {
             replayers
                 .replay_into(scope)
                 .inspect(|x| println!("replayed: {:?}", x));
         })
     }).unwrap(); // asserts error-free execution
-}</code></pre>
+}</code></pre></pre>
 <p>Almost all of the code up above is assigning responsibility for the replaying between the workers we have (from <code>worker.peers()</code>). We partition responsibility for <code>0 .. source_peers</code> among the workers, create <code>TcpListener</code>s to handle the connection requests, wrap them in <code>EventReader</code>s, and then collect them up as a vector. The workers have collectively partitioned the incoming captured streams between themselves.</p>
 <p>Finally, each worker just uses the list of <code>EventReader</code>s as the argument to <code>replay_into</code>, and we get the stream magically transported into a new dataflow, in a different process, with a potentially different number of workers.</p>
 <p>If you want to try it out, make sure to start up the <code>capture_recv</code> example first (otherwise the connections will be refused for <code>capture_send</code>) and specify the expected number of source workers, modifying the number of received workers if you like. Here we are expecting five source workers, and distributing them among three receive workers (to make life complicated):</p>
diff --git a/chapter_4/chapter_4_5.html b/chapter_4/chapter_4_5.html
@@ -185,12 +185,18 @@ <h2 id="the-exchangedata-trait"><a class="header" href="#the-exchangedata-trait"
 <p>The <code>ExchangeData</code> trait is more complicated, and is established in the <code>communication/</code> module. The trait is a synonym for</p>
 <pre><code class="language-rust ignore">Send+Sync+Any+serde::Serialize+for&lt;'a&gt;serde::Deserialize&lt;'a&gt;+'static</code></pre>
 <p>where <code>serde</code> is Rust's most popular serialization and deserialization crate. A great many types implement these traits. If your types does not, you should add these decorators to their definition:</p>
-<pre><code class="language-rust ignore">#[derive(Serialize, Deserialize)]</code></pre>
+<pre><pre class="playground"><code class="language-rust"><span class="boring">#![allow(unused)]
+</span><span class="boring">fn main() {
+</span><span class="boring">extern crate serde;
+</span><span class="boring">use serde::{Serialize, Deserialize};
+</span>#[derive(Serialize, Deserialize)]
+<span class="boring">struct Dummy {}
+</span><span class="boring">}</span></code></pre></pre>
 <p>You must include the <code>serde</code> crate, and if not on Rust 2018 the <code>serde_derive</code> crate.</p>
 <p>The downside to is that deserialization will always involve a clone of the data, which has the potential to adversely impact performance. For example, if you have structures that contain lots of strings, timely dataflow will create allocations for each string even if you do not plan to use all of them.</p>
 <h2 id="an-example"><a class="header" href="#an-example">An example</a></h2>
 <p>Let's imagine you would like to play around with a tree data structure as something you might send around in timely dataflow. I've written the following candidate example:</p>
-<pre><code class="language-rust ignore">extern crate timely;
+<pre><pre class="playground"><code class="language-rust compile_fail">extern crate timely;
 
 use timely::dataflow::operators::*;
 
@@ -212,7 +218,7 @@ <h2 id="an-example"><a class="header" href="#an-example">An example</a></h2>
     fn new(data: D) -&gt; Self {
         Self { data, children: Vec::new() }
     }
-}</code></pre>
+}</code></pre></pre>
 <p>This doesn't work. You'll probably get two errors, that <code>TreeNode</code> doesn't implement <code>Clone</code>, nor does it implement <code>Debug</code>. Timely data types need to implement <code>Clone</code>, and our attempt to print out the trees requires an implementation of <code>Debug</code>. We can create these implementations by decorating the <code>struct</code> declaration like so:</p>
 <pre><pre class="playground"><code class="language-rust"><span class="boring">#![allow(unused)]
 </span><span class="boring">fn main() {
@@ -242,7 +248,7 @@ <h2 id="an-example"><a class="header" href="#an-example">An example</a></h2>
 <h3 id="exchanging-data"><a class="header" href="#exchanging-data">Exchanging data</a></h3>
 <p>Let's up the level a bit and try and shuffle our tree data between workers.</p>
 <p>If we replace our <code>main</code> method with this new one:</p>
-<pre><code class="language-rust ignore">extern crate timely;
+<pre><pre class="playground"><code class="language-rust compile_fail">extern crate timely;
 
 use timely::dataflow::operators::*;
 
@@ -265,19 +271,22 @@ <h3 id="exchanging-data"><a class="header" href="#exchanging-data">Exchanging da
     fn new(data: D) -&gt; Self {
         Self { data, children: Vec::new() }
     }
-}</code></pre>
+}</code></pre></pre>
 <p>We get a new error. A not especially helpful error. It says that it cannot find an <code>exchange</code> method, or more specifically that one exists but it doesn't apply to our type at hand. This is because the data need to satisfy the <code>ExchangeData</code> trait but do not. It would be better if this were clearer in the error messages, I agree.</p>
 <p>The fix is to update the source like so:</p>
-<pre><code class="language-rust ignore">#[macro_use]
-extern crate serde_derive;
-extern crate serde;
+<pre><pre class="playground"><code class="language-rust"><span class="boring">#![allow(unused)]
+</span><span class="boring">fn main() {
+</span>extern crate serde;
+
+use serde::{Serialize, Deserialize};
 
 #[derive(Clone, Debug, Serialize, Deserialize)]
 struct TreeNode&lt;D&gt; {
     data: D,
     children: Vec&lt;TreeNode&lt;D&gt;&gt;,
-}</code></pre>
-<p>and make sure to include the <code>serde_derive</code> and <code>serde</code> crates.</p>
+}
+<span class="boring">}</span></code></pre></pre>
+<p>and make sure to include <code>serde</code> crate with the <code>derive</code> feature on.</p>
 <pre><code class="language-ignore">    Echidnatron% cargo run --example types
         Finished dev [unoptimized + debuginfo] target(s) in 0.07s
          Running `target/debug/examples/types`
diff --git a/chapter_5/chapter_5_1.html b/chapter_5/chapter_5_1.html
@@ -177,22 +177,27 @@ <h1 id="communication"><a class="header" href="#communication">Communication</a>
 <p>Communication in timely dataflow starts from the <code>timely_communication</code> crate. This crate includes not only communication, but is actually where we start up the various worker threads and establish their identities. As in timely dataflow, everything starts by providing a per-worker closure, but this time we are given only a channel allocator as an argument.</p>
 <p>Before continuing, I want to remind you that this is the <em>internals</em> section; you could write your code against this crate if you really want, but one of the nice features of timely dataflow is that you don't have to. You can use a nice higher level layer, as discussed previously in the document.</p>
 <p>That being said, let's take a look at the example from the <code>timely_communication</code> documentation, which is not brief but shouldn't be wildly surprising either.</p>
-<pre><code class="language-rust ignore">extern crate timely_communication;
+<pre><pre class="playground"><code class="language-rust no_run">extern crate timely_communication;
+
+use std::ops::Deref;
+
+use timely_communication::{Allocate, Message};
 
 fn main() {
 
     // extract the configuration from user-supplied arguments, initialize the computation.
-    let config = timely_communication::Configuration::from_args(std::env::args()).unwrap();
+    // configure for two threads, just one process.
+    let config = timely_communication::Config::Process(2);
     let guards = timely_communication::initialize(config, |mut allocator| {
 
         println!("worker {} of {} started", allocator.index(), allocator.peers());
 
         // allocates a pair of senders list and one receiver.
-        let (mut senders, mut receiver) = allocator.allocate();
+        let (mut senders, mut receiver) = allocator.allocate(0);
 
         // send typed data along each channel
         for i in 0 .. allocator.peers() {
-            senders[i].send(format!("hello, {}", i));
+            senders[i].send(Message::from_typed(format!("hello, {}", i)));
             senders[i].done();
         }
 
@@ -201,7 +206,7 @@ <h1 id="communication"><a class="header" href="#communication">Communication</a>
         let mut received = 0;
         while received &lt; allocator.peers() {
             if let Some(message) = receiver.recv() {
-                println!("worker {}: received: &lt;{}&gt;", allocator.index(), message);
+                println!("worker {}: received: &lt;{}&gt;", allocator.index(), message.deref());
                 received += 1;
             }
         }
@@ -216,7 +221,7 @@ <h1 id="communication"><a class="header" href="#communication">Communication</a>
         }
     }
     else { println!("error in computation"); }
-}</code></pre>
+}</code></pre></pre>
 <p>There are a few steps here, and we'll talk through the important parts in each of them.</p>
 <h2 id="configuration"><a class="header" href="#configuration">Configuration</a></h2>
 <p>There is only a limited amount of configuration you can currently do in a timely dataflow computation, and it all lives in the <code>initialize::Configuration</code> type. This type is a simple enumeration of three ways a timely computation could run:</p>
diff --git a/chapter_5/chapter_5_2.html b/chapter_5/chapter_5_2.html
@@ -184,7 +184,9 @@ <h1 id="progress-tracking"><a class="header" href="#progress-tracking">Progress
 <h2 id="dataflow-structure"><a class="header" href="#dataflow-structure">Dataflow Structure</a></h2>
 <p>A dataflow graph hosts some number of operators. For progress tracking, these operators are simply identified by their index. Each operator has some number of <em>input ports</em>, and some number of <em>output ports</em>. The dataflow operators are connected by connecting each input port to a single output port (typically of another operator). Each output port may be connected to multiple distinct input ports (a message produced at an output port is to be delivered to all attached input ports).</p>
 <p>In timely dataflow progress tracking, we identify output ports by the type <code>Source</code> and input ports by the type <code>Target</code>, as from the progress coordinator's point of view, an operator's output port is a <em>source</em> of timestamped data, and an operator's input port is a <em>target</em> of timestamped data. Each source and target can be described by their operator index and then an operator-local index of the corresponding port. The use of distinct types helps us avoid mistaking input and output ports.</p>
-<pre><code class="language-rust ignore">pub struct Source {
+<pre><pre class="playground"><code class="language-rust"><span class="boring">#![allow(unused)]
+</span><span class="boring">fn main() {
+</span>pub struct Source {
     /// Index of the source operator.
     pub index: usize,
     /// Number of the output port from the operator.
@@ -196,7 +198,8 @@ <h2 id="dataflow-structure"><a class="header" href="#dataflow-structure">Dataflo
     pub index: usize,
     /// Number of the input port to the operator.
     pub port: usize,
-}</code></pre>
+}
+<span class="boring">}</span></code></pre></pre>
 <p>The structure of the dataflow graph can be described by a list of all of the connections in the graph, a <code>Vec&lt;(Source, Target)&gt;</code>. From this, we could infer the number of operators and their numbers of input and output ports, as well as enumerate all of the connections themselves.</p>
 <p>At this point we have the structure of a dataflow graph. We can draw a circle for each operator, a stub for each input and output port, and edges connecting the output ports to their destination input ports. Importantly, we have names for every location in the dataflow graph, which will either be a <code>Source</code> or a <code>Target</code>.</p>
 <h2 id="maintaining-capabilities"><a class="header" href="#maintaining-capabilities">Maintaining Capabilities</a></h2>
diff --git a/print.html b/print.html
diff --git a/searchindex.js b/searchindex.js
diff --git a/searchindex.json b/searchindex.json