-
Notifications
You must be signed in to change notification settings - Fork 291
Progress tracking
The main feature timely dataflow provides, in addition to moving data along dataflow edges, is the indication of progress through the stream of data.
The system alerts each recipient of timestamped data once it can determine that some previously active timestamp will never be seen again. This allows the operator to react with final messages and actions, for example sending aggregates, flushing state, or releasing held resources. The same mechanisms also alert the user as output data emerge from the dataflow graph.
We first need to develop timely dataflow's progress vocabulary.
A timely dataflow graph consists of operators, whose inputs and outputs are connected by channels. Each operator input has one associated channel, but an operator output may have many channels leading from it: multiple other operators may want to see the data it produces. Each group of records transmitted in timely dataflow have an associated timestamp.