Skip to content

Conversation

@antiguru
Copy link
Owner

No description provided.

antiguru and others added 30 commits December 15, 2021 16:25
* Timely container abstraction

Introduces a container abstraction to enable generic data on dataflow
edges. A container describes its contents, has a length and capacity,
can be cleared. Containers implementing the PushPartitioned trait are
suitable for exchanging across Timely workers.

Signed-off-by: Moritz Hoffmann <[email protected]>

* Make Timely generic over a container type

Add *Core variants of core Timely infrastructure that is generic in the
data passed along dataflow edges.

Signed-off-by: Moritz Hoffmann <[email protected]>

* Introduce TimelyStack

This is mostly a copy of ColumnStack plus a few implementations. We'd like
to keep the columnar library mostly untouched and hence have our own
variant within Timely.

Signed-off-by: Moritz Hoffmann <[email protected]>
This allows for e.g. indexing differential Descriptions by their lower
to stitch them together.
* exchange: remove one layer of boxing

We were previously forced to use a boxed trait object because we were
unable to name the type of the constructed closure.

This patch fixes it by introducing a custom `ParallelizationHasher`
trait that the `ParallelizationContractCore` expects with a blanket
implementation for all closures taking two arguments. This is to
maintain backwards compatibility with any code that passes a closure
directly to `ExchangeCore::new`.

Then a wrapper type `DataHasher<F>` is introduced that allows us to both
specialize the `ParallelizationHasher` implementation for single
argument closures and at the same time name the type which is what we
need to remove one layer of boxed trait objects.

Signed-off-by: Petros Angelatos <[email protected]>

* exchange: disallow exchanging based on time

Signed-off-by: Petros Angelatos <[email protected]>
* Add option conversions for totally ordered antichains

* Add FromIterator implementations
* Avoid spinning when workers have no dataflows

* Improve worker API to be more clear
Instead of only supporting vector-backed streams, permit TimelyStack and
Rc streams.

Signed-off-by: Moritz Hoffmann <[email protected]>
Add the `shared` function to streams, turning the contents of the stream
into `Rc` of the container. Downstream operators will see the same copy
of the data and the `Rc` takes care of cleaning up the shared reference.

Signed-off-by: Moritz Hoffmann <[email protected]>
Currently, Timely fails to build in CI. Use the opportunity to upgrade
the CI integration to current versions of all dependencies.

Signed-off-by: Moritz Hoffmann <[email protected]>

Signed-off-by: Moritz Hoffmann <[email protected]>
Signed-off-by: Moritz Hoffmann <[email protected]>

Signed-off-by: Moritz Hoffmann <[email protected]>
As the title says, this changes the BranchWhen operator to work on streams
of arbitrary containers.

Signed-off-by: Moritz Hoffmann <[email protected]>

Signed-off-by: Moritz Hoffmann <[email protected]>
The reclock operator delays batches of data based on a clock input. With
this change, it can not only hold back vector-based container but any type
of container.

Signed-off-by: Moritz Hoffmann <[email protected]>

Signed-off-by: Moritz Hoffmann <[email protected]>
Convert the current vector-based implementation for the Exchange operator
into a container-invariant one. The PushPartitioned trait enables the
implementation to be generic over all containers that support it.

Signed-off-by: Moritz Hoffmann <[email protected]>

Signed-off-by: Moritz Hoffmann <[email protected]>
…low#429)

CapabilityRefs are valid to exist as long as the data in the input are
not marked as consumed. This change makes sure that this is the case by
including an extra drop guard in the capability ref.

Signed-off-by: Petros Angelatos <[email protected]>

Signed-off-by: Petros Angelatos <[email protected]>
To allow downstream crates to sniff out these panics and downgrade them,
if appropriate.
frankmcsherry and others added 30 commits January 9, 2023 18:17
Tests for the kafkaesque example started failing because it depends on
`clap` which now requires rustc `1.64`. We could change the deps of the
example but updating rustc seems like a good anything anyway.

Signed-off-by: Petros Angelatos <[email protected]>

Signed-off-by: Petros Angelatos <[email protected]>
In certain situations, a handle survives longer than we would like to wait
for a flush to happen. In this cases, an explicit call to cease can help
to indicate to the rest of the system that no more data follows
immediately, which is equivalent to dropping the handle.

Specifically, in async code the handle can be long-lived and survive await
points, which makes it more important to signal momentary completion to the
system.

Signed-off-by: Moritz Hoffmann <[email protected]>

Signed-off-by: Moritz Hoffmann <[email protected]>
Since the merge of
TimelyDataflow#429,
`CapabilityRef`s have been made safe to hold onto across operator
invocations because that PR made sure that they only decremented their
progress counts on `Drop`. While this allowed `async`/`await` based
operators to freely hold on to them, it was still very difficult for
synchronous based operators to do the same thing, due to the lifetime
attached to the `CapabilityRef`.

We can observe that the lifetime no longer provides any benefits, which
means it can be removed and turn `CapabilityRef`s into fully owned
values. This allows any style of operator to easily hold on to them. The
benefit of that isn't just performance (by avoiding the `retain()`
dance), but also about deferring the decision of the output port a given
input should flow to to a later time.

After making this change, the name `CapabilityRef` felt wrong, since
there is no reference to anything anymore. Instead, the main distinction
between `CapabilityRef`s and `Capabilities` are that the former is
associated with an input port and the latter is associated with an
output port.

As such, I have renamed `CapabilityRef` to `InputCapability` to signal
to users that holding onto one of them represents holding onto a
timestamp at the input for which we have not yet determined the output
port that it should flow to. This nicely ties up the semantics of the
`InputCapability::retain_for_output` and
`InputCapability::delayed_for_output` methods, which make it clear by
their name and signature that this is what "transfers" the capability
from input ports to output ports.

Signed-off-by: Petros Angelatos <[email protected]>
This fixes a compilation problem when deploying Timely.

Signed-off-by: Moritz Hoffmann <[email protected]>
Signed-off-by: Moritz Hoffmann <[email protected]>
This fixes a regression introduced by TimelyDataflow#426 where BufferCore::give now
lacked the inline annotation. In pingpong, this changes the runtime as
follows:

Without inline:
target/release/examples/pingpong 1 500000000 -w1  2.45s user 0.00s system 99% cpu 2.446 total

With inline:
target/release/examples/pingpong 1 500000000 -w1  1.75s user 0.00s system 99% cpu 1.755 total

Signed-off-by: Moritz Hoffmann <[email protected]>
This commit adds `MessageEvent` logging for messages traveling over
`enter`/`leave` channels.
This fixes TimelyDataflow#523. The reason for the error is that guarded messages are
only emitted when the message is non-empty. In this case however, the
message was empty, recording that a message was sent, but never recording
that the message was received. The fix makes sure to only send within
`give_container` when the message is non-empty. This solves this particular
problem, but the requirements around empty messages are still somewhat
implicit.

Signed-off-by: Moritz Hoffmann <[email protected]>
The current code tried to be helpful by ignoring any messages with zero
length, but that made it impossible for the caller to differentiate
between "There was an empty container message" vs "There are no more
messages".

In the presence of empty containers in the channels this can lead
operators to prematurely yield back to timely without fully processing
their inputs, which in turn can lead to the whole dataflow not making
progress since timely won't re-activate operators that have leftover
events in their channels.

This PR removes this check and always forwards the containers to the
operators, exactly as they were sent.

Fixes TimelyDataflow#523

Signed-off-by: Petros Angelatos <[email protected]>
This prevents building Timely in CI because recent versions of clap require
a newer version of Rust than what we use in CI.

Signed-off-by: Moritz Hoffmann <[email protected]>
This is useful if all the caller has is a reference.

Signed-off-by: Moritz Hoffmann <[email protected]>
The derived `default` for Antichain includes a bound `T: Default`, which is
unnecessary. This change replaces the derived implementation with a custom
one that does not have the constrain.

Signed-off-by: Moritz Hoffmann <[email protected]>
It's not used and not maintained.

Signed-off-by: Moritz Hoffmann <[email protected]>
* Activate only by channel ID

Signed-off-by: Moritz Hoffmann <[email protected]>

* Remove Event

Signed-off-by: Moritz Hoffmann <[email protected]>

* Remove comment

Signed-off-by: Moritz Hoffmann <[email protected]>

---------

Signed-off-by: Moritz Hoffmann <[email protected]>
This adds initial support for release-plz and configures it to only produce
a changelog for the timely crate.

Signed-off-by: Moritz Hoffmann <[email protected]>
Signed-off-by: Moritz Hoffmann <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.