Log action can distinguish data from flush #619

antiguru · 2025-01-12T19:12:36Z

Previously, the log action would receive an empty container when we'd like
to report progress but no new data. We'd only send the empty container when
we didn't have any other data to send. This makes it hard for users to
distinguish between flushing and just data updates. It might be important
to distinguish the two because only on flush we might not be called again
for a while, but otherwise it's very likely that the logger might receive
more data, or sees a flush.

This change alters the signature of the action to accept a &mut Option<C>
(where C is CB::Container), and we pass Some(container) on data, and
None on flush.

Clients using the logging API need to change their implementation, as both
vectors and Option offer a iter function, but with different results.

Signed-off-by: Moritz Hoffmann [email protected]

Previously, the log action would receive an empty container when we'd like to report progress but no new data. We'd only send the empty container when we didn't have any other data to send. This makes it hard for users to distinguish between flushing and just data updates. It might be important to distinguish the two because only on flush we might not be called again for a while, but otherwise it's very likely that the logger might receive more data, or sees a flush. This change alters the signature of the action to accept a `&mut Option<C>` (where `C` is `CB::Container`), and we pass `Some(container)` on data, and `None` on flush. Clients using the logging API need to change their implementation, as both vectors and `Option` offer a `iter` function, but with different results. Signed-off-by: Moritz Hoffmann <[email protected]>

We don't regularly drop the inner logger, so one additional flush doesn't justify the added complexity. Signed-off-by: Moritz Hoffmann <[email protected]>

`publish_batch` accepts a mutable reference to an option instead of a mutable reference to a container. Signed-off-by: Moritz Hoffmann <[email protected]>

frankmcsherry

This all looks good to me. I left a few comments, but they are roughly nits, and can be followed on later if we conclude they are worth it!

frankmcsherry · 2025-01-16T00:53:54Z

logging/src/lib.rs

+                let mut c = Some(std::mem::take(container));
+                (self.action)(&elapsed, &mut c);
+                if let Some(mut c) = c {
+                    c.clear();


No strong opinion, but is this clearing new behavior? Does it e.g. prevent passing back owned data, into which the logger can write? Again, really no strong opinion, but just checking whether the force-clear is new, and whether it is intentional.

It was purely defensive. Thinking about it, it should be the container builder's responsibility to enforce what needs to be true about a container after extracting or finishing it, so it doesn't make sense to have the clear call here.

frankmcsherry · 2025-01-16T00:55:49Z

logging/src/lib.rs


-        self.dirty = false;
+        // Send no container to indicate flush.
+        (self.action)(&elapsed, &mut None);


A nit, but one of the reasons to take a &mut Option<_> rather than a Option<&mut _> is to allow the None call to pass back resources, and at least with Push the intent is that you keep calling this as long as you get a non-None back. We don't have to do that here, and probably massively over-thinking this, but wanted to call out the gap.

I see your point, but I'm wondering if this is actually true! Or put differently, is it something we'd like to be true, or is it something that we had at some point and forgot about?

The reason I'm asking is because most (all?) places where we call Push::done, we don't have a loop to drain resources. If I recall correctly, the only place where we loop is in Differential merge batchers to drain the stash of allocations once we're done merging chains.

We could change Push's done function to look like this instead:

fn done(&mut self) { let mut container = None; loop { self.push(&mut container); if container.is_none() { break; } } }

Signed-off-by: Moritz Hoffmann <[email protected]>

antiguru force-pushed the logger_flush branch from 545cc31 to 56ec0d7 Compare January 13, 2025 08:57

antiguru added 3 commits January 15, 2025 13:13

Do not track dirty state in logger

b128b93

We don't regularly drop the inner logger, so one additional flush doesn't justify the added complexity. Signed-off-by: Moritz Hoffmann <[email protected]>

Align batch logger API closer to logging

f4ebc44

`publish_batch` accepts a mutable reference to an option instead of a mutable reference to a container. Signed-off-by: Moritz Hoffmann <[email protected]>

antiguru force-pushed the logger_flush branch from 56ec0d7 to f4ebc44 Compare January 15, 2025 12:13

frankmcsherry approved these changes Jan 16, 2025

View reviewed changes

Don't clear containers

44abb5e

Signed-off-by: Moritz Hoffmann <[email protected]>

antiguru merged commit 291de98 into TimelyDataflow:master Jan 16, 2025
7 checks passed

antiguru deleted the logger_flush branch January 16, 2025 10:44

github-actions bot mentioned this pull request Jan 15, 2025

chore: release #618

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Log action can distinguish data from flush #619

Log action can distinguish data from flush #619

Uh oh!

antiguru commented Jan 12, 2025

Uh oh!

frankmcsherry left a comment

Uh oh!

frankmcsherry Jan 16, 2025

Uh oh!

antiguru Jan 16, 2025

Uh oh!

frankmcsherry Jan 16, 2025

Uh oh!

antiguru Jan 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Log action can distinguish data from flush #619

Log action can distinguish data from flush #619

Uh oh!

Conversation

antiguru commented Jan 12, 2025

Uh oh!

frankmcsherry left a comment

Choose a reason for hiding this comment

Uh oh!

frankmcsherry Jan 16, 2025

Choose a reason for hiding this comment

Uh oh!

antiguru Jan 16, 2025

Choose a reason for hiding this comment

Uh oh!

frankmcsherry Jan 16, 2025

Choose a reason for hiding this comment

Uh oh!

antiguru Jan 16, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants