Skip to content

Conversation

@ggevay
Copy link
Contributor

@ggevay ggevay commented Nov 30, 2025

This implements the largest outstanding item from the frontend peek sequencing work stream (https://github.com/MaterializeInc/database-issues/issues/9593): Adding statement logging.

I had to add a number of new Coordinator Commands. When statement logging is on, this hurts performance by tens of percents (see Nightly benchmarks) with the current CI default sampling/throttling settings (which is I think the same as the cloud defaults, but for self-managed it's completely turned off by default).

However, when statement logging is either turned off or heavily throttled, then it has a smaller impact. (If I disable statement logging, then all the Nightly benchmarks are green.) In that case, the only thing slowing us down is the RegisterFrontendPeek Command, which is sent for every fast-path peek, regardless of whether the statement is being logged or not. This is because it's needed not just for statement logging, but also for query cancellation.

This PR also addresses a large part of the "Query cancellation" todo item of https://github.com/MaterializeInc/database-issues/issues/9593, because the code overlaps a lot with statement logging, see RegisterFrontendPeek.

(The PR doesn't address the old "execute context ... dropped without being properly retired" bug https://github.com/MaterializeInc/database-issues/issues/7304, which is planned as a follow-up. We'll probably need to solve this before widely rolling out the frontend peek sequencing, because this problem will probably become more common due to awaiting more in the frontend task, and thus having more time for a drop to come at the wrong moment. However, I haven't seen evidence of this in CI. I did have some CI fails during development that looked like they could be this issue, but they turned out to be probably due to different bugs during the frontend peek sequencing development, which I have fixed since then. Edit: There was a related CI fail in the meantime, see below.)

I did several Nightly runs with close to final states of the code, and they looked clean:

As mentioned above, we'll have to just accept the benchmark regressions, because statement logging is inherently more work. If a user needs very high QPS, they should set stronger sampling/throttling. We might also decide to revisit the default throttling in cloud.

(Before merging, I'll remove the commit that turns on frontend peek sequencing in CI, so that we keep testing mainly the old peek sequencing for now, e.g. that I have not broken statement logging there.)

@ggevay ggevay added the A-ADAPTER Topics related to the ADAPTER layer label Nov 30, 2025
@ggevay ggevay force-pushed the frontend-peek-statement-logging branch 4 times, most recently from 502d232 to 5790f83 Compare December 14, 2025 20:08
@ggevay ggevay force-pushed the frontend-peek-statement-logging branch 7 times, most recently from bb49ead to a12d26b Compare December 16, 2025 15:30
/// executing) the state necessary to perform this work is bundled in
/// the `ExecuteContextExtra` object (today, it is simply empty).
/// executing). The state necessary to perform this work is bundled in
/// the `ExecuteContextExtra` object.
Copy link
Contributor Author

@ggevay ggevay Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(The functionality that this comment covers is not actually being modified. The comment is being modified because it was extremely outdated.)

) {
let ws_id = self.controller.install_compute_watch_set(objects, t);
) -> Result<(), CollectionLookupError> {
let ws_id = self.controller.install_compute_watch_set(objects, t)?;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Had to make these fallible for the same reason as a lot of things that the frontend peek sequencing calls have to be fallible: the query's dependencies might disappear during sequencing at any moment. This was not the case in the old peek sequencing, when the sequencing itself was occupying the main coordinator task.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored large parts of statement logging into a more modular form, so that the old and the new peek sequencing can share a lot of code. I moved the shared code mostly to the other statement_logging.rs, which is outside coord, to make it clear that it doesn't need a Coordinator self.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As noted on the other statement_logging.rs (which is under coord), much of the code in this statement_logging.rs is code that got factored out from the original statement logging code, and will now be shared between the frontend's statement logging and the old statement logging.

compute_instance: ComputeInstanceId,
) -> Result<&mut mz_compute_client::controller::instance::Client<Timestamp>, InstanceMissing>
{
) -> Result<mz_compute_client::controller::instance::Client<Timestamp>, InstanceMissing> {
Copy link
Contributor Author

@ggevay ggevay Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Was made to not return a &mut to be able to call call_coordinator between ensure_compute_instance_client and client.peek in implement_fast_path_peek_plan.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(We had very little testing for statement logging metrics, so adding some here.)

@ggevay ggevay force-pushed the frontend-peek-statement-logging branch from a12d26b to 50ebb84 Compare December 16, 2025 17:15
@ggevay ggevay marked this pull request as ready for review December 16, 2025 17:17
@ggevay ggevay requested review from a team as code owners December 16, 2025 17:17
@ggevay ggevay requested review from SangJunBak and aljoscha and removed request for SangJunBak December 16, 2025 17:17
@ggevay
Copy link
Contributor Author

ggevay commented Dec 16, 2025

There is a dropped without being properly retired assertion fail in https://buildkite.com/materialize/nightly/builds/14525. I need to look into whether that's a new bug, or just https://github.com/MaterializeInc/database-issues/issues/7304 from something like a connection drop at the wrong moment (for which we have more opportunities in the frontend peek sequencing, due to awaiting more on the frontend task).

Edit: It's just https://github.com/MaterializeInc/database-issues/issues/7304, see below.

@ggevay
Copy link
Contributor Author

ggevay commented Dec 17, 2025

Ok, the stack trace looks very much like it happened from a connection drop, so then it's just https://github.com/MaterializeInc/database-issues/issues/7304. I'm planning to attend to that problem today, in a separate PR.

pg-cdc-old-syntax-mz_1-1               | 2025-12-16T17:39:45.518464Z  thread 'tokio:work-0' panicked at src/adapter/src/coord.rs:1346:13:
pg-cdc-old-syntax-mz_1-1               | execute context for statement StatementLoggingId(019b283f-24e9-792b-9de0-5b3dbe8b8a97) dropped without being properly retired.
pg-cdc-old-syntax-mz_1-1               |    5: core::panicking::panic_fmt
pg-cdc-old-syntax-mz_1-1               |    6: <mz_adapter::coord::ExecuteContextExtra as core::ops::drop::Drop>::drop
pg-cdc-old-syntax-mz_1-1               |    7: <mz_adapter::client::SessionClient>::execute::<<mz_pgwire::codec::FramedConn<mz_server_core::Connection>>::wait_closed::{closure#0}>::{closure#0}::{closure#0}
pg-cdc-old-syntax-mz_1-1               |    8: <mz_adapter::client::SessionClient>::execute::<<mz_pgwire::codec::FramedConn<mz_server_core::Connection>>::wait_closed::{closure#0}>::{closure#0}
pg-cdc-old-syntax-mz_1-1               |    9: <tracing::instrument::Instrumented<<mz_pgwire::protocol::StateMachine<mz_server_core::Connection, tokio_metrics::task::TaskIntervals>>::execute::{closure#0}> as core::future::future::Future>::poll
pg-cdc-old-syntax-mz_1-1               |   10: <mz_pgwire::protocol::StateMachine<mz_server_core::Connection, tokio_metrics::task::TaskIntervals>>::send_execute_response::{closure#0}::{closure#0}
pg-cdc-old-syntax-mz_1-1               |   11: <mz_pgwire::protocol::StateMachine<mz_server_core::Connection, tokio_metrics::task::TaskIntervals>>::send_execute_response::{closure#0}
pg-cdc-old-syntax-mz_1-1               |   12: <tracing::instrument::Instrumented<<mz_pgwire::protocol::StateMachine<mz_server_core::Connection, tokio_metrics::task::TaskIntervals>>::execute::{closure#0}> as core::future::future::Future>::poll
pg-cdc-old-syntax-mz_1-1               |   13: <tracing::instrument::Instrumented<core::pin::Pin<alloc::boxed::Box<dyn core::future::future::Future<Output = core::result::Result<mz_pgwire::protocol::State, std::io::error::Error>> + core::marker::Send>>> as core::future::future::Future>::poll
pg-cdc-old-syntax-mz_1-1               |   14: <mz_pgwire::protocol::StateMachine<mz_server_core::Connection, tokio_metrics::task::TaskIntervals>>::advance_ready::{closure#0}::{closure#0}
pg-cdc-old-syntax-mz_1-1               |   15: <core::future::poll_fn::PollFn<mz_pgwire::protocol::run<mz_server_core::Connection, tokio_metrics::task::TaskIntervals>::{closure#0}::{closure#0}::{closure#4}> as core::future::future::Future>::poll
pg-cdc-old-syntax-mz_1-1               |   16: mz_pgwire::protocol::run::<mz_server_core::Connection, tokio_metrics::task::TaskIntervals>::{closure#0}::{closure#0}
pg-cdc-old-syntax-mz_1-1               |   17: <mz_pgwire::server::Server>::handle_connection::<tokio_metrics::task::TaskIntervals>::{closure#0}::{closure#0}::{closure#0}::{closure#0}
pg-cdc-old-syntax-mz_1-1               |   18: <mz_pgwire::server::Server>::handle_connection::<tokio_metrics::task::TaskIntervals>::{closure#0}
pg-cdc-old-syntax-mz_1-1               |   19: <tokio_metrics::task::Instrumented<mz_server_core::serve<mz_pgwire::server::Server, core::pin::Pin<alloc::boxed::Box<dyn mz_server_core::ConnectionStream<Item = core::result::Result<tokio::net::tcp::stream::TcpStream, std::io::error::Error>>>>>::{closure#0}::{closure#3}> as core::future::future::Future>::poll
pg-cdc-old-syntax-mz_1-1               |   20: <tracing::instrument::Instrumented<tokio_metrics::task::Instrumented<mz_server_core::serve<mz_pgwire::server::Server, core::pin::Pin<alloc::boxed::Box<dyn mz_server_core::ConnectionStream<Item = core::result::Result<tokio::net::tcp::stream::TcpStream, std::io::error::Error>>>>>::{closure#0}::{closure#3}>> as core::future::future::Future>::poll
pg-cdc-old-syntax-mz_1-1               |   21: <tokio::runtime::task::core::Core<tracing::instrument::Instrumented<tokio_metrics::task::Instrumented<mz_server_core::serve<mz_pgwire::server::Server, core::pin::Pin<alloc::boxed::Box<dyn mz_server_core::ConnectionStream<Item = core::result::Result<tokio::net::tcp::stream::TcpStream, std::io::error::Error>>>>>::{closure#0}::{closure#3}>>, alloc::sync::Arc<tokio::runtime::scheduler::multi_thread::handle::Handle>>>::poll
pg-cdc-old-syntax-mz_1-1               |   22: <tokio::runtime::task::harness::Harness<tracing::instrument::Instrumented<tokio_metrics::task::Instrumented<mz_server_core::serve<mz_pgwire::server::Server, core::pin::Pin<alloc::boxed::Box<dyn mz_server_core::ConnectionStream<Item = core::result::Result<tokio::net::tcp::stream::TcpStream, std::io::error::Error>>>>>::{closure#0}::{closure#3}>>, alloc::sync::Arc<tokio::runtime::scheduler::multi_thread::handle::Handle>>>::poll
pg-cdc-old-syntax-mz_1-1               |   23: <tokio::runtime::scheduler::multi_thread::worker::Context>::run_task
pg-cdc-old-syntax-mz_1-1               |   24: <tokio::runtime::scheduler::multi_thread::worker::Context>::run
pg-cdc-old-syntax-mz_1-1               |   25: <tokio::runtime::context::scoped::Scoped<tokio::runtime::scheduler::Context>>::set::<tokio::runtime::scheduler::multi_thread::worker::run::{closure#0}::{closure#0}, ()>
pg-cdc-old-syntax-mz_1-1               |   26: tokio::runtime::context::runtime::enter_runtime::<tokio::runtime::scheduler::multi_thread::worker::run::{closure#0}, ()>
pg-cdc-old-syntax-mz_1-1               |   27: tokio::runtime::scheduler::multi_thread::worker::run
pg-cdc-old-syntax-mz_1-1               |   28: <tracing::instrument::Instrumented<tokio::runtime::blocking::task::BlockingTask<<tokio::runtime::scheduler::multi_thread::worker::Launch>::launch::{closure#0}>> as core::future::future::Future>::poll
pg-cdc-old-syntax-mz_1-1               |   29: <tokio::runtime::task::core::Core<tracing::instrument::Instrumented<tokio::runtime::blocking::task::BlockingTask<<tokio::runtime::scheduler::multi_thread::worker::Launch>::launch::{closure#0}>>, tokio::runtime::blocking::schedule::BlockingSchedule>>::poll
pg-cdc-old-syntax-mz_1-1               |   30: <tokio::runtime::task::harness::Harness<tracing::instrument::Instrumented<tokio::runtime::blocking::task::BlockingTask<<tokio::runtime::scheduler::multi_thread::worker::Launch>::launch::{closure#0}>>, tokio::runtime::blocking::schedule::BlockingSchedule>>::poll
pg-cdc-old-syntax-mz_1-1               |   31: <tokio::runtime::blocking::pool::Inner>::run

@ggevay ggevay force-pushed the frontend-peek-statement-logging branch from 50ebb84 to 907caa9 Compare December 17, 2025 12:53
@ggevay ggevay requested a review from a team as a code owner December 17, 2025 12:53
@ggevay ggevay force-pushed the frontend-peek-statement-logging branch 2 times, most recently from 80730c5 to c856b54 Compare December 18, 2025 15:17
@ggevay ggevay enabled auto-merge December 18, 2025 15:17
@ggevay ggevay disabled auto-merge December 18, 2025 15:29
@ggevay ggevay force-pushed the frontend-peek-statement-logging branch from c856b54 to fbc92be Compare December 18, 2025 17:34
Copy link
Contributor

@SangJunBak SangJunBak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify my knowledge, for the initial peek query, we're recording that attempt in the statement log and if we fallback, we record the fallback attempt as well? This makes sense to me.

Code looks good!

);
return Ok(None);
}
// This must happen BEFORE statement logging setup to avoid orphaned execution records.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From this comment, it's not clear why doing this before statement logging setup avoids orphaned execution records.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intention of the new frontend peek sequencing code is to only begin logging a statement once we are sure that we won't fall back to the old sequencing (i.e., that we have a peek; the new sequencing supports only peeks).

This is not yet true as of this PR, but there are several follow-up PRs, which make this true. Those PRs have also been approved, and I'm about to merge them, right after merging this one.

So, this comment is on top of the code block that does the only allowed bailouts (i.e., when the statement is not a peek). And then we start the statement logging, and then we definitely run the frontend peek sequencing all the way, without falling back to the old sequencing. Does this extra context, with the follow-up PRs make the comment clear?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah that makes sense to me!

@ggevay ggevay merged commit 79c1fad into MaterializeInc:main Dec 18, 2025
131 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-ADAPTER Topics related to the ADAPTER layer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants