Skip to content

Commit be38639

Browse files
committed
Document guidelines for physical operator yielding
To start a policy of the behavior physical operator streams should have and drive improvements in this area to allow for timely cancellation. Connects to apache#14036 and related to pull requests such as apache#14028.
1 parent 3dc212c commit be38639

File tree

1 file changed

+17
-0
lines changed

1 file changed

+17
-0
lines changed

datafusion/physical-plan/src/execution_plan.rs

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -260,13 +260,30 @@ pub trait ExecutionPlan: Debug + DisplayAs + Send + Sync {
260260
/// used.
261261
/// Thus, [`spawn`] is disallowed, and instead use [`SpawnedTask`].
262262
///
263+
/// To enable timely cancellation, the [`Stream`] that is returned must not
264+
/// pin the CPU and must yield back to the tokio runtime regularly. This can
265+
/// be achieved by manually returning [`Poll::Pending`] in regular intervals,
266+
/// or the use of [`tokio::task::yield_now()`]. Cooperative scheduling may also
267+
/// be a way to achieve this goal, as [tokio support for it improves][coop].
268+
/// Determination for "regularly" may be made using a timer (being careful with
269+
/// the overhead-heavy syscall needed to take the time) or by counting rows or
270+
/// batches.
271+
///
272+
/// The goal is for `datafusion`-provided operator implementation to
273+
/// strive for [the guideline of not spending a long time without reaching
274+
/// an `await`/yield point][async-guideline]. Progress towards this goal
275+
/// is tracked partially by the cancellation benchmark.
276+
///
263277
/// For more details see [`SpawnedTask`], [`JoinSet`] and [`RecordBatchReceiverStreamBuilder`]
264278
/// for structures to help ensure all background tasks are cancelled.
265279
///
266280
/// [`spawn`]: tokio::task::spawn
267281
/// [`JoinSet`]: tokio::task::JoinSet
268282
/// [`SpawnedTask`]: datafusion_common_runtime::SpawnedTask
269283
/// [`RecordBatchReceiverStreamBuilder`]: crate::stream::RecordBatchReceiverStreamBuilder
284+
/// [`Poll::Pending`]: std::task::Poll::Pending
285+
/// [coop]: https://github.com/tokio-rs/tokio/pull/7116
286+
/// [async-guideline]: https://ryhl.io/blog/async-what-is-blocking/
270287
///
271288
/// # Implementation Examples
272289
///

0 commit comments

Comments
 (0)