Skip to content

Stage and Source Scheduler and Grouped Execution

Wenlei Xie edited this page Jul 27, 2019 · 9 revisions

Background

When Presto executes a query, it does so by breaking up the execution into a hierarchy of stages. Stage models a particular part of a distributed query plan. Each stage reads from data source and writes to an output buffer, and there are two types of sources:

  • Table scan source (sometimes referred to as partitioned source in code)
  • Remote source (sometimes referred to as unpartitioned source in code)

The following figure shows part of a query plan with two stages:

Two Stages

See more details about stage at https://prestodb.github.io/docs/current/overview/concepts.html#stage

Stage Scheduler

StageScheduler is responsible for the following jobs:

  • Create tasks for this stage
  • Schedule table splits to tasks

This section will discuss about stage scheduler before grouped execution is introduced. ScaledWriterScheduler is not discussed here.

The StageScheduler interface is quite concise (https://github.com/prestodb/presto/blob/b430a562679aab0a04df37b7a1e77cfd5d941c81/presto-main/src/main/java/com/facebook/presto/execution/scheduler/StageScheduler.java):

public interface StageScheduler
        extends Closeable
{
    /**
     * Schedules as much work as possible without blocking.
     * The schedule results is a hint to the query scheduler if and
     * when the stage scheduler should be invoked again.  It is
     * important to note that this is only a hint and the query
     * scheduler may call the schedule method at any time.
     */
    ScheduleResult schedule();

    @Override
    default void close() {}
}

To Be Continued....

Clone this wiki locally