Skip to content

Latest commit

 

History

History
315 lines (228 loc) · 13.9 KB

File metadata and controls

315 lines (228 loc) · 13.9 KB

SQL Optimization

Query optimization attempts to improve query performance and efficiency by altering the execution plan. This is a deep and complex field, and we can only scratch the surface here.

toyDB's query optimizer is very basic -- it only has a handful of rudimentary heuristic optimizations to illustrate how the process works. Real-world optimizers use much more sophisticated methods, including statistical analysis, cost estimation, adaptive execution, etc.

The optimizers are located in the sql::planner::optimizer module. An optimizer sql::planner::Optimizer just takes in a plan node sql::planner::Node (the root node in the plan), and returns an optimized node:

/// A node optimizer, which recursively transforms a plan node to make plan
/// execution more efficient where possible.
pub trait Optimizer: Debug + Send + Sync {
/// Optimizes a node, returning the optimized node.
fn optimize(&self, node: Node) -> Result<Node>;
}

Optimizations are always implemented as recursive node transformations. To help with this, Node has the helper methods Node::transform and Node::transform_expressions which recurse into a node or expression tree and call a given transformation closure on each node, as either pre-order or post-order transforms:

/// Recursively transforms query nodes depth-first by applying the given
/// closures before and after descending.
pub fn transform(
mut self,
before: &impl Fn(Self) -> Result<Self>,
after: &impl Fn(Self) -> Result<Self>,
) -> Result<Self> {
// Helper for transforming boxed nodes.
let xform = |mut node: Box<Node>| -> Result<Box<Node>> {
*node = node.transform(before, after)?;
Ok(node)
};
self = before(self)?;
self = match self {
Self::Aggregate { source, group_by, aggregates } => {
Self::Aggregate { source: xform(source)?, group_by, aggregates }
}
Self::Filter { source, predicate } => {
Self::Filter { source: xform(source)?, predicate }
}
Self::HashJoin { left, left_column, right, right_column, outer } => Self::HashJoin {
left: xform(left)?,
left_column,
right: xform(right)?,
right_column,
outer,
},
Self::Limit { source, limit } => Self::Limit { source: xform(source)?, limit },
Self::NestedLoopJoin { left, right, predicate, outer } => {
Self::NestedLoopJoin { left: xform(left)?, right: xform(right)?, predicate, outer }
}
Self::Offset { source, offset } => Self::Offset { source: xform(source)?, offset },
Self::Order { source, key } => Self::Order { source: xform(source)?, key },
Self::Projection { source, expressions, aliases } => {
Self::Projection { source: xform(source)?, expressions, aliases }
}
Self::Remap { source, targets } => Self::Remap { source: xform(source)?, targets },
Self::IndexLookup { .. }
| Self::KeyLookup { .. }
| Self::Nothing { .. }
| Self::Scan { .. }
| Self::Values { .. } => self,
};
self = after(self)?;
Ok(self)
}
/// Recursively transforms all node expressions by calling the given
/// closures on them before and after descending.
pub fn transform_expressions(
self,
before: &impl Fn(Expression) -> Result<Expression>,
after: &impl Fn(Expression) -> Result<Expression>,
) -> Result<Self> {
Ok(match self {
Self::Filter { source, mut predicate } => {
predicate = predicate.transform(before, after)?;
Self::Filter { source, predicate }
}
Self::NestedLoopJoin { left, right, predicate: Some(predicate), outer } => {
let predicate = Some(predicate.transform(before, after)?);
Self::NestedLoopJoin { left, right, predicate, outer }
}
Self::Order { source, mut key } => {
key = key
.into_iter()
.map(|(expr, dir)| expr.transform(before, after).map(|expr| (expr, dir)))
.try_collect()?;
Self::Order { source, key }
}
Self::Projection { source, mut expressions, aliases } => {
expressions = expressions
.into_iter()
.map(|expr| expr.transform(before, after))
.try_collect()?;
Self::Projection { source, expressions, aliases }
}
Self::Scan { table, alias, filter: Some(filter) } => {
let filter = Some(filter.transform(before, after)?);
Self::Scan { table, alias, filter }
}
Self::Values { mut rows } => {
rows = rows
.into_iter()
.map(|row| row.into_iter().map(|expr| expr.transform(before, after)).collect())
.try_collect()?;
Self::Values { rows }
}
Self::Aggregate { .. }
| Self::HashJoin { .. }
| Self::IndexLookup { .. }
| Self::KeyLookup { .. }
| Self::Limit { .. }
| Self::NestedLoopJoin { predicate: None, .. }
| Self::Nothing { .. }
| Self::Offset { .. }
| Self::Remap { .. }
| Self::Scan { filter: None, .. } => self,
})
}

A technique that's often useful during optimization is to convert expressions into conjunctive normal form, i.e. "an AND of ORs". For example, the two following expressions are equivalent, but the latter is in conjunctive normal form (it's a chain of ANDs):

(a AND b) OR (c AND d)  →  (a OR c) AND (a OR d) AND (b OR c) AND (b OR d)

This is useful because we can often move each AND operand independently around in the plan tree and still get the same result -- we'll see this in action later. Expressions are converted into conjunctive normal form via Expression::into_cnf, which is implemented using De Morgan's laws:

/// Converts the expression into conjunctive normal form, i.e. an AND of
/// ORs, useful during plan optimization. This is done by converting to
/// negation normal form and then applying De Morgan's distributive law.
pub fn into_cnf(self) -> Self {
use Expression::{And, Or};
let xform = |expr| {
// Can't use a single match; needs deref patterns.
let Or(lhs, rhs) = expr else {
return expr;
};
match (*lhs, *rhs) {
// (x AND y) OR z → (x OR z) AND (y OR z)
(And(l, r), rhs) => And(Or(l, rhs.clone().into()).into(), Or(r, rhs.into()).into()),
// x OR (y AND z) → (x OR y) AND (x OR z)
(lhs, And(l, r)) => And(Or(lhs.clone().into(), l).into(), Or(lhs.into(), r).into()),
// Otherwise, do nothing.
(lhs, rhs) => Or(lhs.into(), rhs.into()),
}
};
self.into_nnf().transform(&|e| Ok(xform(e)), &Ok).unwrap() // infallible
}
/// Converts the expression into conjunctive normal form as a vector of
/// ANDed expressions (instead of nested ANDs).
pub fn into_cnf_vec(self) -> Vec<Self> {
let mut cnf = Vec::new();
let mut stack = vec![self.into_cnf()];
while let Some(expr) = stack.pop() {
if let Self::And(lhs, rhs) = expr {
stack.extend([*rhs, *lhs]); // push lhs last to pop it first
} else {
cnf.push(expr);
}
}
cnf
}
/// Converts the expression into negation normal form. This pushes NOT
/// operators into the tree using De Morgan's laws, such that they're always
/// below other logical operators. It is a useful intermediate form for
/// applying other logical normalizations.
pub fn into_nnf(self) -> Self {
use Expression::{And, Not, Or};
let xform = |expr| {
// Can't use a single match; needs deref patterns.
let Not(inner) = expr else {
return expr;
};
match *inner {
// NOT (x AND y) → (NOT x) OR (NOT y)
And(lhs, rhs) => Or(Not(lhs).into(), Not(rhs).into()),
// NOT (x OR y) → (NOT x) AND (NOT y)
Or(lhs, rhs) => And(Not(lhs).into(), Not(rhs).into()),
// NOT NOT x → x
Not(inner) => *inner,
// Otherwise, do nothing.
expr => Not(expr.into()),
}
};
self.transform(&|e| Ok(xform(e)), &Ok).unwrap() // infallible
}

We'll have a brief look at all of toyDB's optimizers, which are listed here in the order they're applied:

/// The set of optimizers, and the order in which they are applied.
pub static OPTIMIZERS: LazyLock<Vec<Box<dyn Optimizer>>> = LazyLock::new(|| {
vec![
Box::new(ConstantFolding),
Box::new(FilterPushdown),
Box::new(IndexLookup),
Box::new(HashJoin),
Box::new(ShortCircuit),
]
});

Test scripts for the optimizers are in src/sql/testscripts/optimizers, and show how query plans evolve as each optimizer is applied.

Constant Folding

The ConstantFolding optimizer performs constant folding. This pre-evaluates constant expressions in the plan during planning, instead of evaluating them for every row during execution.

/// Folds constant expressions by pre-evaluating them once now, instead of
/// re-evaluating them for every row during execution.
#[derive(Debug)]
pub struct ConstantFolding;

For example, consider the query SELECT 1 + 2 * 3 - foo FROM bar. There is no point in re-evaluating 1 + 2 * 3 for every row in bar, because the result is always the same, so we can just evaluate this once during planning, transforming the expression into 7 - foo.

Concretely, this plan:

Select
└─ Projection: 1 + 2 * 3 - bar.foo
   └─ Scan: bar

Should be transformed into this plan:

Select
└─ Projection: 7 - bar.foo
   └─ Scan: bar

To do this, ConstantFolding simply checks whether an Expression tree contains an Expression::Column node -- if it doesn't, then it much be a constant expression (since that's the only dynamic value in an expression), and we can evaluate it with a None input row and replace the original expression node with an Expression::Constant node.

This is done recursively for each plan node, and recursively for each expression node (so it does this both for SELECT, WHERE, ORDER BY, and all other parts of the query). Notably, it does a post-order expression transform, so it starts at the expression leaf nodes and attempts to transform each expression node as it moves back up the tree -- this allows it to iteratively evaluate constant parts as far as possible for each branch.

impl Optimizer for ConstantFolding {
fn optimize(&self, node: Node) -> Result<Node> {
// Recursively transform expressions in the node tree. Post-order to
// partially fold child expressions as far as possible, and avoid
// quadratic costs.
node.transform(&|node| node.transform_expressions(&Ok, &Self::fold), &Ok)
}
}
impl ConstantFolding {
/// Folds constant expressions in a node.
pub fn fold(mut expr: Expression) -> Result<Expression> {
use Expression::*;
use Value::*;
// If the expression is constant, evaluate it.
//
// This is a very simple approach, which doesn't handle more complex
// cases such as 1 + a - 2 (which would require rearranging the
// expression as 1 - 2 + a to evaluate the 1 - 2 branch).
//
// TODO: consider doing something better.
if !expr.contains(&|expr| matches!(expr, Column(_))) {
return expr.evaluate(None).map(Constant);
}

Additionally, ConstantFolding also short-circuits logical expressions. For example, the expression foo AND FALSE will always be FALSE, regardless of what foo is, so we can replace it with FALSE:

// If the expression is a logical operator, and one of the sides is
// constant, we may be able to evaluate it even if it has a column
// reference. For example, a AND FALSE is always FALSE, regardless of
// what a is.
expr = match expr {
And(lhs, rhs) => match (*lhs, *rhs) {
// If either side of an AND is false, the AND is false.
(Constant(Boolean(false)), _) | (_, Constant(Boolean(false))) => {
Constant(Boolean(false))
}
// If either side of an AND is true, the AND is redundant.
(Constant(Boolean(true)), expr) | (expr, Constant(Boolean(true))) => expr,
(lhs, rhs) => And(lhs.into(), rhs.into()),
},
Or(lhs, rhs) => match (*lhs, *rhs) {
// If either side of an OR is true, the OR is true.
(Constant(Boolean(true)), _) | (_, Constant(Boolean(true))) => {
Constant(Boolean(true))
}
// If either side of an OR is false, the OR is redundant.
(Constant(Boolean(false)), expr) | (expr, Constant(Boolean(false))) => expr,
(lhs, rhs) => Or(lhs.into(), rhs.into()),
},
expr => expr,
};

As the code comment mentions though, this doesn't fold optimally: it doesn't attempt to rearrange expressions, which would require knowledge of precedence rules. For example, (1 + foo) - 2 could be folded into foo - 1 by first rearranging it as foo + (1 - 2), but we don't do this currently.

Filter Pushdown

The FilterPushdown optimizer attempts to push filter predicates as far down into the plan as possible, to reduce the number of rows each node has to process.

/// Pushes filter predicates down into child nodes where possible. In
/// particular, this can perform filtering during storage scans (below Raft),
/// instead of reading and transmitting all rows across the network before
/// filtering, by pushing a predicate from a Filter node down into a Scan node.
#[derive(Debug)]
pub struct FilterPushdown;

Recall the movies query plan from the planning section:

Select
└─ Order: movies.released desc
   └─ Projection: movies.title, movies.released, genres.name as genre
      └─ Filter: movies.released >= 2000
         └─ NestedLoopJoin: inner on movies.genre_id = genres.id
            ├─ Scan: movies
            └─ Scan: genres

Even though we're filtering on release >= 2000, the Scan node still has to read all of them from disk and send them via Raft, and the NestedLoopJoin node still has to join all of them. It would be nice if we could push this filtering into the NestedLoopJoin and Scan nodes and avoid this extra work, and this is exactly what FilterPushdown does.

The only plan nodes that have predicates that can be pushed down are Filter nodes and NestedLoopJoin nodes, so we recurse through the plan tree and look for these nodes, attempting to push down.

impl Optimizer for FilterPushdown {
fn optimize(&self, node: Node) -> Result<Node> {
// Push down before descending, so we can keep recursively pushing down.
node.transform(&|node| Ok(Self::push_filters(node)), &Ok)
}
}
impl FilterPushdown {
/// Pushes filter predicates down into child nodes where possible.
fn push_filters(mut node: Node) -> Node {
node = Self::maybe_push_filter(node);
node = Self::maybe_push_join(node);
node
}

When it encounters the Filter node, it will extract the predicate and attempt to push it down into its source node:

/// Pushes a filter node predicate down into its source, if possible.
fn maybe_push_filter(node: Node) -> Node {
let Node::Filter { mut source, predicate } = node else {
return node;
};
// Attempt to push the filter into the source, or return the original.
if let Some(predicate) = Self::push_into(predicate, &mut source) {
return Node::Filter { source, predicate };
}
// Push succeded, return the source that was pushed into. When we
// replace this filter node with the source node, Node.transform() will
// skip the source node since it now takes the place of the original
// filter node. Transform the source manually.
Self::push_filters(*source)
}

If the source node is a Filter, NestedLoopJoin, or Scan node, then we can push the predicate down into it by ANDing it with the existing predicate (if any).

/// Pushes an expression into a node if possible. Otherwise, returns the the
/// unpushed expression.
fn push_into(expr: Expression, target: &mut Node) -> Option<Expression> {
match target {
Node::Filter { predicate, .. } => {
// Temporarily replace the predicate to take ownership.
let rhs = std::mem::replace(predicate, Expression::Constant(Value::Null));
*predicate = Expression::And(expr.into(), rhs.into());
}
Node::NestedLoopJoin { predicate, .. } => {
*predicate = match predicate.take() {
Some(predicate) => Some(Expression::And(expr.into(), predicate.into())),
None => Some(expr),
};
}
Node::Scan { filter, .. } => {
*filter = match filter.take() {
Some(filter) => Some(Expression::And(expr.into(), filter.into())),
None => Some(expr),
};
}
// Unable to push down, just return the original expression.
_ => return Some(expr),
}
None
}

In our case, we were able to push the Filter into the NestedLoopJoin, and our plan now looks like this:

Select
└─ Order: movies.released desc
   └─ Projection: movies.title, movies.released, genres.name as genre
      └─ NestedLoopJoin: inner on movies.genre_id = genres.id AND movies.released >= 2000
         ├─ Scan: movies
         └─ Scan: genres

But we're still not done, as we'd like to push movies.released >= 2000 down into the Scan node. Pushdown for join nodes is a little more tricky, because we can only push down parts of the expression that reference one of the source nodes.

We first have to convert the expression into conjunctive normal form, i.e. and AND of ORs, as we've discussed previously. This allows us to examine and push down each AND part in isolation, because it has the same effect regardless of whether it is evaluated in the NestedLoopJoin node or one of the source nodes. Our expression is already in conjunctive normal form, though.

We then look at each AND part, and check which side of the join it has column references for. If it only references one of the sides, then the expression can be pushed down into it. We also make some effort here to move primary/foreign key constants across to both sides, but we'll gloss over that.

// Pushes down parts of a join predicate into the left or right sources
// where possible.
fn maybe_push_join(node: Node) -> Node {
let Node::NestedLoopJoin { mut left, mut right, predicate: Some(predicate), outer } = node
else {
return node;
};
// Convert the predicate into conjunctive normal form (an AND vector).
let cnf = predicate.into_cnf_vec();
// Push down expressions that don't reference both sources. Constant
// expressions can be pushed down into both.
let (mut push_left, mut push_right, mut predicate) = (Vec::new(), Vec::new(), Vec::new());
for expr in cnf {
let (mut ref_left, mut ref_right) = (false, false);
expr.walk(&mut |expr| {
if let Expression::Column(index) = expr {
ref_left = ref_left || *index < left.columns();
ref_right = ref_right || *index >= left.columns();
}
!(ref_left && ref_right) // exit once both are referenced
});
match (ref_left, ref_right) {
(true, true) => predicate.push(expr),
(true, false) => push_left.push(expr),
(false, true) => push_right.push(expr),
(false, false) => {
push_left.push(expr.clone());
push_right.push(expr);
}
}
}
// In the remaining cross-source expressions, look for equijoins where
// one side also has constant value lookups. In this case we can copy
// the constant lookups to the other side, to allow index lookups. This
// commonly happens when joining a foreign key (which is indexed) on a
// primary key, and we want to make use of the foreign key index, e.g.:
//
// SELECT m.name, g.name FROM movies m JOIN genres g ON m.genre_id = g.id AND g.id = 7;
let left_lookups: HashMap<usize, usize> = push_left // column → push_left index
.iter()
.enumerate()
.filter_map(|(i, expr)| expr.is_column_lookup().map(|column| (column, i)))
.collect();
let right_lookups: HashMap<usize, usize> = push_right // column → push_right index
.iter()
.enumerate()
.filter_map(|(i, expr)| expr.is_column_lookup().map(|column| (column, i)))
.collect();
for expr in &predicate {
// Find equijoins.
let Expression::Equal(lhs, rhs) = expr else { continue };
let Expression::Column(mut l) = **lhs else { continue };
let Expression::Column(mut r) = **rhs else { continue };
// The lhs may be a reference to the right source; swap them.
if l > r {
(l, r) = (r, l)
}
// Check if either side is a column lookup, and copy it over.
if let Some(expr) = left_lookups.get(&l).map(|i| push_left[*i].clone()) {
push_right.push(expr.replace_column(l, r));
}
if let Some(expr) = right_lookups.get(&r).map(|i| push_right[*i].clone()) {
push_left.push(expr.replace_column(r, l));
}
}
// Push predicates down into the sources if possible.
if let Some(expr) = Expression::and_vec(push_left) {
if let Some(expr) = Self::push_into(expr, &mut left) {
// Pushdown failed, put it back into the join predicate.
predicate.push(expr)
}
}
if let Some(mut expr) = Expression::and_vec(push_right) {
// Right columns have indexes in the joined row; shift them left.
expr = expr.shift_column(-(left.columns() as isize));
if let Some(mut expr) = Self::push_into(expr, &mut right) {
// Pushdown failed, undo the column index shift.
expr = expr.shift_column(left.columns() as isize);
predicate.push(expr)
}
}
// Leave any remaining predicates in the join node.
let predicate = Expression::and_vec(predicate);
Node::NestedLoopJoin { left, right, predicate, outer }
}

This allows us to push down the movies.released >= 2000 predicate into the corresponding Scan node, significantly reducing the amount of data transferred across Raft:

Select
└─ Order: movies.released desc
   └─ Projection: movies.title, movies.released, genres.name as genre
      └─ NestedLoopJoin: inner on movies.genre_id = genres.id
         ├─ Scan: movies (released >= 2000)
         └─ Scan: genres

Index Lookups

The IndexLookup optimizer uses primary key or secondary index lookups instead of full table scans where possible.

/// Uses a primary key or secondary index lookup where possible.
#[derive(Debug)]
pub struct IndexLookup;

The optimizer itself is fairly straightforward. It assumes that FilterPushdown has already pushed predicates down into Scan nodes, so it only needs to examine these. It converts the predicate into conjunctive normal form, and looks for any parts that are direct column lookups -- i.e. column = value (possibly a long OR chain of these).

If it finds any, and the column is either a primary key or secondary index column, then we convert the Scan node into either a KeyLookup or IndexLookup node respectively. If there are any further AND predicates remaining, we add a parent Filter node to keep these predicates.

For example, the following plan:

Select
└─ Scan: movies ((id = 1 OR id = 7 OR id = 3) AND released >= 2000)

Will be transformed into one that does individual key lookups rather than a full table scan:

Select
└─ Filter: movies.released >= 2000
   └─ KeyLookup: movies (1, 3, 7)

The code is as outlined above:

impl Optimizer for IndexLookup {
fn optimize(&self, node: Node) -> Result<Node> {
// Recursively transform expressions in the node tree. Post-order to
// partially fold child expressions as far as possible, and avoid
// quadratic costs.
node.transform(&|node| Ok(Self::index_lookup(node)), &Ok)
}
}
impl IndexLookup {
/// Rewrites a filtered scan node into a key or index lookup if possible.
fn index_lookup(mut node: Node) -> Node {
// Only handle scan filters. Assume FilterPushdown has pushed filters
// into scan nodes first.
let Node::Scan { table, alias, filter: Some(filter) } = node else {
return node;
};
// Convert the filter into conjunctive normal form (a list of ANDs).
let mut cnf = filter.clone().into_cnf_vec();
// Find the first expression that's either a primary key or secondary
// index lookup. We could be more clever here, but this is fine.
let Some((i, column)) = cnf.iter().enumerate().find_map(|(i, expr)| {
expr.is_column_lookup()
.filter(|&c| c == table.primary_key || table.columns[c].index)
.map(|column| (i, column))
}) else {
// No index lookups found, return the original node.
return Node::Scan { table, alias, filter: Some(filter) };
};
// Extract the lookup values and expression from the cnf vector.
let values = cnf.remove(i).into_column_values(column);
// Build the primary key or secondary index lookup node.
if column == table.primary_key {
node = Node::KeyLookup { table, keys: values, alias };
} else {
node = Node::IndexLookup { table, column, values, alias };
}
// If there's any remaining CNF expressions, add a filter node for them.
if let Some(predicate) = Expression::and_vec(cnf) {
node = Node::Filter { source: Box::new(node), predicate };
}
node
}
}

Helped by Expression::is_column_lookup() and Expression::into_column_values():

/// Checks if an expression is a single column lookup (i.e. a disjunction of
/// = or IS NULL/NAN for a single column), returning the column index.
pub fn is_column_lookup(&self) -> Option<usize> {
use Expression::*;
match &self {
// Column/constant equality can use index lookups. NULL and NaN are
// handled in into_column_values().
Equal(lhs, rhs) => match (lhs.as_ref(), rhs.as_ref()) {
(Column(c), Constant(_)) | (Constant(_), Column(c)) => Some(*c),
_ => None,
},
// IS NULL and IS NAN can use index lookups.
Is(expr, _) => match expr.as_ref() {
Column(c) => Some(*c),
_ => None,
},
// All OR branches must be lookups on the same column:
// id = 1 OR id = 2 OR id = 3.
Or(lhs, rhs) => match (lhs.is_column_lookup(), rhs.is_column_lookup()) {
(Some(l), Some(r)) if l == r => Some(l),
_ => None,
},
_ => None,
}
}
/// Extracts column lookup values for the given column. Panics if the
/// expression isn't a lookup of the given column, i.e. is_column_lookup()
/// must return true for the expression.
pub fn into_column_values(self, index: usize) -> Vec<Value> {
use Expression::*;
match self {
Equal(lhs, rhs) => match (*lhs, *rhs) {
(Column(column), Constant(value)) | (Constant(value), Column(column)) => {
assert_eq!(column, index, "unexpected column");
// NULL and NAN index lookups are for IS NULL and IS NAN.
// Equality shouldn't match anything, return empty vec.
if value.is_undefined() { Vec::new() } else { vec![value] }
}
(lhs, rhs) => panic!("unexpected expression {:?}", Equal(lhs.into(), rhs.into())),
},
// IS NULL and IS NAN can use index lookups.
Is(expr, value) => match *expr {
Column(column) => {
assert_eq!(column, index, "unexpected column");
vec![value]
}
expr => panic!("unexpected expression {expr:?}"),
},
Or(lhs, rhs) => {
let mut values = lhs.into_column_values(index);
values.extend(rhs.into_column_values(index));
values
}
expr => panic!("unexpected expression {expr:?}"),
}
}

Hash Join

The HashJoin optimizer will replace a NestedLoopJoin with a HashJoin where possible.

/// Uses a hash join instead of a nested loop join for single-column equijoins.
#[derive(Debug)]
pub struct HashJoin;

A nested loop join is a very inefficient O(n²) algorithm, which iterates over all rows in the right source for each row in the left source to see if they match. However, it is completely general, and can join on arbitraily complex predicates.

In the common case where the join predicate is an equality comparison such as movies.genre_id = genres.id (i.e. an equijoin), then we can instead use a hash join. This scans the right table once, builds an in-memory hash table from it, and for each left row it looks up any right rows in the hash table. This is a much more efficient O(n) algorithm.

In our previous movie example, we are in fact doing an equijoin:

Select
└─ Order: movies.released desc
   └─ Projection: movies.title, movies.released, genres.name as genre
      └─ NestedLoopJoin: inner on movies.genre_id = genres.id
         ├─ Scan: movies (released >= 2000)
         └─ Scan: genres

And so our NestedLoopJoin can be replaced by a HashJoin:

Select
└─ Order: movies.released desc
   └─ Projection: movies.title, movies.released, genres.name as genre
      └─ HashJoin: inner on movies.genre_id = genres.id
         ├─ Scan: movies (released >= 2000)
         └─ Scan: genres

The HashJoin optimizer is extremely simple: if the join predicate is an equijoin, use a hash join. This isn't always a good idea (the right source can be huge and we can run out of memory for the hash table), but we keep it simple.

impl Optimizer for HashJoin {
fn optimize(&self, node: Node) -> Result<Node> {
node.transform(&|node| Ok(Self::hash_join(node)), &Ok)
}
}
impl HashJoin {
/// Rewrites a nested loop join into a hash join if possible.
pub fn hash_join(node: Node) -> Node {
let Node::NestedLoopJoin {
left,
right,
predicate: Some(Expression::Equal(lhs, rhs)),
outer,
} = node
else {
return node;
};
match (*lhs, *rhs) {
// If this is a single-column equijoin, use a hash join.
(Expression::Column(mut left_column), Expression::Column(mut right_column)) => {
// The LHS column may be a column in the right table; swap them.
if right_column < left_column {
(left_column, right_column) = (right_column, left_column);
}
// The NestedLoopJoin predicate uses column indexes in the
// joined row, while the HashJoin uses column indexes in each
// individual table. Adjust the RHS column reference.
right_column -= left.columns();
Node::HashJoin { left, left_column, right, right_column, outer }
}
// Otherwise, retain the nested loop join.
(lhs, rhs) => {
let predicate = Some(Expression::Equal(lhs.into(), rhs.into()));
Node::NestedLoopJoin { left, right, predicate, outer }
}
}
}
}

Of course there are many other join algorithms out there, and one of the harder problems in SQL optimization is how to efficiently perform large N-way multijoins. We don't attempt to tackle these problems here -- the HashJoin optimizer is just a very simple example of such join optimization.

Short Circuiting

The ShortCircuit optimizer tries to find nodes that can't possibly do any useful work, and either removes them from the plan, or replaces them with trivial nodes that don't do anything. It is kind of similar to the ConstantFolding optimizer in spirit, but works on plan nodes rather than expression nodes.

/// Short-circuits useless nodes and expressions (for example a Filter node that
/// always evaluates to false), by removing them and/or replacing them with
/// Nothing nodes that yield no rows.
#[derive(Debug)]
pub struct ShortCircuit;

For example, Filter nodes with a TRUE predicate won't actually filter anything:

Select
└─ Filter: true
   └─ Scan: movies

So we can just remove them:

Select
└─ Scan: movies

Similarly, Filter nodes with a FALSE predicate will never emit anything:

Select
└─ Filter: false
   └─ Scan: movies

There's no point doing a scan in this case, so we can just replace it with a Nothing node that does no work and doesn't emit anything:

Select
└─ Nothing

The optimizer tries to find a bunch of such patterns. This can also tidy up query plans a fair bit by removing unnecessary cruft.

impl Optimizer for ShortCircuit {
fn optimize(&self, node: Node) -> Result<Node> {
// Post-order transform, to pull Nothing nodes upwards in the tree.
node.transform(&Ok, &|node| Ok(Self::short_circuit(node)))
}
}
impl ShortCircuit {
/// Short-circuits useless nodes. Assumes the node has already been
/// optimized by ConstantFolding.
fn short_circuit(mut node: Node) -> Node {
use Expression::*;
use Value::*;
node = match node {
// Filter nodes that always yield true are unnecessary: remove them.
Node::Filter { source, predicate: Constant(Boolean(true)) } => *source,
// Predicates that always yield true are unnecessary: remove them.
Node::Scan { table, filter: Some(Constant(Boolean(true))), alias } => {
Node::Scan { table, filter: None, alias }
}
Node::NestedLoopJoin {
left,
right,
predicate: Some(Constant(Boolean(true))),
outer,
} => Node::NestedLoopJoin { left, right, predicate: None, outer },
// Remove noop projections that simply pass through the source columns.
Node::Projection { source, expressions, aliases }
if source.columns() == expressions.len()
&& aliases.iter().all(|alias| *alias == Label::None)
&& expressions
.iter()
.enumerate()
.all(|(i, expr)| *expr == Expression::Column(i)) =>
{
*source
}
node => node,
};
// Short-circuit nodes that don't produce anything by replacing them
// with a Nothing node.
let is_empty = match &node {
Node::Filter { predicate: Constant(Boolean(false) | Null), .. } => true,
Node::IndexLookup { values, .. } if values.is_empty() => true,
Node::KeyLookup { keys, .. } if keys.is_empty() => true,
Node::Limit { limit: 0, .. } => true,
Node::NestedLoopJoin { predicate: Some(Constant(Boolean(false) | Null)), .. } => true,
Node::Scan { filter: Some(Constant(Boolean(false) | Null)), .. } => true,
Node::Values { rows } if rows.is_empty() => true,
// Nodes that pull from a Nothing node can't produce anything.
//
// NB: does not short-circuit aggregation, since an aggregation over 0
// rows should produce a result.
Node::Filter { source, .. }
| Node::HashJoin { left: source, .. }
| Node::HashJoin { right: source, .. }
| Node::NestedLoopJoin { left: source, .. }
| Node::NestedLoopJoin { right: source, .. }
| Node::Offset { source, .. }
| Node::Order { source, .. }
| Node::Projection { source, .. }
if matches!(**source, Node::Nothing { .. }) =>
{
true
}
_ => false,
};
if is_empty {
let columns = (0..node.columns()).map(|i| node.column_label(i)).collect();
return Node::Nothing { columns };
}
node
}
}


SQL Planning   |   SQL Execution