@@ -709,19 +709,34 @@ pub trait ScalarUDFImpl: Debug + DynEq + DynHash + Send + Sync {
709709 Ok ( ExprSimplifyResult :: Original ( args) )
710710 }
711711
712- /// Returns the [preimage] for this function and the specified scalar value, if any.
712+ /// Returns the preimage for this function and the specified scalar
713+ /// expression, if any.
713714 ///
714- /// A preimage is a single contiguous [`Interval`] of values where the function
715- /// will always return `lit_value`
715+ /// # Return Value
716716 ///
717717 /// Implementations should return intervals with an inclusive lower bound and
718718 /// exclusive upper bound.
719719 ///
720- /// This rewrite is described in the [ClickHouse Paper] and is particularly
721- /// useful for simplifying expressions `date_part` or equivalent functions. The
722- /// idea is that if you have an expression like `date_part(YEAR, k) = 2024` and you
723- /// can find a [preimage] for `date_part(YEAR, k)`, which is the range of dates
724- /// covering the entire year of 2024. Thus, you can rewrite the expression to `k
720+ /// # Background
721+ ///
722+ /// A [preimage] is a single contiguous [`Interval`] of the functions
723+ /// argument where the function will return a single literal (constant)
724+ /// value. This can also be thought of as form of interval containment.
725+ ///
726+ /// Using a preimage to rewrite predicates is described in the [ClickHouse
727+ /// Paper]:
728+ ///
729+ /// > some functions can compute the preimage of a given function result.
730+ /// > This is used to replace comparisons of constants with function calls
731+ /// > on the key columns by comparing the key column value with the preimage.
732+ /// > For example, `toYear(k) = 2024` can be replaced by
733+ /// > `k >= 2024-01-01 && k < 2025-01-01`
734+ ///
735+ /// As mentioned above, this rewrite is particularly useful for simplifying
736+ /// expressions such as `date_part` or equivalent functions. The idea is for
737+ /// an an expression like `date_part(YEAR, k) = 2024`, if there is a
738+ /// [preimage] for `date_part(YEAR, k)`, which is the range of dates
739+ /// covering the entire year of 2024, you can rewrite the expression to `k
725740 /// >= '2024-01-01' AND k < '2025-01-01' which is often more optimizable.
726741 ///
727742 /// [ClickHouse Paper]: https://www.vldb.org/pvldb/vol17/p3731-schulze.pdf
0 commit comments