Skip to content

Commit 87b77bb

Browse files
alambfindepi
andauthored
ImproveSignature and comparison_coercion documentation (#13840)
* Improve Signature documentation more * Apply suggestions from code review Co-authored-by: Piotr Findeisen <[email protected]> --------- Co-authored-by: Piotr Findeisen <[email protected]>
1 parent 31acf45 commit 87b77bb

File tree

2 files changed

+42
-13
lines changed

2 files changed

+42
-13
lines changed

datafusion/expr-common/src/signature.rs

Lines changed: 21 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -103,9 +103,13 @@ pub enum TypeSignature {
103103
/// A function such as `concat` is `Variadic(vec![DataType::Utf8,
104104
/// DataType::LargeUtf8])`
105105
Variadic(Vec<DataType>),
106-
/// The acceptable signature and coercions rules to coerce arguments to this
107-
/// signature are special for this function. If this signature is specified,
108-
/// DataFusion will call `ScalarUDFImpl::coerce_types` to prepare argument types.
106+
/// The acceptable signature and coercions rules are special for this
107+
/// function.
108+
///
109+
/// If this signature is specified,
110+
/// DataFusion will call [`ScalarUDFImpl::coerce_types`] to prepare argument types.
111+
///
112+
/// [`ScalarUDFImpl::coerce_types`]: https://docs.rs/datafusion/latest/datafusion/logical_expr/trait.ScalarUDFImpl.html#method.coerce_types
109113
UserDefined,
110114
/// One or more arguments with arbitrary types
111115
VariadicAny,
@@ -123,24 +127,29 @@ pub enum TypeSignature {
123127
/// One or more arguments belonging to the [`TypeSignatureClass`], in order.
124128
///
125129
/// For example, `Coercible(vec![logical_float64()])` accepts
126-
/// arguments like `vec![DataType::Int32]` or `vec![DataType::Float32]`
130+
/// arguments like `vec![Int32]` or `vec![Float32]`
127131
/// since i32 and f32 can be cast to f64
128132
///
129133
/// For functions that take no arguments (e.g. `random()`) see [`TypeSignature::Nullary`].
130134
Coercible(Vec<TypeSignatureClass>),
131-
/// One or more arguments that can be "compared"
135+
/// One or more arguments coercible to a single, comparable type.
136+
///
137+
/// Each argument will be coerced to a single type using the
138+
/// coercion rules described in [`comparison_coercion_numeric`].
139+
///
140+
/// # Examples
141+
///
142+
/// If the `nullif(1, 2)` function is called with `i32` and `i64` arguments
143+
/// the types will both be coerced to `i64` before the function is invoked.
132144
///
133-
/// Each argument will be coerced to a single type based on comparison rules.
134-
/// For example a function called with `i32` and `i64` has coerced type `Int64` so
135-
/// each argument will be coerced to `Int64` before the function is invoked.
145+
/// If the `nullif('1', 2)` function is called with `Utf8` and `i64` arguments
146+
/// the types will both be coerced to `Utf8` before the function is invoked.
136147
///
137148
/// Note:
138-
/// - If compares with numeric and string, numeric is preferred for numeric string cases. For example, `nullif('2', 1)` has coerced types `Int64`.
139-
/// - If the result is Null, it will be coerced to String (Utf8View).
140-
/// - See [`comparison_coercion`] for more details.
141149
/// - For functions that take no arguments (e.g. `random()` see [`TypeSignature::Nullary`]).
150+
/// - If all arguments have type [`DataType::Null`], they are coerced to `Utf8`
142151
///
143-
/// [`comparison_coercion`]: crate::type_coercion::binary::comparison_coercion
152+
/// [`comparison_coercion_numeric`]: crate::type_coercion::binary::comparison_coercion_numeric
144153
Comparable(usize),
145154
/// One or more arguments of arbitrary types.
146155
///

datafusion/expr-common/src/type_coercion/binary.rs

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -625,6 +625,19 @@ pub fn try_type_union_resolution_with_struct(
625625
/// data type. However, users can write queries where the two arguments are
626626
/// different data types. In such cases, the data types are automatically cast
627627
/// (coerced) to a single data type to pass to the kernels.
628+
///
629+
/// # Numeric comparisons
630+
///
631+
/// When comparing numeric values, the lower precision type is coerced to the
632+
/// higher precision type to avoid losing data. For example when comparing
633+
/// `Int32` to `Int64` the coerced type is `Int64` so the `Int32` argument will
634+
/// be cast.
635+
///
636+
/// # Numeric / String comparisons
637+
///
638+
/// When comparing numeric values and strings, both values will be coerced to
639+
/// strings. For example when comparing `'2' > 1`, the arguments will be
640+
/// coerced to `Utf8` for comparison
628641
pub fn comparison_coercion(lhs_type: &DataType, rhs_type: &DataType) -> Option<DataType> {
629642
if lhs_type == rhs_type {
630643
// same type => equality is possible
@@ -642,7 +655,14 @@ pub fn comparison_coercion(lhs_type: &DataType, rhs_type: &DataType) -> Option<D
642655
.or_else(|| struct_coercion(lhs_type, rhs_type))
643656
}
644657

645-
/// Similar to [`comparison_coercion`] but prefer numeric if compares with numeric and string
658+
/// Similar to [`comparison_coercion`] but prefers numeric if compares with
659+
/// numeric and string
660+
///
661+
/// # Numeric comparisons
662+
///
663+
/// When comparing numeric values and strings, the values will be coerced to the
664+
/// numeric type. For example, `'2' > 1` if `1` is an `Int32`, the arguments
665+
/// will be coerced to `Int32`.
646666
pub fn comparison_coercion_numeric(
647667
lhs_type: &DataType,
648668
rhs_type: &DataType,

0 commit comments

Comments
 (0)