Skip to content

Commit 6424b14

Browse files
rxingatorsmile
authored andcommitted
[MINOR] Update docs for functions.scala to make it clear not all the built-in functions are defined there
The title summarizes the change. Author: Reynold Xin <[email protected]> Closes apache#21318 from rxin/functions.
1 parent 34ebcc6 commit 6424b14

File tree

1 file changed

+15
-1
lines changed

1 file changed

+15
-1
lines changed

sql/core/src/main/scala/org/apache/spark/sql/functions.scala

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,21 @@ import org.apache.spark.util.Utils
3939

4040

4141
/**
42-
* Functions available for DataFrame operations.
42+
* Commonly used functions available for DataFrame operations. Using functions defined here provides
43+
* a little bit more compile-time safety to make sure the function exists.
44+
*
45+
* Spark also includes more built-in functions that are less common and are not defined here.
46+
* You can still access them (and all the functions defined here) using the `functions.expr()` API
47+
* and calling them through a SQL expression string. You can find the entire list of functions for
48+
* the latest version of Spark at https://spark.apache.org/docs/latest/api/sql/index.html.
49+
*
50+
* As an example, `isnan` is a function that is defined here. You can use `isnan(col("myCol"))`
51+
* to invoke the `isnan` function. This way the programming language's compiler ensures `isnan`
52+
* exists and is of the proper form. You can also use `expr("isnan(myCol)")` function to invoke the
53+
* same function. In this case, Spark itself will ensure `isnan` exists when it analyzes the query.
54+
*
55+
* `regr_count` is an example of a function that is built-in but not defined here, because it is
56+
* less commonly used. To invoke it, use `expr("regr_count(yCol, xCol)")`.
4357
*
4458
* @groupname udf_funcs UDF functions
4559
* @groupname agg_funcs Aggregate functions

0 commit comments

Comments
 (0)