Skip to content

Commit 54da3bb

Browse files
yeshengmgatorsmile
authored andcommitted
[SPARK-28127][SQL] Micro optimization on TreeNode's mapChildren method
## What changes were proposed in this pull request? The `mapChildren` method in the TreeNode class is commonly used across the whole Spark SQL codebase. In this method, there's a if statement that checks non-empty children. However, there's a cached lazy val `containsChild`, which can avoid unnecessary computation since `containsChild` is used in other methods and therefore constructed anyway. Benchmark showed that this optimization can improve the whole TPC-DS planning time by 6.8%. There is no regression on any TPC-DS query. ## How was this patch tested? Existing UTs. Closes apache#24925 from yeshengm/treenode-children. Authored-by: Yesheng Ma <[email protected]> Signed-off-by: gatorsmile <[email protected]>
1 parent 47f54b1 commit 54da3bb

File tree

1 file changed

+1
-1
lines changed
  • sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees

1 file changed

+1
-1
lines changed

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -319,7 +319,7 @@ abstract class TreeNode[BaseType <: TreeNode[BaseType]] extends Product {
319319
* Returns a copy of this node where `f` has been applied to all the nodes in `children`.
320320
*/
321321
def mapChildren(f: BaseType => BaseType): BaseType = {
322-
if (children.nonEmpty) {
322+
if (containsChild.nonEmpty) {
323323
mapChildren(f, forceCopy = false)
324324
} else {
325325
this

0 commit comments

Comments
 (0)