Skip to content

Conversation

BohuTANG
Copy link
Member

@BohuTANG BohuTANG commented Aug 23, 2025

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

Fix stack overflow in CTE processing that caused segmentation faults when processing complex queries with many UNION operations.

🐛 Problem Description

PR #18577 introduced a regression causing segmentation faults when processing complex CTE (Common Table Expression) queries with many UNION operations. The issue manifests as stack overflow due to deep recursive calls during query binding and AST traversal.

Root Cause

The m_cte_to_temp_table() function performs double recursion:

  1. TableNameReplacer recursively traverses AST nodes via drive_mut()
  2. Query binding recursively processes the modified query via bind_query()

For queries with 150+ UNION operations, this creates ~300 levels of recursive calls, exceeding stack limits.

🔧 Solution

Following the pattern established in PR #18268, this fix adds #[recursive::recursive] annotations to key CTE processing functions. The recursive library automatically grows stack size when needed, preventing overflow.

Modified Functions

  • bind_query() - Main query binding entry point
  • m_cte_to_temp_table() - CTE temporary table creation
  • compute_cte_ref_count() - CTE reference counting
  • bind_cte_definition() - CTE definition binding
  • TableNameReplacer visitor methods - AST traversal for name replacement

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

…E processing

Adds #[recursive::recursive] annotations to key CTE binding functions to prevent
stack overflow when processing complex queries with many UNION operations.

This follows the pattern established in PR #18268 and addresses the
segmentation fault issue introduced in PR #18577.

Fixes:
- bind_query() - Main query binding entry point
- m_cte_to_temp_table() - CTE temporary table creation
- compute_cte_ref_count() - CTE reference counting
- bind_cte_definition() - CTE definition binding
- TableNameReplacer visitor methods - AST traversal for name replacement

The recursive library automatically grows stack size when needed,
preventing stack overflow on deeply nested or complex query structures.
@github-actions github-actions bot added the pr-bugfix this PR patches a bug in codebase label Aug 23, 2025
@BohuTANG BohuTANG requested a review from SkyFan2002 August 23, 2025 12:24
@BohuTANG BohuTANG merged commit 7cca5f5 into main Aug 23, 2025
97 of 102 checks passed
@BohuTANG BohuTANG deleted the fix/cte-recursive-stack-overflow branch August 23, 2025 12:55
@BohuTANG
Copy link
Member Author

Not working :/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-bugfix this PR patches a bug in codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants