Skip to content

Commit 2ac2ff1

Browse files
kyleconroyclaude
andauthored
Add kql() function transformation to view() in EXPLAIN AST output (#115)
* Add kql() function transformation to view() in EXPLAIN AST output Implements KQL (Kusto Query Language) parsing for the kql() table function. The kql() function is transformed to view() with the KQL content parsed into equivalent SQL. Supports: - Table names as first pipe segment - project operator for column selection - filter operator for WHERE conditions with comparison operators Fixes 02366_kql_create_table test (3 statements: stmt5, stmt7, stmt11). * Support dotted identifiers in ALTER TABLE RENAME COLUMN Add parseDottedIdentifier helper to handle nested column names like n.x in RENAME COLUMN statements. Previously only single-part identifiers were captured, causing statements like "RENAME COLUMN n.x TO n.renamed_x" to lose the dot-separated suffix. Fixes tests: - 01213_alter_table_rename_nested (3 statements) - 01213_alter_rename_nested (2 statements) - 01278_alter_rename_combination (2 statements) - 03526_columns_substreams_in_wide_parts (2 statements) * Add support for multi-word data types in parser Handle SQL standard multi-word type names: - DOUBLE PRECISION - INT/INTEGER/TINYINT/SMALLINT/BIGINT/INT1 with SIGNED/UNSIGNED - CHAR/CHARACTER/NCHAR with VARYING or LARGE OBJECT - BINARY with VARYING or LARGE OBJECT - NATIONAL CHAR/CHARACTER with optional VARYING or LARGE OBJECT Fixes 01144_multiword_data_types test (3 statements). * Escape single quotes in alias names in EXPLAIN output When an alias contains single quotes (e.g., "'String'" as an alias), they need to be escaped as \' in the EXPLAIN AST output to match ClickHouse behavior. Fixes tests: - 01101_literal_column_clash (3 statements) - 01950_aliases_bad_cast (1 statement) * Fix handling of INT64_MIN and very large negative integers in CAST Two fixes: 1. In formatExprAsString, avoid overflow when formatting INT64_MIN by checking if the value is already negative before trying to negate it 2. In parseUnaryMinus, properly handle negative numbers larger than int64 can hold (like -9223372036854775809 for Int128) by storing them as strings when strconv.ParseInt fails Fixes 02887_byteswap test (3 statements: stmt27, stmt29, stmt31). * Add tuple expansion and regex pattern EXCEPT support - Handle expression.* syntax for tuple expansion in parseDotAccess() - Add Pattern field to ColumnTransformer for regex-based EXCEPT - Parse string patterns in EXCEPT clauses (e.g., EXCEPT('hello|world')) - Output pattern-based EXCEPT as ColumnsExceptTransformer with String node Fixes test 03101_analyzer_identifiers_4 (stmt7, stmt9, stmt14) Also fixes 01470_columns_transformers2 (stmt4) * Add INDEX and SETTINGS support for ATTACH TABLE statements - Add Indexes field to AttachQuery struct in ast.go - Add Settings field to AttachQuery struct in ast.go - Parse INDEX definitions in ATTACH TABLE column lists - Parse SETTINGS clause in ATTACH TABLE statements - Update explainAttachQuery to output indexes and settings correctly - Handle engine parentheses with empty argument list Fixes test 01249_bad_arguments_for_bloom_filter (stmt10, stmt13, stmt16) Also fixes 01601_detach_permanently and 02990_rmt_replica_path_uuid * Handle COMMENT in MODIFY COLUMN without data type When parsing MODIFY COLUMN col_name COMMENT 'comment', the COMMENT keyword should not be parsed as a data type. Add COMMENT to the list of tokens that indicate the type is omitted in parseColumnDeclaration. Fixes test 00725_comment_columns_long (stmt9, stmt19, stmt21) * Fix ANY/ALL keyword conflict with any()/all() function calls in expressions The ANY/ALL subquery modifier check in parseBinaryExpression was incorrectly triggering for all binary operators. When parsing expressions like `any(x) >= 1 AND any(y) >= 2`, the parser would see the `any` keyword after the AND operator and attempt to parse it as `expr >= ANY(subquery)` pattern, causing incorrect AST structure. This fix restricts the ANY/ALL check to only comparison operators (=, ==, !=, <>, <, <=, >, >=) where this pattern is valid, preventing conflicts with any()/all() function calls in AND/OR expressions. Also flatten both sides of AND/OR chains in collectLogicalOperands for correct EXPLAIN output matching ClickHouse format. * Remove incorrect ln->log function name normalization ClickHouse's EXPLAIN AST outputs 'ln' for the natural logarithm function, not 'log'. The previous incorrect mapping was causing test failures. * Handle boolean literals correctly in CAST expressions Boolean literals in :: cast syntax should output as Bool_1/Bool_0 format instead of string 'true'/'false' to match ClickHouse EXPLAIN. * Parse column definitions after TO target in MATERIALIZED VIEW For MATERIALIZED VIEW ... TO target (columns) AS SELECT syntax, column definitions can appear after the TO clause. Added parsing support for this variant. * Fix IN expression to include :: cast on right side without parentheses When parsing `expr IN value::Type` (without parentheses around the IN list), the :: cast was being applied to the entire IN expression instead of just the value. Changed precedence from CALL to MUL_PREC to ensure :: is consumed as part of the right-hand expression. * Add IF NOT EXISTS support for ATTACH TABLE statement The parser was treating IF as the table name instead of handling IF NOT EXISTS as a modifier. Added IfNotExists field to AttachQuery and parsing logic to handle the IF NOT EXISTS clause. * Add implicit NULL for caseWithExpression without ELSE clause When a CASE x WHEN form has no ELSE clause, ClickHouse implicitly uses NULL as the else value. The explain output was missing this implicit NULL for caseWithExpression (it was already correct for multiIf form). * Add BACKUP and RESTORE statement support Implement parsing and explain output for BACKUP and RESTORE statements: - Add BACKUP and RESTORE tokens - Add BackupQuery and RestoreQuery AST types - Add parseBackup() and parseRestore() parser functions - Add explain handlers for both query types Fixes tests: 03286_backup_to_null and 03593_backup_with_broken_projection * Distinguish EXCEPT set operation from column exclusion When parsing expressions, check if EXCEPT is followed by SELECT to determine if it's a set operation (SELECT (*) EXCEPT SELECT 1) vs column exclusion (SELECT * EXCEPT (col1, col2)). Fixes test: 03457_inconsistent_formatting_except * Include function arguments in BACKUP/RESTORE explain output Update explainBackupQuery and explainRestoreQuery to output function arguments (e.g., Memory('b1') shows the ExpressionList with 'b1'). Fixes tests: 03286_backup_to_memory, 03276_database_backup_merge_tree_table_file_engine, 03278_database_backup_merge_tree_table_disk_engine, 03279_database_backup_database_disk_engine * Allow keywords as CTE names in WITH clause Support using keywords like 'table' as CTE names (e.g., WITH table AS (SELECT 1 AS key)). Exclude NULL/TRUE/FALSE since they have special literal meanings. Fixes test: 03518_left_to_cross_incorrect * Support WITH TIES modifier after TOP clause Handle `SELECT TOP n WITH TIES *` syntax by consuming the WITH TIES tokens after parsing the TOP expression. Fixes test: 03725_empty_tuple_some_limit_with_ties_distinct * Support SHOW TABLE and SHOW DATABASE as aliases Treat `SHOW TABLE tablename` as equivalent to `SHOW CREATE TABLE tablename` and `SHOW DATABASE dbname` as equivalent to `SHOW CREATE DATABASE dbname`. Fixes tests: 02710_show_table, 03663_parameterized_views_formatting_of_substitutions_excessive_backticks * Strip session/global prefix from MySQL system variables For @@session.varname or @@global.varname syntax, strip the session/global scope qualifier since ClickHouse treats them as just @@varname in EXPLAIN. Fixes test: 01337_mysql_global_variables * Add \e escape sequence support for PHP/MySQL style strings Handle the \e escape sequence (escape character, ASCII 27) in string literals for MySQL/PHP compatibility. Fixes test: 01284_escape_sequences_php_mysql_style * Fix OFFSET ROW parsing to accept both singular and plural forms The SQL standard OFFSET...FETCH syntax uses singular "ROW" (e.g., "OFFSET 1 ROW") but the parser only checked for "ROWS" (plural). This caused ROW to be incorrectly parsed as a subquery alias. Fixed by checking for both "ROW" and "ROWS" when consuming the optional keyword after the OFFSET expression. * Support empty USING () clause in JOINs When parsing USING (), distinguish between "no USING clause" (nil) and "empty USING clause" (empty non-nil slice). This ensures the explain output correctly shows the ExpressionList node even when empty. * Fix SYSTEM command parsing for TTL MERGES table names When parsing SYSTEM STOP/START TTL MERGES commands, the table name was being consumed as part of the command because isSystemCommandKeyword() uses case-insensitive matching. A table named 'ttl' would match 'TTL'. Added check to break command parsing after certain complete command suffixes (MERGES, MOVES, FETCHES, SENDS, MUTATIONS) so the next token is correctly parsed as the table name. * Allow EXISTS keyword as column identifier when not followed by ( In queries like 'WHERE exists' where 'exists' is a column name, the parser was treating EXISTS as the start of an EXISTS(subquery) expression and failing when no ( was found. Now if EXISTS is not followed by (, it's treated as an identifier (column name) instead of the subquery existence operator. * Fix table alias parsing order - alias before FINAL In ClickHouse syntax, table aliases come BEFORE the FINAL keyword: FROM table_name t FINAL WHERE ... The parser was checking for FINAL before alias, which meant FINAL wasn't being consumed when an alias was present. This caused the subsequent UNION or WHERE clause to be missed. Reordered to parse alias first, then FINAL, then SAMPLE. * Fix ADD CONSTRAINT explain output to show expression For ALTER TABLE ADD CONSTRAINT, the explain output was showing just the constraint name as an Identifier. Changed to show the constraint's expression properly: Constraint (children 1) Subquery/Function/etc (the expression) This fixes subquery constraints like CHECK (SELECT 1). * Fix tuple literal expansion in IN expressions and explain output - In IN expressions, only expand tuple literals when all elements are parenthesized primitives (e.g., `1 IN (((1), (2)))` expands to Function tuple with 2 elements) - Tuples with non-parenthesized elements or nested tuples stay as Literal Tuple_ (e.g., `(1, '') IN ((1, ''))` renders as Literal Tuple_(...)) - Update explainLiteral to check for parenthesized elements when deciding between Function tuple and Literal Tuple_ format This fixes test 02370_analyzer_in_function and also enables stmt8 in 03552_inconsistent_formatting_operator_as_table_function. * Propagate WITH clause to subsequent SELECTs in UNION queries In ClickHouse's EXPLAIN AST output, the WITH clause from the first SELECT in a UNION ALL/UNION query is propagated to subsequent SELECTs. The inherited WITH clause is output at the END of children for those subsequent SELECT queries. This fix applies the same WITH clause propagation logic that was already implemented for INTERSECT/EXCEPT queries to plain UNION queries. Fixes tests: - 01515_with_global_and_with_propagation (stmt5, stmt11) - 03671_pk_in_subquery_context_expired (stmt7) - 03611_uniqExact_bug (stmt2) - 03033_analyzer_resolve_from_parent_scope (stmt4) - 01236_graphite_mt (stmt4) * Fix database-qualified dictionary names in DETACH/ATTACH statements The parser was not handling database-qualified dictionary names like `db.dict` in DETACH DICTIONARY and ATTACH DICTIONARY statements. Parser changes: - parseDetach: Allow qualified names for DICTIONARY (database.dict) - parseAttach: Allow qualified names for DICTIONARY (database.dict) Explain changes: - explainDetachQuery: Handle Database + Dictionary case - explainAttachQuery: Handle Database + Dictionary case Fixes tests: - 01110_dictionary_layout_without_arguments (stmt7, stmt8) - 01575_disable_detach_table_of_dictionary (stmt7, stmt9) - 01018_ddl_dictionaries_create (stmt17, stmt22) * Accept keywords as index type names in ALTER ADD INDEX The parser was only accepting identifiers (token.IDENT) for index type names like "set" in "ADD INDEX idx c TYPE set(0)". However, "set" is tokenized as a keyword (token.SET). This fix allows keywords to be used as index type names and AFTER index names, matching ClickHouse behavior. Fixes tests: - 01932_alter_index_with_order (stmt5, stmt6) - 03629_storage_s3_disallow_index_alter (stmt3) - 02131_skip_index_not_materialized (stmt4) * Render arrays with parenthesized elements as Function array Arrays containing parenthesized elements like [('a')] should be rendered as Function array with children, not as Literal Array_[...]. This matches ClickHouse's EXPLAIN AST behavior where parenthesized elements inside arrays require the expanded function format. Fixes tests: - 02354_tuple_element_with_default (stmt5, stmt14) - 03552_inconsistent_formatting_operator_as_table_function (stmt5) * Handle LIMIT offset, count syntax after LIMIT BY clause When parsing LIMIT after LIMIT BY (e.g., LIMIT 1 BY x LIMIT 5, 5), the parser was only capturing the first value. This fix handles the comma syntax to correctly parse both offset and count values. Fixes test: - 02003_WithMergeableStateAfterAggregationAndLimit_LIMIT_BY_LIMIT_OFFSET (stmt2, stmt3) * Support IN PARTITION clause in DELETE statements Added Partition field to DeleteQuery AST and parsing for the IN PARTITION clause in lightweight DELETE statements. The syntax is: DELETE FROM table IN PARTITION partition_expr WHERE condition Fixes test: - 02352_lightweight_delete_in_partition (stmt11, stmt12) * Support qualified identifiers starting with keywords When a keyword like SYSTEM is used as the start of a qualified name (e.g., system.one.*), parseKeywordAsIdentifier was returning just the keyword as a single-part identifier. Now it continues to parse DOT sequences to build qualified identifiers and handle qualified asterisks. Fixes tests: - 00467_qualified_names (stmt19, stmt21) - 00502_custom_partitioning_local (stmt17) * Support TTL elements with WHERE conditions Added TTLElement struct to store both the TTL expression and the optional WHERE condition. Updated parser to correctly parse multiple TTL elements separated by commas, including proper handling of SET clause comma separation (SET assignments vs new TTL elements). Fixes tests: - 01622_multiple_ttls (stmt3, stmt11) - 03236_create_query_ttl_where (stmt2) - 03636_empty_projection_block (stmt1) - 03622_ttl_infos_where (stmt3) - 02932_set_ttl_where (stmt2) * Support PARTITION ID syntax in OPTIMIZE TABLE statements Add PartitionByID field to OptimizeQuery AST to distinguish PARTITION ID 'value' from PARTITION expr. The parser now detects the ID keyword after PARTITION and sets this flag. The explain output renders Partition_ID with the inline literal format matching ClickHouse's EXPLAIN AST output. * Fix ALTER ADD INDEX tuple expression parsing Simplified the index expression parsing in ALTER ADD INDEX to let parseExpression handle parentheses naturally. This allows tuple expressions like (a, b, c) to be parsed correctly, matching how CREATE TABLE INDEX parsing works. * Add support for KILL QUERY/MUTATION statements Adds KillQuery AST type and parser for KILL QUERY/MUTATION statements. The explain output matches ClickHouse format with the WHERE expression operator in the header (e.g., Function_and) and SYNC/ASYNC mode. * Enable duplicate output for RELOAD DICTIONARY in SYSTEM queries Add RELOAD DICTIONARY to the list of SYSTEM commands that output database/table identifiers twice in EXPLAIN AST format. * Handle large number overflow and preserve original source text - For hex numbers that overflow uint64 (like 0x123456789ABCDEF01), convert to Float64 - For decimal numbers that overflow, try float64 parsing - Preserve original source text in Literal.Source for formatting in CAST expressions - Update explain for negated uint64 values that overflow int64 to output Float64 - Use Source field in formatElementAsString to preserve exact text in array/tuple casts * Add alias support for ArrayAccess and BetweenExpr in WITH clauses - Add explainBetweenExprWithAlias to support aliases on BETWEEN expressions - Add ArrayAccess and BetweenExpr cases to explainWithElement - Enables WITH expr AS name syntax for array subscripts and BETWEEN clauses * Handle empty PRIMARY KEY () in CREATE TABLE explain output - Add HasEmptyColumnsPrimaryKey flag to CreateTableQuery and AttachQuery - Set flag in parser when PRIMARY KEY () has empty parentheses - Update explain output to show Function tuple with empty ExpressionList * Support TTL DELETE WHERE clause in ALTER TABLE MODIFY TTL - Update ALTER MODIFY TTL parsing to use parseTTLElement - Capture WHERE condition in TTLElement for conditional deletion - Update explain code to output TTLElement with WHERE as child * Trim whitespace in query parameter name and type parsing Parameters like {a1: Int32} with spaces after colon now correctly parse as name=a1 type=Int32 without leading/trailing spaces --------- Co-authored-by: Claude <[email protected]>
1 parent 93d8182 commit 2ac2ff1

File tree

136 files changed

+1627
-861
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

136 files changed

+1627
-861
lines changed

ast/ast.go

Lines changed: 83 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -282,7 +282,8 @@ type CreateQuery struct {
282282
Indexes []*IndexDefinition `json:"indexes,omitempty"`
283283
Projections []*Projection `json:"projections,omitempty"`
284284
Constraints []*Constraint `json:"constraints,omitempty"`
285-
ColumnsPrimaryKey []Expression `json:"columns_primary_key,omitempty"` // PRIMARY KEY in column list
285+
ColumnsPrimaryKey []Expression `json:"columns_primary_key,omitempty"` // PRIMARY KEY in column list
286+
HasEmptyColumnsPrimaryKey bool `json:"has_empty_columns_primary_key,omitempty"` // TRUE if PRIMARY KEY () was seen with empty parens
286287
Engine *EngineClause `json:"engine,omitempty"`
287288
OrderBy []Expression `json:"order_by,omitempty"`
288289
OrderByHasModifiers bool `json:"order_by_has_modifiers,omitempty"` // True if ORDER BY has ASC/DESC modifiers
@@ -496,11 +497,22 @@ type TTLClause struct {
496497
Position token.Position `json:"-"`
497498
Expression Expression `json:"expression"`
498499
Expressions []Expression `json:"expressions,omitempty"` // Additional TTL expressions (for multiple TTL elements)
500+
Elements []*TTLElement `json:"elements,omitempty"` // TTL elements with WHERE conditions
499501
}
500502

501503
func (t *TTLClause) Pos() token.Position { return t.Position }
502504
func (t *TTLClause) End() token.Position { return t.Position }
503505

506+
// TTLElement represents a single TTL element with optional WHERE condition.
507+
type TTLElement struct {
508+
Position token.Position `json:"-"`
509+
Expr Expression `json:"expr"`
510+
Where Expression `json:"where,omitempty"` // WHERE condition for DELETE
511+
}
512+
513+
func (t *TTLElement) Pos() token.Position { return t.Position }
514+
func (t *TTLElement) End() token.Position { return t.Position }
515+
504516
// DropQuery represents a DROP statement.
505517
type DropQuery struct {
506518
Position token.Position `json:"-"`
@@ -707,11 +719,12 @@ func (t *TruncateQuery) statementNode() {}
707719

708720
// DeleteQuery represents a lightweight DELETE statement.
709721
type DeleteQuery struct {
710-
Position token.Position `json:"-"`
711-
Database string `json:"database,omitempty"`
712-
Table string `json:"table"`
713-
Where Expression `json:"where,omitempty"`
714-
Settings []*SettingExpr `json:"settings,omitempty"`
722+
Position token.Position `json:"-"`
723+
Database string `json:"database,omitempty"`
724+
Table string `json:"table"`
725+
Partition Expression `json:"partition,omitempty"` // IN PARTITION clause
726+
Where Expression `json:"where,omitempty"`
727+
Settings []*SettingExpr `json:"settings,omitempty"`
715728
}
716729

717730
func (d *DeleteQuery) Pos() token.Position { return d.Position }
@@ -743,11 +756,14 @@ func (d *DetachQuery) statementNode() {}
743756
// AttachQuery represents an ATTACH statement.
744757
type AttachQuery struct {
745758
Position token.Position `json:"-"`
759+
IfNotExists bool `json:"if_not_exists,omitempty"`
746760
Database string `json:"database,omitempty"`
747761
Table string `json:"table,omitempty"`
748762
Dictionary string `json:"dictionary,omitempty"`
749763
Columns []*ColumnDeclaration `json:"columns,omitempty"`
750-
ColumnsPrimaryKey []Expression `json:"columns_primary_key,omitempty"` // PRIMARY KEY in column list
764+
ColumnsPrimaryKey []Expression `json:"columns_primary_key,omitempty"` // PRIMARY KEY in column list
765+
HasEmptyColumnsPrimaryKey bool `json:"has_empty_columns_primary_key,omitempty"` // TRUE if PRIMARY KEY () was seen with empty parens
766+
Indexes []*IndexDefinition `json:"indexes,omitempty"` // INDEX definitions in column list
751767
Engine *EngineClause `json:"engine,omitempty"`
752768
OrderBy []Expression `json:"order_by,omitempty"`
753769
PrimaryKey []Expression `json:"primary_key,omitempty"`
@@ -756,12 +772,47 @@ type AttachQuery struct {
756772
InnerUUID string `json:"inner_uuid,omitempty"` // TO INNER UUID clause
757773
PartitionBy Expression `json:"partition_by,omitempty"`
758774
SelectQuery Statement `json:"select_query,omitempty"` // AS SELECT clause
775+
Settings []*SettingExpr `json:"settings,omitempty"` // SETTINGS clause
759776
}
760777

761778
func (a *AttachQuery) Pos() token.Position { return a.Position }
762779
func (a *AttachQuery) End() token.Position { return a.Position }
763780
func (a *AttachQuery) statementNode() {}
764781

782+
// BackupQuery represents a BACKUP statement.
783+
type BackupQuery struct {
784+
Position token.Position `json:"-"`
785+
Database string `json:"database,omitempty"`
786+
Table string `json:"table,omitempty"`
787+
Dictionary string `json:"dictionary,omitempty"`
788+
All bool `json:"all,omitempty"` // BACKUP ALL
789+
Temporary bool `json:"temporary,omitempty"`
790+
Target *FunctionCall `json:"target,omitempty"` // Disk('path') or Null
791+
Settings []*SettingExpr `json:"settings,omitempty"`
792+
Format string `json:"format,omitempty"`
793+
}
794+
795+
func (b *BackupQuery) Pos() token.Position { return b.Position }
796+
func (b *BackupQuery) End() token.Position { return b.Position }
797+
func (b *BackupQuery) statementNode() {}
798+
799+
// RestoreQuery represents a RESTORE statement.
800+
type RestoreQuery struct {
801+
Position token.Position `json:"-"`
802+
Database string `json:"database,omitempty"`
803+
Table string `json:"table,omitempty"`
804+
Dictionary string `json:"dictionary,omitempty"`
805+
All bool `json:"all,omitempty"` // RESTORE ALL
806+
Temporary bool `json:"temporary,omitempty"`
807+
Source *FunctionCall `json:"source,omitempty"` // Disk('path') or Null
808+
Settings []*SettingExpr `json:"settings,omitempty"`
809+
Format string `json:"format,omitempty"`
810+
}
811+
812+
func (r *RestoreQuery) Pos() token.Position { return r.Position }
813+
func (r *RestoreQuery) End() token.Position { return r.Position }
814+
func (r *RestoreQuery) statementNode() {}
815+
765816
// DescribeQuery represents a DESCRIBE statement.
766817
type DescribeQuery struct {
767818
Position token.Position `json:"-"`
@@ -860,15 +911,16 @@ func (s *SetQuery) statementNode() {}
860911

861912
// OptimizeQuery represents an OPTIMIZE statement.
862913
type OptimizeQuery struct {
863-
Position token.Position `json:"-"`
864-
Database string `json:"database,omitempty"`
865-
Table string `json:"table"`
866-
Partition Expression `json:"partition,omitempty"`
867-
Final bool `json:"final,omitempty"`
868-
Cleanup bool `json:"cleanup,omitempty"`
869-
Dedupe bool `json:"dedupe,omitempty"`
870-
OnCluster string `json:"on_cluster,omitempty"`
871-
Settings []*SettingExpr `json:"settings,omitempty"`
914+
Position token.Position `json:"-"`
915+
Database string `json:"database,omitempty"`
916+
Table string `json:"table"`
917+
Partition Expression `json:"partition,omitempty"`
918+
PartitionByID bool `json:"partition_by_id,omitempty"` // PARTITION ID vs PARTITION expr
919+
Final bool `json:"final,omitempty"`
920+
Cleanup bool `json:"cleanup,omitempty"`
921+
Dedupe bool `json:"dedupe,omitempty"`
922+
OnCluster string `json:"on_cluster,omitempty"`
923+
Settings []*SettingExpr `json:"settings,omitempty"`
872924
}
873925

874926
func (o *OptimizeQuery) Pos() token.Position { return o.Position }
@@ -995,6 +1047,20 @@ func (s *ShowGrantsQuery) Pos() token.Position { return s.Position }
9951047
func (s *ShowGrantsQuery) End() token.Position { return s.Position }
9961048
func (s *ShowGrantsQuery) statementNode() {}
9971049

1050+
// KillQuery represents a KILL QUERY/MUTATION statement.
1051+
type KillQuery struct {
1052+
Position token.Position `json:"-"`
1053+
Type string `json:"type"` // "QUERY" or "MUTATION"
1054+
Where Expression `json:"where,omitempty"` // WHERE condition
1055+
Sync bool `json:"sync,omitempty"` // SYNC mode (default false = ASYNC)
1056+
Test bool `json:"test,omitempty"` // TEST mode
1057+
Format string `json:"format,omitempty"` // FORMAT clause
1058+
}
1059+
1060+
func (k *KillQuery) Pos() token.Position { return k.Position }
1061+
func (k *KillQuery) End() token.Position { return k.Position }
1062+
func (k *KillQuery) statementNode() {}
1063+
9981064
// ShowPrivilegesQuery represents a SHOW PRIVILEGES statement.
9991065
type ShowPrivilegesQuery struct {
10001066
Position token.Position `json:"-"`
@@ -1360,6 +1426,7 @@ type ColumnTransformer struct {
13601426
Apply string `json:"apply,omitempty"` // function name for APPLY
13611427
ApplyLambda Expression `json:"apply_lambda,omitempty"` // lambda expression for APPLY x -> expr
13621428
Except []string `json:"except,omitempty"` // column names for EXCEPT
1429+
Pattern string `json:"pattern,omitempty"` // regex pattern for EXCEPT('pattern')
13631430
Replaces []*ReplaceExpr `json:"replaces,omitempty"` // replacement expressions for REPLACE
13641431
}
13651432

internal/explain/explain.go

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -238,6 +238,10 @@ func Node(sb *strings.Builder, node interface{}, depth int) {
238238
explainDetachQuery(sb, n, indent)
239239
case *ast.AttachQuery:
240240
explainAttachQuery(sb, n, indent, depth)
241+
case *ast.BackupQuery:
242+
explainBackupQuery(sb, n, indent)
243+
case *ast.RestoreQuery:
244+
explainRestoreQuery(sb, n, indent)
241245
case *ast.AlterQuery:
242246
explainAlterQuery(sb, n, indent, depth)
243247
case *ast.OptimizeQuery:
@@ -254,6 +258,8 @@ func Node(sb *strings.Builder, node interface{}, depth int) {
254258
explainUpdateQuery(sb, n, indent, depth)
255259
case *ast.ParallelWithQuery:
256260
explainParallelWithQuery(sb, n, indent, depth)
261+
case *ast.KillQuery:
262+
explainKillQuery(sb, n, indent, depth)
257263

258264
// Types
259265
case *ast.DataType:

internal/explain/expressions.go

Lines changed: 57 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -8,9 +8,12 @@ import (
88
"github.com/sqlc-dev/doubleclick/ast"
99
)
1010

11-
// escapeAlias escapes backslashes in alias names for EXPLAIN output
11+
// escapeAlias escapes backslashes and single quotes in alias names for EXPLAIN output
1212
func escapeAlias(alias string) string {
13-
return strings.ReplaceAll(alias, "\\", "\\\\")
13+
// Escape backslashes first, then single quotes
14+
result := strings.ReplaceAll(alias, "\\", "\\\\")
15+
result = strings.ReplaceAll(result, "'", "\\'")
16+
return result
1417
}
1518

1619
func explainIdentifier(sb *strings.Builder, n *ast.Identifier, indent string) {
@@ -53,19 +56,17 @@ func explainLiteral(sb *strings.Builder, n *ast.Literal, indent string, depth in
5356
fmt.Fprintf(sb, "%s ExpressionList\n", indent)
5457
return
5558
}
56-
// Single-element tuples (from trailing comma syntax like (1,)) always render as Function tuple
57-
if len(exprs) == 1 {
58-
fmt.Fprintf(sb, "%sFunction tuple (children %d)\n", indent, 1)
59-
fmt.Fprintf(sb, "%s ExpressionList (children %d)\n", indent, len(exprs))
60-
for _, e := range exprs {
61-
Node(sb, e, depth+2)
62-
}
63-
return
64-
}
59+
// Check if any element is parenthesized (e.g., ((1), (2)) vs (1, 2))
60+
// Parenthesized elements mean the tuple should render as Function tuple
61+
hasParenthesizedElement := false
6562
hasComplexExpr := false
6663
for _, e := range exprs {
67-
// Simple literals (numbers, strings, etc.) are OK
64+
// Check for parenthesized literals
6865
if lit, isLit := e.(*ast.Literal); isLit {
66+
if lit.Parenthesized {
67+
hasParenthesizedElement = true
68+
break
69+
}
6970
// Nested tuples that contain only primitive literals are OK
7071
if lit.Type == ast.LiteralTuple {
7172
if !containsOnlyPrimitiveLiteralsWithUnary(lit) {
@@ -79,7 +80,6 @@ func explainLiteral(sb *strings.Builder, n *ast.Literal, indent string, depth in
7980
hasComplexExpr = true
8081
break
8182
}
82-
// Other literals are simple
8383
continue
8484
}
8585
// Unary negation of numeric literals is also simple
@@ -94,8 +94,9 @@ func explainLiteral(sb *strings.Builder, n *ast.Literal, indent string, depth in
9494
hasComplexExpr = true
9595
break
9696
}
97-
if hasComplexExpr {
98-
// Render as Function tuple instead of Literal
97+
// Single-element tuples (from trailing comma syntax like (1,)) always render as Function tuple
98+
// Tuples with complex expressions or parenthesized elements also render as Function tuple
99+
if len(exprs) == 1 || hasComplexExpr || hasParenthesizedElement {
99100
fmt.Fprintf(sb, "%sFunction tuple (children %d)\n", indent, 1)
100101
fmt.Fprintf(sb, "%s ExpressionList (children %d)\n", indent, len(exprs))
101102
for _, e := range exprs {
@@ -131,6 +132,10 @@ func explainLiteral(sb *strings.Builder, n *ast.Literal, indent string, depth in
131132

132133
for _, e := range exprs {
133134
if lit, ok := e.(*ast.Literal); ok {
135+
// Parenthesized elements require Function array format
136+
if lit.Parenthesized {
137+
shouldUseFunctionArray = true
138+
}
134139
if lit.Type == ast.LiteralArray {
135140
hasNestedArrays = true
136141
// Check if inner array needs Function array format:
@@ -395,8 +400,13 @@ func collectLogicalOperands(n *ast.BinaryExpr) []ast.Expression {
395400
operands = append(operands, n.Left)
396401
}
397402

398-
// Don't flatten right side - explicit parentheses would be on the left in left-associative parsing
399-
operands = append(operands, n.Right)
403+
// Also flatten right side if it's the same operator and not parenthesized
404+
// This handles both left-associative and right-associative parsing
405+
if right, ok := n.Right.(*ast.BinaryExpr); ok && right.Op == n.Op && !right.Parenthesized {
406+
operands = append(operands, collectLogicalOperands(right)...)
407+
} else {
408+
operands = append(operands, n.Right)
409+
}
400410

401411
return operands
402412
}
@@ -425,8 +435,15 @@ func explainUnaryExpr(sb *strings.Builder, n *ast.UnaryExpr, indent string, dept
425435
// ClickHouse normalizes -0 to UInt64_0
426436
if val == 0 {
427437
fmt.Fprintf(sb, "%sLiteral UInt64_0\n", indent)
428-
} else {
438+
} else if val <= 9223372036854775808 {
439+
// Value fits in int64 when negated
440+
// Note: -9223372036854775808 is int64 min, so 9223372036854775808 is included
429441
fmt.Fprintf(sb, "%sLiteral Int64_-%d\n", indent, val)
442+
} else {
443+
// Value too large for int64 - output as Float64
444+
f := -float64(val)
445+
s := FormatFloat(f)
446+
fmt.Fprintf(sb, "%sLiteral Float64_%s\n", indent, s)
430447
}
431448
return
432449
}
@@ -647,7 +664,16 @@ func explainAliasedExpr(sb *strings.Builder, n *ast.AliasedExpr, depth int) {
647664
fmt.Fprintf(sb, "%sLiteral Int64_%d (alias %s)\n", indent, -val, escapeAlias(n.Alias))
648665
return
649666
case uint64:
650-
fmt.Fprintf(sb, "%sLiteral Int64_-%d (alias %s)\n", indent, val, escapeAlias(n.Alias))
667+
if val <= 9223372036854775808 {
668+
// Value fits in int64 when negated
669+
// Note: -9223372036854775808 is int64 min, so 9223372036854775808 is included
670+
fmt.Fprintf(sb, "%sLiteral Int64_-%d (alias %s)\n", indent, val, escapeAlias(n.Alias))
671+
} else {
672+
// Value too large for int64 - output as Float64
673+
f := -float64(val)
674+
s := FormatFloat(f)
675+
fmt.Fprintf(sb, "%sLiteral Float64_%s (alias %s)\n", indent, s, escapeAlias(n.Alias))
676+
}
651677
return
652678
}
653679
case ast.LiteralFloat:
@@ -789,9 +815,14 @@ func explainSingleTransformer(sb *strings.Builder, t *ast.ColumnTransformer, ind
789815
case "apply":
790816
fmt.Fprintf(sb, "%s ColumnsApplyTransformer\n", indent)
791817
case "except":
792-
fmt.Fprintf(sb, "%s ColumnsExceptTransformer (children %d)\n", indent, len(t.Except))
793-
for _, col := range t.Except {
794-
fmt.Fprintf(sb, "%s Identifier %s\n", indent, col)
818+
// If it's a regex pattern, output without children
819+
if t.Pattern != "" {
820+
fmt.Fprintf(sb, "%s ColumnsExceptTransformer\n", indent)
821+
} else {
822+
fmt.Fprintf(sb, "%s ColumnsExceptTransformer (children %d)\n", indent, len(t.Except))
823+
for _, col := range t.Except {
824+
fmt.Fprintf(sb, "%s Identifier %s\n", indent, col)
825+
}
795826
}
796827
case "replace":
797828
fmt.Fprintf(sb, "%s ColumnsReplaceTransformer (children %d)\n", indent, len(t.Replaces))
@@ -1029,6 +1060,10 @@ func explainWithElement(sb *strings.Builder, n *ast.WithElement, indent string,
10291060
}
10301061
case *ast.CastExpr:
10311062
explainCastExprWithAlias(sb, e, n.Name, indent, depth)
1063+
case *ast.ArrayAccess:
1064+
explainArrayAccessWithAlias(sb, e, n.Name, indent, depth)
1065+
case *ast.BetweenExpr:
1066+
explainBetweenExprWithAlias(sb, e, n.Name, indent, depth)
10321067
default:
10331068
// For other types, just output the expression (alias may be lost)
10341069
Node(sb, n.Query, depth)

0 commit comments

Comments
 (0)