feat(hogql): remaining duck/postgres syntax 2/2#51955
feat(hogql): remaining duck/postgres syntax 2/2#51955georgemunyoro wants to merge 30 commits intomasterfrom
Conversation
|
There was a problem hiding this comment.
Pull request overview
Expands HogQL’s “duck/postgres” dialect support to cover remaining DuckDB/Postgres-like SQL constructs, particularly around PIVOT, GROUP BY ALL, function-call ORDER BY, and richer type-cast type expressions.
Changes:
- Add PIVOT AST nodes + parsing, printing, and type-resolution support (postgres dialect only).
- Support
ORDER BYinside function calls and print/resolve it appropriately. - Extend parsing/printing for
GROUP BY ALLand additional CAST type syntaxes (arrays + complex types).
Reviewed changes
Copilot reviewed 23 out of 30 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| posthog/hogql/ast.py | Adds PivotExpr/PivotColumn nodes, extends Call with order_by, and updates join table union. |
| posthog/hogql/parser.py | Parses GROUP BY ALL, function-call ORDER BY, array/complex column types, and PIVOT table expressions. |
| posthog/hogql/visitor.py | Traversal + cloning support for new pivot nodes and Call.order_by. |
| posthog/hogql/resolver.py | Resolves PIVOT expressions (postgres dialect only) and integrates PIVOT into join resolution. |
| posthog/hogql/printer/base.py | Prints GROUP BY ALL, PIVOT, function-call ORDER BY, and prints ExpressionFieldType. |
| posthog/hogql/printer/postgres.py | Adds validation/printing support for function-call ORDER BY in postgres dialect. |
| posthog/hogql/printer/postgres_functions.py | Updates postgres passthrough function allowlist (adds year). |
| posthog/hogql/test/_test_parser.py | Parser coverage for function-call ORDER BY, GROUP BY ALL, PIVOT, and additional CAST type syntaxes. |
| posthog/hogql/test/test_resolver.py | Resolver coverage for PIVOT success cases + error cases, plus function-call ORDER BY. |
| posthog/hogql/printer/test/test_printer.py | Printer coverage for PIVOT and function-call ORDER BY. |
| posthog/hogql/grammar/HogQLParser.g4 | Grammar updates for PIVOT, GROUP BY ALL, function-call ORDER BY, and array/complex type expressions. |
| posthog/hogql/grammar/HogQLLexer.common.g4 | Adds PIVOT keyword token. |
| posthog/hogql/grammar/HogQLParserVisitor.py | Regenerates visitor stubs for new grammar nodes. |
| posthog/hogql/grammar/HogQLLexer.tokens | Regenerated lexer tokens including PIVOT. |
| posthog/hogql/grammar/HogQLParser.tokens | Regenerated parser tokens including PIVOT. |
| common/hogql_parser/parser_json.cpp | C++ JSON converter support for GROUP BY ALL, type arrays/complex, function-call ORDER BY, and PIVOT. |
| common/hogql_parser/HogQLParser.h | Regenerated C++ parser header for new grammar constructs. |
| common/hogql_parser/HogQLLexer.h | Regenerated C++ lexer header for new token set. |
| common/hogql_parser/HogQLParserVisitor.h | Regenerated C++ visitor interface additions. |
| common/hogql_parser/HogQLParserBaseVisitor.h | Regenerated C++ base visitor additions. |
| common/hogql_parser/HogQLParser.tokens | Regenerated C++ parser tokens including PIVOT. |
| common/hogql_parser/HogQLLexer.tokens | Regenerated C++ lexer tokens including PIVOT. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
i see this is a draft. is there a part 1 i missed or is it merged? |
@andrewjmcgehee Part 1 is already merged, will have this one merged by end of day #50353 |
- Added support for `JoinExprPositional` in the HogQL parser to handle positional joins. - Implemented new column expression types: `ColumnExprColumnsQualifiedExclude`, `ColumnExprColumnsQualifiedReplace`, `ColumnExprColumnsQualifiedExcludeReplace`, and `ColumnExprColumnsQualifiedAll` to extend the capabilities of column expressions. - Updated the `HogQLParseTreeConverter` to process the new join and column expressions correctly. - Enhanced the `HogQLPrinter` to include `ORDER BY` clause handling in the output for queries that specify ordering.
- Updated HogQLParser.tokens to include new tokens for IGNORE, INCLUDE, and other keywords. - Implemented visitJoinExprPivot method in HogQLParserVisitor to handle JOIN with PIVOT expressions. - Added visitColumnExprIgnoreNulls method in HogQLParseTreeConverter to process IGNORE NULLS expressions. - Enhanced visitJoinExprPivot method to construct the appropriate AST for PIVOT expressions. - Modified visitColumnExprList to ensure compatibility with new expression handling.
- Implemented visitJoinExprUnpivot method in HogQLParserVisitor to handle JOIN UNPIVOT parse tree. - Enhanced HogQLParseTreeConverter with visitJoinExprUnpivot to convert JOIN UNPIVOT expressions into AST.
Prompt To Fix All With AIThis is a comment left during a code review.
Path: posthog/hogql/parser.py
Line: 1293-1298
Comment:
**`visitColumnExprColumnsQualifiedAll` / `visitColumnExprColumnsQualifiedExclude` produce objects inconsistent with their tests**
`visitColumnExprColumnsQualifiedAll` (line 1293) returns `ast.ColumnsExpr(all_columns=True)`, identical to `visitColumnExprColumnsAll`. It discards the table qualifier entirely.
The parser test at `_test_parser.py:1208-1212` expects:
```python
ast.ColumnsExpr(columns=[ast.Field(chain=["events", "*"])])
```
…which has `all_columns=False, columns=[Field(["events", "*"])]`.
`visitColumnExprColumnsQualifiedExclude` (line 1296) returns `ast.ColumnsExpr(all_columns=True, exclude=exclude)`, but the test at `_test_parser.py:1214-1219` expects:
```python
ast.ColumnsExpr(columns=[ast.ColumnsExpr(all_columns=True, exclude=["event"])])
```
Because these are plain Python dataclasses, the equality comparisons would fail for both cases. The `COLUMNS(events.*)` and `COLUMNS(events.* EXCLUDE (...))` visitors need to either (a) preserve the qualifier in the returned node, or (b) the tests need to be updated to the simpler `all_columns=True` form if losing the qualifier is intentional.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: posthog/hogql/parser.py
Line: 410-411
Comment:
**`SelectSetQuery.order_by` appears to be dead code**
The grammar change adds `orderByClause?` to `selectSetStmt`, but ANTLR's LL(\*) parser greedily assigns any trailing `ORDER BY` to the last individual `selectStmt` inside `subsequentSelectSetClause`. As a result, `ctx.orderByClause()` here would always be `None` and `result.order_by` is never set.
This is confirmed by both new tests:
1. **Parser test** (`_test_parser.py:1868`): `test_select_set_order_by` places `ORDER BY` on the last individual `SelectQuery` within `subsequent_select_queries`, not on `SelectSetQuery`.
2. **Printer test** (`test_printer.py:346`): the expected string is `"SELECT 1 LIMIT 50000 UNION ALL SELECT 2 ORDER BY 1 ASC LIMIT 50000"`. `ORDER BY 1 ASC` appears *before* the final `LIMIT 50000`, which can only happen when the order-by belongs to the individual `SelectQuery`. If `SelectSetQuery.order_by` were populated, the printer (lines 144–149 of `base.py`) would append it *after* the individual LIMITs, producing `"…SELECT 2 LIMIT 50000 ORDER BY 1 ASC"`.
The infrastructure added to support `SelectSetQuery.order_by` (this parser line, `resolver.py:644`, `visitor.py:CloningVisitor`, and `printer/base.py:144-149`) is never exercised and violates the "no superfluous parts" simplicity rule.
If union-level ORDER BY is truly needed (e.g. for the `(SELECT 1) UNION ALL (SELECT 2) ORDER BY 1` parenthesised form), a dedicated test for that form is required to confirm the code path is actually triggered.
How can I resolve this? If you propose a fix, please make it concise.Reviews (2): Last reviewed commit: "bot review" | Re-trigger Greptile |
|
It looks like the code of |
|
It looks like the code of |
|
Size Change: 0 B Total Size: 110 MB ℹ️ View Unchanged
|
… into feat/hogql-final-quack-2
Problem
We should be able to fully parse DuckDB SQL
Changes
Add remaining syntax + grammar tweaks
How did you test this code?
Unit tests
Publish to changelog?
No