|
1 | | -# PostgreSQL 13->14 AST Transformer Notes |
2 | | - |
3 | 1 | ## Current Status |
4 | | -- **Pass Rate**: 125/258 tests (48.4%) |
5 | | -- **Baseline**: Improved from 124 to 125 with enum transformations |
6 | | -- **Branch**: devin/1750826349-v13-to-v14-transformer |
7 | | -- **Last Updated**: June 26, 2025 22:04 UTC |
8 | | - |
9 | | -## Primary Challenge: funcformat Field Transformation |
10 | | - |
11 | | -### Problem Description |
12 | | -The main blocker for improving beyond 124/258 is the `funcformat` field in `FuncCall` nodes. The current transformer adds `funcformat: "COERCE_EXPLICIT_CALL"` to all FuncCall nodes, but PG14's actual behavior is more nuanced: |
13 | | - |
14 | | -### Observed Patterns from Failing Tests |
15 | | - |
16 | | -#### 1. SQL Syntax Functions (should use COERCE_SQL_SYNTAX) |
17 | | -- **TRIM functions**: `TRIM(BOTH FROM ' text ')` → `funcformat: "COERCE_SQL_SYNTAX"` |
18 | | -- **String functions**: `SUBSTRING`, `POSITION`, `OVERLAY` |
19 | | -- **Date/time functions**: `EXTRACT`, `CURRENT_DATE`, `CURRENT_TIMESTAMP` |
20 | | - |
21 | | -**Example failure** (strings-41.sql): |
22 | | -``` |
23 | | -Expected: "funcformat": "COERCE_SQL_SYNTAX" |
24 | | -Received: "funcformat": "COERCE_EXPLICIT_CALL" |
25 | | -``` |
26 | | - |
27 | | -#### 2. Aggregate Functions in TypeCast (should have NO funcformat) |
28 | | -- **Aggregate + TypeCast**: `CAST(AVG(column) AS NUMERIC(10,3))` → no funcformat field |
29 | | -- **Mathematical functions in casts**: Similar pattern |
30 | | - |
31 | | -**Example failure** (aggregates-3.sql): |
32 | | -``` |
33 | | -Expected: (no funcformat field) |
34 | | -Received: "funcformat": "COERCE_EXPLICIT_CALL" |
35 | | -``` |
36 | | - |
37 | | -#### 3. Context-Specific Exclusions (already implemented) |
38 | | -Current exclusions working correctly: |
39 | | -- CHECK constraints |
40 | | -- COMMENT statements |
41 | | -- TypeCast contexts |
42 | | -- XmlExpr contexts |
43 | | -- INSERT statements |
44 | | -- RangeFunction contexts |
45 | | - |
46 | | -### Technical Implementation Challenges |
47 | | - |
48 | | -#### Current Approach |
49 | | -```typescript |
50 | | -// Current: One-size-fits-all |
51 | | -if (!this.shouldExcludeFuncformat(node, context)) { |
52 | | - result.funcformat = "COERCE_EXPLICIT_CALL"; |
53 | | -} |
54 | | -``` |
55 | | - |
56 | | -#### Needed Approach |
57 | | -```typescript |
58 | | -// Needed: Function-specific logic |
59 | | -if (!this.shouldExcludeFuncformat(node, context)) { |
60 | | - result.funcformat = this.getFuncformatValue(node, context); |
61 | | -} |
62 | | - |
63 | | -private getFuncformatValue(node: any, context: TransformerContext): string { |
64 | | - const funcname = this.getFunctionName(node); |
65 | | - |
66 | | - // SQL syntax functions |
67 | | - if (sqlSyntaxFunctions.includes(funcname.toLowerCase())) { |
68 | | - return 'COERCE_SQL_SYNTAX'; |
69 | | - } |
70 | | - |
71 | | - // Default to explicit call |
72 | | - return 'COERCE_EXPLICIT_CALL'; |
73 | | -} |
74 | | -``` |
75 | | - |
76 | | -### Analysis of Remaining 134 Failing Tests |
77 | | - |
78 | | -#### Test Categories with funcformat Issues: |
79 | | -1. **String manipulation**: TRIM, SUBSTRING, etc. (need COERCE_SQL_SYNTAX) |
80 | | -2. **Aggregates in TypeCast**: AVG, SUM, etc. in CAST expressions (need exclusion) |
81 | | -3. **Date/time functions**: EXTRACT, date arithmetic (need COERCE_SQL_SYNTAX) |
82 | | -4. **Array operations**: Array functions and operators |
83 | | -5. **Numeric operations**: Mathematical functions in various contexts |
84 | | - |
85 | | -#### Root Cause Analysis: |
86 | | -The 124/258 plateau suggests that: |
87 | | -- Context-specific exclusions are working (no regressions) |
88 | | -- But function-specific `funcformat` values are the missing piece |
89 | | -- Need to distinguish between SQL syntax vs explicit call functions |
90 | | -- Need better detection of aggregate-in-typecast patterns |
91 | | - |
92 | | -### Next Steps to Break the Plateau |
93 | | - |
94 | | -1. **Implement function-specific funcformat logic** |
95 | | - - Create mapping of SQL syntax functions |
96 | | - - Add getFuncformatValue() method |
97 | | - - Test with TRIM/string function failures |
98 | | - |
99 | | -2. **Enhance TypeCast + Aggregate detection** |
100 | | - - Improve context detection for aggregates in casts |
101 | | - - May need parent node analysis beyond current path checking |
102 | | - |
103 | | -3. **Systematic testing approach** |
104 | | - - Target specific failing test categories |
105 | | - - Verify each improvement maintains baseline |
106 | | - - Focus on high-impact function types first |
107 | | - |
108 | | -### Key Insights |
109 | | -- The transformer architecture is sound (124/258 baseline is stable) |
110 | | -- Context-specific exclusions work correctly |
111 | | -- The remaining challenge is function-type-specific behavior |
112 | | -- PG14 parser behavior varies significantly by function category |
113 | | -- Need more granular funcformat assignment logic |
114 | | - |
115 | | -## Implementation Strategy |
116 | | -Focus on breaking the 125/258 plateau by implementing function-specific funcformat logic, starting with the most common failing patterns (TRIM, aggregates in TypeCast). |
117 | | - |
118 | | -## Recent Enum Transformations (June 26, 2025) |
119 | 2 |
|
120 | | -### Implemented Enum Mappings |
121 | | -Added systematic enum transformations to handle PG13->PG14 differences: |
| 3 | +13-14 |
| 4 | +Test Suites: 23 failed, 235 passed, 258 total |
122 | 5 |
|
123 | | -#### A_Expr_Kind Transformations |
124 | | -```typescript |
125 | | -private transformA_Expr_Kind(kind: string): string { |
126 | | - const pg13ToP14Map: { [key: string]: string } = { |
127 | | - 'AEXPR_OF': 'AEXPR_IN', // AEXPR_OF removed in PG14 |
128 | | - 'AEXPR_PAREN': 'AEXPR_OP', // AEXPR_PAREN removed in PG14 |
129 | | - // ... other values preserved |
130 | | - }; |
131 | | - return pg13ToP14Map[kind] || kind; |
132 | | -} |
133 | | -``` |
| 6 | +14-15 |
| 7 | +Test Suites: 4 failed, 254 passed, 258 total |
134 | 8 |
|
135 | | -#### RoleSpecType Transformations |
136 | | -```typescript |
137 | | -private transformRoleSpecType(type: string): string { |
138 | | - // Handles addition of ROLESPEC_CURRENT_ROLE at position 1 in PG14 |
139 | | - // Maps existing PG13 values to correct PG14 positions |
140 | | -} |
141 | | -``` |
| 9 | +15-16 |
| 10 | +Test Suites: 77 failed, 181 passed, 258 total |
142 | 11 |
|
143 | | -### Integration Points |
144 | | -- **A_Expr method**: Now calls `this.transformA_Expr_Kind(node.kind)` for enum transformation |
145 | | -- **RoleSpec method**: Calls `this.transformRoleSpecType(node.roletype)` for role type mapping |
146 | | -- **Fixed duplicate functions**: Removed conflicting transformRoleSpecType implementations |
| 12 | +16-17 |
| 13 | +Test Suites: 3 failed, 255 passed, 258 total |
147 | 14 |
|
148 | | -### Results |
149 | | -- **Pass Rate**: Maintained 125/258 (no regression from enum changes) |
150 | | -- **Stability**: Enum transformations working correctly without breaking existing functionality |
151 | | -- **Foundation**: Prepared for additional enum transformations (TableLikeOption, SetQuantifier) |
152 | 15 |
|
153 | | -### Analysis Scripts Created |
154 | | -- `analyze_funcformat_failures.js`: Systematic funcformat failure analysis |
155 | | -- `test_extract_direct.js`: Direct PG13 vs PG14 parser comparison |
156 | | -- `test_date_part_transform.js`: Function name transformation testing |
157 | | -- `investigate_enums.js`: Enum value investigation across versions |
0 commit comments