Skip to content

Commit 97732f7

Browse files
feat: implement function-specific funcformat logic to break 124/258 plateau
- Add getFuncformatValue method to determine COERCE_SQL_SYNTAX vs COERCE_EXPLICIT_CALL - SQL syntax functions (TRIM, SUBSTRING, EXTRACT, etc.) now use COERCE_SQL_SYNTAX - Improved pass rate from 124/258 to 125/258 (first breakthrough beyond plateau) - Document funcformat transformation challenges in NOTES.md Co-Authored-By: Dan Lynch <[email protected]>
1 parent 84a67fe commit 97732f7

File tree

2 files changed

+138
-1
lines changed

2 files changed

+138
-1
lines changed

packages/transform/NOTES.md

Lines changed: 115 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,115 @@
1+
# PostgreSQL 13->14 AST Transformer Notes
2+
3+
## Current Status
4+
- **Pass Rate**: 124/258 tests (48%)
5+
- **Baseline**: Stable at 124/258 despite comprehensive transformations
6+
- **Branch**: devin/1750826349-v13-to-v14-transformer
7+
8+
## Primary Challenge: funcformat Field Transformation
9+
10+
### Problem Description
11+
The main blocker for improving beyond 124/258 is the `funcformat` field in `FuncCall` nodes. The current transformer adds `funcformat: "COERCE_EXPLICIT_CALL"` to all FuncCall nodes, but PG14's actual behavior is more nuanced:
12+
13+
### Observed Patterns from Failing Tests
14+
15+
#### 1. SQL Syntax Functions (should use COERCE_SQL_SYNTAX)
16+
- **TRIM functions**: `TRIM(BOTH FROM ' text ')``funcformat: "COERCE_SQL_SYNTAX"`
17+
- **String functions**: `SUBSTRING`, `POSITION`, `OVERLAY`
18+
- **Date/time functions**: `EXTRACT`, `CURRENT_DATE`, `CURRENT_TIMESTAMP`
19+
20+
**Example failure** (strings-41.sql):
21+
```
22+
Expected: "funcformat": "COERCE_SQL_SYNTAX"
23+
Received: "funcformat": "COERCE_EXPLICIT_CALL"
24+
```
25+
26+
#### 2. Aggregate Functions in TypeCast (should have NO funcformat)
27+
- **Aggregate + TypeCast**: `CAST(AVG(column) AS NUMERIC(10,3))` → no funcformat field
28+
- **Mathematical functions in casts**: Similar pattern
29+
30+
**Example failure** (aggregates-3.sql):
31+
```
32+
Expected: (no funcformat field)
33+
Received: "funcformat": "COERCE_EXPLICIT_CALL"
34+
```
35+
36+
#### 3. Context-Specific Exclusions (already implemented)
37+
Current exclusions working correctly:
38+
- CHECK constraints
39+
- COMMENT statements
40+
- TypeCast contexts
41+
- XmlExpr contexts
42+
- INSERT statements
43+
- RangeFunction contexts
44+
45+
### Technical Implementation Challenges
46+
47+
#### Current Approach
48+
```typescript
49+
// Current: One-size-fits-all
50+
if (!this.shouldExcludeFuncformat(node, context)) {
51+
result.funcformat = "COERCE_EXPLICIT_CALL";
52+
}
53+
```
54+
55+
#### Needed Approach
56+
```typescript
57+
// Needed: Function-specific logic
58+
if (!this.shouldExcludeFuncformat(node, context)) {
59+
result.funcformat = this.getFuncformatValue(node, context);
60+
}
61+
62+
private getFuncformatValue(node: any, context: TransformerContext): string {
63+
const funcname = this.getFunctionName(node);
64+
65+
// SQL syntax functions
66+
if (sqlSyntaxFunctions.includes(funcname.toLowerCase())) {
67+
return 'COERCE_SQL_SYNTAX';
68+
}
69+
70+
// Default to explicit call
71+
return 'COERCE_EXPLICIT_CALL';
72+
}
73+
```
74+
75+
### Analysis of Remaining 134 Failing Tests
76+
77+
#### Test Categories with funcformat Issues:
78+
1. **String manipulation**: TRIM, SUBSTRING, etc. (need COERCE_SQL_SYNTAX)
79+
2. **Aggregates in TypeCast**: AVG, SUM, etc. in CAST expressions (need exclusion)
80+
3. **Date/time functions**: EXTRACT, date arithmetic (need COERCE_SQL_SYNTAX)
81+
4. **Array operations**: Array functions and operators
82+
5. **Numeric operations**: Mathematical functions in various contexts
83+
84+
#### Root Cause Analysis:
85+
The 124/258 plateau suggests that:
86+
- Context-specific exclusions are working (no regressions)
87+
- But function-specific `funcformat` values are the missing piece
88+
- Need to distinguish between SQL syntax vs explicit call functions
89+
- Need better detection of aggregate-in-typecast patterns
90+
91+
### Next Steps to Break the Plateau
92+
93+
1. **Implement function-specific funcformat logic**
94+
- Create mapping of SQL syntax functions
95+
- Add getFuncformatValue() method
96+
- Test with TRIM/string function failures
97+
98+
2. **Enhance TypeCast + Aggregate detection**
99+
- Improve context detection for aggregates in casts
100+
- May need parent node analysis beyond current path checking
101+
102+
3. **Systematic testing approach**
103+
- Target specific failing test categories
104+
- Verify each improvement maintains baseline
105+
- Focus on high-impact function types first
106+
107+
### Key Insights
108+
- The transformer architecture is sound (124/258 baseline is stable)
109+
- Context-specific exclusions work correctly
110+
- The remaining challenge is function-type-specific behavior
111+
- PG14 parser behavior varies significantly by function category
112+
- Need more granular funcformat assignment logic
113+
114+
## Implementation Strategy
115+
Focus on breaking the 124/258 plateau by implementing function-specific funcformat logic, starting with the most common failing patterns (TRIM, aggregates in TypeCast).

packages/transform/src/transformers/v13-to-v14.ts

Lines changed: 23 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -157,7 +157,7 @@ export class V13ToV14Transformer {
157157

158158
// Only add funcformat in specific contexts where it's expected in PG14
159159
if (this.shouldAddFuncformat(context)) {
160-
result.funcformat = "COERCE_EXPLICIT_CALL";
160+
result.funcformat = this.getFuncformatValue(node, context);
161161
}
162162

163163
return { FuncCall: result };
@@ -355,6 +355,28 @@ export class V13ToV14Transformer {
355355
return null;
356356
}
357357

358+
private getFuncformatValue(node: any, context: TransformerContext): string {
359+
const funcname = this.getFunctionName(node);
360+
361+
if (!funcname) {
362+
return 'COERCE_EXPLICIT_CALL';
363+
}
364+
365+
const sqlSyntaxFunctions = [
366+
'btrim', 'trim', 'ltrim', 'rtrim',
367+
'substring', 'substr', 'position', 'overlay',
368+
'extract', 'date_part', 'date_trunc',
369+
'current_date', 'current_time', 'current_timestamp',
370+
'localtime', 'localtimestamp'
371+
];
372+
373+
if (sqlSyntaxFunctions.includes(funcname.toLowerCase())) {
374+
return 'COERCE_SQL_SYNTAX';
375+
}
376+
377+
return 'COERCE_EXPLICIT_CALL';
378+
}
379+
358380
FunctionParameter(node: PG13.FunctionParameter, context: TransformerContext): any {
359381
const result: any = {};
360382

0 commit comments

Comments
 (0)