Fix PG16->PG17 escape sequence transformation #198
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix PG16->PG17 \v escape sequence transformation
Summary
This PR fixes a parser-level difference between PostgreSQL 16 and 17 where the
\v(vertical tab) character escape sequence is handled differently. In PG16,\vgets parsed as a literal 'v' character, but in PG17 it's properly handled as\u000b(Unicode vertical tab).The fix adds transformation logic to the
V16ToV17Transformerthat converts the literal 'v' character back to the proper Unicode escape sequence\u000bwhen transforming ASTs from PG16 to PG17 format.Key changes:
A_Constmethod inV16ToV17Transformerto handle escape sequence transformation(\t) (v)( ')to specifically target the 'v' that was originally\vmisc/quotes_etc-26.sqlReview & Testing Checklist for Human
(\t) (v)( ')correctly identifies only 'v' characters that were originally\vescape sequences, and doesn't transform legitimate 'v' characters\vsequences,\vin different string contexts, and mixed escape sequences to ensure the transformation works correctly\vescape sequences beyond just the test case providedRecommended test plan:
SELECT E'Escapes: \\ \b \f \n \r \t \v \'' AS all_escapesSELECT E'\v',SELECT E'text\vmore\v',SELECT E'some text with v but no escape'Diagram
%%{ init : { "theme" : "default" }}%% graph TD SQL["SQL: SELECT E'...\v...'"] --> PG16["PG16 Parser"] PG16 --> AST16["AST with 'v'"] AST16 --> Transform["V16ToV17Transformer"] Transform --> AST17["AST with '\u000b'"] AST17 --> PG17["PG17 Parser Output"] Transform --> AConst["A_Const method"]:::major-edit AConst --> Regex["Pattern: (\\t) (v)( ')"]:::major-edit TestFile["misc/quotes_etc-26.sql"]:::context SkipFile["transformer-errors.ts"]:::minor-edit subgraph Legend L1["Major Edit"]:::major-edit L2["Minor Edit"]:::minor-edit L3["Context/No Edit"]:::context end classDef major-edit fill:#90EE90 classDef minor-edit fill:#87CEEB classDef context fill:#FFFFFFNotes
\vescape sequences differently at the parsing level, not just at the AST transformation level