Skip to content

Conversation

@devin-ai-integration
Copy link

Implement automatic E-prefix detection for escaped string literals in PostgreSQL

Problem

The deparser was not automatically adding the E prefix to string literals containing backslash escape sequences, which is required for proper PostgreSQL syntax when standard_conforming_strings=on (the default since PostgreSQL 9.1).

Solution

Implemented automatic E-prefix detection logic that:

  1. Detects escape sequences - Identifies strings containing backslash escape patterns that require E-prefix
  2. Adds E-prefix automatically - Enhances string literal generation to include E'...' syntax when needed
  3. Proper escaping - Correctly escapes backslashes and quotes within E-prefixed strings
  4. Avoids conflicts - Prevents E-prefix for hexadecimal bytea literals to avoid syntax conflicts

Technical Implementation

New needsEscapePrefix() method

needsEscapePrefix(value: string): boolean {
  // Don't add E-prefix to hexadecimal bytea literals (e.g., \xDeAdBeEf)
  if (/^\\x[0-9a-fA-F]+$/i.test(value)) {
    return false;
  }
  
  // Don't add E-prefix to strings that look like bytea literals with mixed content
  if (/\\x[0-9a-fA-F]/.test(value) && !/\\[nrtbf\\']/.test(value)) {
    return false;
  }
  
  // Check for common backslash escape sequences that require E-prefix
  return /\\[nrtbf\\']/.test(value) ||             // Basic escapes: \n, \t, \r, \b, \f, \\, \'
         /\\u[0-9a-fA-F]{4}/.test(value) ||        // Unicode escapes: \u0041
         /\\U[0-9a-fA-F]{8}/.test(value) ||        // Extended unicode escapes: \U00000041
         /\\[0-7]{1,3}/.test(value);               // Octal escapes: \123
}

Enhanced string processing

  • A_Const method - Updated to detect escape sequences and apply E-prefix with proper escaping
  • String method - Enhanced to handle E-prefix detection in string literal contexts
  • Proper escaping - Backslashes are escaped as \\\\ and quotes as '' in E-prefixed strings

Examples

Input AST string values:

  • "a\bcd"E'a\\bcd'
  • "a\b'cd"E'a\\b''cd'
  • "ab\'cd"E'ab\\''cd'
  • "\\"E'\\\\'

Regular strings without escapes:

  • "hello world"'hello world' (no E-prefix needed)

Testing

  • ✅ All 251 test suites pass (264 total tests)
  • ✅ Verified E-prefix detection works correctly for various escape patterns
  • ✅ Confirmed no regressions in existing functionality
  • ✅ Tested reparsing of generated SQL to ensure validity

PostgreSQL Compatibility

This implementation follows PostgreSQL's string literal syntax requirements:

  • E-prefix is added for strings with backslash escape sequences
  • Proper escaping prevents syntax errors
  • Compatible with standard_conforming_strings=on (default since PostgreSQL 9.1)
  • Avoids conflicts with bytea literal syntax

Link to Devin run: https://app.devin.ai/sessions/53c6e39590e9403ab99a25afe0dd5e99

Requested by: Dan Lynch ([email protected])

- Add needsEscapePrefix() method to detect backslash escape sequences
- Enhance A_Const and String methods to automatically add E-prefix when needed
- Support common escape patterns: \n, \t, \r, \b, \f, \, \', \x, \u, \U, octal
- Properly escape backslashes and quotes in E-prefixed strings
- Avoid E-prefix for hexadecimal bytea literals to prevent conflicts
- All 251 test suites pass with 264 total tests

Co-Authored-By: Dan Lynch <[email protected]>
@devin-ai-integration
Copy link
Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

…handling

- Move needsEscapePrefix(), escapeEString(), and formatEString() methods to QuoteUtils
- Update A_Const and String methods to use QuoteUtils.formatEString()
- Update CommentStmt comment handling to use QuoteUtils.formatEString()
- Use non-capturing regex groups for cleaner E-prefix detection
- Fix backslash escaping to use proper double backslash replacement
- Centralize all E-prefix detection and formatting logic in utilities
- All 251 test suites pass with 264 total tests

Co-Authored-By: Dan Lynch <[email protected]>
@pyramation pyramation closed this Jun 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants