|
| 1 | +# Parser Predicate Recognition Bug Fix Guide |
| 2 | + |
| 3 | +This guide documents the specific pattern for fixing bugs where identifier-based predicates (like `REGEXP_LIKE`) are not properly recognized when wrapped in parentheses in boolean expressions. |
| 4 | + |
| 5 | +## Problem Description |
| 6 | + |
| 7 | +**Symptom**: Parentheses around identifier-based boolean predicates cause syntax errors. |
| 8 | +- Example: `SELECT 1 WHERE (REGEXP_LIKE('a', 'pattern'))` fails to parse |
| 9 | +- Works: `SELECT 1 WHERE REGEXP_LIKE('a', 'pattern')` (without parentheses) |
| 10 | + |
| 11 | +**Root Cause**: The `IsNextRuleBooleanParenthesis()` function in `TSql80ParserBaseInternal.cs` only recognizes: |
| 12 | +- Keyword-based predicates (tokens): `LIKE`, `BETWEEN`, `CONTAINS`, `EXISTS`, etc. |
| 13 | +- One identifier-based predicate: `IIF` |
| 14 | +- But doesn't recognize newer identifier-based predicates like `REGEXP_LIKE` |
| 15 | + |
| 16 | +## Understanding the Fix |
| 17 | + |
| 18 | +### The `IsNextRuleBooleanParenthesis()` Function |
| 19 | + |
| 20 | +This function determines whether parentheses contain a boolean expression vs. a scalar expression. It scans forward from a `LeftParenthesis` token looking for boolean operators or predicates. |
| 21 | + |
| 22 | +**Location**: `SqlScriptDom/Parser/TSql/TSql80ParserBaseInternal.cs` |
| 23 | + |
| 24 | +**Key Logic**: |
| 25 | +```csharp |
| 26 | +case TSql80ParserInternal.Identifier: |
| 27 | + // if identifier is IIF |
| 28 | + if(NextTokenMatches(CodeGenerationSupporter.IIf)) |
| 29 | + { |
| 30 | + ++insideIIf; |
| 31 | + } |
| 32 | + // ADD NEW IDENTIFIER-BASED PREDICATES HERE |
| 33 | + break; |
| 34 | +``` |
| 35 | + |
| 36 | +### The Solution Pattern |
| 37 | + |
| 38 | +For identifier-based boolean predicates, add detection logic in the `Identifier` case: |
| 39 | + |
| 40 | +```csharp |
| 41 | +case TSql80ParserInternal.Identifier: |
| 42 | + // if identifier is IIF |
| 43 | + if(NextTokenMatches(CodeGenerationSupporter.IIf)) |
| 44 | + { |
| 45 | + ++insideIIf; |
| 46 | + } |
| 47 | + // if identifier is REGEXP_LIKE |
| 48 | + else if(NextTokenMatches(CodeGenerationSupporter.RegexpLike)) |
| 49 | + { |
| 50 | + if (caseDepth == 0 && topmostSelect == 0 && insideIIf == 0) |
| 51 | + { |
| 52 | + matches = true; |
| 53 | + loop = false; |
| 54 | + } |
| 55 | + } |
| 56 | + break; |
| 57 | +``` |
| 58 | + |
| 59 | +## Step-by-Step Fix Process |
| 60 | + |
| 61 | +### 1. Reproduce the Issue |
| 62 | +Create a test case to confirm the bug: |
| 63 | +```sql |
| 64 | +SELECT 1 WHERE (REGEXP_LIKE('a', 'pattern')); -- Should fail without fix |
| 65 | +``` |
| 66 | + |
| 67 | +### 2. Identify the Predicate Constant |
| 68 | +Find the predicate identifier in `CodeGenerationSupporter`: |
| 69 | +```csharp |
| 70 | +// In CodeGenerationSupporter.cs |
| 71 | +public const string RegexpLike = "REGEXP_LIKE"; |
| 72 | +``` |
| 73 | + |
| 74 | +### 3. Apply the Fix |
| 75 | +Modify `TSql80ParserBaseInternal.cs` in the `IsNextRuleBooleanParenthesis()` method: |
| 76 | + |
| 77 | +**File**: `SqlScriptDom/Parser/TSql/TSql80ParserBaseInternal.cs` |
| 78 | +**Method**: `IsNextRuleBooleanParenthesis()` |
| 79 | +**Location**: Around line 808, in the `case TSql80ParserInternal.Identifier:` block |
| 80 | + |
| 81 | +Add the predicate detection logic following the pattern shown above. |
| 82 | + |
| 83 | +### 4. Update Test Cases |
| 84 | +Add test cases covering the parentheses scenario: |
| 85 | + |
| 86 | +**Test Script**: `Test/SqlDom/TestScripts/RegexpLikeTests170.sql` |
| 87 | +```sql |
| 88 | +SELECT 1 WHERE (REGEXP_LIKE('a', '%pattern%')); |
| 89 | +``` |
| 90 | + |
| 91 | +**Baseline**: `Test/SqlDom/Baselines170/RegexpLikeTests170.sql` |
| 92 | +```sql |
| 93 | +SELECT 1 |
| 94 | +WHERE (REGEXP_LIKE ('a', '%pattern%')); |
| 95 | +``` |
| 96 | + |
| 97 | +**Test Configuration**: Update error counts in `Only170SyntaxTests.cs` if the new test cases affect older parser versions. |
| 98 | + |
| 99 | +### 5. Build and Verify |
| 100 | +```bash |
| 101 | +# Build the parser |
| 102 | +dotnet build SqlScriptDom/Microsoft.SqlServer.TransactSql.ScriptDom.csproj -c Debug |
| 103 | + |
| 104 | +# Run the specific test |
| 105 | +dotnet test Test/SqlDom/UTSqlScriptDom.csproj --filter "FullyQualifiedName~SqlStudio.Tests.UTSqlScriptDom.SqlDomTests.TSql170SyntaxIn170ParserTest" -c Debug |
| 106 | +``` |
| 107 | + |
| 108 | +## When to Apply This Pattern |
| 109 | + |
| 110 | +This fix pattern applies when: |
| 111 | + |
| 112 | +1. **Identifier-based predicates**: The predicate is defined as an identifier (not a keyword token) |
| 113 | +2. **Boolean context**: The predicate returns a boolean value for use in WHERE clauses, CHECK constraints, etc. |
| 114 | +3. **Parentheses fail**: The predicate works without parentheses but fails with parentheses |
| 115 | +4. **Already implemented**: The predicate grammar and AST are already correctly implemented |
| 116 | + |
| 117 | +## Common Predicates That May Need This Fix |
| 118 | + |
| 119 | +- `REGEXP_LIKE` (✅ Fixed) |
| 120 | +- Future identifier-based boolean functions |
| 121 | +- Custom function predicates that return boolean values |
| 122 | + |
| 123 | +## Related Files Modified |
| 124 | + |
| 125 | +This type of fix typically involves: |
| 126 | + |
| 127 | +1. **Core Parser Logic**: |
| 128 | + - `SqlScriptDom/Parser/TSql/TSql80ParserBaseInternal.cs` - Main fix |
| 129 | + |
| 130 | +2. **Test Infrastructure**: |
| 131 | + - `Test/SqlDom/TestScripts/[TestName].sql` - Input test cases |
| 132 | + - `Test/SqlDom/Baselines[Version]/[TestName].sql` - Expected output |
| 133 | + - `Test/SqlDom/Only[Version]SyntaxTests.cs` - Test configuration |
| 134 | + |
| 135 | +3. **Potentially Affected**: |
| 136 | + - `Test/SqlDom/TestScripts/BooleanExpressionTests.sql` - May need additional test cases |
| 137 | + - `Test/SqlDom/BaselinesCommon/BooleanExpressionTests.sql` - Corresponding baselines |
| 138 | + |
| 139 | +## Verification Checklist |
| 140 | + |
| 141 | +- [ ] Parentheses syntax parses without errors |
| 142 | +- [ ] Non-parentheses syntax still works |
| 143 | +- [ ] Test suite passes for target SQL version |
| 144 | +- [ ] Older SQL versions have appropriate error counts |
| 145 | +- [ ] Related boolean expression tests still pass |
| 146 | + |
| 147 | +## Notes and Gotchas |
| 148 | + |
| 149 | +- **IIF Special Handling**: `IIF` has special logic (`++insideIIf`) because it's not a simple boolean predicate |
| 150 | +- **Context Conditions**: The fix includes conditions (`caseDepth == 0 && topmostSelect == 0 && insideIIf == 0`) to ensure proper parsing context |
| 151 | +- **Token vs Identifier**: Keyword predicates are handled as tokens, identifier predicates need special detection |
| 152 | +- **Cross-Version Impact**: Adding test cases may increase error counts for older SQL Server parsers |
| 153 | + |
| 154 | +This pattern ensures that identifier-based boolean predicates work consistently with parentheses, maintaining parser compatibility across different syntactic contexts. |
0 commit comments