Skip to content

Commit cd170a8

Browse files
authored
Fixing a bug with regex_Like function as Boolean exp (#166)
* Fixing a bug with regex_Like function as boolean exp * Adding more test
1 parent 35b7ba5 commit cd170a8

File tree

9 files changed

+205
-3
lines changed

9 files changed

+205
-3
lines changed

.github/BUG_FIXING_GUIDE.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,3 +40,14 @@ The process of fixing a bug, especially one that involves adding new syntax, fol
4040
* **d. Re-run the Tests**: Run the same test command again. This time, the tests should pass, confirming that the generated script matches the new baseline.
4141
4242
By following these steps, you can ensure that new syntax is correctly parsed, represented in the AST, generated back into a script, and fully validated by the testing framework.
43+
44+
## Special Case: Parser Predicate Recognition Issues
45+
46+
If you encounter a bug where:
47+
- An identifier-based predicate (like `REGEXP_LIKE`) works without parentheses: `WHERE REGEXP_LIKE('a', 'pattern')` ✅
48+
- But fails with parentheses: `WHERE (REGEXP_LIKE('a', 'pattern'))` ❌
49+
- The error is a syntax error near the closing parenthesis or semicolon
50+
51+
This is likely a **parser predicate recognition issue**. The grammar and AST are correct, but the `IsNextRuleBooleanParenthesis()` function doesn't recognize the identifier-based predicate.
52+
53+
**Solution**: Follow the [Parser Predicate Recognition Fix Guide](PARSER_PREDICATE_RECOGNITION_FIX.md) instead of the standard grammar modification workflow.
Lines changed: 154 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,154 @@
1+
# Parser Predicate Recognition Bug Fix Guide
2+
3+
This guide documents the specific pattern for fixing bugs where identifier-based predicates (like `REGEXP_LIKE`) are not properly recognized when wrapped in parentheses in boolean expressions.
4+
5+
## Problem Description
6+
7+
**Symptom**: Parentheses around identifier-based boolean predicates cause syntax errors.
8+
- Example: `SELECT 1 WHERE (REGEXP_LIKE('a', 'pattern'))` fails to parse
9+
- Works: `SELECT 1 WHERE REGEXP_LIKE('a', 'pattern')` (without parentheses)
10+
11+
**Root Cause**: The `IsNextRuleBooleanParenthesis()` function in `TSql80ParserBaseInternal.cs` only recognizes:
12+
- Keyword-based predicates (tokens): `LIKE`, `BETWEEN`, `CONTAINS`, `EXISTS`, etc.
13+
- One identifier-based predicate: `IIF`
14+
- But doesn't recognize newer identifier-based predicates like `REGEXP_LIKE`
15+
16+
## Understanding the Fix
17+
18+
### The `IsNextRuleBooleanParenthesis()` Function
19+
20+
This function determines whether parentheses contain a boolean expression vs. a scalar expression. It scans forward from a `LeftParenthesis` token looking for boolean operators or predicates.
21+
22+
**Location**: `SqlScriptDom/Parser/TSql/TSql80ParserBaseInternal.cs`
23+
24+
**Key Logic**:
25+
```csharp
26+
case TSql80ParserInternal.Identifier:
27+
// if identifier is IIF
28+
if(NextTokenMatches(CodeGenerationSupporter.IIf))
29+
{
30+
++insideIIf;
31+
}
32+
// ADD NEW IDENTIFIER-BASED PREDICATES HERE
33+
break;
34+
```
35+
36+
### The Solution Pattern
37+
38+
For identifier-based boolean predicates, add detection logic in the `Identifier` case:
39+
40+
```csharp
41+
case TSql80ParserInternal.Identifier:
42+
// if identifier is IIF
43+
if(NextTokenMatches(CodeGenerationSupporter.IIf))
44+
{
45+
++insideIIf;
46+
}
47+
// if identifier is REGEXP_LIKE
48+
else if(NextTokenMatches(CodeGenerationSupporter.RegexpLike))
49+
{
50+
if (caseDepth == 0 && topmostSelect == 0 && insideIIf == 0)
51+
{
52+
matches = true;
53+
loop = false;
54+
}
55+
}
56+
break;
57+
```
58+
59+
## Step-by-Step Fix Process
60+
61+
### 1. Reproduce the Issue
62+
Create a test case to confirm the bug:
63+
```sql
64+
SELECT 1 WHERE (REGEXP_LIKE('a', 'pattern')); -- Should fail without fix
65+
```
66+
67+
### 2. Identify the Predicate Constant
68+
Find the predicate identifier in `CodeGenerationSupporter`:
69+
```csharp
70+
// In CodeGenerationSupporter.cs
71+
public const string RegexpLike = "REGEXP_LIKE";
72+
```
73+
74+
### 3. Apply the Fix
75+
Modify `TSql80ParserBaseInternal.cs` in the `IsNextRuleBooleanParenthesis()` method:
76+
77+
**File**: `SqlScriptDom/Parser/TSql/TSql80ParserBaseInternal.cs`
78+
**Method**: `IsNextRuleBooleanParenthesis()`
79+
**Location**: Around line 808, in the `case TSql80ParserInternal.Identifier:` block
80+
81+
Add the predicate detection logic following the pattern shown above.
82+
83+
### 4. Update Test Cases
84+
Add test cases covering the parentheses scenario:
85+
86+
**Test Script**: `Test/SqlDom/TestScripts/RegexpLikeTests170.sql`
87+
```sql
88+
SELECT 1 WHERE (REGEXP_LIKE('a', '%pattern%'));
89+
```
90+
91+
**Baseline**: `Test/SqlDom/Baselines170/RegexpLikeTests170.sql`
92+
```sql
93+
SELECT 1
94+
WHERE (REGEXP_LIKE ('a', '%pattern%'));
95+
```
96+
97+
**Test Configuration**: Update error counts in `Only170SyntaxTests.cs` if the new test cases affect older parser versions.
98+
99+
### 5. Build and Verify
100+
```bash
101+
# Build the parser
102+
dotnet build SqlScriptDom/Microsoft.SqlServer.TransactSql.ScriptDom.csproj -c Debug
103+
104+
# Run the specific test
105+
dotnet test Test/SqlDom/UTSqlScriptDom.csproj --filter "FullyQualifiedName~SqlStudio.Tests.UTSqlScriptDom.SqlDomTests.TSql170SyntaxIn170ParserTest" -c Debug
106+
```
107+
108+
## When to Apply This Pattern
109+
110+
This fix pattern applies when:
111+
112+
1. **Identifier-based predicates**: The predicate is defined as an identifier (not a keyword token)
113+
2. **Boolean context**: The predicate returns a boolean value for use in WHERE clauses, CHECK constraints, etc.
114+
3. **Parentheses fail**: The predicate works without parentheses but fails with parentheses
115+
4. **Already implemented**: The predicate grammar and AST are already correctly implemented
116+
117+
## Common Predicates That May Need This Fix
118+
119+
- `REGEXP_LIKE` (✅ Fixed)
120+
- Future identifier-based boolean functions
121+
- Custom function predicates that return boolean values
122+
123+
## Related Files Modified
124+
125+
This type of fix typically involves:
126+
127+
1. **Core Parser Logic**:
128+
- `SqlScriptDom/Parser/TSql/TSql80ParserBaseInternal.cs` - Main fix
129+
130+
2. **Test Infrastructure**:
131+
- `Test/SqlDom/TestScripts/[TestName].sql` - Input test cases
132+
- `Test/SqlDom/Baselines[Version]/[TestName].sql` - Expected output
133+
- `Test/SqlDom/Only[Version]SyntaxTests.cs` - Test configuration
134+
135+
3. **Potentially Affected**:
136+
- `Test/SqlDom/TestScripts/BooleanExpressionTests.sql` - May need additional test cases
137+
- `Test/SqlDom/BaselinesCommon/BooleanExpressionTests.sql` - Corresponding baselines
138+
139+
## Verification Checklist
140+
141+
- [ ] Parentheses syntax parses without errors
142+
- [ ] Non-parentheses syntax still works
143+
- [ ] Test suite passes for target SQL version
144+
- [ ] Older SQL versions have appropriate error counts
145+
- [ ] Related boolean expression tests still pass
146+
147+
## Notes and Gotchas
148+
149+
- **IIF Special Handling**: `IIF` has special logic (`++insideIIf`) because it's not a simple boolean predicate
150+
- **Context Conditions**: The fix includes conditions (`caseDepth == 0 && topmostSelect == 0 && insideIIf == 0`) to ensure proper parsing context
151+
- **Token vs Identifier**: Keyword predicates are handled as tokens, identifier predicates need special detection
152+
- **Cross-Version Impact**: Adding test cases may increase error counts for older SQL Server parsers
153+
154+
This pattern ensures that identifier-based boolean predicates work consistently with parentheses, maintaining parser compatibility across different syntactic contexts.

.github/copilot-instructions.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,8 @@ ScriptDom is a library for parsing and generating T-SQL scripts. It is primarily
4545
## Bug Fixing and Baseline Generation
4646
For a practical guide on fixing bugs, including the detailed workflow for generating test baselines, see the [Bug Fixing Guide](BUG_FIXING_GUIDE.md).
4747

48+
For specific parser predicate recognition issues (when identifier-based predicates like `REGEXP_LIKE` don't work with parentheses), see the [Parser Predicate Recognition Fix Guide](PARSER_PREDICATE_RECOGNITION_FIX.md).
49+
4850
## Editing generated outputs, debugging generation
4951
- Never edit generated files permanently (they live under `obj/...`/CsGenIntermediateOutputPath). Instead change:
5052
- `.g` grammar files

SqlScriptDom/Parser/TSql/TSql80ParserBaseInternal.cs

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -808,6 +808,15 @@ protected bool IsNextRuleBooleanParenthesis()
808808
{
809809
++insideIIf;
810810
}
811+
// if identifier is REGEXP_LIKE
812+
else if(NextTokenMatches(CodeGenerationSupporter.RegexpLike))
813+
{
814+
if (caseDepth == 0 && topmostSelect == 0 && insideIIf == 0)
815+
{
816+
matches = true;
817+
loop = false;
818+
}
819+
}
811820
break;
812821
case TSql80ParserInternal.LeftParenthesis:
813822
++openParens;

Test/SqlDom/Baselines170/RegexpLikeTests170.sql

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,8 @@ SELECT IIF (REGEXP_LIKE ('abc', '^a'), 1, 0) AS is_match;
88

99
SELECT IIF (NOT REGEXP_LIKE ('abc', '^a'), 1, 0) AS is_match;
1010

11+
SELECT (IIF (REGEXP_LIKE ('abc', '^a'), 'Match', 'No Match')) AS result;
12+
1113
SELECT CASE WHEN REGEXP_LIKE ('abc', '^a') THEN 1 ELSE 0 END AS is_match;
1214

1315
SELECT CASE WHEN NOT REGEXP_LIKE ('abc', '^a') THEN 1 ELSE 0 END AS is_match;
@@ -30,4 +32,10 @@ SELECT CASE WHEN NOT REGEXP_LIKE ('abc', '^a', NULL) THEN 1 ELSE 0 END AS is_mat
3032

3133
SELECT CASE WHEN REGEXP_LIKE (NULL, '^a', 'c') THEN 1 ELSE 0 END AS is_match;
3234

33-
SELECT IIF (NOT REGEXP_LIKE ('abc', NULL), 1, 0) AS is_match;
35+
SELECT IIF (NOT REGEXP_LIKE ('abc', NULL), 1, 0) AS is_match;
36+
37+
SELECT 1
38+
WHERE REGEXP_LIKE ('a', '^a');
39+
40+
SELECT 1
41+
WHERE (REGEXP_LIKE ('a', '%pattern%'));

Test/SqlDom/BaselinesCommon/BooleanExpressionTests.sql

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -138,6 +138,12 @@ FROM Production.Document
138138
WHERE CONTAINS ((t1.c1), @a);
139139

140140

141+
142+
GO
143+
SELECT Title
144+
FROM Production.Document
145+
WHERE (CONTAINS ((t1.c1), @a));
146+
141147
GO
142148
SELECT Title
143149
FROM Production.Document

Test/SqlDom/Only170SyntaxTests.cs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ public partial class SqlDomTests
2727
new ParserTest170("VectorFunctionTests170.sql", nErrors80: 1, nErrors90: 1, nErrors100: 2, nErrors110: 2, nErrors120: 2, nErrors130: 2, nErrors140: 2, nErrors150: 2, nErrors160: 2),
2828
new ParserTest170("SecurityStatementExternalModelTests170.sql", nErrors80: 2, nErrors90: 17, nErrors100: 17, nErrors110: 17, nErrors120: 17, nErrors130: 17, nErrors140: 17, nErrors150: 17, nErrors160: 17),
2929
new ParserTest170("CreateEventSessionNotLikePredicate.sql", nErrors80: 2, nErrors90: 1, nErrors100: 1, nErrors110: 1, nErrors120: 1, nErrors130: 0, nErrors140: 0, nErrors150: 0, nErrors160: 0),
30-
new ParserTest170("RegexpLikeTests170.sql", nErrors80: 13, nErrors90: 13, nErrors100: 13, nErrors110: 15, nErrors120: 15, nErrors130: 15, nErrors140: 15, nErrors150: 15, nErrors160: 15)
30+
new ParserTest170("RegexpLikeTests170.sql", nErrors80: 15, nErrors90: 15, nErrors100: 15, nErrors110: 18, nErrors120: 18, nErrors130: 18, nErrors140: 18, nErrors150: 18, nErrors160: 18)
3131
};
3232

3333
private static readonly ParserTest[] SqlAzure170_TestInfos =

Test/SqlDom/TestScripts/BooleanExpressionTests.sql

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -128,6 +128,11 @@ FROM Production.Document
128128
WHERE Contains ( t1.c1, @a);
129129
GO
130130

131+
SELECT Title
132+
FROM Production.Document
133+
WHERE (Contains ( t1.c1, @a));
134+
GO
135+
131136
SELECT Title
132137
FROM Production.Document
133138
WHERE Freetext ( t2.*, N'abc');

Test/SqlDom/TestScripts/RegexpLikeTests170.sql

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,9 @@ SELECT IIF (REGEXP_LIKE ('abc', '^a'), 1, 0) AS is_match;
88

99
SELECT IIF (NOT REGEXP_LIKE ('abc', '^a'), 1, 0) AS is_match;
1010

11+
-- Test REGEXP_LIKE inside IIF with parentheses (should be scalar parentheses)
12+
SELECT (IIF (REGEXP_LIKE ('abc', '^a'), 'Match', 'No Match')) AS result;
13+
1114
SELECT CASE WHEN REGEXP_LIKE ('abc', '^a') THEN 1 ELSE 0 END AS is_match;
1215

1316
SELECT CASE WHEN NOT REGEXP_LIKE ('abc', '^a') THEN 1 ELSE 0 END AS is_match;
@@ -30,4 +33,8 @@ SELECT CASE WHEN NOT REGEXP_LIKE ('abc', '^a', NULL) THEN 1 ELSE 0 END AS is_mat
3033

3134
SELECT CASE WHEN REGEXP_LIKE (NULL, '^a', 'c') THEN 1 ELSE 0 END AS is_match;
3235

33-
SELECT IIF (NOT REGEXP_LIKE ('abc', NULL), 1, 0) AS is_match;
36+
SELECT IIF (NOT REGEXP_LIKE ('abc', NULL), 1, 0) AS is_match;
37+
38+
SELECT 1 WHERE REGEXP_LIKE('a', '^a');
39+
40+
SELECT 1 WHERE (REGEXP_LIKE('a', '%pattern%'));

0 commit comments

Comments
 (0)