Skip to content

Conversation

@phodal
Copy link
Owner

@phodal phodal commented Jan 23, 2026

Summary

This PR adds comprehensive syntax support for Python 3.10, 3.12, and 3.14 features to the Chapi Python AST parser.

Changes

Python 3.10 - Pattern Matching (PEP 634)

  • ✅ Complete match/case statement support
  • ✅ Literal patterns (numbers, strings, None, True, False)
  • ✅ Capture patterns (variable binding)
  • ✅ Wildcard patterns (_)
  • ✅ Class patterns with positional and keyword arguments
  • ✅ Sequence patterns (lists, tuples)
  • ✅ Mapping patterns (dictionaries)
  • ✅ Or patterns (|)
  • ✅ As patterns (as keyword)
  • ✅ Guard conditions (if clauses)

Python 3.12 - Type Parameter Syntax (PEP 695)

  • ✅ Generic class type parameters: class Stack[T]:
  • ✅ Generic function type parameters: def first[T]():
  • ✅ TypeVar support: T, K, V
  • ✅ TypeVarTuple support: *Ts
  • ✅ ParamSpec support: **P
  • ✅ Type constraints: T: int, T: (int, str)
  • ✅ Type alias statement: type Vector = list[float]

Python 3.14 - New Features

  • ✅ Template strings (t-strings): t"...", tr"..."
  • ✅ Bracketless except (PEP 758): except ValueError, TypeError:

Technical Details

  • Updated PythonLexer.g4 with new keywords and t-string support
  • Extended PythonParser.g4 with new grammar rules
  • Enhanced PythonVersion.java enum with version-specific checks
  • Added comprehensive test suite (Python314FeaturesTest.kt) with 25+ test cases

Testing

  • ✅ All existing tests pass (78 tests)
  • ✅ New test suite covers all new syntax features
  • ✅ Soft keyword compatibility verified (match, case, type can be used as identifiers)

Related PEPs

  • PEP 634: Structural Pattern Matching
  • PEP 695: Type Parameter Syntax
  • PEP 758: Allow except and except* expressions without parentheses
  • PEP 750: Template String Literals (t-strings)

Breaking Changes

None - all changes are additive and backward compatible.

Summary by CodeRabbit

  • New Features

    • Added support for Python 3.10+ pattern matching (match/case statements)
    • Added support for Python 3.12+ type parameters in classes and functions
    • Added support for Python 3.12+ type alias statements
    • Added support for Python 3.14 template strings (t-strings)
    • Added support for Python 3.14 bracketless except clauses
    • Enhanced version detection with feature-specific capability checks
  • Tests

    • Added comprehensive test suite for Python 3.10-3.14 syntax features

✏️ Tip: You can customize this high-level summary in your review settings.

- Add Python 3.10 pattern matching (match/case statements)
  - Support literal, capture, wildcard, class, sequence, and mapping patterns
  - Support or patterns, as patterns, and guard conditions

- Add Python 3.12 type parameter syntax (PEP 695)
  - Generic class and function type parameters: class Stack[T], def first[T]()
  - TypeVar, TypeVarTuple (*Ts), ParamSpec (**P) support
  - Type constraints: T: int, T: (int, str)
  - Type alias statement: type Vector = list[float]

- Add Python 3.14 features
  - Template strings (t-strings): t"...", tr"..."
  - Bracketless except (PEP 758): except ValueError, TypeError:

- Update PythonVersion enum with Python310, Python312, Python314
- Add comprehensive test suite for all new syntax features
- All existing tests pass
Copilot AI review requested due to automatic review settings January 23, 2026 00:13
@coderabbitai
Copy link

coderabbitai bot commented Jan 23, 2026

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

Adds comprehensive support for Python 3.10–3.14 language features to the ANTLR-based parser: pattern matching (3.10), type parameters and type aliases (3.12), template strings and bracketless except clauses (3.14). Updates lexer tokens, parser rules, version enum, and introduces extensive feature tests.

Changes

Cohort / File(s) Summary
Lexer Grammar
chapi-ast-python/src/main/antlr/PythonLexer.g4
Added soft keyword tokens MATCH, CASE, TYPE. Extended STRING token rule to recognize t-string prefixes ([tT]) with optional raw modifier ([rR]), supporting Python 3.14 template strings.
Parser Grammar
chapi-ast-python/src/main/antlr/PythonParser.g4
Introduced pattern matching constructs (match_stmt, pattern, as_pattern, or_pattern, closed_pattern, literal_pattern, capture_pattern, wildcard_pattern, class_pattern, sequence_pattern, mapping_pattern, group_pattern). Added type parameter syntax (type_params, type_param_list, type_param) and type_alias_stmt for Python 3.12. Updated exception handling (except_clause, except_types) for Python 3.14. Extended classdef and funcdef to accept optional type parameters. Updated name rule to permit soft keywords as identifiers.
Version Support
chapi-ast-python/src/main/java/chapi/ast/antlr/PythonVersion.java
Added enum constants Python310, Python312, Python314 to represent specific Python versions. Introduced three capability-check methods: supportsPatternMatching(), supportsTypeParameters(), supportsPython314Features() with version-aware logic.
Comprehensive Feature Tests
chapi-ast-python/src/test/kotlin/chapi/ast/pythonast/Python314FeaturesTest.kt
Created extensive test suite with nested test groups covering pattern matching (various patterns, guards, edge cases), type parameters (bounds, multiple parameters), type aliases, template strings, bracketless except clauses, soft keyword identifiers, version capability checks, and nested pattern scenarios.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐰 A Python parser grows so bright,
With patterns, types, and templates in flight,
From match and case to t-strings so neat,
Version three-ten through fourteen complete!
Soft keywords hop, exceptions refine—
Grammar and tests align! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 7.32% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the main objective of the pull request: adding Python 3.10-3.14 syntax support.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@phodal
Copy link
Owner Author

phodal commented Jan 23, 2026

augment review

@augmentcode
Copy link

augmentcode bot commented Jan 23, 2026

🤖 Augment PR Summary

Summary: Extends Chapi’s Python grammar to parse newer Python 3.10–3.14 syntax.

Changes:

  • Added soft-keyword tokens for match/case (3.10) and type (3.12), and introduced a full match_stmt grammar for structural pattern matching (PEP 634).
  • Implemented PEP 695 constructs: type parameters on classes/functions and the type-alias statement.
  • Added grammar for PEP 758 “bracketless except” with multiple exception types.
  • Extended string literal lexing to recognize 3.14 template-string prefixes (t"", rt"", etc.).
  • Expanded PythonVersion with 3.10/3.12/3.14 entries and helper predicates.
  • Added a Kotlin test suite exercising parsing of the new syntax and soft-keyword identifier compatibility.
Technical Notes: Most changes are in the ANTLR lexer/parser grammars; tests validate that PythonAnalyser can parse representative snippets across versions.

🤖 Was this summary useful? React with 👍 or 👎

Copy link

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 4 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

;

named_expression
: name ASSIGN test
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

named_expression is defined as name ASSIGN test, but ASSIGN is = in the lexer, so valid walrus subjects (:=) in match statements won’t parse and invalid name = expr could be accepted in expression position.

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎

;

wildcard_pattern
: NAME // Matches '_' specifically, handled semantically
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wildcard_pattern : NAME overlaps with capture_pattern/pattern_capture_target (which also start from an identifier), so _ vs capture can become ambiguous and wildcard_pattern may never appear in the parse tree as intended.

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎

// Python 3.14+ (PEP 758): EXCEPT test, test, test without parentheses (only WITHOUT AS clause)
except_clause
: EXCEPT (test ({CheckVersion(2)}? COMMA name {SetVersion(2);} | {CheckVersion(3)}? AS name {SetVersion(3);})?)? COLON suite
| EXCEPT except_types COLON suite // Python 3.14+ without parentheses (no AS clause allowed)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Autodetect mode, CheckVersion(2) succeeds, so except ValueError, TypeError: is likely to match the Python 2 COMMA name branch (and call SetVersion(2)) instead of the new except_types alternative. That would misclassify Python 3.14 bracketless-except as Python 2 and change its meaning.

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎

Python2(2),
Python3(3);
Python3(3),
Python310(310), // Pattern matching
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds Python310/Python312/Python314, but the parser’s SetVersion(...) currently only maps 2 and 3, and PythonAnalyser constructs the parser in Autodetect without setting a newer version. As a result, these newer enum values (and the feature-check helpers) may not take effect during normal parsing unless something sets parser.Version externally.

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎

@codecov
Copy link

codecov bot commented Jan 23, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 69.08%. Comparing base (948159b) to head (59fba90).
⚠️ Report is 2 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff              @@
##             master      #53      +/-   ##
============================================
+ Coverage     68.92%   69.08%   +0.16%     
- Complexity     1796     1810      +14     
============================================
  Files            88       88              
  Lines          7977     7983       +6     
  Branches       1624     1624              
============================================
+ Hits           5498     5515      +17     
+ Misses         1452     1443       -9     
+ Partials       1027     1025       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds comprehensive syntax support for Python 3.10, 3.12, and 3.14 features to the Chapi Python AST parser, including pattern matching (match/case), type parameter syntax (PEP 695), type alias statements, template strings (t-strings), and bracketless except clauses (PEP 758).

Changes:

  • Extended Python grammar (PythonLexer.g4 and PythonParser.g4) to support new syntax features including match/case patterns, generic type parameters, type aliases, t-strings, and bracketless except clauses
  • Added new Python version enums (Python310, Python312, Python314) with feature support methods
  • Created comprehensive test suite with 25+ test cases covering all new syntax features

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 13 comments.

File Description
Python314FeaturesTest.kt New test suite with 25+ tests for Python 3.10-3.14 features, organized into nested test classes for pattern matching, type parameters, type aliases, t-strings, bracketless except, and soft keyword compatibility
PythonVersion.java Added three new version enums (Python310, Python312, Python314) and three feature support methods to check version compatibility
PythonParser.g4 Added extensive grammar rules for match statements, patterns (literal, capture, wildcard, class, sequence, mapping, or, as), type parameters, type bounds, type alias statements, and bracketless except clause support
PythonLexer.g4 Added soft keyword tokens (MATCH, CASE, TYPE) and extended STRING token to support t-string prefixes (t, tr, rt)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

;

wildcard_pattern
: NAME // Matches '_' specifically, handled semantically
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment states this matches '_' specifically and is "handled semantically", but there's no semantic action or validation in the grammar rule itself. This could be misleading to future maintainers.

Either add a semantic predicate to actually check for '_' at parse time, or update the comment to clarify that this semantic check must be performed in the listener/visitor code that processes the AST. The comment should reference where this semantic validation occurs.

Suggested change
: NAME // Matches '_' specifically, handled semantically
: NAME // Matches a NAME; treating '_' as the wildcard pattern is done in later semantic analysis (e.g. in listeners/visitors processing match patterns)

Copilot uses AI. Check for mistakes.
Comment on lines 98 to +246
: EXCEPT (test ({CheckVersion(2)}? COMMA name {SetVersion(2);} | {CheckVersion(3)}? AS name {SetVersion(3);})?)? COLON suite
| EXCEPT except_types COLON suite // Python 3.14+ without parentheses (no AS clause allowed)
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The except_clause grammar has a potential ambiguity issue. Both alternatives start with EXCEPT test which could cause parsing conflicts:

  1. First alternative: EXCEPT (test ...)? COLON suite - allows zero or one test
  2. Second alternative: EXCEPT except_types COLON suite - requires test (COMMA test)+

When the parser sees EXCEPT test COMMA test COLON, it could match either:

  • The first alternative with the Python 2 path: EXCEPT test COMMA name (if test is followed by COMMA name)
  • The second alternative: EXCEPT except_types (if test is followed by COMMA test)

This ambiguity could lead to incorrect parsing. Consider reordering the alternatives or making the second alternative more specific to avoid conflicts. The more specific alternative (except_types requiring at least two exceptions) should typically come before the more general one.

Suggested change
: EXCEPT (test ({CheckVersion(2)}? COMMA name {SetVersion(2);} | {CheckVersion(3)}? AS name {SetVersion(3);})?)? COLON suite
| EXCEPT except_types COLON suite // Python 3.14+ without parentheses (no AS clause allowed)
: EXCEPT except_types COLON suite // Python 3.14+ without parentheses (no AS clause allowed)
| EXCEPT (test ({CheckVersion(2)}? COMMA name {SetVersion(2);} | {CheckVersion(3)}? AS name {SetVersion(3);})?)? COLON suite

Copilot uses AI. Check for mistakes.
;

wildcard_pattern
: NAME // Matches '_' specifically, handled semantically
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wildcard_pattern rule is defined as just NAME, with a comment stating it "Matches '_' specifically, handled semantically". This implementation is incorrect because it will match ANY name token, not just the underscore wildcard.

This means patterns like case foo: where foo is intended to be a capture pattern would incorrectly be parsed as a wildcard pattern. The wildcard pattern should either:

  1. Be checked semantically to ensure the NAME is specifically '_', or
  2. Use a more specific lexer token or parser predicate to match only '_'

The current implementation could lead to incorrect AST generation where any capture pattern is treated as a wildcard.

Suggested change
: NAME // Matches '_' specifically, handled semantically
: {getCurrentToken().getText().equals("_")}? NAME // Matches '_' specifically

Copilot uses AI. Check for mistakes.
Comment on lines +124 to +127
pattern
: or_pattern
| as_pattern
;
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pattern matching grammar has a structural issue. The pattern rule includes both or_pattern and as_pattern, but as_pattern is defined as or_pattern AS pattern_capture_target. This means:

  1. pattern can be as_pattern
  2. as_pattern includes or_pattern
  3. or_pattern includes closed_pattern
  4. closed_pattern can be many things including group_pattern
  5. group_pattern is OPEN_PAREN pattern CLOSE_PAREN

However, looking at PEP 634, the as_pattern should be at the same level as or_pattern, not nested within pattern. The current structure is:

pattern: or_pattern | as_pattern
as_pattern: or_pattern AS pattern_capture_target

This is actually correct per PEP 634, but the rule should ensure that as_pattern takes precedence in the alternatives. Consider reordering to put as_pattern first to ensure it's matched before falling back to or_pattern.

Copilot uses AI. Check for mistakes.
Comment on lines +22 to +151
@Test
fun `should parse basic match statement with literal patterns`() {
val code = """
def http_status(status):
match status:
case 200:
return "OK"
case 404:
return "Not Found"
case 500:
return "Internal Server Error"
""".trimIndent()

val codeFile = PythonAnalyser().analysis(code, "test.py")
val func = codeFile.DataStructures[0].Functions[0]

assertEquals("http_status", func.Name)
assertEquals(1, func.Parameters.size)
}

@Test
fun `should parse match with capture patterns`() {
val code = """
def describe_point(point):
match point:
case (0, 0):
return "Origin"
case (x, 0):
return f"X-axis at {x}"
case (0, y):
return f"Y-axis at {y}"
case (x, y):
return f"Point at ({x}, {y})"
""".trimIndent()

val codeFile = PythonAnalyser().analysis(code, "test.py")
val func = codeFile.DataStructures[0].Functions[0]

assertEquals("describe_point", func.Name)
}

@Test
fun `should parse match with wildcard pattern`() {
val code = """
def process_command(command):
match command:
case "quit":
return "Goodbye"
case _:
return "Unknown command"
""".trimIndent()

val codeFile = PythonAnalyser().analysis(code, "test.py")
val func = codeFile.DataStructures[0].Functions[0]

assertEquals("process_command", func.Name)
}

@Test
fun `should parse match with class patterns`() {
val code = """
def handle_event(event):
match event:
case Click(x=x, y=y):
handle_click(x, y)
case KeyPress(key=k):
handle_key(k)
case Quit():
sys.exit()
""".trimIndent()

val codeFile = PythonAnalyser().analysis(code, "test.py")
val func = codeFile.DataStructures[0].Functions[0]

assertEquals("handle_event", func.Name)
}

@Test
fun `should parse match with guard conditions`() {
val code = """
def categorize_number(n):
match n:
case x if x < 0:
return "negative"
case 0:
return "zero"
case x if x > 0:
return "positive"
""".trimIndent()

val codeFile = PythonAnalyser().analysis(code, "test.py")
val func = codeFile.DataStructures[0].Functions[0]

assertEquals("categorize_number", func.Name)
}

@Test
fun `should parse match with or patterns`() {
val code = """
def classify_char(char):
match char:
case 'a' | 'e' | 'i' | 'o' | 'u':
return "vowel"
case _:
return "consonant"
""".trimIndent()

val codeFile = PythonAnalyser().analysis(code, "test.py")
val func = codeFile.DataStructures[0].Functions[0]

assertEquals("classify_char", func.Name)
}

@Test
fun `should parse match with mapping patterns`() {
val code = """
def process_action(action):
match action:
case {"type": "move", "x": x, "y": y}:
move_to(x, y)
case {"type": "attack", "target": t}:
attack(t)
""".trimIndent()

val codeFile = PythonAnalyser().analysis(code, "test.py")
val func = codeFile.DataStructures[0].Functions[0]

assertEquals("process_action", func.Name)
}
}
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test assertions for pattern matching are very weak and only verify that parsing succeeds without checking whether the match/case constructs are actually captured in the AST.

Looking at other test files like PythonAdvancedFeaturesTest.kt (e.g., lines 63-67), tests verify specific properties like annotations, modifiers, or function calls. These pattern matching tests should verify:

  1. That match statements are captured in the function body
  2. That case blocks are identified
  3. That patterns (literal, capture, class, etc.) are correctly parsed
  4. That guard conditions are captured

Current tests only check basic function name parsing, which would pass even if pattern matching was completely ignored by the parser. Consider adding assertions that verify match/case constructs are actually present in the parsed AST structure.

Copilot uses AI. Check for mistakes.
Comment on lines +308 to +346
@Test
fun `should parse simple t-string`() {
val code = """
def greet(name):
return t"Hello, {name}!"
""".trimIndent()

val codeFile = PythonAnalyser().analysis(code, "test.py")
val func = codeFile.DataStructures[0].Functions[0]

assertEquals("greet", func.Name)
}

@Test
fun `should parse raw t-string`() {
val code = """
def make_pattern(value):
return rt"\\d+{value}\\w+"
""".trimIndent()

val codeFile = PythonAnalyser().analysis(code, "test.py")
val func = codeFile.DataStructures[0].Functions[0]

assertEquals("make_pattern", func.Name)
}

@Test
fun `should parse t-string with complex expressions`() {
val code = """
def format_data(user):
return t"User: {user.name}, Age: {user.age}, Active: {user.is_active}"
""".trimIndent()

val codeFile = PythonAnalyser().analysis(code, "test.py")
val func = codeFile.DataStructures[0].Functions[0]

assertEquals("format_data", func.Name)
}
}
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test assertions for template strings (t-strings) only verify that the function name is correctly parsed, not that t-strings are actually recognized as a distinct string type in the AST.

These tests should verify:

  1. That t-strings are captured with their proper prefix in the AST
  2. That raw t-strings (rt"...") are distinguished from regular t-strings (t"...")
  3. That the string content and interpolation expressions are properly extracted

The current assertions would pass even if 't' was completely ignored and the strings were parsed as regular strings. Consider adding assertions that verify t-string specific properties are preserved in the AST.

Copilot uses AI. Check for mistakes.
Comment on lines +353 to +407
@Test
fun `should parse except without parentheses for multiple exceptions`() {
val code = """
def safe_parse(data):
try:
return json.loads(data)
except ValueError, TypeError, KeyError:
return None
""".trimIndent()

val codeFile = PythonAnalyser().analysis(code, "test.py")
val func = codeFile.DataStructures[0].Functions[0]

assertEquals("safe_parse", func.Name)
}

@Test
fun `should parse except with parentheses still works`() {
val code = """
def safe_convert(value):
try:
return int(value)
except (ValueError, TypeError) as e:
print(f"Error: {e}")
return 0
""".trimIndent()

val codeFile = PythonAnalyser().analysis(code, "test.py")
val func = codeFile.DataStructures[0].Functions[0]

assertEquals("safe_convert", func.Name)
}

@Test
fun `should parse mixed except styles`() {
// Note: Per PEP 758, bracketless except cannot use AS clause
val code = """
def process_file(path):
try:
with open(path) as f:
return f.read()
except FileNotFoundError:
return None
except PermissionError, IOError:
raise
except ValueError:
return ""
""".trimIndent()

val codeFile = PythonAnalyser().analysis(code, "test.py")
val func = codeFile.DataStructures[0].Functions[0]

assertEquals("process_file", func.Name)
}
}
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test assertions for bracketless except only verify that the function name is correctly parsed, not that multiple exceptions are properly captured in the except clause.

These tests should verify:

  1. That multiple exception types in bracketless form are correctly identified
  2. That the exception types are properly separated and stored
  3. That mixing bracketless and parenthesized except clauses works correctly

The current assertions would pass even if the except clause was misparsed or if only the first exception type was captured. This is particularly important for line 359 which tests the new Python 3.14 syntax except ValueError, TypeError, KeyError: - the test should verify all three exception types are recognized.

Copilot uses AI. Check for mistakes.
Comment on lines +137 to +145
closed_pattern
: literal_pattern
| capture_pattern
| wildcard_pattern
| class_pattern
| sequence_pattern
| mapping_pattern
| group_pattern
;
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an ambiguity between capture_pattern and wildcard_pattern in the closed_pattern alternatives. Both rules will match any NAME token:

  • capture_pattern expands to pattern_capture_target which is name which includes NAME
  • wildcard_pattern is directly NAME

When the parser encounters a NAME token in a pattern context, ANTLR won't know whether to treat it as a capture_pattern or wildcard_pattern. This ambiguity should be resolved by:

  1. Removing one of the alternatives (since wildcard is meant to only match '_', it should be handled differently)
  2. Using a semantic predicate to distinguish '_' from other names
  3. Restructuring to have a single 'name_pattern' that's semantically interpreted

This is related to the wildcard_pattern issue noted separately, but the ambiguity in closed_pattern makes it a more serious parsing concern.

Copilot uses AI. Check for mistakes.
Comment on lines 3 to +17
public enum PythonVersion {
Autodetect(0),
Python2(2),
Python3(3);
Python3(3),
Python310(310), // Pattern matching
Python312(312), // Type parameter syntax
Python314(314); // Template strings, bracketless except
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The version numbering scheme is inconsistent across enum values. Python2 uses value 2, Python3 uses value 3, but Python 3.10+ uses values like 310, 312, 314.

This inconsistency creates confusion in the support methods:

  • Line 33: value >= 310 || this == Python3 - this special case is needed because Python3(3) < 310
  • This means Python3.getValue() returns 3, which is less than 310, making version comparisons counterintuitive

Consider using a consistent scheme:

  • Option 1: Use full version numbers for all (e.g., Python2=200, Python3=300, Python310=310)
  • Option 2: Use a separate major/minor version structure
  • Option 3: Document this inconsistency clearly and why Python3 is treated specially

The current approach where Python3(3) is considered to support Python 3.10 features (line 33) suggests Python3 means "latest 3.x", which should be documented.

Copilot uses AI. Check for mistakes.
Comment on lines +40 to +47
return this == Autodetect || value >= 312;
}

/**
* Check if this version supports Python 3.14+ features (t-strings, bracketless except)
*/
public boolean supportsPython314Features() {
return this == Autodetect || value >= 314;
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The supportsTypeParameters() method doesn't include || this == Python3 unlike supportsPatternMatching(). This means:

  • Python3 version will return false for type parameters support (since value=3 < 312)
  • Python3 version will return true for pattern matching support (due to special case on line 33)

This inconsistency is confusing. If Python3 is meant to represent "generic Python 3.x" that should support all features (as suggested by line 33), then this method should also include || this == Python3. If Python3 is meant to represent specifically Python 3.0-3.9, then line 33's special case for pattern matching is incorrect (as noted in a separate comment).

The behavior should be consistent across all feature support methods.

Suggested change
return this == Autodetect || value >= 312;
}
/**
* Check if this version supports Python 3.14+ features (t-strings, bracketless except)
*/
public boolean supportsPython314Features() {
return this == Autodetect || value >= 314;
return this == Autodetect || value >= 312 || this == Python3;
}
/**
* Check if this version supports Python 3.14+ features (t-strings, bracketless except)
*/
public boolean supportsPython314Features() {
return this == Autodetect || value >= 314 || this == Python3;

Copilot uses AI. Check for mistakes.
- Add comprehensive tests for PythonVersion methods:
  - supportsPatternMatching()
  - supportsTypeParameters()
  - supportsPython314Features()
  - getValue() for all versions

- Add edge case tests for:
  - Nested patterns in match statements
  - As patterns in match statements
  - Star patterns and double star patterns
  - Complex type parameters with multiple bounds
  - Type aliases with type parameters
  - T-strings with complex expressions

- All tests pass (40+ test cases total)
- Improves code coverage for PythonVersion.java
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@chapi-ast-python/src/main/antlr/PythonParser.g4`:
- Around line 241-252: The grammar allows the Python 3.14+ bracketless exception
form to match in older modes because the second alternative of except_clause
lacks a version predicate; add the version guard by prepending the predicate
{CheckVersion(314)}? to the alternative that references except_types so that
except_types is only used when CheckVersion(314) is true, keeping the existing
CheckVersion/SetVersion pattern used in the first except_clause alternative
(refer to except_clause, except_types, CheckVersion and SetVersion).
- Around line 137-165: The closed_pattern rule currently places capture_pattern
before wildcard_pattern causing NAME-based captures to shadow the '_' wildcard;
fix this by moving wildcard_pattern above capture_pattern in closed_pattern and
restrict wildcard_pattern to only match the underscore identifier by adding a
semantic predicate or explicit text check (e.g., in wildcard_pattern ensure the
token matched by NAME equals "_" instead of any NAME). Update the rules
referenced: closed_pattern, wildcard_pattern, capture_pattern (and its
components pattern_capture_target → name → NAME) so wildcard_pattern is
evaluated first and only produces a wildcard_pattern node when the NAME token is
"_" .

In `@chapi-ast-python/src/main/java/chapi/ast/antlr/PythonVersion.java`:
- Around line 29-48: The three feature-gate methods (supportsPatternMatching,
supportsTypeParameters, supportsPython314Features) are inconsistent about
treating the enum constant Python3 as “generic Python 3.x”; align their
semantics by either always treating Python3 as supporting modern 3.x features or
never doing so—pick one. Concretely, update the implementations in
PythonVersion: if you choose to treat Python3 as generic modern 3.x, add the
same check (this == Python3) to supportsTypeParameters() and
supportsPython314Features(); if you choose strict numeric checks, remove the
this == Python3 clause from supportsPatternMatching(). Keep the Autodetect
handling (this == Autodetect) unchanged and use the existing value field for
numeric comparisons.
🧹 Nitpick comments (1)
chapi-ast-python/src/test/kotlin/chapi/ast/pythonast/Python314FeaturesTest.kt (1)

19-151: Consider extending PythonAnalyser to accept an explicit version parameter for version-specific feature tests.

The tests exercise Python 3.10–3.14 syntax features (pattern matching, type parameters, etc.), but PythonAnalyser.analysis() currently accepts only code and filePath, relying on the parser's Autodetect mode. While the infrastructure exists (PythonVersion enum with Python310, Python312, Python314 variants), the public API doesn't expose version control. To reduce brittleness and improve test clarity, consider adding a version parameter to analysis() or creating a builder method that allows explicit version selection. Autodetect currently permits these features, but explicit pinning would make intent clearer and protect against future default changes.

Comment on lines +137 to +165
closed_pattern
: literal_pattern
| capture_pattern
| wildcard_pattern
| class_pattern
| sequence_pattern
| mapping_pattern
| group_pattern
;

literal_pattern
: MINUS? number
| STRING+
| NONE
| TRUE
| FALSE
;

capture_pattern
: pattern_capture_target
;

pattern_capture_target
: name
;

wildcard_pattern
: NAME // Matches '_' specifically, handled semantically
;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n chapi-ast-python/src/main/antlr/PythonParser.g4 | sed -n '137,165p'

Repository: phodal/chapi

Length of output: 701


🏁 Script executed:

# Check for wildcard and capture pattern usage in visitors and analyzers
rg -n "wildcard_pattern|capture_pattern" -A 3 -B 1 --type java --type python

Repository: phodal/chapi

Length of output: 142


🏁 Script executed:

# Check if there's semantic handling for _ in the codebase
rg -n "equals.*_|_.*equals" --type java -A 2 -B 1 | head -50

Repository: phodal/chapi

Length of output: 1795


🏁 Script executed:

rg -n "^\s*name\s*:" chapi-ast-python/src/main/antlr/PythonParser.g4 -A 5

Repository: phodal/chapi

Length of output: 38


🏁 Script executed:

fd -t f "PythonParserBase\|PythonVisitor" chapi-ast-python/

Repository: phodal/chapi

Length of output: 38


🏁 Script executed:

fd -t f "\.g4" chapi-ast-python/src/main/antlr/

Repository: phodal/chapi

Length of output: 150


🏁 Script executed:

# Search for any semantic predicate or special handling of underscore in Python parser
rg -n "TEXT\|semantic" chapi-ast-python/src/main/antlr/PythonParser.g4 -i | head -20

Repository: phodal/chapi

Length of output: 38


🏁 Script executed:

# Search in entire PythonParser.g4 for name rule definition
rg -n "^name\s*:" chapi-ast-python/src/main/antlr/PythonParser.g4 -A 5

Repository: phodal/chapi

Length of output: 38


🏁 Script executed:

# Also search with whitespace variations
grep -n "^\s*name\s*:" chapi-ast-python/src/main/antlr/PythonParser.g4 -A 5

Repository: phodal/chapi

Length of output: 38


🏁 Script executed:

# Get entire PythonParser.g4 to understand structure
wc -l chapi-ast-python/src/main/antlr/PythonParser.g4

Repository: phodal/chapi

Length of output: 107


🏁 Script executed:

# Search for pattern_capture_target definition and what it actually calls
grep -n "pattern_capture_target\|NAME\|IDENTIFIER" chapi-ast-python/src/main/antlr/PythonParser.g4 | head -40

Repository: phodal/chapi

Length of output: 341


🏁 Script executed:

# Get lines around pattern definitions
cat -n chapi-ast-python/src/main/antlr/PythonParser.g4 | sed -n '155,170p'

Repository: phodal/chapi

Length of output: 427


🏁 Script executed:

# Get the name rule definition
cat -n chapi-ast-python/src/main/antlr/PythonParser.g4 | sed -n '488,495p'

Repository: phodal/chapi

Length of output: 232


🏁 Script executed:

# Check if there are any visitor classes generated and how they handle wildcard_pattern
find chapi-ast-python -name "*Visitor*.java" -o -name "*BaseVisitor*.java" | head -5

Repository: phodal/chapi

Length of output: 38


🏁 Script executed:

# Check for any Python files in the codebase that might handle AST traversal
find chapi-ast-python -type f -name "*.py" | head -10

Repository: phodal/chapi

Length of output: 646


🏁 Script executed:

# Search for wildcard_pattern references in Java source code
find chapi-ast-python/src -name "*.java" -type f -exec grep -l "wildcard_pattern" {} \;

Repository: phodal/chapi

Length of output: 38


🏁 Script executed:

# Search more broadly for any references to wildcard in the codebase
rg "wildcard" chapi-ast-python/src --type java -i

Repository: phodal/chapi

Length of output: 38


🏁 Script executed:

# Check if there's a base listener/visitor pattern in chapi for Python
find chapi-ast-python/src/main -type f -name "*.java" | head -20

Repository: phodal/chapi

Length of output: 258


🏁 Script executed:

cat chapi-ast-python/src/main/java/chapi/ast/antlr/PythonParserBase.java

Repository: phodal/chapi

Length of output: 741


🏁 Script executed:

# Search for any pattern handling in the entire module
rg "pattern" chapi-ast-python/src/main/java -i -l

Repository: phodal/chapi

Length of output: 121


Wildcard pattern is unreachable due to capture_pattern precedence.

Both capture_pattern and wildcard_pattern ultimately match the NAME token (capture_pattern via pattern_capture_targetnameNAME), and since capture_pattern appears first in the ordered choice on line 139, it will always match before wildcard_pattern gets evaluated. This prevents wildcard patterns (specifically _) from ever being parsed as wildcard_pattern nodes. Python 3.10+ match statements require proper wildcard pattern support, so downstream code expecting wildcard_pattern AST nodes will never receive them.

Reorder wildcard_pattern before capture_pattern in the closed_pattern rule and add a semantic predicate to wildcard_pattern to restrict it to the _ identifier:

Proposed fix
-closed_pattern
-    : literal_pattern
-    | capture_pattern
-    | wildcard_pattern
+closed_pattern
+    : literal_pattern
+    | wildcard_pattern
+    | capture_pattern
     | class_pattern
     | sequence_pattern
     | mapping_pattern
     | group_pattern
     ;

-wildcard_pattern
-    : NAME  // Matches '_' specifically, handled semantically
+wildcard_pattern
+    : name=NAME { $name.text.equals("_") }?
     ;
🤖 Prompt for AI Agents
In `@chapi-ast-python/src/main/antlr/PythonParser.g4` around lines 137 - 165, The
closed_pattern rule currently places capture_pattern before wildcard_pattern
causing NAME-based captures to shadow the '_' wildcard; fix this by moving
wildcard_pattern above capture_pattern in closed_pattern and restrict
wildcard_pattern to only match the underscore identifier by adding a semantic
predicate or explicit text check (e.g., in wildcard_pattern ensure the token
matched by NAME equals "_" instead of any NAME). Update the rules referenced:
closed_pattern, wildcard_pattern, capture_pattern (and its components
pattern_capture_target → name → NAME) so wildcard_pattern is evaluated first and
only produces a wildcard_pattern node when the NAME token is "_" .

Comment on lines 241 to 252
// Python 2 : EXCEPT test COMMA name
// Python 3 : EXCEPT test AS name
// Python 3.14+ (PEP 758): EXCEPT test, test, test without parentheses (only WITHOUT AS clause)
except_clause
: EXCEPT (test ({CheckVersion(2)}? COMMA name {SetVersion(2);} | {CheckVersion(3)}? AS name {SetVersion(3);})?)? COLON suite
| EXCEPT except_types COLON suite // Python 3.14+ without parentheses (no AS clause allowed)
;

// Python 3.14+ allows multiple exception types without parentheses
except_types
: test (COMMA test)+
;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Search for version gating patterns in the grammar file
rg -n "CheckVersion|SetVersion|supportsPython314|PythonVersion" chapi-ast-python/src/main/antlr/PythonParser.g4 | head -50

Repository: phodal/chapi

Length of output: 704


🏁 Script executed:

# Get context around the except_clause and except_types rules
sed -n '230,260p' chapi-ast-python/src/main/antlr/PythonParser.g4

Repository: phodal/chapi

Length of output: 963


🏁 Script executed:

# Check if there are any lexer/semantic action files
find chapi-ast-python -name "*.java" -o -name "*.py" -o -name "*.g4" | grep -i "listener\|visitor\|action\|semantic" | head -20

Repository: phodal/chapi

Length of output: 38


🏁 Script executed:

# Search for CheckVersion implementation and how it's used
find chapi-ast-python -name "*.java" | head -20

Repository: phodal/chapi

Length of output: 258


🏁 Script executed:

# Look for PythonParser listener or base classes
find chapi-ast-python -name "*Listener*" -o -name "*Visitor*" -o -name "Python*" | grep -E "\.(java|py)$" | head -20

Repository: phodal/chapi

Length of output: 258


🏁 Script executed:

# Search for other Python 3.14+ specific rules to see if they have version guards
rg -n "3\.14|PEP 758|type_params" chapi-ast-python/src/main/antlr/PythonParser.g4 | head -30

Repository: phodal/chapi

Length of output: 574


🏁 Script executed:

# Check if there are any other rules with semantic actions but no CheckVersion
rg -n "EXCEPT|except_types" chapi-ast-python/src/main/antlr/PythonParser.g4 -A 3 -B 3

Repository: phodal/chapi

Length of output: 676


🏁 Script executed:

# Examine CheckVersion implementation in PythonParserBase
cat chapi-ast-python/src/main/java/chapi/ast/antlr/PythonParserBase.java

Repository: phodal/chapi

Length of output: 741


🏁 Script executed:

# Check if type_params rule has any version guards
rg -n "type_params" chapi-ast-python/src/main/antlr/PythonParser.g4 -B 2 -A 5 | head -40

Repository: phodal/chapi

Length of output: 730


🏁 Script executed:

# Check if there are other unguarded Python 3.12+ or 3.14+ features
rg -n "Python 3\.(12|13|14)\+|PEP (695|758)" chapi-ast-python/src/main/antlr/PythonParser.g4 | head -20

Repository: phodal/chapi

Length of output: 591


🏁 Script executed:

# Look for how type_params, type_alias_stmt are used and if they're version-guarded at call sites
rg -n "type_params|type_alias_stmt" chapi-ast-python/src/main/antlr/PythonParser.g4 -B 1 | head -40

Repository: phodal/chapi

Length of output: 698


🏁 Script executed:

# Check if there's any validation or version checking outside the grammar
rg -n "supportsPython|checkVersion|validateVersion" chapi-ast-python/src/main/java -i

Repository: phodal/chapi

Length of output: 295


🏁 Script executed:

# Check PythonVersion.java to see the version enum values
cat chapi-ast-python/src/main/java/chapi/ast/antlr/PythonVersion.java

Repository: phodal/chapi

Length of output: 1374


🏁 Script executed:

# Look at how the grammar file is structured and check for other rules without CheckVersion that should be version-specific
rg -n "EXCEPT except_types|except_types" chapi-ast-python/src/main/antlr/PythonParser.g4 -B 5 -A 5

Repository: phodal/chapi

Length of output: 729


Add version guard to except_types alternative.
The except_types rule (lines 250–252) lacks a CheckVersion predicate, allowing Python 3.14+ bracketless exception syntax to parse in Python 2 and 3 modes. Add {CheckVersion(314)}? before the second except_clause alternative to prevent unintended matches in earlier versions, consistent with the version-gating pattern used in the first alternative (line 245).

🤖 Prompt for AI Agents
In `@chapi-ast-python/src/main/antlr/PythonParser.g4` around lines 241 - 252, The
grammar allows the Python 3.14+ bracketless exception form to match in older
modes because the second alternative of except_clause lacks a version predicate;
add the version guard by prepending the predicate {CheckVersion(314)}? to the
alternative that references except_types so that except_types is only used when
CheckVersion(314) is true, keeping the existing CheckVersion/SetVersion pattern
used in the first except_clause alternative (refer to except_clause,
except_types, CheckVersion and SetVersion).

Comment on lines +29 to +48
/**
* Check if this version supports Python 3.10+ features (pattern matching)
*/
public boolean supportsPatternMatching() {
return this == Autodetect || value >= 310 || this == Python3;
}

/**
* Check if this version supports Python 3.12+ features (type parameters)
*/
public boolean supportsTypeParameters() {
return this == Autodetect || value >= 312;
}

/**
* Check if this version supports Python 3.14+ features (t-strings, bracketless except)
*/
public boolean supportsPython314Features() {
return this == Autodetect || value >= 314;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for PythonVersion.Python3 usage and feature gate method calls
rg -n "PythonVersion\.Python3\b|supportsPatternMatching|supportsTypeParameters|supportsPython314Features" -A 2 -B 2

Repository: phodal/chapi

Length of output: 2213


🏁 Script executed:

#!/bin/bash
# Find and examine the PythonVersion enum definition
fd -e java | xargs rg -l "enum PythonVersion|class PythonVersion" | head -5

Repository: phodal/chapi

Length of output: 121


🏁 Script executed:

#!/bin/bash
# Read the PythonVersion.java file
cat -n chapi-ast-python/src/main/java/chapi/ast/antlr/PythonVersion.java

Repository: phodal/chapi

Length of output: 1717


Align Python3 semantics across feature gates to avoid inconsistent feature availability.

supportsPatternMatching() treats Python3 (value=3) as supporting 3.10+ features with an explicit check, but supportsTypeParameters() and supportsPython314Features() exclude it entirely. Since Python3 represents a generic/unknown Python 3.x version (as seen in PythonParserBase.java line 22), this creates an inconsistency: either all three methods should include Python3, or none should. Decide on the intended semantics—does generic Python 3 assume modern feature support, or should version checks be strict?—and align all three methods accordingly.

🤖 Prompt for AI Agents
In `@chapi-ast-python/src/main/java/chapi/ast/antlr/PythonVersion.java` around
lines 29 - 48, The three feature-gate methods (supportsPatternMatching,
supportsTypeParameters, supportsPython314Features) are inconsistent about
treating the enum constant Python3 as “generic Python 3.x”; align their
semantics by either always treating Python3 as supporting modern 3.x features or
never doing so—pick one. Concretely, update the implementations in
PythonVersion: if you choose to treat Python3 as generic modern 3.x, add the
same check (this == Python3) to supportsTypeParameters() and
supportsPython314Features(); if you choose strict numeric checks, remove the
this == Python3 clause from supportsPatternMatching(). Keep the Autodetect
handling (this == Autodetect) unchanged and use the existing value field for
numeric comparisons.

@phodal phodal merged commit 0b30268 into master Jan 23, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants