Remove backtracking from package-gale parser by gfx · Pull Request #711 · wado-lang/wado

gfx · 2026-03-27T22:47:33Z

https://claude.ai/code/session_014FPnMv8SbHtLtKwvuUs8gs

…n engine Add Consume variant to PredictionNode that factors out shared fixed-token prefixes across alternatives. When all alternatives in a group start with the same TokenRef or Literal, the prediction engine now emits a Consume node that advances the parser past the shared prefix without spending lookahead depth budget, then recurses on the suffixes. This eliminates backtracking in cases like JSON obj/arr rules where '{' pair (',' pair)* '}' and '{' '}' share the '{' prefix — the generated parser now consumes '{' then dispatches on the next token. For SQLite, backtracking sites reduced from 122 to 109 (inline groups with shared keyword prefixes like INSERT OR REPLACE/ROLLBACK/etc). https://claude.ai/code/session_014FPnMv8SbHtLtKwvuUs8gs

The prediction engine now detects when all alternatives in a group share the same leading RuleRef (e.g. parse_name), consumes it via a parse call, and then dispatches on the suffix. This eliminates backtracking for cases like SQLite's compound_operator rule where alternatives share a common RuleRef prefix. SQLite backtracking: 109 → 105 (further reduction from previous 122). https://claude.ai/code/session_014FPnMv8SbHtLtKwvuUs8gs

Implement rule_is_single_token() that checks whether a parser rule always consumes exactly one token (all alternatives have exactly one non-nullable terminal or single-token RuleRef). When compute_deeper_first encounters such a RuleRef, it safely advances past it (consuming 1 depth unit), enabling lookahead into what follows the rule reference. This is the static equivalent of ANTLR4's SLL prediction: for rules like SQLite's `name` → `IDENTIFIER | keyword`, the prediction engine can now look past the rule reference to disambiguate alternatives (e.g., `function_name '('` vs `column_name` at depth 2). No golden file changes because SQLite's `any_name` rule includes a recursive `'(' any_name ')'` alternative, making it non-single-token. The infrastructure is in place for grammars with pure single-token rules. https://claude.ai/code/session_014FPnMv8SbHtLtKwvuUs8gs

Replace the old depth-based prediction engine with an SLL-style DFA that tracks element positions independently per alternative. Each alternative maintains its own cursor into its element sequence, enabling disambiguation when different alternatives are at different positions after consuming a shared token. Key changes: - SllConfig struct tracks (alt_index, elements, pos) per alternative - build_sll_node simulates token consumption across all configs - Single-token RuleRefs advance position; multi-token RuleRefs produce opaque configs that force Backtrack (safe approximation) - strip_dead_consume removes Consume nodes that lead to Backtrack - Opaque configs are correctly included in all dispatch branches Results: SQLite backtracking 122 → 98 (-20%), S-expression 1 → 1, JSON/Calculator 0 → 0. All 92 Gale unit tests pass. 2 SQLite integration tests fail (EXISTS subquery) due to backtrack trial order — will be fixed in a follow-up. Also fixes unrelated clippy warnings (if_same_then_else, ptr_arg). https://claude.ai/code/session_014FPnMv8SbHtLtKwvuUs8gs

…iction Fix a pre-existing bug where EXISTS/NOT EXISTS subqueries were parsed as function calls instead of subquery expressions. The root cause was twofold: 1. sll_closure skipped nullable elements (e.g., Optional(K_NOT)), causing their FIRST tokens to be lost from the prediction. This meant bt_21 ((NOT)? EXISTS '(' select_stmt ')') wasn't included in the EXISTS token dispatch branch. 2. Backtrack trial order: alternatives starting with fixed tokens (like EXISTS) should be tried before alternatives starting with RuleRefs (like function_name) since they are more specific. Changes: - Remove nullable skipping from sll_closure; use first_of_elements_from in sll_config_first to naturally include nullable tokens in FIRST sets - Add explicit nullable handling in sll_advance_inner: when at a nullable element, try both matching it and skipping past all consecutive nullables (non-recursive to avoid explosion) - Raise alt_sort_priority for multi-element alternatives starting with fixed tokens (priority 4) over those starting with RuleRefs (priority 2) Also fixes unrelated clippy warnings (if_same_then_else, ptr_arg). Results: 1283 tests pass (0 failures), SQLite backtracking 98 → 104 (slight increase due to nullable configs producing more opaque paths). https://claude.ai/code/session_014FPnMv8SbHtLtKwvuUs8gs

https://claude.ai/code/session_014FPnMv8SbHtLtKwvuUs8gs

claude added 6 commits March 27, 2026 13:33

merge origin/main (conflicts unresolved)

10b1917

gfx enabled auto-merge March 27, 2026 23:22

claude added 2 commits March 27, 2026 23:28

resolve merge conflicts

3ece7dd

Run format-wado on parser_gen.wado

20b3610

https://claude.ai/code/session_014FPnMv8SbHtLtKwvuUs8gs

gfx merged commit f332d0e into main Mar 27, 2026
9 of 10 checks passed

gfx deleted the claude/remove-parser-backtracking-ogyIs branch March 27, 2026 23:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove backtracking from package-gale parser#711

Remove backtracking from package-gale parser#711
gfx merged 8 commits intomainfrom
claude/remove-parser-backtracking-ogyIs

gfx commented Mar 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gfx commented Mar 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants