Fix InexactError in peek_behind_pos when skipping nested trivia nodes #604

KristofferC · 2025-11-08T20:30:38Z

Explanation of bug written below by Claude :robot. Fixes JuliaLang/julia#60084

When parsing certain incomplete keywords like "do", the parser would crash
with InexactError: trunc(UInt32, -1) instead of returning a ParseError.

Root cause:

The peek_behind_pos function walks backwards through the flat output array,
subtracting byte_spans to calculate byte positions. However, when a non-terminal
trivia node contains child nodes, both the parent and children exist in the
output array with overlapping byte_spans. The old code would subtract the
byte_span of both the parent and its children, causing byte_idx to go negative.

Example with input "do":

The parser creates this output structure:
[1] TOMBSTONE (byte_span=0)
[2] "do" token (byte_span=2, is_trivia=true, covers bytes 1-2)
[3] error node (byte_span=2, is_trivia=true, node_span=1, covers bytes 1-2)
└── contains node [2] as a child

When peek_behind_pos walks backwards from next_byte=3:
Old behavior:
- Start: byte_idx = 3, node_idx = 3
- Skip error node [3]: byte_idx = 3 - 2 = 1, node_idx = 2
- Skip "do" node [2]: byte_idx = 1 - 2 = -1 ❌ (tries to convert to UInt32)

The problem: Both nodes cover the same bytes (1-2), so subtracting both
spans double-counts the same 2 bytes.

Solution:

When skipping a non-terminal node, also skip all its children without
subtracting their byte_spans, since the parent's byte_span already includes
them:

if is_non_terminal(node)
node_idx -= (1 + node.node_span) # Skip the node + all its children
else
node_idx -= 1
end

New behavior:

Start: byte_idx = 3, node_idx = 3
Skip error node [3]: byte_idx = 3 - 2 = 1, node_idx = 3 - (1 + 1) = 1
Stop at TOMBSTONE [1] (not trivia)
Return: byte_idx = 1, node_idx = 1 ✓

Why "do" was unique:

The "do" keyword is the only Julia keyword that:

Gets marked as TRIVIA when appearing standalone (invalid syntax)
Has an error node emitted that wraps it (also marked as TRIVIA)
Continues parsing afterward, eventually calling peek_behind

Other incomplete keywords either parse successfully or fail earlier before
reaching the code path that calls peek_behind.

When parsing certain incomplete keywords like "do", the parser would crash with `InexactError: trunc(UInt32, -1)` instead of returning a ParseError. Root cause: ----------- The peek_behind_pos function walks backwards through the flat output array, subtracting byte_spans to calculate byte positions. However, when a non-terminal trivia node contains child nodes, both the parent and children exist in the output array with overlapping byte_spans. The old code would subtract the byte_span of both the parent and its children, causing byte_idx to go negative. Example with input "do": ------------------------ The parser creates this output structure: [1] TOMBSTONE (byte_span=0) [2] "do" token (byte_span=2, is_trivia=true, covers bytes 1-2) [3] error node (byte_span=2, is_trivia=true, node_span=1, covers bytes 1-2) └── contains node [2] as a child When peek_behind_pos walks backwards from next_byte=3: Old behavior: - Start: byte_idx = 3, node_idx = 3 - Skip error node [3]: byte_idx = 3 - 2 = 1, node_idx = 2 - Skip "do" node [2]: byte_idx = 1 - 2 = -1 ❌ (tries to convert to UInt32) The problem: Both nodes cover the same bytes (1-2), so subtracting both spans double-counts the same 2 bytes. Solution: --------- When skipping a non-terminal node, also skip all its children without subtracting their byte_spans, since the parent's byte_span already includes them: if is_non_terminal(node) node_idx -= (1 + node.node_span) # Skip the node + all its children else node_idx -= 1 end New behavior: - Start: byte_idx = 3, node_idx = 3 - Skip error node [3]: byte_idx = 3 - 2 = 1, node_idx = 3 - (1 + 1) = 1 - Stop at TOMBSTONE [1] (not trivia) - Return: byte_idx = 1, node_idx = 1 ✓ Why "do" was unique: -------------------- The "do" keyword is the only Julia keyword that: 1. Gets marked as TRIVIA when appearing standalone (invalid syntax) 2. Has an error node emitted that wraps it (also marked as TRIVIA) 3. Continues parsing afterward, eventually calling peek_behind Other incomplete keywords either parse successfully or fail earlier before reaching the code path that calls peek_behind.

Keno

The fix seems non-controversial, although I'm not sure that non-terminal trivia are that good an idea, but that's a bit of a separate question. Given that we have them right now, this is the right fix.

KristofferC requested a review from Keno November 8, 2025 20:30

KristofferC mentioned this pull request Nov 8, 2025

typing do in the repl causes Error: Error in the keymap JuliaLang/julia#60084

Closed

Keno approved these changes Nov 9, 2025

View reviewed changes

KristofferC merged commit e78a222 into main Nov 9, 2025
36 checks passed

KristofferC deleted the kc/do_parse branch November 9, 2025 19:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix InexactError in peek_behind_pos when skipping nested trivia nodes #604

Fix InexactError in peek_behind_pos when skipping nested trivia nodes #604

Uh oh!

KristofferC commented Nov 8, 2025

Uh oh!

Keno left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Fix InexactError in peek_behind_pos when skipping nested trivia nodes #604

Fix InexactError in peek_behind_pos when skipping nested trivia nodes #604

Uh oh!

Conversation

KristofferC commented Nov 8, 2025

Root cause:

Example with input "do":

Solution:

Why "do" was unique:

Uh oh!

Keno left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants