Skip to content

Commit ab92d47

Browse files
cscheidclaude
andauthored
Bugfix for pipe table delimiters (#48)
* Fix pipe table parsing with code spans containing pipes (#29) Added code span recognition to pipe table cell parsing. The block grammar now properly handles backtick code spans like `|` within table cells by repeating the inline grammar's code span parsing logic in the block context. Changes: - Added CODE_SPAN_START and CODE_SPAN_CLOSE external tokens to block grammar - Implemented parse_code_span() in block scanner with lookahead for matching delimiters - Added _pipe_table_code_span rule to parse code spans within table cells - Updated scanner state serialization to track code span delimiter length 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * Add tests for pipe tables with code spans containing pipes (issue #29) These tests verify that the parser correctly handles code spans containing pipe characters within pipe tables. The fix in the block parser now properly parses code spans to avoid treating pipes inside backticks as table delimiters. Test cases cover: - Simple code span with single pipe - Multiple code spans with pipes in different cells - Mixed backtick delimiters (double and triple backticks) All tests pass and match Pandoc's output. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> * Fix pipe table parsing with latex spans containing pipes This commit applies the same technique used for code spans (issue #29) to fix latex spans (inline math with $ delimiters) in pipe tables. Changes: - Added LATEX_SPAN_START and LATEX_SPAN_CLOSE external tokens - Implemented parse_latex_span() in scanner.c to handle dollar sign delimiters within pipe table cells - Added _pipe_table_latex_span grammar rule - Fixed token check ordering in scan() to prioritize latex span parsing over display math state tracking when inside pipe table cells - Added tree-sitter test case for latex spans with pipes - Added end-to-end tests in quarto-markdown-pandoc crate The fix ensures that pipes inside latex spans (e.g., $|$) are not treated as table cell delimiters, matching Pandoc's behavior. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> --------- Co-authored-by: Claude <[email protected]>
1 parent dfdcca9 commit ab92d47

File tree

11 files changed

+11517
-9042
lines changed

11 files changed

+11517
-9042
lines changed
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
| a | b |
2+
|---|---|
3+
| `|` | oh no |
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
| a | b |
2+
|---|---|
3+
| $|$ | math |
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
| Test | Description |
2+
|------|-------------|
3+
| `` ` `` | backtick in code |
4+
| ``` | ``` | pipe in triple backtick code |
5+
| `a|b|c` | multiple pipes |
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
| Test | Description |
2+
|------|-------------|
3+
| `|` | backtick code with pipe |
4+
| $|$ | latex span with pipe |
5+
| `a|b` | more code |
6+
| $x|y$ | more math |
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
| Column 1 | Column 2 | Column 3 |
2+
|----------|----------|----------|
3+
| `|` | normal text | `a|b` |
4+
| regular | `||` | more text |
5+
| `x | y` | text | `|>` |
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
| Column 1 | Column 2 | Column 3 |
2+
|----------|----------|----------|
3+
| $|$ | normal text | $a|b$ |
4+
| regular | $||$ | more text |
5+
| $x | y$ | text | $|>$ |

crates/tree-sitter-qmd/tree-sitter-markdown/grammar.js

Lines changed: 38 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -454,21 +454,47 @@ module.exports = grammar({
454454
),
455455
),
456456

457+
// Code span within pipe table cells - simplified version that only handles backticks
458+
_pipe_table_code_span: $ => seq(
459+
$._code_span_start,
460+
repeat(choice(
461+
$._word,
462+
$._whitespace,
463+
common.punctuation_without($, []),
464+
)),
465+
$._code_span_close,
466+
),
467+
468+
// Latex span within pipe table cells - simplified version that only handles dollar signs
469+
_pipe_table_latex_span: $ => seq(
470+
$._latex_span_start,
471+
repeat(choice(
472+
$._word,
473+
$._whitespace,
474+
common.punctuation_without($, []),
475+
)),
476+
$._latex_span_close,
477+
),
478+
457479
_pipe_table_cell_contents: $ => prec.right(
458480
seq(
459481
choice(
460482
$._word,
461-
$._display_math_state_track_marker,
462-
$._inline_math_state_track_marker,
483+
$._display_math_state_track_marker,
484+
$._inline_math_state_track_marker,
463485
$._backslash_escape,
486+
$._pipe_table_code_span,
487+
$._pipe_table_latex_span,
464488
common.punctuation_without($, ['|']),
465489
),
466490
repeat(choice(
467491
$._word,
468-
$._display_math_state_track_marker,
469-
$._inline_math_state_track_marker,
492+
$._display_math_state_track_marker,
493+
$._inline_math_state_track_marker,
470494
$._whitespace,
471495
$._backslash_escape,
496+
$._pipe_table_code_span,
497+
$._pipe_table_latex_span,
472498
common.punctuation_without($, ['|']),
473499
)))),
474500

@@ -569,6 +595,14 @@ module.exports = grammar({
569595
// special tokens to allow external scanner serialization to happen
570596
$._display_math_state_track_marker,
571597
$._inline_math_state_track_marker,
598+
599+
// code span delimiters for parsing pipe table cells
600+
$._code_span_start,
601+
$._code_span_close,
602+
603+
// latex span delimiters for parsing pipe table cells
604+
$._latex_span_start,
605+
$._latex_span_close,
572606
],
573607
precedences: $ => [
574608
[$._setext_heading1, $._block],

0 commit comments

Comments
 (0)