Skip to content

Conversation

@cscheid
Copy link
Contributor

@cscheid cscheid commented Oct 14, 2025

Implements support for Pandoc-style table captions that appear after pipe tables using the tree-sitter grammar for precise parsing.

Table caption syntax:

  • Blank line followed by ": caption text"
  • Caption must appear immediately after a pipe table
  • Only the ': caption' format is supported

Implementation approach:

  • Added table_caption rule to tree-sitter grammar as child of pipe_table
  • Table captions are parsed at grammar level, not in postprocessor
  • Reverted _line rule to disallow paragraphs starting with ':' (restores safety check for detecting accidental fenced div continuation)

Changes:

  • Modified tree-sitter-markdown/grammar.js to add table_caption rule
  • Updated pipe_table processor to extract caption from parse tree
  • Added table_caption handler in treesitter.rs dispatcher
  • Removed postprocessor caption detection logic and with_blocks filter
  • Removed extract_table_caption function from postprocess.rs

Output matches Pandoc's JSON AST structure with caption attached to table's caption.long field as a Plain block.

🤖 Generated with Claude Code

cscheid and others added 2 commits October 14, 2025 14:07
Implements support for Pandoc-style table captions that appear after
pipe tables using the tree-sitter grammar for precise parsing.

Table caption syntax:
- Blank line followed by ": caption text"
- Caption must appear immediately after a pipe table
- Only the ': caption' format is supported

Implementation approach:
- Added table_caption rule to tree-sitter grammar as child of pipe_table
- Table captions are parsed at grammar level, not in postprocessor
- Reverted _line rule to disallow paragraphs starting with ':'
  (restores safety check for detecting accidental fenced div continuation)

Changes:
- Modified tree-sitter-markdown/grammar.js to add table_caption rule
- Updated pipe_table processor to extract caption from parse tree
- Added table_caption handler in treesitter.rs dispatcher
- Removed postprocessor caption detection logic and with_blocks filter
- Removed extract_table_caption function from postprocess.rs

Benefits:
- Table captions only recognized in correct context (after pipe tables)
- Restored paragraph safety check prevents malformed documents
- Cleaner separation between grammar structure and AST transformation
- Better performance by eliminating block scanning in postprocessor

Output matches Pandoc's JSON AST structure with caption attached to
table's caption.long field as a Plain block.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@cscheid cscheid merged commit b248a0c into main Oct 14, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants