Skip to content

refactor(markdown-parser): promote fenced code block skipped trivia to explicit CST nodes#9321

Open
jfmcdowell wants to merge 5 commits intobiomejs:mainfrom
jfmcdowell:refactor/fenced-code-block-prefix
Open

refactor(markdown-parser): promote fenced code block skipped trivia to explicit CST nodes#9321
jfmcdowell wants to merge 5 commits intobiomejs:mainfrom
jfmcdowell:refactor/fenced-code-block-prefix

Conversation

@jfmcdowell
Copy link
Contributor

@jfmcdowell jfmcdowell commented Mar 3, 2026

Note

AI Assistance Disclosure: This PR was developed with assistance from Claude Code.

Summary

  • Add MdIndentToken to AnyMdInline in the grammar for fence indent stripping tokens.
  • Replace 4 parse_as_skipped_trivia_tokens() call sites in fenced_code_block.rs with explicit CST node emission:
    • Sites 1-3: Blockquote > prefixes on continuation lines within fenced code blocks now emit MdQuotePrefix nodes (with MdQuoteIndentList, marker, and optional post-marker space).
    • Site 4: Fence indent stripping per CommonMark §4.5 now emits MdIndentToken nodes with MD_INDENT_CHAR tokens.
  • Add MdIndentToken no-op arm in to_html.rs extract_alt_text_inline exhaustive match.
  • Regenerate codegen output (biome_markdown_syntax, biome_markdown_formatter).
  • Add error fixture fenced_code_in_blockquote.md documenting pre-existing limitation where fenced code blocks inside blockquotes produce unterminated fence diagnostics.
  • Update fenced_code_advanced.md snapshot to reflect new CST shape.

Continues the skipped trivia promotion series (#9219, #9274, #9313). Sites 1-3 (quote prefixes in code content) are structurally correct but exercised only via the pre-existing blockquote+fenced-code path which has a known limitation — the error fixture documents current behavior until a follow-up fix lands.

No user-facing behavior change. Parsed semantics are preserved; only the internal CST representation changes.

Test Plan

  • just test-crate biome_markdown_parser
  • just test-markdown-conformance
  • just f && just l

Docs

N/A — internal structural change, no new user-facing features.

@changeset-bot
Copy link

changeset-bot bot commented Mar 3, 2026

⚠️ No Changeset found

Latest commit: 27f50ef

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@github-actions github-actions bot added A-Parser Area: parser A-Formatter Area: formatter A-Tooling Area: internal tools labels Mar 3, 2026
…o explicit CST nodes

Replace 4 parse_as_skipped_trivia_tokens() call sites in fenced_code_block.rs:
- Sites 1-3: blockquote > prefixes on continuation lines emit MdQuotePrefix nodes
- Site 4: fence indent stripping emits MdIndentToken nodes

Add MdIndentToken to AnyMdInline in the grammar and regenerate codegen.
Add MdIndentToken no-op arm in to_html.rs extract_alt_text_inline.
Add error fixture documenting pre-existing fenced-code-in-blockquote limitation.
Extract try_bump_quote_marker as pub(crate) to deduplicate marker-bumping logic.
@jfmcdowell jfmcdowell force-pushed the refactor/fenced-code-block-prefix branch from ccff014 to 041e82b Compare March 4, 2026 01:51
@jfmcdowell jfmcdowell marked this pull request as ready for review March 4, 2026 02:26
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 4, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 88acaddf-87cc-4ff1-9ab2-64d146bf5952

📥 Commits

Reviewing files that changed from the base of the PR and between 582f51a and 870df65.

⛔ Files ignored due to path filters (1)
  • crates/biome_markdown_parser/tests/md_test_suite/ok/fenced_code_in_blockquote.md.snap is excluded by !**/*.snap and included by **
📒 Files selected for processing (3)
  • crates/biome_markdown_parser/src/syntax/fenced_code_block.rs
  • crates/biome_markdown_parser/tests/md_test_suite/ok/fenced_code_in_blockquote.md
  • crates/biome_markdown_parser/tests/spec_test.rs

Walkthrough

This PR introduces MdIndentToken support throughout the markdown parser, formatter, and HTML extraction pipeline. It refactors fenced code block parsing with a structured, stateful approach using new helper functions (prepare_next_code_content_token, consume_quote_prefixes_in_code_content). The quote parsing logic is significantly expanded with virtual-line-start handling, indentation calculation refinements, and modularised block parsing control flow. Test coverage for fenced code blocks within blockquotes is added.

Possibly related PRs

Suggested reviewers

  • ematipico
  • dyc3
🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main structural refactoring: promoting previously-skipped trivia (indentation tokens in fenced code blocks) to explicit CST nodes, which is the core change across multiple files.
Description check ✅ Passed The description is comprehensive and directly related to the changeset, detailing the grammar additions, four specific refactoring sites, codegen regeneration, test fixtures, and the broader skipped-trivia promotion series this work continues.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/biome_markdown_parser/src/syntax/fenced_code_block.rs`:
- Around line 370-373: The call to try_bump_quote_marker(p) is inside
debug_assert! so it is skipped in release builds and the parser state won't be
updated; replace the debug_assert! invocation with an unconditional call to
try_bump_quote_marker(p) (so the marker is always consumed) and keep an optional
debug-only check if desired (e.g., call try_bump_quote_marker(p) and then
debug_assert!(result, "guard above guarantees marker present")); update the code
around the debug_assert! to call try_bump_quote_marker(p) unconditionally and
handle a false result only via debug assertion or by panicking with the same
message.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 02a409d7-7cbc-4b18-996b-0f5447559e8c

📥 Commits

Reviewing files that changed from the base of the PR and between 1022662 and 8d084b9.

⛔ Files ignored due to path filters (3)
  • crates/biome_markdown_parser/tests/md_test_suite/error/fenced_code_in_blockquote.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_parser/tests/md_test_suite/ok/fenced_code_advanced.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_syntax/src/generated/nodes.rs is excluded by !**/generated/**, !**/generated/** and included by **
📒 Files selected for processing (6)
  • crates/biome_markdown_formatter/src/markdown/any/inline.rs
  • crates/biome_markdown_parser/src/syntax/fenced_code_block.rs
  • crates/biome_markdown_parser/src/syntax/quote.rs
  • crates/biome_markdown_parser/src/to_html.rs
  • crates/biome_markdown_parser/tests/md_test_suite/error/fenced_code_in_blockquote.md
  • xtask/codegen/markdown.ungram

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/biome_markdown_parser/src/syntax/fenced_code_block.rs`:
- Around line 314-335: The code currently sets at_line_start = false immediately
after consume_quote_prefixes_in_code_content, which prevents the later
fence-indent stripping block (skip_fenced_content_indent and at_closing_fence)
from running for lines inside blockquotes; update the loop in
fenced_code_block.rs so fence-indent stripping runs after quote prefix
consumption: after calling consume_quote_prefixes_in_code_content (function
name) and before or regardless of resetting at_line_start, call
skip_fenced_content_indent when fence_indent > 0 and then re-check
at_closing_fence (function name) — or alternatively handle blockquote-nested
indentation explicitly by adding a branch that strips fence_indent even when
at_line_start was just true and quote prefixes were consumed; ensure
CodeContentLoopAction semantics and the at_line_start flag are preserved.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7e51518c-9db4-4808-9efb-2a5a282df782

📥 Commits

Reviewing files that changed from the base of the PR and between 8d084b9 and 582f51a.

📒 Files selected for processing (1)
  • crates/biome_markdown_parser/src/syntax/fenced_code_block.rs

Use virtual_line_start in line_has_closing_fence so fence detection
starts after consumed quote prefixes instead of seeing `>` as
non-whitespace. Set virtual_line_start after quote prefix consumption
and allow fence-indent stripping to run on blockquote lines.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-Formatter Area: formatter A-Parser Area: parser A-Tooling Area: internal tools

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant