refactor(markdown-parser): promote list structural tokens from skipped trivia to explicit CST nodes by jfmcdowell · Pull Request #9274 · biomejs/biome

jfmcdowell · 2026-02-28T14:24:45Z

Note

AI Assistance Disclosure: This PR was developed with assistance from Claude Code.

Summary

Promote list structural tokens (pre-marker indent, marker, post-marker space, content indent) from skipped trivia to explicit CST nodes, mirroring the MdQuotePrefix pattern established in Phase 1.
Introduce MdListMarkerPrefix, MdIndentToken, and MdIndentTokenList to the markdown grammar, making list indentation structure visible and traversable in the CST.
Replace skip_list_marker_indent() (which discarded whitespace as trivia) with emit_indent_char_list() that wraps each indent character in a proper node.
Emit MD_LIST_POST_MARKER_SPACE as an explicit token instead of silently consuming it.
Remove trim_range() — a legacy workaround that tried to normalize node ranges by stripping leading/trailing whitespace. With structural tokens now in the CST, raw node ranges are correct by construction.
Fix to_html.rs to navigate the new CST shape (bullet.prefix().marker() instead of bullet.bullet()) and correctly handle leading newlines in list item rendering.
Add verbatim formatter stubs for all new node types.

Test Plan

cargo test -p biome_markdown_parser
just test-markdown-conformance

Results:

Parser tests pass (129 total, 0 failures)
CommonMark conformance passes (652/652, 100%)

Docs

N/A — internal parser refactor with no user-facing behavior change.

changeset-bot · 2026-02-28T14:24:49Z

⚠️ No Changeset found

Latest commit: eeedde8

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

…d trivia to explicit CST nodes Introduce MdListMarkerPrefix to wrap list marker structure (pre-marker indent, marker, post-marker space, content indent) as real CST nodes instead of skipped trivia. This mirrors the MdQuotePrefix pattern from Phase 1 and makes list structure visible to the formatter harness. - Add MdListMarkerPrefix, MdIndentToken, MdIndentTokenList to grammar - Replace skip_list_marker_indent() with emit_indent_char_list() - Emit MD_LIST_POST_MARKER_SPACE as an explicit token - Promote marker-only line newline to MdNewline node - Remove trim_range(); use raw node ranges for metadata recording - Fix to_html.rs renderer to handle new CST shape correctly - Add verbatim formatter stubs for new node types

coderabbitai · 2026-02-28T15:48:45Z

Walkthrough

This pull request introduces structured representation of Markdown indentation and list marker prefixes. The parser changes convert previously skipped range trivia into explicit CST nodes (MdIndentToken, MdListMarkerPrefix, MdIndentTokenList), refactoring list item parsing to emit these nodes directly. The grammar is updated to decompose list bullets into prefix structures containing pre-marker indent, marker, post-marker space, and content indent. Corresponding formatter implementations and HTML generation logic are updated to handle these new node types.

Possibly related PRs

fix(markdown-parser): promote blockquote prefix markers from skipped trivia to explicit CST nodes #9219: Promotes previously-skipped prefix trivia into explicit CST nodes with similar emission patterns and formatter rule registration approach
refactor(markdown): split list item parser #9167: Modifies list item parsing logic in the same crates/biome_markdown_parser/src/syntax/list.rs file with related refactoring of prefix and indentation handling
feat(formatter): set up boiletplate for markdown formatter #8962: Extends Markdown formatter wiring with generated formatter implementations and auxiliary module infrastructure

Suggested reviewers

ematipico

🚥 Pre-merge checks | ✅ 2

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: promoting list structural tokens from trivia to explicit CST nodes, which is reflected throughout the changeset.
Description check	✅ Passed	The description comprehensively relates to the changeset, detailing the promotion of structural tokens, new grammar nodes, refactored parsing logic, and testing results.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/biome_markdown_parser/src/syntax/list.rs`:
- Around line 750-757: The code currently skips emitting content indent when
first_line_empty is true, which leaves extra spaces before NEWLINE (e.g. "-  
\n") and prevents handle_first_line_marker_only from seeing NEWLINE; update the
condition in the block around emit_indent_char_list so that whenever
!setext_marker && spaces_after_marker > 1 you call emit_indent_char_list(p, 0)
(remove the first_line_empty requirement), and make the same change in the
corresponding block around lines 1015-1020; this ensures remaining whitespace is
emitted as MD_INDENT_TOKEN_LIST tokens so handle_first_line_marker_only (and
NEWLINE detection) works correctly.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 412a08d and eeedde8.

⛔ Files ignored due to path filters (17)

crates/biome_markdown_factory/src/generated/node_factory.rs is excluded by !**/generated/**, !**/generated/** and included by **
crates/biome_markdown_factory/src/generated/syntax_factory.rs is excluded by !**/generated/**, !**/generated/** and included by **
crates/biome_markdown_parser/tests/md_test_suite/ok/bullet_list.md.snap is excluded by !**/*.snap and included by **
crates/biome_markdown_parser/tests/md_test_suite/ok/lazy_continuation.md.snap is excluded by !**/*.snap and included by **
crates/biome_markdown_parser/tests/md_test_suite/ok/list_continuation_edge_cases.md.snap is excluded by !**/*.snap and included by **
crates/biome_markdown_parser/tests/md_test_suite/ok/list_indentation.md.snap is excluded by !**/*.snap and included by **
crates/biome_markdown_parser/tests/md_test_suite/ok/list_interrupt_bullet.md.snap is excluded by !**/*.snap and included by **
crates/biome_markdown_parser/tests/md_test_suite/ok/list_interrupt_ordered.md.snap is excluded by !**/*.snap and included by **
crates/biome_markdown_parser/tests/md_test_suite/ok/list_tightness.md.snap is excluded by !**/*.snap and included by **
crates/biome_markdown_parser/tests/md_test_suite/ok/multiline_list.md.snap is excluded by !**/*.snap and included by **
crates/biome_markdown_parser/tests/md_test_suite/ok/ordered_list.md.snap is excluded by !**/*.snap and included by **
crates/biome_markdown_parser/tests/md_test_suite/ok/paragraph_interruption.md.snap is excluded by !**/*.snap and included by **
crates/biome_markdown_parser/tests/md_test_suite/ok/setext_heading_edge_cases.md.snap is excluded by !**/*.snap and included by **
crates/biome_markdown_syntax/src/generated/kind.rs is excluded by !**/generated/**, !**/generated/** and included by **
crates/biome_markdown_syntax/src/generated/macros.rs is excluded by !**/generated/**, !**/generated/** and included by **
crates/biome_markdown_syntax/src/generated/nodes.rs is excluded by !**/generated/**, !**/generated/** and included by **
crates/biome_markdown_syntax/src/generated/nodes_mut.rs is excluded by !**/generated/**, !**/generated/** and included by **

📒 Files selected for processing (11)

crates/biome_markdown_formatter/src/generated.rs
crates/biome_markdown_formatter/src/markdown/auxiliary/indent_token.rs
crates/biome_markdown_formatter/src/markdown/auxiliary/list_marker_prefix.rs
crates/biome_markdown_formatter/src/markdown/auxiliary/mod.rs
crates/biome_markdown_formatter/src/markdown/lists/indent_token_list.rs
crates/biome_markdown_formatter/src/markdown/lists/mod.rs
crates/biome_markdown_parser/src/parser.rs
crates/biome_markdown_parser/src/syntax/list.rs
crates/biome_markdown_parser/src/to_html.rs
xtask/codegen/markdown.ungram
xtask/codegen/src/markdown_kinds_src.rs

💤 Files with no reviewable changes (1)

crates/biome_markdown_parser/src/parser.rs

coderabbitai · 2026-02-28T15:48:48Z

crates/biome_markdown_parser/src/syntax/list.rs

+    // Content indent (remaining whitespace tokens on first line)
+    if !setext_marker && !first_line_empty && spaces_after_marker > 1 {
+        emit_indent_char_list(p, 0);
+    } else {
+        // Empty first line or no content indent -- emit empty MdIndentTokenList
+        let empty_m = p.start();
+        empty_m.complete(p, MD_INDENT_TOKEN_LIST);
+    }


⚠️ Potential issue | 🟠 Major

Marker-only lines with extra spaces can miss the newline path.

On Line 751 and Line 1015, content indent is skipped when first_line_empty is true. For inputs like - \n, only one post-marker space is consumed, leaving remaining whitespace before NEWLINE. Because handle_first_line_marker_only on Line 1540 requires being at NEWLINE, marker-only handling can be bypassed.

💡 Suggested fix

- if !setext && !first_line_empty && spaces_after_marker > 1 { + if !setext && spaces_after_marker > 1 { emit_indent_char_list(p, 0); } else { // Empty first line or no content indent -- emit empty MdIndentTokenList let empty_m = p.start(); empty_m.complete(p, MD_INDENT_TOKEN_LIST); }

- if !first_line_empty && spaces_after_marker > 1 { + if spaces_after_marker > 1 { emit_indent_char_list(p, 0); } else { let empty_m = p.start(); empty_m.complete(p, MD_INDENT_TOKEN_LIST); }

Also applies to: 1015-1020

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@crates/biome_markdown_parser/src/syntax/list.rs` around lines 750 - 757, The code currently skips emitting content indent when first_line_empty is true, which leaves extra spaces before NEWLINE (e.g. "- \n") and prevents handle_first_line_marker_only from seeing NEWLINE; update the condition in the block around emit_indent_char_list so that whenever !setext_marker && spaces_after_marker > 1 you call emit_indent_char_list(p, 0) (remove the first_line_empty requirement), and make the same change in the corresponding block around lines 1015-1020; this ensures remaining whitespace is emitted as MD_INDENT_TOKEN_LIST tokens so handle_first_line_marker_only (and NEWLINE detection) works correctly.

dyc3

At a glance this looks like you're going in the right direction.

github-actions bot added A-Parser Area: parser A-Formatter Area: formatter A-Tooling Area: internal tools labels Feb 28, 2026

jfmcdowell force-pushed the refactor/list-marker-prefix branch 3 times, most recently from c65ad0f to c31f169 Compare February 28, 2026 14:53

jfmcdowell changed the title ~~fix(markdown-parser): restore metadata range contract for list/quote rendering~~ refactor(markdown-parser): promote list structural tokens from skipped trivia to explicit CST nodes Feb 28, 2026

jfmcdowell force-pushed the refactor/list-marker-prefix branch from c31f169 to eeedde8 Compare February 28, 2026 15:39

jfmcdowell marked this pull request as ready for review February 28, 2026 15:41

coderabbitai bot reviewed Feb 28, 2026

View reviewed changes

dyc3 approved these changes Feb 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor(markdown-parser): promote list structural tokens from skipped trivia to explicit CST nodes#9274

refactor(markdown-parser): promote list structural tokens from skipped trivia to explicit CST nodes#9274
jfmcdowell wants to merge 1 commit intobiomejs:mainfrom
jfmcdowell:refactor/list-marker-prefix

jfmcdowell commented Feb 28, 2026 •

edited

Loading

Uh oh!

changeset-bot bot commented Feb 28, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Feb 28, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 28, 2026

Uh oh!

dyc3 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

jfmcdowell commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Plan

Docs

Uh oh!

changeset-bot bot commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

coderabbitai bot commented Feb 28, 2026

Walkthrough

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

dyc3 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jfmcdowell commented Feb 28, 2026 •

edited

Loading

changeset-bot bot commented Feb 28, 2026 •

edited

Loading