Skip to content

Extract anchors from AST#13

Merged
frankharkins merged 9 commits intomainfrom
FH/ast-anchors
Jan 29, 2026
Merged

Extract anchors from AST#13
frankharkins merged 9 commits intomainfrom
FH/ast-anchors

Conversation

@frankharkins
Copy link
Copy Markdown
Collaborator

@frankharkins frankharkins commented Jan 27, 2026

Closes #11 and fixes #12

  • 549c592: A prefactor moving the AST-parsing and walking to its own module. The mdx::walk_ast function calls the links::extract_from_node for each AST node.
  • 7291e7f: Add tests for new behaviour
  • 38f0b72: The main change in this PR, extracting anchors from the AST. For headings, we concatenate the values of all text and inlinecode nodes. For ids, we look for the id prop.
  • 38c2c67: A quick follow-up to handle inline math. Tests expectations come from the website anchors.

The rest of the commits are minor changes (renaming / formatting / etc.)

I ran this new code against Qiskit/documentation and it (correctly) detects one broken link.

@frankharkins frankharkins marked this pull request as ready for review January 27, 2026 17:21
Copy link
Copy Markdown
Collaborator

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent!

Btw, I recommed simplifying the file organization to not use a folder per file, given how simple our project is. You should be able to move, for example, src/anchors/mod.rs to be src/anchors.rs.

for line in markdown.split("\n") {
if let Some(heading) = get_first_capture(line, &heading_regex) {
let anchor = heading_to_anchor(heading);
/// Given any markdown node, extract the
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incomplete comment

Comment on lines -47 to -51
if let Some(children) = node.children() {
for child in children {
extract_from_node(child, links);
}
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this part removed?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's moved to

/// Walk the markdown AST and call a function on each node
pub fn walk_ast<'a>(node: &'a Node, f: &mut impl FnMut(&'a Node) -> ()) -> () {
f(node);
if let Some(children) = node.children() {
for child in children {
walk_ast(child, f);
}
}
}

Before, extract_from_node was the only function that used the AST, so it walked it too. Now we walk it in walk_ast.

@frankharkins frankharkins merged commit 862f5c1 into main Jan 29, 2026
15 checks passed
@frankharkins frankharkins deleted the FH/ast-anchors branch January 29, 2026 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Anchors incorrectly include emphasis in headings Extract anchors from AST

2 participants