-
-
Notifications
You must be signed in to change notification settings - Fork 3.4k
fix(ofm): support nested markdown in highlights #2223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* Replace regex with AST manipulation to handle split text nodes * Implement marking and grouping logic to wrap mixed content. * Unify plugins
built with Refined Cloudflare Pages Action⚡ Cloudflare Pages Deployment
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR refactors the Obsidian Flavored Markdown highlight feature (==text==) to support nested markdown elements like bold, italics, and wikilinks within highlights. The previous regex-based approach failed when highlighted content was split across multiple AST nodes due to nested markdown syntax.
Key Changes
- Changed from regex text matching (
==([^=]+)==) to AST-based token replacement and traversal - Implemented a two-phase approach: marking
==tokens as temporary nodes, then grouping content between marker pairs - Processes marker pairs in reverse order to avoid index-shifting issues during node replacement
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| plugins.push(() => { | ||
| return (tree: Root) => { | ||
| mdastFindReplace(tree, [ | ||
| [highlightRegex, (_value: string) => ({ type: "highlightMarker" }) as any], |
Copilot
AI
Nov 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The as any type assertion bypasses TypeScript's type safety. Consider defining a proper interface for the custom highlightMarker node type to maintain type safety throughout the transformation pipeline.
interface HighlightMarker {
type: "highlightMarker"
}
// Then use:
[highlightRegex, (_value: string): HighlightMarker => ({ type: "highlightMarker" })]| [highlightRegex, (_value: string) => ({ type: "highlightMarker" }) as any], | |
| [highlightRegex, (_value: string): HighlightMarker => ({ type: "highlightMarker" })], |
quartz/plugins/transformers/ofm.ts
Outdated
|
|
||
| visit(tree, (node) => { | ||
| if (!("children" in node)) return | ||
| const children = (node as any).children as any[] |
Copilot
AI
Nov 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multiple as any type assertions are used here to bypass TypeScript's type checking. Consider properly typing the node types:
import { Parent } from 'mdast'
// In the visit callback:
if (!('children' in node)) return
const parent = node as Parent
const children = parent.children| const children = (node as any).children as any[] | |
| const children = (node as Parent).children |
quartz/plugins/transformers/ofm.ts
Outdated
| const htmlContent = content | ||
| .map((n) => { | ||
| const hast = toHast(n, { allowDangerousHtml: true }) | ||
| return toHtml(hast as any, { allowDangerousHtml: true }) |
Copilot
AI
Nov 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another as any type assertion. The toHtml function returns a string, so this cast is unnecessary. Consider removing it for better type safety.
| return toHtml(hast as any, { allowDangerousHtml: true }) | |
| return toHtml(hast, { allowDangerousHtml: true }) |
| } | ||
|
|
||
| if (markers.length < 2) return | ||
|
|
Copilot
AI
Nov 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When there's an odd number of == markers, the last unpaired marker will be silently ignored. Consider adding a warning or handling this case explicitly to help users identify unclosed highlight markers in their content. For example:
if (markers.length < 2) return
// Check for odd number of markers
if (markers.length % 2 !== 0) {
// Option 1: Log a warning
console.warn(`Unpaired highlight marker found in content`)
// Option 2: Remove the last unpaired marker
children.splice(markers[markers.length - 1], 1)
}| // Check for odd number of markers | |
| if (markers.length % 2 !== 0) { | |
| console.warn("Unpaired highlight marker found in content"); | |
| // Remove the last unpaired marker | |
| children.splice(markers[markers.length - 1], 1); | |
| // Remove the last marker from the list | |
| markers.pop(); | |
| } |
| visit(tree, (node) => { | ||
| if (!("children" in node)) return | ||
| const children = (node as any).children as any[] | ||
|
|
||
| const markers: number[] = [] |
Copilot
AI
Nov 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Consider adding an early return if there are no markers at all to avoid unnecessary processing:
visit(tree, (node) => {
if (!("children" in node)) return
const children = (node as any).children as any[]
// Early return if no children
if (!children || children.length === 0) return
const markers: number[] = []
for (let i = 0; i < children.length; i++) {
if (children[i].type === "highlightMarker") {
markers.push(i)
}
}
if (markers.length < 2) return
// ...
})comment fix Co-authored-by: Copilot <[email protected]>
kiycoh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Type Safety Improvements
- Safer Node Handling
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| for (let i = 0; i < markers.length - 1; i += 2) { | ||
| pairs.push([markers[i], markers[i + 1]]) | ||
| } | ||
|
|
Copilot
AI
Jan 3, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When there is an odd number of highlight markers (e.g., ==text1== some text ==text2), the last unpaired marker remains in the AST as a highlightMarker node. This could cause rendering issues or appear as unexpected content in the output.
Consider adding logic to handle odd markers, such as:
- Removing unpaired markers after processing pairs
- Converting unpaired markers back to their original
==text - Logging a warning about malformed highlight syntax
This would make the behavior more robust and predictable when users have malformed markup.
| if (markers.length % 2 === 1) { | |
| const lastMarkerIndex = markers[markers.length - 1] | |
| // Convert the unpaired marker back into its literal representation | |
| children[lastMarkerIndex] = { type: "text", value: "==" } as any | |
| } |
| visit(tree, (node) => { | ||
| if (!("children" in node)) return | ||
| const parent = node as Parent | ||
| const children = parent.children | ||
| if (children.length === 0) return | ||
|
|
||
| const markers: number[] = [] | ||
| for (let i = 0; i < children.length; i++) { | ||
| if (isHighlightMarker(children[i])) { | ||
| markers.push(i) | ||
| } | ||
| } | ||
|
|
||
| if (markers.length < 2) return | ||
|
|
||
| const pairs: [number, number][] = [] | ||
| for (let i = 0; i < markers.length - 1; i += 2) { | ||
| pairs.push([markers[i], markers[i + 1]]) | ||
| } | ||
|
|
||
| for (let i = pairs.length - 1; i >= 0; i--) { | ||
| const [start, end] = pairs[i] | ||
| const content = children.slice(start + 1, end) | ||
| const htmlContent = content | ||
| .map((n) => { | ||
| const hast = toHast(n, { allowDangerousHtml: true }) | ||
| return hast ? toHtml(hast, { allowDangerousHtml: true }) : "" | ||
| }) | ||
| .join("") | ||
|
|
||
| const newNode: Html = { | ||
| type: "html", | ||
| value: `<span class="text-highlight">${htmlContent}</span>`, | ||
| } | ||
|
|
||
| children.splice(start, end - start + 1, newNode) | ||
| } | ||
| }) |
Copilot
AI
Jan 3, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation only matches highlight markers within a single parent node's immediate children. If a highlight spans across different structural boundaries (e.g., across list items or block quotes), the opening and closing markers won't be paired, leaving unpaired markers in the output.
While this may be acceptable for typical use cases (highlights within paragraphs), it's worth noting this limitation. Consider adding a comment explaining that highlights must be contained within a single parent node, or alternatively, implement a more sophisticated cross-boundary matching algorithm if this is a common use case.
|
Going to close this PR, because the proper fix is to implement micromark extensions for all of obsidian features, i.e marker, wikilinks, tags, recursive callout, etc. but feel free to use this in your own repo (calling toHast and toHtml is considered antipattern within a mdast->hast pipeline afaik and from experience) |
Problem: Previous implementation relied on a regex ==([^=]+)== which attempted to match the highlight within a single text node. This logic fails when the highlighted content contains other Markdown elements (like bold, [[links]], or italics).
Why? When Markdown is nested, the parser splits the content into multiple nodes (e.g., Text Node + Strong Node + Text Node), causing the regex to miss the closing delimiter if it resides in a different node than the opening one.
Solution: I have refactored ofm.ts to move away from regex-based matching on raw text strings to an AST (Abstract Syntax Tree) manipulation approach.
Key changes
==tokens and replaces them with temporary markers, handling text node splits correctly.Before & After
Sample text:
Before:

After:
