Skip to content

Conversation

@kiycoh
Copy link

@kiycoh kiycoh commented Nov 25, 2025

Problem: Previous implementation relied on a regex ==([^=]+)== which attempted to match the highlight within a single text node. This logic fails when the highlighted content contains other Markdown elements (like bold, [[links]], or italics).

Why? When Markdown is nested, the parser splits the content into multiple nodes (e.g., Text Node + Strong Node + Text Node), causing the regex to miss the closing delimiter if it resides in a different node than the opening one.

Solution: I have refactored ofm.ts to move away from regex-based matching on raw text strings to an AST (Abstract Syntax Tree) manipulation approach.

Key changes

  • AST Traversal: Instead of matching strings, the logic now scans the syntax tree.
  • Two-Phase Logic (Unified):
    • Marking: Identifies == tokens and replaces them with temporary markers, handling text node splits correctly.
    • Grouping: Scans children nodes to find marker pairs. It collects all content between the opening and closing markers (including nested HTML/Markdown nodes) and wraps them in the tag.
  • Complexity: The iteration over children nodes is now done in reverse. This simplifies the logic by preventing index shifting issues when nodes are replaced or modified during the loop, no while loop management. All tests are ok and linted with Prettier.

Before & After

Sample text:

==L'**Ingegneria del software** è una disciplina ingegneristica [[Informatica]] che caratterizza l'intero processo di produzione di un *[[software]]*==

Before:
image

After:
image

* Replace regex with AST manipulation to handle split text nodes
* Implement marking and grouping logic to wrap mixed content.
* Unify plugins
@github-actions
Copy link

github-actions bot commented Nov 25, 2025

built with Refined Cloudflare Pages Action

⚡ Cloudflare Pages Deployment

Name Status Preview Last Commit
quartz ✅ Ready (View Log) Visit Preview 23d2cfb

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the Obsidian Flavored Markdown highlight feature (==text==) to support nested markdown elements like bold, italics, and wikilinks within highlights. The previous regex-based approach failed when highlighted content was split across multiple AST nodes due to nested markdown syntax.

Key Changes

  • Changed from regex text matching (==([^=]+)==) to AST-based token replacement and traversal
  • Implemented a two-phase approach: marking == tokens as temporary nodes, then grouping content between marker pairs
  • Processes marker pairs in reverse order to avoid index-shifting issues during node replacement

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

plugins.push(() => {
return (tree: Root) => {
mdastFindReplace(tree, [
[highlightRegex, (_value: string) => ({ type: "highlightMarker" }) as any],
Copy link

Copilot AI Nov 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The as any type assertion bypasses TypeScript's type safety. Consider defining a proper interface for the custom highlightMarker node type to maintain type safety throughout the transformation pipeline.

interface HighlightMarker {
  type: "highlightMarker"
}

// Then use:
[highlightRegex, (_value: string): HighlightMarker => ({ type: "highlightMarker" })]
Suggested change
[highlightRegex, (_value: string) => ({ type: "highlightMarker" }) as any],
[highlightRegex, (_value: string): HighlightMarker => ({ type: "highlightMarker" })],

Copilot uses AI. Check for mistakes.

visit(tree, (node) => {
if (!("children" in node)) return
const children = (node as any).children as any[]
Copy link

Copilot AI Nov 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiple as any type assertions are used here to bypass TypeScript's type checking. Consider properly typing the node types:

import { Parent } from 'mdast'

// In the visit callback:
if (!('children' in node)) return
const parent = node as Parent
const children = parent.children
Suggested change
const children = (node as any).children as any[]
const children = (node as Parent).children

Copilot uses AI. Check for mistakes.
const htmlContent = content
.map((n) => {
const hast = toHast(n, { allowDangerousHtml: true })
return toHtml(hast as any, { allowDangerousHtml: true })
Copy link

Copilot AI Nov 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another as any type assertion. The toHtml function returns a string, so this cast is unnecessary. Consider removing it for better type safety.

Suggested change
return toHtml(hast as any, { allowDangerousHtml: true })
return toHtml(hast, { allowDangerousHtml: true })

Copilot uses AI. Check for mistakes.
}

if (markers.length < 2) return

Copy link

Copilot AI Nov 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When there's an odd number of == markers, the last unpaired marker will be silently ignored. Consider adding a warning or handling this case explicitly to help users identify unclosed highlight markers in their content. For example:

if (markers.length < 2) return

// Check for odd number of markers
if (markers.length % 2 !== 0) {
  // Option 1: Log a warning
  console.warn(`Unpaired highlight marker found in content`)
  // Option 2: Remove the last unpaired marker
  children.splice(markers[markers.length - 1], 1)
}
Suggested change
// Check for odd number of markers
if (markers.length % 2 !== 0) {
console.warn("Unpaired highlight marker found in content");
// Remove the last unpaired marker
children.splice(markers[markers.length - 1], 1);
// Remove the last marker from the list
markers.pop();
}

Copilot uses AI. Check for mistakes.
Comment on lines 392 to 396
visit(tree, (node) => {
if (!("children" in node)) return
const children = (node as any).children as any[]

const markers: number[] = []
Copy link

Copilot AI Nov 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Consider adding an early return if there are no markers at all to avoid unnecessary processing:

visit(tree, (node) => {
  if (!("children" in node)) return
  const children = (node as any).children as any[]

  // Early return if no children
  if (!children || children.length === 0) return

  const markers: number[] = []
  for (let i = 0; i < children.length; i++) {
    if (children[i].type === "highlightMarker") {
      markers.push(i)
    }
  }

  if (markers.length < 2) return
  // ...
})

Copilot uses AI. Check for mistakes.
Copy link
Author

@kiycoh kiycoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Type Safety Improvements
  2. Safer Node Handling

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

for (let i = 0; i < markers.length - 1; i += 2) {
pairs.push([markers[i], markers[i + 1]])
}

Copy link

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When there is an odd number of highlight markers (e.g., ==text1== some text ==text2), the last unpaired marker remains in the AST as a highlightMarker node. This could cause rendering issues or appear as unexpected content in the output.

Consider adding logic to handle odd markers, such as:

  • Removing unpaired markers after processing pairs
  • Converting unpaired markers back to their original == text
  • Logging a warning about malformed highlight syntax

This would make the behavior more robust and predictable when users have malformed markup.

Suggested change
if (markers.length % 2 === 1) {
const lastMarkerIndex = markers[markers.length - 1]
// Convert the unpaired marker back into its literal representation
children[lastMarkerIndex] = { type: "text", value: "==" } as any
}

Copilot uses AI. Check for mistakes.
Comment on lines +406 to +443
visit(tree, (node) => {
if (!("children" in node)) return
const parent = node as Parent
const children = parent.children
if (children.length === 0) return

const markers: number[] = []
for (let i = 0; i < children.length; i++) {
if (isHighlightMarker(children[i])) {
markers.push(i)
}
}

if (markers.length < 2) return

const pairs: [number, number][] = []
for (let i = 0; i < markers.length - 1; i += 2) {
pairs.push([markers[i], markers[i + 1]])
}

for (let i = pairs.length - 1; i >= 0; i--) {
const [start, end] = pairs[i]
const content = children.slice(start + 1, end)
const htmlContent = content
.map((n) => {
const hast = toHast(n, { allowDangerousHtml: true })
return hast ? toHtml(hast, { allowDangerousHtml: true }) : ""
})
.join("")

const newNode: Html = {
type: "html",
value: `<span class="text-highlight">${htmlContent}</span>`,
}

children.splice(start, end - start + 1, newNode)
}
})
Copy link

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current implementation only matches highlight markers within a single parent node's immediate children. If a highlight spans across different structural boundaries (e.g., across list items or block quotes), the opening and closing markers won't be paired, leaving unpaired markers in the output.

While this may be acceptable for typical use cases (highlights within paragraphs), it's worth noting this limitation. Consider adding a comment explaining that highlights must be contained within a single parent node, or alternatively, implement a more sophisticated cross-boundary matching algorithm if this is a common use case.

Copilot uses AI. Check for mistakes.
@aarnphm
Copy link
Collaborator

aarnphm commented Jan 13, 2026

Going to close this PR, because the proper fix is to implement micromark extensions for all of obsidian features, i.e marker, wikilinks, tags, recursive callout, etc.

but feel free to use this in your own repo (calling toHast and toHtml is considered antipattern within a mdast->hast pipeline afaik and from experience)

@aarnphm aarnphm closed this Jan 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants