Skip to content

fix(richtext)!: HTML and Markdown parsing for links and edge cases#369

Open
maoberlehner wants to merge 34 commits intomainfrom
bugfix/richtext-html-parser
Open

fix(richtext)!: HTML and Markdown parsing for links and edge cases#369
maoberlehner wants to merge 34 commits intomainfrom
bugfix/richtext-html-parser

Conversation

@maoberlehner
Copy link
Contributor

@maoberlehner maoberlehner commented Nov 7, 2025

Parsing did not work correctly for certain combinations of <a> and nested tags inside or around it.

To ensure parsing is aligned with the Tiptap editor format, we now use the Tiptap editor's generateJSON command, which gives us HTML-to-richtext conversion out of the box.

BREAKING CHANGE: Instead of our own resolver format, we now allow users to override the Tiptap extensions used for parsing.

Fixes WDX-141


Note

Replaces custom HTML/Markdown parsing in packages/richtext with Tiptap-based parsing (via @tiptap/html + extensions), updates tests, and refreshes frameworks/deps across the repo.

  • Richtext parsing (BREAKING):
    • Replace custom HTML/Markdown parsers with Tiptap JSON generation using @tiptap/html and a full set of Tiptap extensions.
    • Remove legacy parsing deps (node-html-parser, markdown-it-github), add Tiptap packages.
    • Update unit tests for new parsing behavior.
    • Parsing customization now via Tiptap extensions instead of previous resolver format.
  • Tooling/Deps:
    • Bump React to 19.2.0, Vue to 3.5.22, Next-related deps, and Nuxt/Vite stacks in examples/playgrounds.
    • Add happy-dom to Vitest where used and perform miscellaneous lockfile upgrades.

Written by Cursor Bugbot for commit f128bb6. This will update automatically on new commits. Configure here.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors HTML and Markdown parsing to use Tiptap's built-in generateJSON command, fixing edge cases with links and nested tags that weren't handled correctly by the previous custom parser.

  • Replaced custom HTML parsing logic with Tiptap's generateJSON for more robust HTML-to-richtext conversion
  • Simplified Markdown parser to convert Markdown to HTML first, then use the HTML parser
  • Changed the API from custom resolvers to Tiptap tipTapExtensions for customization

Reviewed Changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
packages/richtext/src/markdown-parser.ts Simplified to render Markdown as HTML then delegate to HTML parser
packages/richtext/src/markdown-parser.test.ts Updated tests to use Tiptap extensions and reflect new output format
packages/richtext/src/html-parser.ts Complete rewrite using Tiptap's generateJSON with custom extensions
packages/richtext/src/html-parser.test.ts Updated tests for new Tiptap-based implementation
packages/richtext/package.json Added Tiptap dependencies, removed node-html-parser

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@pkg-pr-new
Copy link

pkg-pr-new bot commented Nov 7, 2025

Open in StackBlitz

@storyblok/astro

npm i https://pkg.pr.new/@storyblok/astro@369

storyblok

npm i https://pkg.pr.new/storyblok@369

@storyblok/eslint-config

npm i https://pkg.pr.new/@storyblok/eslint-config@369

@storyblok/js

npm i https://pkg.pr.new/@storyblok/js@369

storyblok-js-client

npm i https://pkg.pr.new/storyblok-js-client@369

@storyblok/management-api-client

npm i https://pkg.pr.new/@storyblok/management-api-client@369

@storyblok/nuxt

npm i https://pkg.pr.new/@storyblok/nuxt@369

@storyblok/react

npm i https://pkg.pr.new/@storyblok/react@369

@storyblok/region-helper

npm i https://pkg.pr.new/@storyblok/region-helper@369

@storyblok/richtext

npm i https://pkg.pr.new/@storyblok/richtext@369

@storyblok/svelte

npm i https://pkg.pr.new/@storyblok/svelte@369

@storyblok/vue

npm i https://pkg.pr.new/@storyblok/vue@369

commit: 7e651a5

Parsing did not work correctly for certain combinations of `<a>` and
nested tags inside or around it.

To ensure parsing is aligned with the Tiptap editor format, we now use
the Tiptap editor's `generateJSON` command, which gives us
HTML-to-richtext conversion out of the box.

BREAKING CHANGE: Instead of our own resolver format, we now allow users
to override the Tiptap extensions used for parsing.

Fixes WDX-141
@dipankarmaikap
Copy link
Contributor

Hey @maoberlehner, with this new implementation, it won’t work as expected. As I suspected, we need to follow the specific naming pattern we use internally — for example, bullet_list, list_item, and ordered_list. These are case-sensitive.

The good news is that we can configure the Tiptap extensions to align with our internal structure. I did a quick test, and after making the following changes to the list items, it worked correctly:

const defaultExtensions = {
  bulletList: BulletList.configure({
    itemTypeName: 'list_item',
  }).extend({
    name: 'bullet_list',
  }),
  listItem: ListItem.configure({
    bulletListTypeName: 'bullet_list',
    orderedListTypeName: 'ordered_list',
  }).extend({
    name: 'list_item',
  }),
  orderedList: OrderedList.configure({
    itemTypeName: 'list_item',
  }).extend({
    name: 'ordered_list',
  }),
  // ...
}

This only covers these specific list-related types — we’ll need to check and match the rest of the node types as well. I haven’t tested all the possible ones yet.

@maoberlehner
Copy link
Contributor Author

@dipankarmaikap thank you for spotting this issue and your recommendations on how to work around it!

I'd love to follow the Tiptap editor defaults because this would enable us to simplify our renderers, too. That's why I reached out to the product team, asking if they could add support for both cases in Storyblok (https://storyblok.slack.com/archives/G01AWKD1FGC/p1762846202774159). If this is possible, I'd add support for both cases in our renderer too and keep this code as is. If it's not possible, I'll go with your approach.

@alexjoverm alexjoverm self-assigned this Feb 9, 2026
@alexjoverm alexjoverm removed the request for review from alvarosabu February 9, 2026 15:21
@alexjoverm alexjoverm marked this pull request as draft February 9, 2026 15:23
@alexjoverm alexjoverm added the richtext [Package] `storyblok-richtext` related issues. label Feb 9, 2026
blok,
key,
});
},
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate React keys for multiple bloks in body

Medium Severity

The renderComponent callback receives the same id (the blok node's node.attrs.id) for every blok in the body array, and uses it directly as the React key. When a blok node contains multiple components in its body, all rendered elements get the same key. The old createComponentResolver avoided this by appending -${index} to each key via body.map((blok, index) => ...). This regression causes React duplicate-key warnings and incorrect reconciliation when multiple blok components share a single body array.

Additional Locations (1)

Fix in Cursor Fix in Web

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexjoverm might be a valid concern!

@maoberlehner
Copy link
Contributor Author

@alexjoverm did you consider using renderToReactElement? It seems like the idiomatic way to render tiptap to React. Unfortunately, TipTap provides no other renderers, but I think it would be easy for Claude Code to create renderToVueElement (and maybe others).

I understand that this PR has been open for some time, but I find it very tempting to go fully idiomatic and, at least for React, reduce our custom code to an absolute minimum. And there is some hope that TipTap will provide renderers for other frameworks too at some point, and we can delete even more code.

@maoberlehner
Copy link
Contributor Author

Example for renderToReactElement usage:

renderToReactElement({
  content: json,
  extensions: [StarterKit, MyCustomNodeExtension],
  options: {
    nodeMapping: {
      heading({ node, children }) {
        return <h1 className="custom-heading">{children}</h1>
      },
      // your custom node types work here too
      myWidget({ node }) {
        return <MyWidgetComponent data={node.attrs} />
      },
    },
  },
})

@alexjoverm
Copy link
Contributor

alexjoverm commented Feb 25, 2026

@maoberlehner no worries, it's a great suggestion! I share the vision of using the already provided react renderer, and going fully idiomatic per framework, reducing our custom code.

That said, for this PR specifically I'd prefer to keep the current approach because:

  • it would mean maintaining two rendering approaches (React via Tiptap, and the framework-agnostic one)
  • we still have custom logic deferring from TipTap, and quite some things to move around to change the approach for React
  • as this PR originated from a high-prio customer issue and has grown into a major richtext revamp already, adding an internal architecture change on top would increase the scope and time considerable.

Even though AI can accelerate the refactor, this kind of internal architecture change still needs thorough review and iteration before shipping to production, so it's not something we'd rush. But definitely something we can revisit later. Would you be ok with tackling the renderToReactElement refactor as a separate effort?

@maoberlehner
Copy link
Contributor Author

@maoberlehner no worries, it's a great suggestion! I share the vision of using the already provided react renderer, and going fully idiomatic per framework, reducing our custom code.

That said, for this PR specifically I'd prefer to keep the current approach because:

  • it would mean maintaining two rendering approaches (React via Tiptap, and the framework-agnostic one)
  • we still have custom logic deferring from TipTap, and quite some things to move around to change the approach for React
  • as this PR originated from a high-prio customer issue and has grown into a major richtext revamp already, adding an internal architecture change on top would increase the scope and time considerable.

Even though AI can accelerate the refactor, this kind of internal architecture change still needs thorough review and iteration before shipping to production, so it's not something we'd rush. But definitely something we can revisit later. Would you be ok with tackling the renderToReactElement refactor as a separate effort?

sounds good!

Copy link
Contributor Author

@maoberlehner maoberlehner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huge step forward for our richtext implementation!

My dream is that we have a shared repo with product for all TipTap extensions (they need them too in their code), and that we can use TipTap static renderers, so this repo becomes just a thin wrapper around both. Maybe.. some time.. 🤩

"type": "link",
"attrs": {
"href": "hola@alvarosaburido.dev",
"href": "jane@example.com",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was checking if example.com is safe to use - it is!

example.com (served at https://example.com/) is a reserved “example” domain that’s maintained by the Internet Assigned Numbers Authority (IANA) for documentation/testing examples, and it’s not available for public registration or transfer.

Nice!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, honestly it's a big coincidence hahahaha. Great to know!

blok,
key,
});
},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexjoverm might be a valid concern!

@alexjoverm alexjoverm self-requested a review February 26, 2026 09:15
Copy link
Contributor

@alexjoverm alexjoverm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(on behalf of Markus) - Approved

Expose segmentStoryblokRichText to render rich text using framework-native rendering.
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable autofix in the Cursor dashboard.

...customResolvers,
};

const resolver = richTextResolver({ resolvers });
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified file is dead code after export removal

Low Severity

richTextToHTML.ts was updated to use the new tiptap extension pattern, but the same PR removes it from viteStaticCopy targets and from the client.ts re-export, and drops its type declaration from public.d.ts. The file is now unreachable dead code that was modified unnecessarily. It either needs to be deleted or re-exported if still intended to be part of the public API.

Additional Locations (1)

Fix in Cursor Fix in Web

<div set:html={renderedRichText} />

{
richTextSegments.map((segment, index) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dipankarmaikap with the current logic, how can a user override a mark or a node?

For example, on Vue or React, often the user wants to override a link to render to a RouterLink or a ReactRouter or Next.js Link component, instead of a plain <a> tag.

I'm not sure if Astro has their own Link component for routing, but in any case, sometimes users want to override default marks or nodes and use components instead.

How would they do it here? Can you add an override example here in the playground? For inspiration, you can see the link or heading override in the Next.js playground https://github.com/storyblok/monoblok/pull/369/changes?mode=single#diff-b31212b3df5bff9f904df8f375b02b0018045aeb174c18272c7f007a6720af3fR13-R21

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

richtext [Package] `storyblok-richtext` related issues.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants