fix(richtext)!: HTML and Markdown parsing for links and edge cases#369
fix(richtext)!: HTML and Markdown parsing for links and edge cases#369maoberlehner wants to merge 34 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR refactors HTML and Markdown parsing to use Tiptap's built-in generateJSON command, fixing edge cases with links and nested tags that weren't handled correctly by the previous custom parser.
- Replaced custom HTML parsing logic with Tiptap's
generateJSONfor more robust HTML-to-richtext conversion - Simplified Markdown parser to convert Markdown to HTML first, then use the HTML parser
- Changed the API from custom
resolversto TiptaptipTapExtensionsfor customization
Reviewed Changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| packages/richtext/src/markdown-parser.ts | Simplified to render Markdown as HTML then delegate to HTML parser |
| packages/richtext/src/markdown-parser.test.ts | Updated tests to use Tiptap extensions and reflect new output format |
| packages/richtext/src/html-parser.ts | Complete rewrite using Tiptap's generateJSON with custom extensions |
| packages/richtext/src/html-parser.test.ts | Updated tests for new Tiptap-based implementation |
| packages/richtext/package.json | Added Tiptap dependencies, removed node-html-parser |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
5cdcca2 to
6d45762
Compare
@storyblok/astro
storyblok
@storyblok/eslint-config
@storyblok/js
storyblok-js-client
@storyblok/management-api-client
@storyblok/nuxt
@storyblok/react
@storyblok/region-helper
@storyblok/richtext
@storyblok/svelte
@storyblok/vue
commit: |
Parsing did not work correctly for certain combinations of `<a>` and nested tags inside or around it. To ensure parsing is aligned with the Tiptap editor format, we now use the Tiptap editor's `generateJSON` command, which gives us HTML-to-richtext conversion out of the box. BREAKING CHANGE: Instead of our own resolver format, we now allow users to override the Tiptap extensions used for parsing. Fixes WDX-141
6d45762 to
f128bb6
Compare
|
Hey @maoberlehner, with this new implementation, it won’t work as expected. As I suspected, we need to follow the specific naming pattern we use internally — for example, The good news is that we can configure the Tiptap extensions to align with our internal structure. I did a quick test, and after making the following changes to the list items, it worked correctly: const defaultExtensions = {
bulletList: BulletList.configure({
itemTypeName: 'list_item',
}).extend({
name: 'bullet_list',
}),
listItem: ListItem.configure({
bulletListTypeName: 'bullet_list',
orderedListTypeName: 'ordered_list',
}).extend({
name: 'list_item',
}),
orderedList: OrderedList.configure({
itemTypeName: 'list_item',
}).extend({
name: 'ordered_list',
}),
// ...
}This only covers these specific list-related types — we’ll need to check and match the rest of the node types as well. I haven’t tested all the possible ones yet. |
|
@dipankarmaikap thank you for spotting this issue and your recommendations on how to work around it! I'd love to follow the Tiptap editor defaults because this would enable us to simplify our renderers, too. That's why I reached out to the product team, asking if they could add support for both cases in Storyblok (https://storyblok.slack.com/archives/G01AWKD1FGC/p1762846202774159). If this is possible, I'd add support for both cases in our renderer too and keep this code as is. If it's not possible, I'll go with your approach. |
| blok, | ||
| key, | ||
| }); | ||
| }, |
There was a problem hiding this comment.
Duplicate React keys for multiple bloks in body
Medium Severity
The renderComponent callback receives the same id (the blok node's node.attrs.id) for every blok in the body array, and uses it directly as the React key. When a blok node contains multiple components in its body, all rendered elements get the same key. The old createComponentResolver avoided this by appending -${index} to each key via body.map((blok, index) => ...). This regression causes React duplicate-key warnings and incorrect reconciliation when multiple blok components share a single body array.
Additional Locations (1)
|
@alexjoverm did you consider using renderToReactElement? It seems like the idiomatic way to render tiptap to React. Unfortunately, TipTap provides no other renderers, but I think it would be easy for Claude Code to create I understand that this PR has been open for some time, but I find it very tempting to go fully idiomatic and, at least for React, reduce our custom code to an absolute minimum. And there is some hope that TipTap will provide renderers for other frameworks too at some point, and we can delete even more code. |
|
Example for renderToReactElement({
content: json,
extensions: [StarterKit, MyCustomNodeExtension],
options: {
nodeMapping: {
heading({ node, children }) {
return <h1 className="custom-heading">{children}</h1>
},
// your custom node types work here too
myWidget({ node }) {
return <MyWidgetComponent data={node.attrs} />
},
},
},
}) |
|
@maoberlehner no worries, it's a great suggestion! I share the vision of using the already provided react renderer, and going fully idiomatic per framework, reducing our custom code. That said, for this PR specifically I'd prefer to keep the current approach because:
Even though AI can accelerate the refactor, this kind of internal architecture change still needs thorough review and iteration before shipping to production, so it's not something we'd rush. But definitely something we can revisit later. Would you be ok with tackling the |
586d3ab to
d97f3c9
Compare
sounds good! |
maoberlehner
left a comment
There was a problem hiding this comment.
Huge step forward for our richtext implementation!
My dream is that we have a shared repo with product for all TipTap extensions (they need them too in their code), and that we can use TipTap static renderers, so this repo becomes just a thin wrapper around both. Maybe.. some time.. 🤩
| "type": "link", | ||
| "attrs": { | ||
| "href": "hola@alvarosaburido.dev", | ||
| "href": "jane@example.com", |
There was a problem hiding this comment.
Was checking if example.com is safe to use - it is!
example.com (served at https://example.com/) is a reserved “example” domain that’s maintained by the Internet Assigned Numbers Authority (IANA) for documentation/testing examples, and it’s not available for public registration or transfer.
Nice!
There was a problem hiding this comment.
Oh, honestly it's a big coincidence hahahaha. Great to know!
| blok, | ||
| key, | ||
| }); | ||
| }, |
alexjoverm
left a comment
There was a problem hiding this comment.
(on behalf of Markus) - Approved
Expose segmentStoryblokRichText to render rich text using framework-native rendering.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable autofix in the Cursor dashboard.
| ...customResolvers, | ||
| }; | ||
|
|
||
| const resolver = richTextResolver({ resolvers }); |
There was a problem hiding this comment.
Modified file is dead code after export removal
Low Severity
richTextToHTML.ts was updated to use the new tiptap extension pattern, but the same PR removes it from viteStaticCopy targets and from the client.ts re-export, and drops its type declaration from public.d.ts. The file is now unreachable dead code that was modified unnecessarily. It either needs to be deleted or re-exported if still intended to be part of the public API.
Additional Locations (1)
| <div set:html={renderedRichText} /> | ||
|
|
||
| { | ||
| richTextSegments.map((segment, index) => { |
There was a problem hiding this comment.
@dipankarmaikap with the current logic, how can a user override a mark or a node?
For example, on Vue or React, often the user wants to override a link to render to a RouterLink or a ReactRouter or Next.js Link component, instead of a plain <a> tag.
I'm not sure if Astro has their own Link component for routing, but in any case, sometimes users want to override default marks or nodes and use components instead.
How would they do it here? Can you add an override example here in the playground? For inspiration, you can see the link or heading override in the Next.js playground https://github.com/storyblok/monoblok/pull/369/changes?mode=single#diff-b31212b3df5bff9f904df8f375b02b0018045aeb174c18272c7f007a6720af3fR13-R21


Parsing did not work correctly for certain combinations of
<a>and nested tags inside or around it.To ensure parsing is aligned with the Tiptap editor format, we now use the Tiptap editor's
generateJSONcommand, which gives us HTML-to-richtext conversion out of the box.BREAKING CHANGE: Instead of our own resolver format, we now allow users to override the Tiptap extensions used for parsing.
Fixes WDX-141
Note
Replaces custom HTML/Markdown parsing in
packages/richtextwith Tiptap-based parsing (via@tiptap/html+ extensions), updates tests, and refreshes frameworks/deps across the repo.@tiptap/htmland a full set of Tiptap extensions.node-html-parser,markdown-it-github), add Tiptap packages.19.2.0, Vue to3.5.22, Next-related deps, and Nuxt/Vite stacks in examples/playgrounds.happy-domto Vitest where used and perform miscellaneous lockfile upgrades.Written by Cursor Bugbot for commit f128bb6. This will update automatically on new commits. Configure here.