-
Notifications
You must be signed in to change notification settings - Fork 116
refactor: parse pasted content from html and md #163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Reviewer's GuideRefactors the editor paste pipeline to use AST-based HTML/Markdown parsing with smart, context-aware merging (including tables and code blocks), adjusts Yjs/Slate integration for new table types, and adds comprehensive unit/E2E tests and dependencies to validate the new behavior and support Jest/bundling. Sequence diagram for the new smart paste pipelinesequenceDiagram
actor "User"
participant "BrowserClipboard"
participant "ReactEditor"
participant "withPasted plugin"
participant "HTML parser"
participant "Markdown parser"
participant "Paste context analyzer"
participant "Paste merger"
participant "Yjs/Slate adapter"
"User" ->> "BrowserClipboard": "Press paste (Ctrl+V / Cmd+V)"
"BrowserClipboard" -->> "withPasted plugin": "Provide DataTransfer with 'text/html' and 'text/plain'"
"withPasted plugin" ->> "ReactEditor": "Override 'insertTextData(data)'"
"withPasted plugin" ->> "withPasted plugin": "Read 'html' and 'text' from DataTransfer"
alt "HTML present and non-empty"
"withPasted plugin" ->> "HTML parser": "handleHTMLPaste(editor, html, text)"
"HTML parser" ->> "HTML parser": "sanitizeHTML(html)"
"HTML parser" ->> "HTML parser": "deserializeHTMLToAST(safeHTML)"
"HTML parser" ->> "HTML parser": "elementToBlock() for each element"
"HTML parser" -->> "withPasted plugin": "Return 'ParsedBlock[]'"
else "No HTML, use plain text"
"withPasted plugin" ->> "withPasted plugin": "handlePlainTextPaste(editor, text)"
alt "Single line URL"
"withPasted plugin" ->> "withPasted plugin": "handleURLPaste(editor, url)"
"withPasted plugin" ->> "ReactEditor": "Transforms.insertNodes(link or video or page ref)"
else "Markdown detected"
"withPasted plugin" ->> "Markdown parser": "handleMarkdownPaste(editor, markdown)"
"Markdown parser" ->> "Markdown parser": "unified().use(remarkParse, remarkGfm)"
"Markdown parser" ->> "Markdown parser": "convertMarkdownASTToAppFlowyBlocks()"
"Markdown parser" -->> "withPasted plugin": "Return 'ParsedBlock[]'"
else "Plain multi-line text"
"withPasted plugin" ->> "withPasted plugin": "handleMultiLinePlainText(editor, lines)"
"withPasted plugin" -->> "withPasted plugin": "Build paragraph 'ParsedBlock[]'"
end
end
opt "Parsed blocks available"
"withPasted plugin" ->> "Paste context analyzer": "analyzePasteContext(editor)"
"Paste context analyzer" ->> "Paste context analyzer": "getBlockEntry(), getBlockTextContent()"
"Paste context analyzer" -->> "withPasted plugin": "Return 'PasteContext' (blockType, isEmpty, cursorPosition, canMerge)"
"withPasted plugin" ->> "Paste merger": "smartPaste(editor, parsedBlocks, context)"
"Paste merger" ->> "Paste merger": "beforePasted(editor)"
alt "shouldReplaceBlock(context, count)"
"Paste merger" ->> "Yjs/Slate adapter": "replaceBlockPaste(editor, blocks) via slateContentInsertToYData()"
"Yjs/Slate adapter" ->> "Yjs/Slate adapter": "deleteBlock(oldBlockId)"
else "shouldMergeInline(context, 1)"
"Paste merger" ->> "Paste merger": "inlineMergePaste(editor, firstBlock)"
"Paste merger" ->> "ReactEditor": "Transforms.insertNodes(inline text nodes)"
else "shouldMergeAndAppend(context, count>1)"
"Paste merger" ->> "ReactEditor": "Insert first block inline, splitNodes()"
"Paste merger" ->> "Yjs/Slate adapter": "appendBlocksPaste(editor, remainingBlocks)"
else "Default append"
"Paste merger" ->> "Yjs/Slate adapter": "appendBlocksPaste(editor, blocks)"
end
"Yjs/Slate adapter" ->> "Yjs/Slate adapter": "findSlateEntryByBlockId(), slateContentInsertToYData()"
"Yjs/Slate adapter" ->> "ReactEditor": "focusBlock(lastBlockId) and select end()"
end
"ReactEditor" -->> "User": "Display pasted content with correct structure and formatting"
Flow diagram for smart paste strategy selectionflowchart TD
A["Start smartPaste(editor, blocks, context)"]
B["Call 'beforePasted(editor)' and check 'blocks.length > 0'"]
C{"shouldReplaceBlock(context, blocks.length)?"}
D["replaceBlockPaste(editor, blocks)\nInsert via 'slateContentInsertToYData' at current index\nDelete original block\nFocus last block"]
E{"shouldMergeInline(context, blocks.length)?"}
F["inlineMergePaste(editor, firstBlock)\nConvert 'text'+'formats' to inline 'Text[]'\n'Transforms.insertNodes' at selection"]
G{"shouldMergeAndAppend(context, blocks.length)?"}
H["mergeAndAppendPaste(editor, blocks)\nInsert inline text from firstBlock\n'SplitNodes' at cursor\nAppend remaining blocks via Yjs"]
I["appendBlocksPaste(editor, blocks)\nInsert all blocks after current block via Yjs\nFocus last block"]
J["Return 'true' or 'false' based on success"]
A --> B
B -->|"'beforePasted' failed or no blocks"| J
B -->|"OK"| C
C -->|"Yes"| D
C -->|"No"| E
D --> J
E -->|"Yes"| F
E -->|"No"| G
F --> J
G -->|"Yes"| H
G -->|"No"| I
H --> J
I --> J
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
- In withPasted, the single-line plain-text path that falls through to Transforms.insertNodes (non-URL, non-Markdown) no longer calls beforePasted, which may skip Yjs-related pre-paste logic that other paths still rely on; consider routing this through smartPaste or explicitly invoking beforePasted for consistency.
- The pasteContent Cypress helper digs into React internals via __reactFiber / __reactInternalInstance traversal to reach the editor, which is quite brittle across React upgrades; consider using more stable hooks (e.g., triggering the native paste event only) or exposing a test-only interface instead of depending on private fiber fields.
- The new paste pipeline adds multiple console.log/console.debug style logs (e.g., in withPasted, html-parser, paste-merger) that will fire on every paste; consider gating these behind a debug flag or removing them to avoid noisy logs in production.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In withPasted, the single-line plain-text path that falls through to Transforms.insertNodes (non-URL, non-Markdown) no longer calls beforePasted, which may skip Yjs-related pre-paste logic that other paths still rely on; consider routing this through smartPaste or explicitly invoking beforePasted for consistency.
- The pasteContent Cypress helper digs into React internals via __reactFiber / __reactInternalInstance traversal to reach the editor, which is quite brittle across React upgrades; consider using more stable hooks (e.g., triggering the native paste event only) or exposing a test-only interface instead of depending on private fiber fields.
- The new paste pipeline adds multiple console.log/console.debug style logs (e.g., in withPasted, html-parser, paste-merger) that will fire on every paste; consider gating these behind a debug flag or removing them to avoid noisy logs in production.
## Individual Comments
### Comment 1
<location> `src/components/editor/plugins/withPasted.ts:32-38` </location>
<code_context>
- const lines = text.split(/\r\n|\r|\n/);
+ // Priority 1: HTML (if available)
+ if (html && html.trim().length > 0) {
+ console.log('[AppFlowy] Handling HTML paste', html);
+ return handleHTMLPaste(editor, html, text);
+ }
- const html = data.getData('text/html');
+ // Priority 2: Plain text
+ if (text && text.trim().length > 0) {
+ console.log('[AppFlowy] Handling Plain Text paste', text);
+ return handlePlainTextPaste(editor, text);
+ }
</code_context>
<issue_to_address>
**🚨 suggestion (security):** Avoid unconditional console logging of full pasted contents in production code.
Both HTML and plain-text paste handlers log the full clipboard contents on every paste, which can clutter production logs and expose sensitive user data. Please either remove these logs, guard them with a debug flag, or log only minimal metadata (e.g., content type and length) instead of the full payload.
</issue_to_address>
### Comment 2
<location> `src/components/editor/plugins/withPasted.ts:37-46` </location>
<code_context>
- console.debug('insert HTML Data', html);
- const fragment = deserializeHTML(html) as Node[];
+ // Special case: Single line
+ if (lineLength === 1) {
+ const isUrl = !!processUrl(text);
- insertFragment(editor, fragment);
+ if (isUrl) {
+ return handleURLPaste(editor, text);
+ }
- return true;
- }
+ // Check if it's Markdown (even for single line)
+ if (detectMarkdown(text)) {
+ return handleMarkdownPaste(editor, text);
+ }
</code_context>
<issue_to_address>
**issue (bug_risk):** Markdown detection on plain-text paste might interfere with code block usage.
Currently, plain-text pastes are always passed through `detectMarkdown`, even inside a `CodeBlock`. In code blocks, users generally expect the text to be inserted verbatim, even if it contains Markdown symbols. Consider skipping Markdown handling when the current block is a `CodeBlock` (e.g., via `getBlockEntry`) and instead insert the raw text to avoid unintended formatting there.
</issue_to_address>
### Comment 3
<location> `src/components/editor/plugins/withPasted.ts:148-157` </location>
<code_context>
+function handleURLPaste(editor: ReactEditor, url: string): boolean {
</code_context>
<issue_to_address>
**issue (bug_risk):** URL-specific paste path bypasses the shared `beforePasted` hook.
`handleURLPaste` → `insertBlock` skips `beforePasted`, while other structured paste paths (e.g. `smartPaste`) run through it. If `beforePasted` is relied on for collaboration sync, validation, or undo grouping, URL pastes may behave inconsistently. Consider calling `beforePasted` (or an equivalent guard) before inserting mention/video/link preview blocks to keep URL pastes aligned with the main paste pipeline.
</issue_to_address>
### Comment 4
<location> `src/components/editor/parsers/__tests__/markdown-parser.test.ts:219` </location>
<code_context>
+ });
+
+ describe('parseList', () => {
+ it('should parse unordered list', () => {
+ const node: HastElement = {
+ type: 'element',
</code_context>
<issue_to_address>
**suggestion (testing):** Add a unit test that covers the special bullet character (•) pre-processing logic in parseMarkdown.
Since `parseMarkdown` normalizes lines starting with `•` into `-` before passing them to remark, but current tests only cover `-` and `*`, please add a unit test using input like `• Item 1` / `• Item 2`. The test should verify that the result items are `BlockType.BulletedListBlock` and that the returned text no longer contains the original `•` characters. This will directly validate the pre-processing behavior in addition to the existing Cypress coverage in `paste-lists.cy.ts`.
</issue_to_address>
### Comment 5
<location> `src/components/editor/parsers/__tests__/markdown-parser.test.ts:354-364` </location>
<code_context>
+ expect(blocks[0].children).toHaveLength(3);
+ });
+
+ it('should disable GFM when option is false', () => {
+ const markdown = '~~strikethrough~~';
+ const blocks = parseMarkdown(markdown, { gfm: false });
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding tests for how tables behave when GFM is disabled to fully validate the options surface.
You already verify that strikethrough is not parsed when `gfm: false`. To round out the configuration coverage, please also add a test that:
- passes a Markdown table into `parseMarkdown` with `{ gfm: false }`, and
- asserts that the result does not produce `BlockType.SimpleTableBlock` (e.g., it falls back to paragraphs).
This keeps `gfm` behavior consistent across all GFM-only constructs, not just strikethrough.
```suggestion
it('should disable GFM when option is false', () => {
const markdown = '~~strikethrough~~';
const blocks = parseMarkdown(markdown, { gfm: false });
// Without GFM, strikethrough should not be parsed
expect(blocks[0].formats.some((f) => f.type === 'strikethrough')).toBe(false);
});
it('should not parse tables as SimpleTableBlock when GFM is disabled', () => {
const markdown = `
| Col1 | Col2 |
| ---- | ---- |
| A | B |
`.trim();
const blocks = parseMarkdown(markdown, { gfm: false });
// Without GFM, tables should not be parsed as SimpleTableBlock
expect(blocks[0].type).not.toBe(BlockType.SimpleTableBlock);
// It should fall back to a regular paragraph (or equivalent non-table block)
expect(blocks[0].type).toBe(BlockType.ParagraphBlock);
});
it('should handle inline code in headings', () => {
const markdown = '# Heading with `code`';
const blocks = parseMarkdown(markdown);
```
</issue_to_address>
### Comment 6
<location> `cypress/support/paste-utils.ts:15` </location>
<code_context>
+ * Helper function to paste content and wait for processing
+ * Directly calls Slate editor's insertData method to bypass event system
+ */
+export const pasteContent = (html: string, plainText: string) => {
+ // Wait for editors to be available
+ cy.get('[contenteditable="true"]').should('have.length.at.least', 1);
</code_context>
<issue_to_address>
**issue (complexity):** Consider refactoring pasteContent into smaller helper functions to separate editor lookup, Slate introspection, and fallback paste logic for better readability and maintainability.
The main complexity is concentrated in `pasteContent`. You can keep all behavior (including the direct `insertData` path) but make it much easier to read/maintain by extracting responsibilities and de-duplicating logic.
Concretely:
1) Extract editor lookup into a small helper
This makes the main flow more self‑describing and removes inline DOM filtering noise:
```ts
const findMainEditor = (): Cypress.Chainable<HTMLElement> => {
return cy.get('[contenteditable="true"]').then($editors => {
let targetEditor: HTMLElement | null = null;
$editors.each((_index: number, el: HTMLElement) => {
const $el = Cypress.$(el);
if (!$el.attr('data-testid')?.includes('title') && !$el.hasClass('editor-title')) {
targetEditor = el;
return false;
}
});
if (!targetEditor && $editors.length > 0) {
targetEditor = $editors.last()[0];
}
if (!targetEditor) {
throw new Error('No editor found');
}
return targetEditor;
});
};
```
2) Extract React/Slate introspection into a dedicated helper
This keeps all the current behavior (including the fiber traversal), but isolates the fragile logic so the main helper reads linearly:
```ts
const getSlateEditor = (targetEditor: HTMLElement, win: Window): any | null => {
const editorKey = Object.keys(targetEditor).find(key =>
key.startsWith('__reactFiber') || key.startsWith('__reactInternalInstance')
);
if (!editorKey) return null;
let currentFiber: any = (targetEditor as any)[editorKey];
let depth = 0;
while (currentFiber && depth < 50) {
const props = currentFiber.memoizedProps;
const stateNode = currentFiber.stateNode;
if (props?.editor) return props.editor;
if (stateNode?.editor) return stateNode.editor;
currentFiber = currentFiber.return;
depth++;
}
return null;
};
```
3) Extract the Cypress fallback trigger to avoid duplication
```ts
const triggerPasteFallback = (
targetEditor: HTMLElement,
html: string,
plainText: string
) => {
cy.wrap(targetEditor).trigger('paste', {
clipboardData: {
getData: (type: string) => {
if (type === 'text/html') return html;
if (type === 'text/plain') return plainText;
return '';
},
types: ['text/html', 'text/plain']
},
bubbles: true,
cancelable: true
});
};
```
4) Rewrite `pasteContent` using the helpers
Same functionality, but with clear primary vs fallback paths and no repeated logic:
```ts
export const pasteContent = (html: string, plainText: string) => {
cy.get('[contenteditable="true"]').should('have.length.at.least', 1);
findMainEditor().then(targetEditor => {
cy.wrap(targetEditor).click({ force: true }).focus();
cy.window().then(win => {
const slateEditor = getSlateEditor(targetEditor, win);
if (slateEditor && typeof slateEditor.insertData === 'function') {
const dataTransfer = new win.DataTransfer();
if (html) dataTransfer.setData('text/html', html);
if (plainText) {
dataTransfer.setData('text/plain', plainText);
} else if (!html) {
dataTransfer.setData('text/plain', '');
}
slateEditor.insertData(dataTransfer);
} else {
triggerPasteFallback(targetEditor, html, plainText);
}
});
});
cy.wait(1500);
};
```
This keeps:
- Direct `insertData` usage when the Slate editor is discoverable.
- Cypress `trigger('paste', …)` as a fallback.
- All existing selection/focus and waiting behavior.
But it:
- Removes deep nested branches from `pasteContent`.
- Clearly separates “find editor”, “get Slate editor via React internals”, and “simulate paste”.
- Centralizes the fragile React introspection in one small, well‑named helper so future changes are localized.
</issue_to_address>
### Comment 7
<location> `src/components/editor/utils/paste-merger.ts:173` </location>
<code_context>
+/**
+ * Strategy 1: Replace the current empty block with pasted content
+ */
+function replaceBlockPaste(editor: ReactEditor, blocks: ParsedBlock[]): boolean {
+ const slateNodes = parsedBlocksToSlateNodes(blocks);
+ const point = editor.selection?.anchor;
</code_context>
<issue_to_address>
**issue (complexity):** Consider extracting the shared Yjs insertion and focus logic into reusable helpers so the paste strategy functions stay thin and only decide where to insert content.
You can reduce the complexity (and duplication) without changing behavior by extracting the repeated Yjs insertion plumbing into a focused helper, and keeping the strategy functions as thin as possible.
Right now, `replaceBlockPaste` and `appendBlocksPaste` both do:
- resolve selection → blockId
- get `sharedRoot`, `block`, `parent`, `parentChildren`
- compute index
- get `doc`
- call `slateContentInsertToYData`
- track `lastBlockId`
- focus last inserted block
You can move that into a reusable helper and have each strategy only decide *where* to insert (replace vs after).
Example refactor (core idea):
```ts
type InsertPosition = 'replace' | 'after';
function insertSlateNodesAtBlock(
editor: YjsEditor,
blockId: string,
slateNodes: Node[],
position: InsertPosition,
opts: { deleteOriginal?: boolean } = {}
): string | null {
const sharedRoot = getSharedRoot(editor);
const block = getBlock(blockId, sharedRoot);
const parentId = block.get(YjsEditorKey.block_parent);
const parent = getBlock(parentId, sharedRoot);
const parentChildren = getChildrenArray(parent.get(YjsEditorKey.block_children), sharedRoot);
const index = parentChildren.toArray().findIndex((id) => id === blockId);
if (index === -1) return null;
const doc = assertDocExists(sharedRoot);
let lastBlockId: string | null = blockId;
doc.transact(() => {
const insertIndex = position === 'replace' ? index : index + 1;
const newBlockIds = slateContentInsertToYData(parentId, insertIndex, slateNodes, doc);
lastBlockId = newBlockIds[newBlockIds.length - 1];
if (opts.deleteOriginal) {
deleteBlock(sharedRoot, blockId);
}
});
return lastBlockId;
}
function focusBlock(editor: YjsEditor, blockId: string): void {
setTimeout(() => {
try {
const entry = findSlateEntryByBlockId(editor, blockId);
if (!entry) return;
const [, path] = entry;
const point = editor.end(path);
editor.select(point);
} catch (e) {
console.error('Error focusing block:', e);
}
}, 50);
}
```
Then `replaceBlockPaste` and `appendBlocksPaste` become much simpler and easier to reason about, while preserving all behavior:
```ts
function replaceBlockPaste(editor: ReactEditor, blocks: ParsedBlock[]): boolean {
const slateNodes = parsedBlocksToSlateNodes(blocks);
const point = editor.selection?.anchor;
if (!point) return false;
const entry = getBlockEntry(editor as YjsEditor, point);
if (!entry) return false;
const [node] = entry;
const blockId = (node as { blockId?: string }).blockId;
if (!blockId) return false;
const lastBlockId = insertSlateNodesAtBlock(
editor as YjsEditor,
blockId,
slateNodes,
'replace',
{ deleteOriginal: true }
);
if (!lastBlockId) return false;
focusBlock(editor as YjsEditor, lastBlockId);
return true;
}
function appendBlocksPaste(editor: ReactEditor, blocks: ParsedBlock[]): boolean {
const slateNodes = parsedBlocksToSlateNodes(blocks);
const point = editor.selection?.anchor;
if (!point) return false;
const entry = getBlockEntry(editor as YjsEditor, point);
if (!entry) return false;
const [node] = entry;
const blockId = (node as { blockId?: string }).blockId;
if (!blockId) return false;
const lastBlockId = insertSlateNodesAtBlock(
editor as YjsEditor,
blockId,
slateNodes,
'after'
);
if (!lastBlockId) return false;
focusBlock(editor as YjsEditor, lastBlockId);
return true;
}
```
This:
- Removes the non-trivial Yjs plumbing duplication.
- Keeps `smartPaste` strategies focused on “what to do” (replace vs append) rather than “how to wire Yjs/Slate.”
- Centralizes the transaction and focus behavior in one place, making future changes (e.g., different focus behavior, error handling, or metrics) much easier and less error-prone.
If you want to go a step further, you could also extract the “resolve current blockId from selection” into a small helper:
```ts
function getCurrentBlockId(editor: ReactEditor): string | null {
const point = editor.selection?.anchor;
if (!point) return null;
const entry = getBlockEntry(editor as YjsEditor, point);
if (!entry) return null;
const [node] = entry;
return (node as { blockId?: string }).blockId ?? null;
}
```
Then both strategies can just call `getCurrentBlockId(editor)` and pass the result into `insertSlateNodesAtBlock`, further simplifying the file without changing any behavior.
</issue_to_address>
### Comment 8
<location> `src/components/editor/parsers/html-parser.ts:44-46` </location>
<code_context>
const blocks = convertASTToAppFlowyBlocks(tree.children as HastElement[], options);
return blocks;
</code_context>
<issue_to_address>
**suggestion (code-quality):** Inline variable that is immediately returned ([`inline-immediately-returned-variable`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/TypeScript/Default-Rules/inline-immediately-returned-variable))
```suggestion
return convertASTToAppFlowyBlocks(tree.children as HastElement[], options);
```
<br/><details><summary>Explanation</summary>Something that we often see in people's code is assigning to a result variable
and then immediately returning it.
Returning the result directly shortens the code and removes an unnecessary
variable, reducing the mental load of reading the function.
Where intermediate variables can be useful is if they then get used as a
parameter or a condition, and the name can act like a comment on what the
variable represents. In the case where you're returning it from a function, the
function name is there to tell you what the result is, so the variable name
is unnecessary.
</details>
</issue_to_address>
### Comment 9
<location> `src/components/editor/parsers/html-parser.ts:88-110` </location>
<code_context>
if (block) {
// Handle array of blocks (e.g. flattened lists)
if (Array.isArray(block)) {
// Recursively process nested structures if needed (though flattened blocks are usually leaves)
blocks.push(...block);
} else {
// For blocks with children (like lists that weren't flattened), recursively process
// Note: parseList now returns array, so this branch is less common for lists
if (block.children.length > 0) {
// Children already processed in block converter for some types
blocks.push(block);
} else {
blocks.push(block);
}
}
} else {
// If element couldn't be converted, try processing its children
if (node.children && node.children.length > 0) {
const childBlocks = convertASTToAppFlowyBlocks(node.children as HastElement[], options, depth + 1);
blocks.push(...childBlocks);
}
}
</code_context>
<issue_to_address>
**suggestion (code-quality):** Merge else clause's nested if statement into `else if` ([`merge-else-if`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/TypeScript/Default-Rules/merge-else-if))
```suggestion
if (block) {
// Handle array of blocks (e.g. flattened lists)
if (Array.isArray(block)) {
// Recursively process nested structures if needed (though flattened blocks are usually leaves)
blocks.push(...block);
} else {
// For blocks with children (like lists that weren't flattened), recursively process
// Note: parseList now returns array, so this branch is less common for lists
if (block.children.length > 0) {
// Children already processed in block converter for some types
blocks.push(block);
} else {
blocks.push(block);
}
}
}
else if (node.children && node.children.length > 0) {
const childBlocks = convertASTToAppFlowyBlocks(node.children as HastElement[], options, depth + 1);
blocks.push(...childBlocks);
}
```
<br/><details><summary>Explanation</summary>Flattening if statements nested within else clauses generates code that is
easier to read and expand upon.
</details>
</issue_to_address>
### Comment 10
<location> `src/components/editor/parsers/html-parser.ts:90-102` </location>
<code_context>
if (Array.isArray(block)) {
// Recursively process nested structures if needed (though flattened blocks are usually leaves)
blocks.push(...block);
} else {
// For blocks with children (like lists that weren't flattened), recursively process
// Note: parseList now returns array, so this branch is less common for lists
if (block.children.length > 0) {
// Children already processed in block converter for some types
blocks.push(block);
} else {
blocks.push(block);
}
}
</code_context>
<issue_to_address>
**suggestion (code-quality):** Merge else clause's nested if statement into `else if` ([`merge-else-if`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/TypeScript/Default-Rules/merge-else-if))
```suggestion
if (Array.isArray(block)) {
// Recursively process nested structures if needed (though flattened blocks are usually leaves)
blocks.push(...block);
}
else if (block.children.length > 0) {
// Children already processed in block converter for some types
blocks.push(block);
}
else {
blocks.push(block);
}
```
<br/><details><summary>Explanation</summary>Flattening if statements nested within else clauses generates code that is
easier to read and expand upon.
</details>
</issue_to_address>
### Comment 11
<location> `src/components/editor/parsers/html-parser.ts:145-163` </location>
<code_context>
</code_context>
<issue_to_address>
**issue (code-quality):** Avoid function declarations, favouring function assignment expressions, inside blocks. ([`avoid-function-declarations-in-blocks`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/TypeScript/Default-Rules/avoid-function-declarations-in-blocks))
<details><summary>Explanation</summary>Function declarations may be hoisted in Javascript, but the behaviour is inconsistent between browsers.
Hoisting is generally confusing and should be avoided. Rather than using function declarations inside blocks, you
should use function expressions, which create functions in-scope.
</details>
</issue_to_address>
### Comment 12
<location> `src/components/editor/parsers/inline-converters.ts:35-179` </location>
<code_context>
</code_context>
<issue_to_address>
**issue (code-quality):** Avoid function declarations, favouring function assignment expressions, inside blocks. ([`avoid-function-declarations-in-blocks`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/TypeScript/Default-Rules/avoid-function-declarations-in-blocks))
<details><summary>Explanation</summary>Function declarations may be hoisted in Javascript, but the behaviour is inconsistent between browsers.
Hoisting is generally confusing and should be avoided. Rather than using function declarations inside blocks, you
should use function expressions, which create functions in-scope.
</details>
</issue_to_address>
### Comment 13
<location> `src/components/editor/parsers/markdown-parser.ts:58-60` </location>
<code_context>
const blocks = convertMarkdownASTToAppFlowyBlocks(ast.children as BlockContent[]);
return blocks;
</code_context>
<issue_to_address>
**suggestion (code-quality):** Inline variable that is immediately returned ([`inline-immediately-returned-variable`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/TypeScript/Default-Rules/inline-immediately-returned-variable))
```suggestion
return convertMarkdownASTToAppFlowyBlocks(ast.children as BlockContent[]);
```
<br/><details><summary>Explanation</summary>Something that we often see in people's code is assigning to a result variable
and then immediately returning it.
Returning the result directly shortens the code and removes an unnecessary
variable, reducing the mental load of reading the function.
Where intermediate variables can be useful is if they then get used as a
parameter or a condition, and the name can act like a comment on what the
variable represents. In the case where you're returning it from a function, the
function name is there to tell you what the result is, so the variable name
is unnecessary.
</details>
</issue_to_address>
### Comment 14
<location> `src/components/editor/parsers/mdast-utils.ts:36-118` </location>
<code_context>
</code_context>
<issue_to_address>
**issue (code-quality):** Avoid function declarations, favouring function assignment expressions, inside blocks. ([`avoid-function-declarations-in-blocks`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/TypeScript/Default-Rules/avoid-function-declarations-in-blocks))
<details><summary>Explanation</summary>Function declarations may be hoisted in Javascript, but the behaviour is inconsistent between browsers.
Hoisting is generally confusing and should be avoided. Rather than using function declarations inside blocks, you
should use function expressions, which create functions in-scope.
</details>
</issue_to_address>
### Comment 15
<location> `src/components/editor/parsers/sanitize.ts:80-82` </location>
<code_context>
const sanitized = DOMPurify.sanitize(html, SANITIZE_CONFIG);
return sanitized;
</code_context>
<issue_to_address>
**suggestion (code-quality):** Inline variable that is immediately returned ([`inline-immediately-returned-variable`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/TypeScript/Default-Rules/inline-immediately-returned-variable))
```suggestion
return DOMPurify.sanitize(html, SANITIZE_CONFIG);
```
<br/><details><summary>Explanation</summary>Something that we often see in people's code is assigning to a result variable
and then immediately returning it.
Returning the result directly shortens the code and removes an unnecessary
variable, reducing the mental load of reading the function.
Where intermediate variables can be useful is if they then get used as a
parameter or a condition, and the name can act like a comment on what the
variable represents. In the case where you're returning it from a function, the
function name is there to tell you what the result is, so the variable name
is unnecessary.
</details>
</issue_to_address>
### Comment 16
<location> `src/components/editor/parsers/sanitize.ts:150-155` </location>
<code_context>
if (key && value && allowed.includes(key.toLowerCase())) {
// Basic validation of CSS values (prevent expressions)
if (!/javascript|expression|@import|url\(/i.test(value)) {
result[key.toLowerCase()] = value;
}
}
</code_context>
<issue_to_address>
**suggestion (code-quality):** Merge nested if conditions ([`merge-nested-ifs`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/TypeScript/Default-Rules/merge-nested-ifs))
```suggestion
if (key && value && allowed.includes(key.toLowerCase()) && !/javascript|expression|@import|url\(/i.test(value)) {
result[key.toLowerCase()] = value;
}
```
<br/><details><summary>Explanation</summary>Reading deeply nested conditional code is confusing, since you have to keep track of which
conditions relate to which levels. We therefore strive to reduce nesting where
possible, and the situation where two `if` conditions can be combined using
`and` is an easy win.
</details>
</issue_to_address>
### Comment 17
<location> `src/components/editor/utils/markdown-detector.ts:79-81` </location>
<code_context>
const density = Math.min(totalMatches / (lines * 0.5), 1);
return density;
</code_context>
<issue_to_address>
**suggestion (code-quality):** Inline variable that is immediately returned ([`inline-immediately-returned-variable`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/TypeScript/Default-Rules/inline-immediately-returned-variable))
```suggestion
return Math.min(totalMatches / (lines * 0.5), 1);
```
<br/><details><summary>Explanation</summary>Something that we often see in people's code is assigning to a result variable
and then immediately returning it.
Returning the result directly shortens the code and removes an unnecessary
variable, reducing the mental load of reading the function.
Where intermediate variables can be useful is if they then get used as a
parameter or a condition, and the name can act like a comment on what the
variable represents. In the case where you're returning it from a function, the
function name is there to tell you what the result is, so the variable name
is unnecessary.
</details>
</issue_to_address>
### Comment 18
<location> `src/components/editor/utils/paste-merger.ts:184` </location>
<code_context>
const blockId = (node as { blockId?: string }).blockId;
</code_context>
<issue_to_address>
**suggestion (code-quality):** Prefer object destructuring when accessing and using properties. ([`use-object-destructuring`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/TypeScript/Default-Rules/use-object-destructuring))
```suggestion
const {blockId} = node as { blockId?: string };
```
<br/><details><summary>Explanation</summary>Object destructuring can often remove an unnecessary temporary reference, as well as making your code more succinct.
From the [Airbnb Javascript Style Guide](https://airbnb.io/javascript/#destructuring--object)
</details>
</issue_to_address>
### Comment 19
<location> `src/components/editor/utils/paste-merger.ts:268` </location>
<code_context>
const blockId = (node as { blockId?: string }).blockId;
</code_context>
<issue_to_address>
**suggestion (code-quality):** Prefer object destructuring when accessing and using properties. ([`use-object-destructuring`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/TypeScript/Default-Rules/use-object-destructuring))
```suggestion
const {blockId} = node as { blockId?: string };
```
<br/><details><summary>Explanation</summary>Object destructuring can often remove an unnecessary temporary reference, as well as making your code more succinct.
From the [Airbnb Javascript Style Guide](https://airbnb.io/javascript/#destructuring--object)
</details>
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| if (text && text.trim().length > 0) { | ||
| console.log('[AppFlowy] Handling Plain Text paste', text); | ||
| return handlePlainTextPaste(editor, text); | ||
| } | ||
|
|
||
| const lineLength = lines.filter(Boolean).length; | ||
| const point = editor.selection?.anchor as BasePoint; | ||
| const entry = getBlockEntry(editor as YjsEditor, point); | ||
|
|
||
| if (!entry) return false; | ||
|
|
||
| const [node] = entry; | ||
|
|
||
| if (lineLength === 1) { | ||
| const isUrl = !!processUrl(text); | ||
|
|
||
| if (isUrl) { | ||
| const isAppFlowyLinkUrl = isURL(text, { | ||
| host_whitelist: [window.location.hostname], | ||
| }); | ||
|
|
||
| if (isAppFlowyLinkUrl) { | ||
| const url = new URL(text); | ||
| const blockId = url.searchParams.get('blockId'); | ||
|
|
||
| if (blockId) { | ||
| const pageId = url.pathname.split('/').pop(); | ||
| const point = editor.selection?.anchor as BasePoint; | ||
|
|
||
| Transforms.insertNodes( | ||
| editor, | ||
| { | ||
| text: '@', | ||
| mention: { | ||
| type: MentionType.PageRef, | ||
| page_id: pageId, | ||
| block_id: blockId, | ||
| }, | ||
| }, | ||
| { at: point, select: true, voids: false } | ||
| ); | ||
|
|
||
| return true; | ||
| } | ||
| } | ||
|
|
||
| const isVideoUrl = isURL(text, { | ||
| host_whitelist: ['youtube.com', 'www.youtube.com', 'youtu.be', 'vimeo.com'], | ||
| }); | ||
|
|
||
| if (isVideoUrl) { | ||
| insertFragment(editor, [ | ||
| { | ||
| type: BlockType.VideoBlock, | ||
| data: { url: text } as VideoBlockData, | ||
| children: [ | ||
| { | ||
| text: '', | ||
| }, | ||
| ], | ||
| }, | ||
| ]); | ||
| return true; | ||
| } | ||
|
|
||
| insertFragment(editor, [ | ||
| { | ||
| type: BlockType.LinkPreview, | ||
| data: { url: text } as LinkPreviewBlockData, | ||
| children: [{ text: '' }], | ||
| }, | ||
| ]); | ||
| return false; | ||
| }; | ||
|
|
||
| return true; | ||
| } | ||
| } | ||
| return editor; | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (bug_risk): Markdown detection on plain-text paste might interfere with code block usage.
Currently, plain-text pastes are always passed through detectMarkdown, even inside a CodeBlock. In code blocks, users generally expect the text to be inserted verbatim, even if it contains Markdown symbols. Consider skipping Markdown handling when the current block is a CodeBlock (e.g., via getBlockEntry) and instead insert the raw text to avoid unintended formatting there.
| }); | ||
| }); | ||
|
|
||
| it('should parse unordered list', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (testing): Add a unit test that covers the special bullet character (•) pre-processing logic in parseMarkdown.
Since parseMarkdown normalizes lines starting with • into - before passing them to remark, but current tests only cover - and *, please add a unit test using input like • Item 1 / • Item 2. The test should verify that the result items are BlockType.BulletedListBlock and that the returned text no longer contains the original • characters. This will directly validate the pre-processing behavior in addition to the existing Cypress coverage in paste-lists.cy.ts.
| it('should disable GFM when option is false', () => { | ||
| const markdown = '~~strikethrough~~'; | ||
| const blocks = parseMarkdown(markdown, { gfm: false }); | ||
|
|
||
| // Without GFM, strikethrough should not be parsed | ||
| expect(blocks[0].formats.some((f) => f.type === 'strikethrough')).toBe(false); | ||
| }); | ||
|
|
||
| it('should handle inline code in headings', () => { | ||
| const markdown = '# Heading with `code`'; | ||
| const blocks = parseMarkdown(markdown); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (testing): Consider adding tests for how tables behave when GFM is disabled to fully validate the options surface.
You already verify that strikethrough is not parsed when gfm: false. To round out the configuration coverage, please also add a test that:
- passes a Markdown table into
parseMarkdownwith{ gfm: false }, and - asserts that the result does not produce
BlockType.SimpleTableBlock(e.g., it falls back to paragraphs).
This keepsgfmbehavior consistent across all GFM-only constructs, not just strikethrough.
| it('should disable GFM when option is false', () => { | |
| const markdown = '~~strikethrough~~'; | |
| const blocks = parseMarkdown(markdown, { gfm: false }); | |
| // Without GFM, strikethrough should not be parsed | |
| expect(blocks[0].formats.some((f) => f.type === 'strikethrough')).toBe(false); | |
| }); | |
| it('should handle inline code in headings', () => { | |
| const markdown = '# Heading with `code`'; | |
| const blocks = parseMarkdown(markdown); | |
| it('should disable GFM when option is false', () => { | |
| const markdown = '~~strikethrough~~'; | |
| const blocks = parseMarkdown(markdown, { gfm: false }); | |
| // Without GFM, strikethrough should not be parsed | |
| expect(blocks[0].formats.some((f) => f.type === 'strikethrough')).toBe(false); | |
| }); | |
| it('should not parse tables as SimpleTableBlock when GFM is disabled', () => { | |
| const markdown = ` | |
| | Col1 | Col2 | | |
| | ---- | ---- | | |
| | A | B | | |
| `.trim(); | |
| const blocks = parseMarkdown(markdown, { gfm: false }); | |
| // Without GFM, tables should not be parsed as SimpleTableBlock | |
| expect(blocks[0].type).not.toBe(BlockType.SimpleTableBlock); | |
| // It should fall back to a regular paragraph (or equivalent non-table block) | |
| expect(blocks[0].type).toBe(BlockType.ParagraphBlock); | |
| }); | |
| it('should handle inline code in headings', () => { | |
| const markdown = '# Heading with `code`'; | |
| const blocks = parseMarkdown(markdown); |
| if (key && value && allowed.includes(key.toLowerCase())) { | ||
| // Basic validation of CSS values (prevent expressions) | ||
| if (!/javascript|expression|@import|url\(/i.test(value)) { | ||
| result[key.toLowerCase()] = value; | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (code-quality): Merge nested if conditions (merge-nested-ifs)
| if (key && value && allowed.includes(key.toLowerCase())) { | |
| // Basic validation of CSS values (prevent expressions) | |
| if (!/javascript|expression|@import|url\(/i.test(value)) { | |
| result[key.toLowerCase()] = value; | |
| } | |
| } | |
| if (key && value && allowed.includes(key.toLowerCase()) && !/javascript|expression|@import|url\(/i.test(value)) { | |
| result[key.toLowerCase()] = value; | |
| } | |
Explanation
Reading deeply nested conditional code is confusing, since you have to keep track of whichconditions relate to which levels. We therefore strive to reduce nesting where
possible, and the situation where two
if conditions can be combined usingand is an easy win.
* refactor: paste text and add test for pasting * fix: paste bullet list with empty line * chore: lint * chore: fmt code * chore: fmt jest test * chore: fix test * chore: lint * chore: fix table
Description
Checklist
General
Testing
Feature-Specific
Summary by Sourcery
Improve rich paste handling in the editor with structured HTML/Markdown parsing, smarter context-aware insertion, and extended content type support (lists, code, tables, links, and media), along with new tests and dependencies to validate and support the behavior.
New Features:
Bug Fixes:
Enhancements:
Build:
Tests: