Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -178,6 +178,8 @@ function markdownToPlainText(md: string) {
.replace(/`(.*?)`/g, '$1')
// 移除代码块 ```code```
.replace(/```[\s\S]*?```/g, '')
// 移除html标签
.replace(/<[^>]+>/g, '')
// 移除多余的换行符
.replace(/\n{2,}/g, '\n')
.trim()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your Markdown-to-plain text conversion function is generally well-written and efficient. However, there are a few things you can consider for optimization or improvement:

  1. Code Block Handling: The current code block removal logic should still work effectively, but it might be better to add comments explaining why this part of the regular expression works.

  2. HTML Tag Removal: The HTML tag removal regex /<[^>]+>/g will match all tags correctly, but since HTML is typically not used in Markdown (other than embedded URLs), if no HTML is expected in your input, you could optimize slightly by removing only non-whitespace characters inside <. This could prevent unexpected matches on some erroneous input patterns like <script>alert('XSS')</script>.

  3. Newline Optimization: While trimming excessive newlines with \n{2,} isn't necessary as long as the newline count is at least two after each other, it doesn't harm anything either.

  4. Comments:

    • Add comments above the .replace(/```[\s\S]*?```/g, '') line explaining that it removes multi-line code blocks.
    • Consider adding comments around the .replace(/<(?:[^"'>]|"[^"]*"|'[^']*')*>/g, '') line explaining its purpose.

Here's an updated version with these suggestions:

function markdownToPlainText(md: string) {
  return md
    .replace(/`(.*?)`/g, '$1')          // Remove inline code backticks
    .replace(/```[\s\S]*?```/g, '')     // Remove code blocks (multi-line)
    .replace(/<(?:[^"'>]|"[^"]*"|'[^']*')*>/g, '')  // Remove specific style attributes from HTML tags
    .trim();
}

This improved version maintains the functionality while being more explicit about what each replace operation does.

Expand Down
Loading