Skip to content

Replace markdown library with mistune#3767

Open
iangreenleaf wants to merge 9 commits intobookwyrm-social:mainfrom
iangreenleaf:new-markdown-parser
Open

Replace markdown library with mistune#3767
iangreenleaf wants to merge 9 commits intobookwyrm-social:mainfrom
iangreenleaf:new-markdown-parser

Conversation

@iangreenleaf
Copy link
Copy Markdown
Contributor

Description

This replaces the library for parsing Markdown formatted text.

I ran into problems while working on a feature that uses media queries in the sizes attribute on img; markdown.py was tripping over symbols like <= and treating them like HTML even though they were in a string in an attribute and should have been passed through. In other words, I suspect the parser used by markdown.py isn't very robust.

There are a couple other libraries that could work, like marko. To be honest I picked mistune without a ton of research because it seemed like it was modern, fast, and had wide adoption. Marko complains that mistune is not fully compliant with the CommonMark standard, but then mistune's own docs seem to suggest that it is. I didn't go all the way down that rabbit hole, so all that to say I'm open to using some other library instead, though I suspect whatever differences exist in adherence to spec are probably quite minor and not likely to be big deal to Bookwyrm. Mistune appears to have excellent performance as well, which is helpful.

What type of Pull Request is this?

  • Bug Fix
  • Enhancement
  • Plumbing / Internals / Dependencies
  • Refactor

Does this PR change settings or dependencies, or break something?

  • This PR changes or adds default settings, configuration, or .env values
  • This PR changes or adds dependencies
  • This PR introduces other breaking changes

Details of breaking or configuration changes (if any of above checked)

Changes one of the library dependencies.

Documentation

  • New or amended documentation will be required if this PR is merged
  • I have created a matching pull request in the Documentation repository
  • I intend to create a matching pull request in the Documentation repository after this PR is merged

Tests

  • My changes do not need new tests
  • All tests I have added are passing
  • I have written tests but need help to make them pass
  • I have not written tests and need help to write them

markdown.py had parsing errors that were throwing off img srcsets.
It doesn't hugely matter one way or another, but the existing tests all
expect no newline, so why rock the boat.
@hughrun hughrun added the dependencies Pull requests that update a dependency file label Jan 3, 2026
Copy link
Copy Markdown
Member

@mouse-reeve mouse-reeve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no objection to this change, and mistune passes the smell check of being actively maintained.

Unfortunately, you're a victim of timing on this one -- we just moved from requirements.txt to pyproject.toml. Would you be down to adjust accordingly? I'm not sure why mypy is unhappy but you should be able to run ./bw-dev ruff to resolve that failing check

@iangreenleaf
Copy link
Copy Markdown
Contributor Author

Sure, I can merge and update! Might take me a little bit to do so--lot of other stuff going on at the moment.

@iangreenleaf
Copy link
Copy Markdown
Contributor Author

@mouse-reeve should be ready again. I had to cast a couple of the return values because mistune offers a weird return signature. I'm pretty sure that as called that method will only ever return a string, so hopefully casting is the appropriate solution there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants