Skip to content

Conversation

@coxxny
Copy link

@coxxny coxxny commented Oct 3, 2025

Issue

#95

Summary

This PR fixes a crash when converting an .rst file that contains a grid table with double-width characters (e.g. Japanese).

Root cause: Docutils pads double-width characters during table parsing by temporarily inserting a NUL (\x00) sentinel. In the RST→MyST path (Docutils AST → markdown-it tokens → renderer), that NUL can remain in token content and reach the Markdown renderer, which then errors out.

Fix: Right before rendering, recursively strip \x00 from token content. This keeps the renderer from ever seeing embedded NULs. The change is minimal and localized to the rendering step.

Minimal Reproduction

test.rst

+-------+-------+
|  列A  |  列B  |
+-------+-------+
|  あい |  かき |
+-------+-------+

Before

$rst2myst convert test.md
test.rst -> test.md
FAILED:
null bytes should be removed by now

FINISHED ALL! (extensions: [])

After

$rst2myst convert test.md
test.md -> test.md
CONVERTED (extensions: [])

FINISHED ALL! (extensions: [])

test

image

pre-commit result

image

@coxxny
Copy link
Author

coxxny commented Oct 3, 2025

This PR contains a fix for ReadTheDocs configuration so the docs build succeeds.
ReadTheBook now requires a phinx configuration key.
Ref: https://about.readthedocs.com/blog/2024/12/deprecate-config-files-without-sphinx-or-mkdocs-config/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant