-
Notifications
You must be signed in to change notification settings - Fork 53
[209_9] 修复代码模式的中文换行显示问题 #2635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[209_9] 修复代码模式的中文换行显示问题 #2635
Conversation
cd44abf to
a5941eb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes a rendering bug where Chinese characters and other CJK text were incorrectly split during automatic line wrapping in code environments, causing them to display as <#XXXX> escape sequences instead of the actual characters.
Changes:
- Added boundary protection logic to prevent splitting TeXmacs internal escape sequences (
<#...>) during line wrapping - Applied the fix to both
verb_language_rep(for\codeenvironments) andprog_language_rep(for language-specific code environments like\python-code,\cpp-code, etc.) - Added documentation and test cases
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| src/System/Language/verb_language.cpp | Added helper functions to detect and preserve <#...> escape sequences during hyphenation in verbatim/code mode |
| src/System/Language/prog_language.cpp | Added identical helper functions for programming language code environments |
| devel/209_9.md | Added developer documentation explaining the issue, fix, and testing approach |
| TeXmacs/tests/tmu/209_9.tmu | Added test document with code examples in multiple environments to verify the fix |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
60d46a7 to
679c276
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (after >= n) return n; | ||
|
|
||
| int i = 0; | ||
| int last= 0; |
Copilot
AI
Jan 22, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The initialization of last to 0 could lead to returning 0 when no valid split point is found before the requested after position. Consider whether this is the intended behavior. If the entire beginning of the string consists of an escape sequence and after falls within it, the function will return 0, which would place the entire string in the right output and leave left empty. This might cause issues with line breaking algorithms that expect at least some progress. Consider documenting this edge case behavior or initializing last to after to ensure forward progress in line breaking.
| int last= 0; | |
| // Initialize last to 'after' so we guarantee forward progress even if no | |
| // valid atom boundary is found at or before 'after'. | |
| int last= after; |
| code模式示例: | ||
|
|
||
| <\cpp-code> | ||
| <code|<code*|z中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中>> |
Copilot
AI
Jan 22, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line 9 contains nested code tags <code|<code*|...>> while lines 15, 21, and 27 only use <code*|...>. This inconsistency appears to be unintentional. Consider removing the outer <code| tag to match the pattern used in the other examples.
| <code|<code*|z中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中>> | |
| <code*|z中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中> |
如何测试
启动 Mogan
插入以下任意代码环境之一(或其他支持的代码环境):
\code\python-code\cpp-code\r-code输入一行足够长、包含中文字符的内容,例如:
期望结果:
中文字符不会被拆分
不再出现 <#XXXX> 或在 < 处断裂的异常显示
中文字符要么完整出现在上一行,要么完整出现在下一行
测试文档: TeXmacs/tests/tmu/209_9.tmu
2026/1/21
What
修复在代码模式下(包括 \code、\python-code、\cpp-code 等环境)
中文字符在自动换行时被错误拆分、显示为 <#XXXX> 的问题。
Why
代码模式在自动换行时直接按字符串下标切分字符串,
当断行位置落在 <#XXXX> 内部时,会破坏内部转义结构,
最终导致渲染失败并显示为 <#XXXX>。
关联issue #2605
How
在 verb_language_rep::hyphenate 与 prog_language_rep::hyphenate 中
引入断行边界保护机制:
将 <#...> 内部转义序列视为不可拆分的原子
若断行位置落在原子内部,则向左吸附到最近的合法边界
仅在合法边界处对字符串进行切分