Skip to content

Conversation

@notfoundzzz
Copy link
Contributor

如何测试

  1. 启动 Mogan

  2. 插入以下任意代码环境之一(或其他支持的代码环境):

    • \code
    • \python-code
    • \cpp-code
    • \r-code
  3. 输入一行足够长、包含中文字符的内容,例如:

z中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中
  1. 在开头输入字符以触发不同位置的自动换行

期望结果:

  • 中文字符不会被拆分

  • 不再出现 <#XXXX> 或在 < 处断裂的异常显示

  • 中文字符要么完整出现在上一行,要么完整出现在下一行

测试文档: TeXmacs/tests/tmu/209_9.tmu

2026/1/21

What

修复在代码模式下(包括 \code、\python-code、\cpp-code 等环境)
中文字符在自动换行时被错误拆分、显示为 <#XXXX> 的问题。

Why

代码模式在自动换行时直接按字符串下标切分字符串,
当断行位置落在 <#XXXX> 内部时,会破坏内部转义结构,
最终导致渲染失败并显示为 <#XXXX>。

关联issue #2605

How

在 verb_language_rep::hyphenate 与 prog_language_rep::hyphenate 中
引入断行边界保护机制:

  • 将 <#...> 内部转义序列视为不可拆分的原子

  • 若断行位置落在原子内部,则向左吸附到最近的合法边界

  • 仅在合法边界处对字符串进行切分

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a rendering bug where Chinese characters and other CJK text were incorrectly split during automatic line wrapping in code environments, causing them to display as <#XXXX> escape sequences instead of the actual characters.

Changes:

  • Added boundary protection logic to prevent splitting TeXmacs internal escape sequences (<#...>) during line wrapping
  • Applied the fix to both verb_language_rep (for \code environments) and prog_language_rep (for language-specific code environments like \python-code, \cpp-code, etc.)
  • Added documentation and test cases

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
src/System/Language/verb_language.cpp Added helper functions to detect and preserve <#...> escape sequences during hyphenation in verbatim/code mode
src/System/Language/prog_language.cpp Added identical helper functions for programming language code environments
devel/209_9.md Added developer documentation explaining the issue, fix, and testing approach
TeXmacs/tests/tmu/209_9.tmu Added test document with code examples in multiple environments to verify the fix

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@notfoundzzz notfoundzzz force-pushed the zhl/209_9/fix-code-chinese-lexing branch from 60d46a7 to 679c276 Compare January 21, 2026 08:15
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

if (after >= n) return n;

int i = 0;
int last= 0;
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initialization of last to 0 could lead to returning 0 when no valid split point is found before the requested after position. Consider whether this is the intended behavior. If the entire beginning of the string consists of an escape sequence and after falls within it, the function will return 0, which would place the entire string in the right output and leave left empty. This might cause issues with line breaking algorithms that expect at least some progress. Consider documenting this edge case behavior or initializing last to after to ensure forward progress in line breaking.

Suggested change
int last= 0;
// Initialize last to 'after' so we guarantee forward progress even if no
// valid atom boundary is found at or before 'after'.
int last= after;

Copilot uses AI. Check for mistakes.
code模式示例:

<\cpp-code>
<code|<code*|z中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中>>
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 9 contains nested code tags <code|<code*|...>> while lines 15, 21, and 27 only use <code*|...>. This inconsistency appears to be unintentional. Consider removing the outer <code| tag to match the pattern used in the other examples.

Suggested change
<code|<code*|z中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中>>
<code*|z中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中>

Copilot uses AI. Check for mistakes.
@da-liii da-liii requested a review from yinyuscloor January 26, 2026 03:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants