Skip to content

Commit ade5ab7

Browse files
authored
[209_9] 修复代码模式的中文换行显示问题 (#2635)
Co-authored-by: notfoundzzz <notfoundzzz@users.noreply.github.com>
1 parent db38eef commit ade5ab7

5 files changed

Lines changed: 157 additions & 4 deletions

File tree

TeXmacs/tests/tmu/209_9.tmu

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
<TMU|<tuple|1.1.0|2026.1.1>>
2+
3+
<style|<tuple|generic|chinese|table-captions-above|number-europe|preview-ref|python|r>>
4+
5+
<\body>
6+
code模式示例:
7+
8+
<\cpp-code>
9+
<code|<code*|z中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中>>
10+
</cpp-code>
11+
12+
python-code模式示例:
13+
14+
<\python-code>
15+
<code*|z中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中>
16+
</python-code>
17+
18+
cpp-code模式示例:
19+
20+
<\cpp-code>
21+
<code*|z中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中>
22+
</cpp-code>
23+
24+
r-code模式示例:
25+
26+
<\r-code>
27+
<code*|z中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中>
28+
</r-code>
29+
</body>
30+
31+
<\initial>
32+
<\collection>
33+
<associate|page-medium|paper>
34+
<associate|page-screen-margin|false>
35+
</collection>
36+
</initial>

devel/209_9.md

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
# 209_9 修复代码模式的中文换行显示问题
2+
3+
## 如何测试
4+
1. 启动 Mogan / TeXmacs
5+
2. 插入以下任意代码环境之一(或其他支持的代码环境):
6+
- `\code`
7+
- `\python-code`
8+
- `\cpp-code`
9+
- `\r-code`
10+
11+
3. 输入一行**足够长**、包含**中文字符**的内容,例如:
12+
```tex
13+
z中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中中
14+
```
15+
4. 在开头输入字符以触发不同位置的自动换行
16+
17+
期望结果:
18+
19+
- 中文字符不会被拆分
20+
21+
- 不再出现 <#XXXX> 或在 < 处断裂的异常显示
22+
23+
- 中文字符要么完整出现在上一行,要么完整出现在下一行
24+
25+
测试文档: TeXmacs/tests/tmu/209_9.tmu
26+
27+
## 2026/1/21
28+
### What
29+
修复在代码模式下(包括 \code、\python-code、\cpp-code 等环境)
30+
中文字符在自动换行时被错误拆分、显示为 <#XXXX> 的问题。
31+
32+
### Why
33+
代码模式在自动换行时直接按字符串下标切分字符串,
34+
当断行位置落在 <#XXXX> 内部时,会破坏内部转义结构,
35+
最终导致渲染失败并显示为 <#XXXX>。
36+
37+
关联issue #2605
38+
39+
### How
40+
在 verb_language_rep::hyphenate 与 prog_language_rep::hyphenate 中
41+
引入断行边界保护机制:
42+
43+
- 将 <#...> 内部转义序列视为不可拆分的原子
44+
45+
- 若断行位置落在原子内部,则向左吸附到最近的合法边界
46+
47+
- 仅在合法边界处对字符串进行切分

src/System/Language/code_wrap.hpp

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
/******************************************************************************
2+
* MODULE : code_wrap.hpp
3+
* DESCRIPTION: helpers for safe code wrapping boundaries
4+
* COPYRIGHT : (C) 2026 The MoganSTEM contributors
5+
*******************************************************************************
6+
* This software falls under the GNU general public license version 3 or later.
7+
* It comes WITHOUT ANY WARRANTY WHATSOEVER. For details, see the file LICENSE
8+
* in the root directory or <http://www.gnu.org/licenses/gpl-3.0.html>.
9+
******************************************************************************/
10+
11+
#ifndef TM_CODE_WRAP_HPP
12+
#define TM_CODE_WRAP_HPP
13+
14+
#include "basic.hpp"
15+
#include "string.hpp"
16+
// Protect TeXmacs internal escape sequences like "<#4E2D>" (CJK, etc.)
17+
// from being split during automatic line wrapping in code/prog environments.
18+
//
19+
// NOTE:
20+
// - We only protect "<#...>" to avoid affecting normal code like "<tag>".
21+
// - This is a last-resort safety net: even if the line breaker proposes an
22+
// invalid split position, we snap it to a valid boundary here.
23+
static inline int
24+
tm_atom_end_for_code_wrap (string s, int i) {
25+
int n= N (s);
26+
if (i < 0 || i >= n) return i;
27+
if (s[i] != '<') return i + 1;
28+
29+
if (i + 1 >= n || s[i + 1] != '#') return i + 1;
30+
31+
int j= i + 2;
32+
while (j < n && s[j] != '>')
33+
j++;
34+
if (j < n && s[j] == '>') return j + 1;
35+
36+
return i + 1;
37+
}
38+
39+
static inline int
40+
tm_snap_after_boundary_for_code_wrap (string s, int after) {
41+
int n= N (s);
42+
if (after <= 0) return 0;
43+
if (after >= n) return n;
44+
45+
int i = 0;
46+
int last= 0;
47+
while (i < n) {
48+
int j= tm_atom_end_for_code_wrap (s, i);
49+
if (j > after) break;
50+
last= j;
51+
i = j;
52+
}
53+
return last;
54+
}
55+
56+
#endif // TM_CODE_WRAP_HPP

src/System/Language/prog_language.cpp

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@
1111
******************************************************************************/
1212

1313
#include "analyze.hpp"
14+
#include "code_wrap.hpp"
1415
#include "convert.hpp"
1516
#include "converter.hpp"
1617
#include "cork.hpp"
@@ -434,8 +435,9 @@ prog_language_rep::get_hyphens (string s) {
434435
void
435436
prog_language_rep::hyphenate (string s, int after, string& left,
436437
string& right) {
437-
left = s (0, after);
438-
right= s (after, N (s));
438+
int a= tm_snap_after_boundary_for_code_wrap (s, after);
439+
left = s (0, a);
440+
right= s (a, N (s));
439441
}
440442

441443
string

src/System/Language/verb_language.cpp

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
******************************************************************************/
1111

1212
#include "analyze.hpp"
13+
#include "code_wrap.hpp"
1314
#include "impl_language.hpp"
1415
#include "observers.hpp"
1516
#include "packrat.hpp"
@@ -61,11 +62,22 @@ verb_language_rep::get_hyphens (string s) {
6162
return penalty;
6263
}
6364

65+
/**
66+
* @brief 按代码原子边界切分 verbatim 文本,避免将 "<#...>" 内部拆开。
67+
* @param s 待切分的原始字符串。
68+
* @param after 布局器建议的断行位置(可能落在原子内部)。
69+
* @param left 返回断点左侧内容。
70+
* @param right 返回断点右侧内容。
71+
*
72+
* @note 示例:当 s 为 "ab<#4E2D>cd" 且 after 落在 "<#4E2D>" 内部时,
73+
* 会将断点回退到原子起止边界,避免产生非法拆分。
74+
*/
6475
void
6576
verb_language_rep::hyphenate (string s, int after, string& left,
6677
string& right) {
67-
left = s (0, after);
68-
right= s (after, N (s));
78+
int a= tm_snap_after_boundary_for_code_wrap (s, after);
79+
left = s (0, a);
80+
right= s (a, N (s));
6981
}
7082

7183
string

0 commit comments

Comments
 (0)