Commit 345c9a5
Fix two out-of-bounds read issues when handling truncated UTF-8 input (#1005)
Two independent out-of-bounds read issues were identified in OpenCC's UTF-8
processing logic when handling malformed or truncated UTF-8 sequences.
1) MaxMatchSegmentation:
NextCharLength() could return a value larger than the remaining input size.
The previous logic subtracted this value from a size_t length counter,
potentially causing underflow and subsequent out-of-bounds reads.
2) Conversion:
Similar length handling could allow reads past the end of the input buffer
during dictionary matching, potentially propagating unintended bytes to the
conversion output.
This patch fixes both issues by:
- Explicitly tracking the end of the input buffer
- Recomputing remaining length on each iteration
- Clamping matched character and key lengths to the remaining buffer size
- Preventing reads past the null terminator
The changes preserve existing behavior for valid UTF-8 input and add test
coverage for truncated UTF-8 sequences.
These issues may have security implications when processing untrusted input
and are classified as heap out-of-bounds reads (CWE-125).
Co-authored-by: Claude <noreply@anthropic.com>1 parent 0277545 commit 345c9a5
File tree
4 files changed
+110
-4
lines changed- src
4 files changed
+110
-4
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
26 | 32 | | |
27 | | - | |
| 33 | + | |
| 34 | + | |
28 | 35 | | |
29 | 36 | | |
30 | 37 | | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
31 | 42 | | |
32 | 43 | | |
33 | 44 | | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
34 | 50 | | |
35 | 51 | | |
36 | 52 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
47 | 47 | | |
48 | 48 | | |
49 | 49 | | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
50 | 90 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
30 | 30 | | |
31 | 31 | | |
32 | 32 | | |
33 | | - | |
| 33 | + | |
34 | 34 | | |
35 | | - | |
| 35 | + | |
| 36 | + | |
36 | 37 | | |
37 | 38 | | |
38 | 39 | | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
39 | 44 | | |
40 | 45 | | |
41 | 46 | | |
| |||
44 | 49 | | |
45 | 50 | | |
46 | 51 | | |
47 | | - | |
48 | 52 | | |
49 | 53 | | |
50 | 54 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
43 | 43 | | |
44 | 44 | | |
45 | 45 | | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
46 | 92 | | |
0 commit comments