Skip to content

Commit 56542ce

Browse files
authored
Add comment about CJK compatibility ideographs to DoNotEmit.txt (#1187)
[UTC-184-A55] Action Item for Roozbeh Pournader, SAH: Add a comment to DoNotEmit.txt header explaining the situation with CJK compatibility characters. [Ref. 4.2 in L2/25-187]
1 parent 74197a5 commit 56542ce

File tree

1 file changed

+11
-1
lines changed

1 file changed

+11
-1
lines changed

unicodetools/data/ucd/dev/DoNotEmit.txt

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# DoNotEmit-17.0.0.txt
2-
# Date: 2025-07-30
2+
# Date: 2025-08-04
33
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
@@ -42,6 +42,16 @@
4242
# Sequences for Egyptian Hieroglyphs are not listed here. See
4343
# the kEH_AltSeq property in UAX #57 for that information.
4444
#
45+
# CJK compatibility ideographs are not listed here either. Most of the CJK
46+
# compatibility ideographs are canonically equivalent to a CJK unified
47+
# ideograph, which means that distinctions between compatibility ideographs
48+
# and the unified ideographs that they are canonically equivalent to would
49+
# be lost in normalization. The preferred form for applications that intend
50+
# to keep such distinctions is using a standardized variation sequence
51+
# instead of a CJK compatibility ideograph. For a comprehensive list of
52+
# these standardized variation sequences, see the section "CJK
53+
# compatibility ideographs" in StandardizedVariants.txt.
54+
#
4555
# Note that some sequences could be considered recursive, in the way that
4656
# the preferred sequence to use may be a subsequence of the "Do Not Emit"
4757
# sequence. This may have implications for some implementations who may want

0 commit comments

Comments
 (0)