2222</ tbody > </ table >
2323< div class ="body ">
2424 < h1 > Unicode® Collation Algorithm< br > Conformance Tests</ h1 >
25- < h2 align ="center " class =" changed " > Version 15.0.0< br > 2022-08-12</ h2 >
25+ < h2 align ="center "> Version 15.0.0< br > 2022-08-12</ h2 >
2626< p > The following files provide conformance tests for the Unicode Collation Algorithm
27- (< a href ="https://www.unicode.org/reports/tr10/tr10-46 .html "> UTS #10: Unicode Collation Algorithm</ a > ).</ p >
27+ (< a href ="https://www.unicode.org/reports/tr10/tr10-47 .html "> UTS #10: Unicode Collation Algorithm</ a > ).</ p >
2828 < ul >
2929 < li > CollationTest_SHIFTED.txt</ li >
3030 < li > CollationTest_NON_IGNORABLE.txt</ li >
@@ -33,7 +33,7 @@ <h2 align="center" class="changed">Version 15.0.0<br>2022-08-12</h2>
3333 </ ul >
3434 < p > These files are large, and thus packaged in zip format to save download time.</ p >
3535
36- < blockquote class =" changed " >
36+ < blockquote >
3737 < p > < b > Note:</ b > These files test the sort order of an untailored DUCET table.
3838 If you are using an implementation of the
3939 < a href ="https://www.unicode.org/reports/tr35/tr35-collation.html#CLDR_Collation_Algorithm "> CLDR Collation Algorithm</ a >
@@ -53,7 +53,7 @@ <h2>Format</h2>
5353 < p > There are four different files:</ p >
5454 < ul >
5555 < li > The shifted vs non-ignorable files correspond to the two alternate
56- < a href ="https://www.unicode.org/reports/tr10/tr10-46 .html#Variable_Weighting "> Variable Weighting</ a > values.</ li >
56+ < a href ="https://www.unicode.org/reports/tr10/tr10-47 .html#Variable_Weighting "> Variable Weighting</ a > values.</ li >
5757 < li > The SHORT versions omit the comments, for more compact storage.</ li >
5858 </ ul >
5959< p > The format is illustrated by the following example:</ p >
@@ -67,20 +67,15 @@ <h2>Format</h2>
6767 separator. Between the bars are the primary, secondary, tertiary, and quaternary weights (if any),
6868 in hex.</ p >
6969 < blockquote >
70- < table class ="noborder " border ="0 " cellpadding ="2 " cellspacing ="0 ">
71- < tbody > < tr >
72- < th class ="noborder " align ="left " valign ="top " width ="1% "> < b > Note:</ b > </ th >
73- < td class ="noborder " valign ="top "> The sort key is purely informational. UCA does < i > not</ i >
74- require the production of any particular sort key, as long as the results of comparisons
75- match.</ td >
76- </ tr >
77- </ tbody > </ table >
70+ < p > < b > Note:</ b > The sort key is purely informational. UCA does < i > not</ i >
71+ require the production of any particular sort key, as long as the results of comparisons
72+ match.</ p >
7873 </ blockquote >
7974
8075 < h2 > Testing</ h2 >
8176 < p > The files are designed so each line in the file will order as being greater than or equal to
8277 the previous one, when using the UCA and the
83- < a href ="https://www.unicode.org/reports/tr10/tr10-46 .html#Default_Unicode_Collation_Element_Table "> Default
78+ < a href ="https://www.unicode.org/reports/tr10/tr10-47 .html#Default_Unicode_Collation_Element_Table "> Default
8479 Unicode Collation Element Table</ a > .
8580 A test program can read in each line, compare it to
8681 the last line, and signal an error if order is not correct. The exact comparison that should be
@@ -97,9 +92,6 @@ <h2>Testing</h2>
9792 < p > These files contain test cases that include ill-formed strings, with surrogate code points.
9893 Implementations that do not weight surrogate code points the same way as reserved code points
9994 may filter out such lines lines in the test cases, before testing for conformance.</ p >
100- < blockquote class ="removed ">
101- < p > < b > Note:</ b > This test is only valid for an untailored DUCET table.</ p >
102- </ blockquote >
10395
10496 < h2 > Migration</ h2 >
10597 < h3 > Tie-breaker</ h3 >
@@ -131,14 +123,14 @@ <h3>Discontiguous contractions</h3>
131123 < li > S2.1.1 loops over each of the following three characters C,
132124 but there is no table entry for any of those three S+C.
133125 In particular, there is no DUCET mapping for 0FB2+0F71
134- (see < i > < a href ="https://www.unicode.org/reports/tr10/tr10-46 .html#Well_Formed_DUCET "> Tibetan and
126+ (see < i > < a href ="https://www.unicode.org/reports/tr10/tr10-47 .html#Well_Formed_DUCET "> Tibetan and
135127 Well-Formedness of DUCET</ a > </ i > ).</ li >
136128 < li > The loop exits without finding any match beyond S=0FB2.</ li >
137129 </ ul >
138130
139131 < p > See “Also note that the Algorithm employs two distinct contraction matching methods:”
140132 at the end of < i > Section 7.2,
141- < a href ="https://www.unicode.org/reports/tr10/tr10-46 .html#Step_2 "> Produce Collation Element Arrays</ a > </ i > .</ p >
133+ < a href ="https://www.unicode.org/reports/tr10/tr10-47 .html#Step_2 "> Produce Collation Element Arrays</ a > </ i > .</ p >
142134
143135 < hr width ="50% ">
144136 < p class ="copyright "> © 2022 Unicode, Inc. All Rights Reserved.
0 commit comments