Skip to content

Commit 77c0188

Browse files
committed
Revise the format of the LinkFormattingTest
… as per email. Also cleaned up the code for the headers, to be more uniform. Also allow the wikipedia URL with ; in it.
1 parent 1ec265e commit 77c0188

File tree

7 files changed

+385
-209
lines changed

7 files changed

+385
-209
lines changed

unicodetools/data/linkification/dev/LinkBracket.txt

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,32 +1,32 @@
11
# LinkBracket.txt
2-
# Date: 2025-12-16, 17:57:01 GMT
2+
# Date: 2025-12-20, 21:02:29 GMT
33
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
6-
#
6+
#
77
# The usage and stability of these values is covered in https://www.unicode.org/reports/tr58/
8-
#
8+
#
99
# ================================================
10-
#
10+
#
1111
# Property: Link_Bracket
1212
# Format
13-
#
13+
#
1414
# Field 0: code point
1515
# Field 1: code point
1616
# For more information, see https://www.unicode.org/reports/tr58/#property-data.
17-
#
17+
#
1818
# For the purpose of regular expressions, the property Link_Bracket is defined as
1919
# a string property whose value is either a single code point or is <none>.
20-
#
20+
#
2121
# The short name of the property is the same as its long name.
22-
#
22+
#
2323
# All code points not explicitly listed for Link_Bracket
2424
# have the value <none>.
25-
#
25+
#
2626
# @missing: 0000..10FFFF; <none>
27-
#
27+
#
2828
# ================================================
29-
29+
#
3030
0029 ; 0028 #1.1 () ⇒ () RIGHT PARENTHESIS
3131
003E ; 003C #1.1 (> ⇒ <) GREATER-THAN SIGN
3232
005D ; 005B #1.1 (] ⇒ [) RIGHT SQUARE BRACKET

unicodetools/data/linkification/dev/LinkDetectionTest.txt

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,16 @@
11
# LinkDetectionTest.txt
2-
# Date: 2025-12-16, 20:06:36 GMT
2+
# Date: 2025-12-20, 21:02:29 GMT
33
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
6-
#
6+
#
77
# The usage and stability of these values is covered in https://www.unicode.org/reports/tr58/
8-
#
8+
#
99
# ================================================
10-
#
10+
#
1111
# Format:
1212
# Each line contains zero or more marked links, such as ⸠abc.com⸡
13-
#
13+
#
1414
# Operation:
1515
# For each line.
1616
# • Create a copy of the line, with the characters ⸠ and ⸡ removed.
@@ -19,7 +19,7 @@
1919
# Empty lines, and lines starting with # are ignored.
2020
# Otherwise # is treated like any other character.
2121
# ================================================
22-
22+
#
2323

2424
# Misc. test cases
2525

unicodetools/data/linkification/dev/LinkEmail.txt

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,29 @@
11
# LinkEmail.txt
2-
# Date: 2025-12-16, 17:57:01 GMT
2+
# Date: 2025-12-20, 21:02:29 GMT
33
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
6-
#
6+
#
77
# The usage and stability of these values is covered in https://www.unicode.org/reports/tr58/
8-
#
8+
#
99
# ================================================
10-
#
10+
#
1111
# Property: Link_Email
1212
# Format
13-
#
13+
#
1414
# Field 0: code point range
1515
# For more information, see https://www.unicode.org/reports/tr58/#property-data.
16-
#
16+
#
1717
# For the purpose of regular expressions, the property Link_Email is defined as
1818
# a binary property.
19-
#
19+
#
2020
# The short name of the property is the same as its long name.
21-
#
21+
#
2222
# All code points not explicitly listed for Link_Email
2323
# have the value No.
24-
#
24+
#
2525
# ================================================
26-
26+
#
2727
0021 # 1.1 (!) EXCLAMATION MARK
2828
0023..0027 # 1.1 [5] (#..') NUMBER SIGN..APOSTROPHE
2929
002A..0039 # 1.1 [16] (*..9) ASTERISK..DIGIT NINE

unicodetools/data/linkification/dev/LinkFormattingTest.txt

Lines changed: 214 additions & 67 deletions
Large diffs are not rendered by default.

unicodetools/data/linkification/dev/LinkTerm.txt

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,34 +1,34 @@
11
# LinkTerm.txt
2-
# Date: 2025-12-16, 20:06:36 GMT
2+
# Date: 2025-12-20, 21:02:29 GMT
33
# © 2025 Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
6-
#
6+
#
77
# The usage and stability of these values is covered in https://www.unicode.org/reports/tr58/
8-
#
8+
#
99
# ================================================
10-
#
10+
#
1111
# Property: Link_Term
1212
# Format
13-
#
13+
#
1414
# Field 0: code point range
1515
# Field 1: a Link_Term value
1616
# For more information, see https://www.unicode.org/reports/tr58/#property-data.
17-
#
17+
#
1818
# For the purpose of regular expressions, the property Link_Term is defined as
1919
# an enumerated property of code points.
2020
# The short name of the property is the same as its long name.
2121
# The possible values are: Include, Hard, Soft, Close, Open
22-
#
22+
#
2323
# The short name of each value is the same as its long name.
24-
#
24+
#
2525
# All code points not explicitly listed for Link_Term
2626
# have the value Hard.
27-
#
27+
#
2828
# @missing: 0000..10FFFF; Hard
29-
#
29+
#
3030
# ================================================
31-
31+
#
3232
0021..0022 ; Soft # 1.1 [2] (!..") EXCLAMATION MARK..QUOTATION MARK
3333
0027 ; Soft # 1.1 (') APOSTROPHE
3434
002C ; Soft # 1.1 (,) COMMA

0 commit comments

Comments
 (0)