Skip to content

Commit a6577cd

Browse files
committed
script to pub final files; templatize readmes
1 parent 0a889ad commit a6577cd

File tree

13 files changed

+216
-62
lines changed

13 files changed

+216
-62
lines changed

docs/data-workflow.md

Lines changed: 35 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -30,17 +30,15 @@ Process:
3030
* Iterate between KenW and Michel.
3131
* Generated from UnicodeData.txt and an annotations file, using some C program.
3232
* Used for generating code charts.
33-
* KenW posts NamesList.txt into https://www.unicode.org/Public/draft/UCD/ucd/ .
33+
* KenW posts NamesList.txt somewhere.
3434
* A unicodetools GitHub contributor fetches this file
3535
and creates a pull request as for “regular” data files.
3636

3737
### Folder readmes
3838

39-
The “source of truth” for these is outside of GitHub for now.
40-
KenW updates or vets these files and posts them to https://www.unicode.org/Public/draft/ .
41-
A unicodetools GitHub contributor fetches these files and creates a pull request as above.
42-
43-
See https://github.com/unicode-org/properties/issues/8 “simplify versioning of readme files”
39+
The various ReadMe.txt files are checked into the unicodetools repo.
40+
They are templatized, and the publication scripts below replace variables with the
41+
Unicode and emoji versions, copyright year, and publication date (date when the script was run).
4442

4543
### “Regular” data files
4644

@@ -122,27 +120,45 @@ from a unicodetools workspace to a target folder with the layout of https://www.
122120
Send the resulting zip file to Rick for posting to https://www.unicode.org/Public/draft/ .
123121
Ask Rick to add other files that are not tracked in the unicodetools repo:
124122
* Unihan.zip to .../draft/UCD/ucd
123+
* UCDXML files to .../draft/UCD/ucdxml
125124
* beta charts to .../draft/UCD/charts
126125

127126
### Publish a release
128127

129-
TODO: Write a script like /pub/copy-release-to-draft.sh that will be run on the unicode.org server
130-
and copy the set of the .../dev/ data files for a beta snapshot
131-
from a unicodetools workspace to the location behind https://www.unicode.org/Public/draft/ .
128+
After the last UTC meeting for the release, collect all of the data file updates
129+
(mostly from recently opened action items).
132130

131+
When complete, publish the draft files once more via the beta script.
133132
Verify the final set of files in the draft folder.
134133

135-
TODO: Write a script like /pub/copy-draft-to-release.sh that will be run on the unicode.org server
136-
and copy the files from the location behind https://www.unicode.org/Public/draft/
137-
to the locations behind the version-specific release folders.
138-
For example:
139-
* https://www.unicode.org/Public/draft/UCD/https://www.unicode.org/Public/15.1.0/
140-
* https://www.unicode.org/Public/draft/UCA/https://www.unicode.org/Public/UCA/15.1.0/
141-
* https://www.unicode.org/Public/draft/emoji/https://www.unicode.org/Public/emoji/15.1/
142-
* etc.
134+
Run the [pub/copy-final.sh](https://github.com/unicode-org/unicodetools/blob/main/pub/copy-final.sh)
135+
script from an up-to-date repo workspace.
136+
137+
Send the resulting zip file to Rick for posting to https://www.unicode.org/Public/ (not .../Public/draft/).
138+
Ask Rick to add other files that are not tracked in the unicodetools repo:
139+
* Unihan.zip to .../<version>/ucd
140+
* UCDXML files to .../<version>/ucdxml
141+
* final charts to .../<version>/charts
142+
143+
This script works much like the beta script, except it:
144+
* assembles all of the files for Public/ in their release folder structure,
145+
rather than for Public/draft/
146+
* creates a zipped/<version> folder with UCD.zip
143147

144-
After a Unicode release, copy a snapshot of the unicodetools repo .../dev/ files
145-
(matching the released files, of course) to a versioned unicodetools folder;
148+
### After a release
149+
150+
Verify once more that the unicodetools repo .../dev/ files match the released/published files.
151+
(They better...)
152+
153+
Copy a snapshot of the unicodetools repo .../dev/ files to a versioned unicodetools folder;
146154
for example: .../unicodetools/data/ucd/15.1.0/ .
147155
(We no longer append a “-Update” suffix to the folder name.)
148156

157+
Create a release tag in the repo.
158+
159+
Edit the pub/*.sh scripts and advance the version numbers and copyright years.
160+
161+
Change the Unicode Tools code as necessary for the start of work on the next version.
162+
Settings.java lastVersion & latestVersion and more.
163+
164+
Declare “main” to be open for the next version.

pub/copy-alpha-to-draft.sh

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,25 @@ DRAFT=$2
1010

1111
UNITOOLS_DATA=$UNICODETOOLS/unicodetools/data
1212

13+
# Adjust the following for each year and version as needed.
14+
COPY_YEAR=2023
15+
UNI_VER=15.1.0
16+
EMOJI_VER=15.1
17+
18+
TODAY=`date --iso-8601`
19+
20+
mkdir -p $DRAFT
21+
22+
cat > $DRAFT/sed-readmes.txt << eof
23+
s/COPY_YEAR/$COPY_YEAR/
24+
s/PUB_DATE/$TODAY/
25+
s/PUB_STATUS/draft/
26+
s/UNI_VER/$UNI_VER/
27+
s/EMOJI_VER/$EMOJI_VER/
28+
s%PUBLIC_EMOJI%Public/draft/emoji/%
29+
s%PUBLIC_UCD_EMOJI%Public/draft/UCD/ucd/emoji/%
30+
eof
31+
1332
mkdir -p $DRAFT/UCD/ucd
1433
cp -r $UNITOOLS_DATA/ucd/dev/* $DRAFT/UCD/ucd
1534
rm -r $DRAFT/UCD/ucd/Unihan
@@ -22,6 +41,12 @@ cp $UNITOOLS_DATA/emoji/dev/* $DRAFT/emoji
2241
# Fix permissions. Everyone can read, and search directories.
2342
chmod a+rX -R $DRAFT
2443

44+
# Update the readmes in-place (-i) as set up above.
45+
find $DRAFT -name '*ReadMe.txt' | xargs sed -i -f $DRAFT/sed-readmes.txt
46+
47+
# Cleanup
48+
rm $DRAFT/sed-readmes.txt
49+
2550
rm $DRAFT/alpha.zip
2651
(cd $DRAFT; zip -r alpha.zip *)
2752

pub/copy-beta-to-draft.sh

Lines changed: 29 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,25 @@ DRAFT=$2
1010

1111
UNITOOLS_DATA=$UNICODETOOLS/unicodetools/data
1212

13+
# Adjust the following for each year and version as needed.
14+
COPY_YEAR=2023
15+
UNI_VER=15.1.0
16+
EMOJI_VER=15.1
17+
18+
TODAY=`date --iso-8601`
19+
20+
mkdir -p $DRAFT
21+
22+
cat > $DRAFT/sed-readmes.txt << eof
23+
s/COPY_YEAR/$COPY_YEAR/
24+
s/PUB_DATE/$TODAY/
25+
s/PUB_STATUS/draft/
26+
s/UNI_VER/$UNI_VER/
27+
s/EMOJI_VER/$EMOJI_VER/
28+
s%PUBLIC_EMOJI%Public/draft/emoji/%
29+
s%PUBLIC_UCD_EMOJI%Public/draft/UCD/ucd/emoji/%
30+
eof
31+
1332
mkdir -p $DRAFT/UCD/ucd
1433
cp -r $UNITOOLS_DATA/ucd/dev/* $DRAFT/UCD/ucd
1534
rm -r $DRAFT/UCD/ucd/Unihan
@@ -27,7 +46,7 @@ cp $UNITOOLS_DATA/idna/dev/* $DRAFT/idna
2746

2847
mkdir -p $DRAFT/idna2008derived
2948
rm $DRAFT/idna2008derived/*
30-
cp $UNITOOLS_DATA/idna/idna2008derived/Idna2008-15.1.0.txt $DRAFT/idna2008derived
49+
cp $UNITOOLS_DATA/idna/idna2008derived/Idna2008-$UNI_VER.txt $DRAFT/idna2008derived
3150
cp $UNITOOLS_DATA/idna/idna2008derived/ReadMe.txt $DRAFT/idna2008derived
3251

3352
mkdir -p $DRAFT/security
@@ -36,22 +55,29 @@ cp $UNITOOLS_DATA/security/dev/* $DRAFT/security
3655
# Fix permissions. Everyone can read, and search directories.
3756
chmod a+rX -R $DRAFT
3857

58+
# Update the readmes in-place (-i) as set up above.
59+
find $DRAFT -name '*ReadMe.txt' | xargs sed -i -f $DRAFT/sed-readmes.txt
60+
3961
# Zip files for some types of data, after fixing permissions
4062
rm $DRAFT/UCA/CollationTest.zip
4163
(cd $DRAFT/UCA; zip -r CollationTest.zip CollationTest && rm -r CollationTest)
4264

4365
rm $DRAFT/security/*.zip
44-
(cd $DRAFT/security; zip -r uts39-data-15.1.0.zip *)
66+
(cd $DRAFT/security; zip -r uts39-data-$UNI_VER.zip *)
4567

4668
# Fix permissions again to catch the zip files
4769
chmod a+rX -R $DRAFT
4870

49-
# Zip file to deliver the whole set of beta data files
71+
# Cleanup
72+
rm $DRAFT/sed-readmes.txt
73+
74+
# Zip file to deliver the whole set of data files
5075
rm $DRAFT/beta.zip
5176
(cd $DRAFT; zip -r beta.zip *)
5277

5378
echo "--------------------"
5479
echo "Copy files from elsewhere:"
5580
echo "- Unihan.zip to $DRAFT/UCD/ucd"
81+
echo "- UCDXML files to $DRAFT/UCD/ucdxml"
5682
echo "- beta charts to $DRAFT/UCD/charts"
5783

pub/copy-final.sh

Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
# Script for
2+
# https://github.com/unicode-org/unicodetools/blob/main/docs/data-workflow.md#publish-a-release
3+
#
4+
# Invoke like this:
5+
#
6+
# pub/copy-final.sh ~/unitools/mine/src /tmp/unicode/Public/final
7+
8+
UNICODETOOLS=$1
9+
DEST=$2
10+
11+
UNITOOLS_DATA=$UNICODETOOLS/unicodetools/data
12+
13+
# Adjust the following for each year and version as needed.
14+
COPY_YEAR=2023
15+
UNI_VER=15.1.0
16+
EMOJI_VER=15.1
17+
18+
TODAY=`date --iso-8601`
19+
20+
mkdir -p $DEST
21+
22+
cat > $DEST/sed-readmes.txt << eof
23+
s/COPY_YEAR/$COPY_YEAR/
24+
s/PUB_DATE/$TODAY/
25+
s/PUB_STATUS/final/
26+
s/UNI_VER/$UNI_VER/
27+
s/EMOJI_VER/$EMOJI_VER/
28+
s%PUBLIC_EMOJI%Public/emoji/$EMOJI_VER/%
29+
s%PUBLIC_UCD_EMOJI%Public/$UNI_VER/ucd/emoji/%
30+
eof
31+
32+
mkdir -p $DEST/$UNI_VER/ucd
33+
mkdir -p $DEST/zipped/$UNI_VER
34+
cp -r $UNITOOLS_DATA/ucd/dev/* $DEST/$UNI_VER/ucd
35+
rm -r $DEST/$UNI_VER/ucd/Unihan
36+
mv $DEST/$UNI_VER/ucd/version-ReadMe.txt $DEST/$UNI_VER/ReadMe.txt
37+
mv $DEST/$UNI_VER/ucd/zipped-ReadMe.txt $DEST/zipped/$UNI_VER/ReadMe.txt
38+
39+
mkdir -p $DEST/UCA/$UNI_VER
40+
cp -r $UNITOOLS_DATA/uca/dev/* $DEST/UCA/$UNI_VER
41+
42+
mkdir -p $DEST/emoji/$EMOJI_VER
43+
cp $UNITOOLS_DATA/emoji/dev/* $DEST/emoji/$EMOJI_VER
44+
45+
mkdir -p $DEST/idna/$UNI_VER
46+
cp $UNITOOLS_DATA/idna/dev/* $DEST/idna/$UNI_VER
47+
48+
mkdir -p $DEST/idna/idna2008derived
49+
rm $DEST/idna/idna2008derived/*
50+
cp $UNITOOLS_DATA/idna/idna2008derived/Idna2008-$UNI_VER.txt $DEST/idna/idna2008derived
51+
cp $UNITOOLS_DATA/idna/idna2008derived/ReadMe.txt $DEST/idna/idna2008derived
52+
53+
mkdir -p $DEST/security/$UNI_VER
54+
cp $UNITOOLS_DATA/security/dev/* $DEST/security/$UNI_VER
55+
56+
# Fix permissions. Everyone can read, and search directories.
57+
chmod a+rX -R $DEST
58+
59+
# Update the readmes in-place (-i) as set up above.
60+
find $DEST -name '*ReadMe.txt' | xargs sed -i -f $DEST/sed-readmes.txt
61+
62+
# Zip files for some types of data, after fixing permissions
63+
rm $DEST/$UNI_VER/ucd/UCD.zip
64+
(cd $DEST/$UNI_VER/ucd; zip -r UCD.zip * && mv UCD.zip $DEST/zipped/$UNI_VER)
65+
66+
rm $DEST/UCA/$UNI_VER/CollationTest.zip
67+
(cd $DEST/UCA/$UNI_VER; zip -r CollationTest.zip CollationTest && rm -r CollationTest)
68+
69+
rm $DEST/security/$UNI_VER/*.zip
70+
(cd $DEST/security/$UNI_VER; zip -r uts39-data-$UNI_VER.zip *)
71+
72+
# Fix permissions again to catch the zip files
73+
chmod a+rX -R $DEST
74+
75+
# Cleanup
76+
rm $DEST/sed-readmes.txt
77+
78+
# Zip file to deliver the whole set of data files
79+
rm $DEST/final.zip
80+
(cd $DEST; zip -r final.zip *)
81+
82+
echo "--------------------"
83+
echo "Copy files from elsewhere:"
84+
echo "- Unihan.zip to $DEST/$UNI_VER/ucd"
85+
echo "- UCDXML files to $DEST/$UNI_VER/ucdxml"
86+
echo "- final charts to $DEST/$UNI_VER/charts"
87+
Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,21 @@
11
# Unicode Emoji
2-
# © 2023 Unicode®, Inc.
2+
# © COPY_YEAR Unicode®, Inc.
33
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
44
# For terms of use, see https://www.unicode.org/terms_of_use.html
55

6-
This directory contains draft data files for Unicode Emoji, Version 15.1
6+
This directory contains PUB_STATUS data files for Unicode Emoji, Version EMOJI_VER
77

8-
Public/draft/emoji/
8+
PUBLIC_EMOJI
99

1010
emoji-sequences.txt
1111
emoji-zwj-sequences.txt
1212
emoji-test.txt
1313

14-
The following related files are found in the UCD for Version 15.1
14+
The following related files are found in the UCD for Version EMOJI_VER
1515

16-
Public/draft/UCD/ucd/emoji/
16+
PUBLIC_UCD_EMOJI
1717

1818
emoji-data.txt
1919
emoji-variation-sequences.txt
2020

21-
For documentation, see UTS #51 Unicode Emoji, Version 15.1
21+
For documentation, see UTS #51 Unicode Emoji, Version EMOJI_VER
Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
# Unicode IDNA Mapping and Test Data
2-
# Date: 2023-01-30
3-
# © 2023 Unicode®, Inc.
2+
# Date: PUB_DATE
3+
# © COPY_YEAR Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use, see https://www.unicode.org/terms_of_use.html
66

7-
This directory contains draft data files for version 15.1.0 of
7+
This directory contains PUB_STATUS data files for version UNI_VER of
88
UTS #46, Unicode IDNA Compatibility Processing.
99

1010
https://www.unicode.org/reports/tr46/

unicodetools/data/idna/idna2008derived/ReadMe.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# IDNA2008_Category Property
2-
# Date: 2023-05-16
3-
# © 2023 Unicode®, Inc.
2+
# Date: PUB_DATE
3+
# © COPY_YEAR Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use, see https://www.unicode.org/terms_of_use.html
66

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Unicode Security Data (Confusables and Identifiers)
2-
# © 2023 Unicode®, Inc.
2+
# © COPY_YEAR Unicode®, Inc.
33
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
44
# For terms of use, see https://www.unicode.org/terms_of_use.html
55

6-
This directory contains the data files for Version 15.1.0 of
6+
This directory contains the data files for Version UNI_VER of
77
UTS #39: Unicode Security Mechanisms (https://www.unicode.org/reports/tr39/)

unicodetools/data/uca/dev/ReadMe.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Unicode Collation Algorithm
2-
# Date: 2023-01-30
3-
# © 2023 Unicode®, Inc.
2+
# Date: PUB_DATE
3+
# © COPY_YEAR Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use, see https://www.unicode.org/terms_of_use.html
66
#
Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Unicode Character Database
2-
# Date: 2023-01-30
3-
# © 2023 Unicode®, Inc.
2+
# Date: PUB_DATE
3+
# © COPY_YEAR Unicode®, Inc.
44
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
55
# For terms of use, see https://www.unicode.org/terms_of_use.html
66
#
@@ -10,7 +10,7 @@
1010
# UAX #44, "Unicode Character Database"
1111
# UTS #51, "Unicode Emoji"
1212
#
13-
# The UAXes and UTS #51 can be accessed at https://www.unicode.org/versions/Unicode15.1.0/
13+
# The UAXes and UTS #51 can be accessed at https://www.unicode.org/versions/UnicodeUNI_VER/
1414

15-
This directory contains draft data files
16-
for the Unicode Character Database, for Version 15.1.0 of the Unicode Standard.
15+
This directory contains PUB_STATUS data files
16+
for the Unicode Character Database, for Version UNI_VER of the Unicode Standard.

0 commit comments

Comments
 (0)