Skip to content

Commit 28ba53c

Browse files
masahir0ytytso
authored andcommitted
unicode: refactor the rule for regenerating utf8data.h
scripts/mkutf8data is used only when regenerating utf8data.h, which never happens in the normal kernel build. However, it is irrespectively built if CONFIG_UNICODE is enabled. Moreover, there is no good reason for it to reside in the scripts/ directory since it is only used in fs/unicode/. Hence, move it from scripts/ to fs/unicode/. In some cases, we bypass build artifacts in the normal build. The conventional way to do so is to surround the code with ifdef REGENERATE_*. For example, - 7373f4f ("kbuild: add implicit rules for parser generation") - 6aaf49b ("crypto: arm,arm64 - Fix random regeneration of S_shipped") I rewrote the rule in a more kbuild'ish style. In the normal build, utf8data.h is just shipped from the check-in file. $ make [ snip ] SHIPPED fs/unicode/utf8data.h CC fs/unicode/utf8-norm.o CC fs/unicode/utf8-core.o CC fs/unicode/utf8-selftest.o AR fs/unicode/built-in.a If you want to generate utf8data.h based on UCD, put *.txt files into fs/unicode/, then pass REGENERATE_UTF8DATA=1 from the command line. The mkutf8data tool will be automatically compiled to generate the utf8data.h from the *.txt files. $ make REGENERATE_UTF8DATA=1 [ snip ] HOSTCC fs/unicode/mkutf8data GEN fs/unicode/utf8data.h CC fs/unicode/utf8-norm.o CC fs/unicode/utf8-core.o CC fs/unicode/utf8-selftest.o AR fs/unicode/built-in.a I renamed the check-in utf8data.h to utf8data.h_shipped so that this will work for the out-of-tree build. You can update it based on the latest UCD like this: $ make REGENERATE_UTF8DATA=1 fs/unicode/ $ cp fs/unicode/utf8data.h fs/unicode/utf8data.h_shipped Also, I added entries to .gitignore and dontdiff. Signed-off-by: Masahiro Yamada <[email protected]> Signed-off-by: Theodore Ts'o <[email protected]>
1 parent 0a790fe commit 28ba53c

File tree

7 files changed

+38
-17
lines changed

7 files changed

+38
-17
lines changed

Documentation/dontdiff

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -176,6 +176,7 @@ mkprep
176176
mkregtable
177177
mktables
178178
mktree
179+
mkutf8data
179180
modpost
180181
modules.builtin
181182
modules.order
@@ -254,6 +255,7 @@ vsyscall_32.lds
254255
wanxlfw.inc
255256
uImage
256257
unifdef
258+
utf8data.h
257259
wakeup.bin
258260
wakeup.elf
259261
wakeup.lds

fs/unicode/.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
mkutf8data
2+
utf8data.h

fs/unicode/Makefile

Lines changed: 30 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -5,15 +5,34 @@ obj-$(CONFIG_UNICODE_NORMALIZATION_SELFTEST) += utf8-selftest.o
55

66
unicode-y := utf8-norm.o utf8-core.o
77

8-
# This rule is not invoked during the kernel compilation. It is used to
9-
# regenerate the utf8data.h header file.
10-
utf8data.h.new: *.txt $(objdir)/scripts/mkutf8data
11-
$(objdir)/scripts/mkutf8data \
12-
-a DerivedAge.txt \
13-
-c DerivedCombiningClass.txt \
14-
-p DerivedCoreProperties.txt \
15-
-d UnicodeData.txt \
16-
-f CaseFolding.txt \
17-
-n NormalizationCorrections.txt \
18-
-t NormalizationTest.txt \
8+
$(obj)/utf8-norm.o: $(obj)/utf8data.h
9+
10+
# In the normal build, the checked-in utf8data.h is just shipped.
11+
#
12+
# To generate utf8data.h from UCD, put *.txt files in this directory
13+
# and pass REGENERATE_UTF8DATA=1 from the command line.
14+
ifdef REGENERATE_UTF8DATA
15+
16+
quiet_cmd_utf8data = GEN $@
17+
cmd_utf8data = $< \
18+
-a $(srctree)/$(src)/DerivedAge.txt \
19+
-c $(srctree)/$(src)/DerivedCombiningClass.txt \
20+
-p $(srctree)/$(src)/DerivedCoreProperties.txt \
21+
-d $(srctree)/$(src)/UnicodeData.txt \
22+
-f $(srctree)/$(src)/CaseFolding.txt \
23+
-n $(srctree)/$(src)/NormalizationCorrections.txt \
24+
-t $(srctree)/$(src)/NormalizationTest.txt \
1925
-o $@
26+
27+
$(obj)/utf8data.h: $(obj)/mkutf8data $(filter %.txt, $(cmd_utf8data)) FORCE
28+
$(call if_changed,utf8data)
29+
30+
else
31+
32+
$(obj)/utf8data.h: $(src)/utf8data.h_shipped FORCE
33+
$(call if_changed,shipped)
34+
35+
endif
36+
37+
targets += utf8data.h
38+
hostprogs-y += mkutf8data

fs/unicode/README.utf8data

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -55,15 +55,14 @@ released version of the UCD can be found here:
5555

5656
http://www.unicode.org/Public/UCD/latest/
5757

58-
To build the utf8data.h file, from a kernel tree that has been built,
59-
cd to this directory (fs/unicode) and run this command:
58+
Then, build under fs/unicode/ with REGENERATE_UTF8DATA=1:
6059

61-
make C=../.. objdir=../.. utf8data.h.new
60+
make REGENERATE_UTF8DATA=1 fs/unicode/
6261

63-
After sanity checking the newly generated utf8data.h.new file (the
62+
After sanity checking the newly generated utf8data.h file (the
6463
version generated from the 12.1.0 UCD should be 4,109 lines long, and
6564
have a total size of 324k) and/or comparing it with the older version
66-
of utf8data.h, rename it to utf8data.h.
65+
of utf8data.h_shipped, rename it to utf8data.h_shipped.
6766

6867
If you are a kernel developer updating to a newer version of the
6968
Unicode Character Database, please update this README.utf8data file
File renamed without changes.
File renamed without changes.

scripts/Makefile

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,6 @@ hostprogs-$(CONFIG_ASN1) += asn1_compiler
2020
hostprogs-$(CONFIG_MODULE_SIG) += sign-file
2121
hostprogs-$(CONFIG_SYSTEM_TRUSTED_KEYRING) += extract-cert
2222
hostprogs-$(CONFIG_SYSTEM_EXTRA_CERTIFICATE) += insert-sys-cert
23-
hostprogs-$(CONFIG_UNICODE) += mkutf8data
2423

2524
HOSTCFLAGS_sortextable.o = -I$(srctree)/tools/include
2625
HOSTCFLAGS_asn1_compiler.o = -I$(srctree)/include

0 commit comments

Comments
 (0)