Skip to content

Commit 540e7bc

Browse files
PhilipOakleygitster
authored andcommitted
doc: pretty-formats note wide char limitations, and add tests
The previous commits added clarifications to the column alignment placeholders, note that the spaces are optional around the parameters. Also, a proposed extension [1] to allow hard truncation (without ellipsis '..') highlighted that the existing code does not play well with wide characters, such as Asian fonts and emojis. For example, N wide characters take 2N columns so won't fit an odd number column width, causing misalignment somewhere. Further analysis also showed that decomposed characters, e.g. separate `a` + `umlaut` Unicode code-points may also be mis-counted, in some cases leaving multiple loose `umlauts` all combined together. Add some notes about these limitations, and add basic tests to demonstrate them. The chosen solution for the tests is to substitute any wide character that overlaps a splitting boundary for the unicode vertical ellipsis code point as a rare but 'obvious' substitution. An alternative could be the substitution with a single dot '.' which matches regular expression usage, and our two dot ellipsis, and further in scenarios where the bulk of the text is wide characters, would be obvious. In mainly 'ascii' scenarios a singleton emoji being substituted by a dot could be confusing. It is enough that the tests fail cleanly. The final choice for the substitute character can be deferred. [1] https://lore.kernel.org/git/[email protected]/ Signed-off-by: Philip Oakley <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent b5cd634 commit 540e7bc

File tree

2 files changed

+32
-0
lines changed

2 files changed

+32
-0
lines changed

Documentation/pretty-formats.txt

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -157,6 +157,11 @@ The placeholders are:
157157
only works correctly with N >= 2.
158158
Note 2: spaces around the N and M (see below)
159159
values are optional.
160+
Note 3: Emojis and other wide characters
161+
will take two display columns, which may
162+
over-run column boundaries.
163+
Note 4: decomposed character combining marks
164+
may be misplaced at padding boundaries.
160165
'%<|( <M> )':: make the next placeholder take at least until Mth
161166
display column, padding spaces on the right if necessary.
162167
Use negative M values for column positions measured

t/t4205-log-pretty-formats.sh

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1018,4 +1018,31 @@ test_expect_success '%(describe:abbrev=...) vs git describe --abbrev=...' '
10181018
test_cmp expect actual
10191019
'
10201020

1021+
# pretty-formats note wide char limitations, and add tests
1022+
test_expect_failure 'wide and decomposed characters column counting' '
1023+
1024+
# from t/lib-unicode-nfc-nfd.sh hex values converted to octal
1025+
utf8_nfc=$(printf "\303\251") && # e acute combined.
1026+
utf8_nfd=$(printf "\145\314\201") && # e with a combining acute (i.e. decomposed)
1027+
utf8_emoji=$(printf "\360\237\221\250") &&
1028+
1029+
# replacement character when requesting a wide char fits in a single display colum.
1030+
# "half wide" alternative could be a plain ASCII dot `.`
1031+
utf8_vert_ell=$(printf "\342\213\256") &&
1032+
1033+
# use ${xxx} here!
1034+
nfc10="${utf8_nfc}${utf8_nfc}${utf8_nfc}${utf8_nfc}${utf8_nfc}${utf8_nfc}${utf8_nfc}${utf8_nfc}${utf8_nfc}${utf8_nfc}" &&
1035+
nfd10="${utf8_nfd}${utf8_nfd}${utf8_nfd}${utf8_nfd}${utf8_nfd}${utf8_nfd}${utf8_nfd}${utf8_nfd}${utf8_nfd}${utf8_nfd}" &&
1036+
emoji5="${utf8_emoji}${utf8_emoji}${utf8_emoji}${utf8_emoji}${utf8_emoji}" &&
1037+
# emoji5 uses 10 display columns
1038+
1039+
test_commit "abcdefghij" &&
1040+
test_commit --no-tag "${nfc10}" &&
1041+
test_commit --no-tag "${nfd10}" &&
1042+
test_commit --no-tag "${emoji5}" &&
1043+
printf "${utf8_emoji}..${utf8_emoji}${utf8_vert_ell}\n${utf8_nfd}..${utf8_nfd}${utf8_nfd}\n${utf8_nfc}..${utf8_nfc}${utf8_nfc}\na..ij\n" >expected &&
1044+
git log --format="%<(5,mtrunc)%s" -4 >actual &&
1045+
test_cmp expected actual
1046+
'
1047+
10211048
test_done

0 commit comments

Comments
 (0)