Skip to content

Commit 462749b

Browse files
René Scharfegitster
authored andcommitted
utf8.c: speculatively assume utf-8 in strbuf_add_wrapped_text()
is_utf8() works by calling utf8_width() for each character at the supplied location. In strbuf_add_wrapped_text(), we do that anyway while wrapping the lines. So instead of checking the encoding beforehand, optimistically assume that it's utf-8 and wrap along until an invalid character is hit, and when that happens start over. This pays off if the text consists only of valid utf-8 characters. The following command was run against the Linux kernel repo with git 1.7.0: $ time git log --format='%b' v2.6.32 >/dev/null real 0m2.679s user 0m2.580s sys 0m0.100s $ time git log --format='%w(60,4,8)%b' >/dev/null real 0m4.342s user 0m4.230s sys 0m0.110s And with this patch series: $ time git log --format='%w(60,4,8)%b' >/dev/null real 0m3.741s user 0m3.630s sys 0m0.110s So the cost of wrapping is reduced to 70% in this case. Signed-off-by: Rene Scharfe <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 68ad5e1 commit 462749b

File tree

1 file changed

+17
-6
lines changed

1 file changed

+17
-6
lines changed

utf8.c

Lines changed: 17 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -324,16 +324,21 @@ static size_t display_mode_esc_sequence_len(const char *s)
324324
* consumed (and no extra indent is necessary for the first line).
325325
*/
326326
int strbuf_add_wrapped_text(struct strbuf *buf,
327-
const char *text, int indent, int indent2, int width)
327+
const char *text, int indent1, int indent2, int width)
328328
{
329-
int w = indent, assume_utf8 = is_utf8(text);
330-
const char *bol = text, *space = NULL;
329+
int indent, w, assume_utf8 = 1;
330+
const char *bol, *space, *start = text;
331+
size_t orig_len = buf->len;
331332

332333
if (width <= 0) {
333-
strbuf_add_indented_text(buf, text, indent, indent2);
334+
strbuf_add_indented_text(buf, text, indent1, indent2);
334335
return 1;
335336
}
336337

338+
retry:
339+
bol = text;
340+
w = indent = indent1;
341+
space = NULL;
337342
if (indent < 0) {
338343
w = -indent;
339344
space = text;
@@ -385,9 +390,15 @@ int strbuf_add_wrapped_text(struct strbuf *buf,
385390
}
386391
continue;
387392
}
388-
if (assume_utf8)
393+
if (assume_utf8) {
389394
w += utf8_width(&text, NULL);
390-
else {
395+
if (!text) {
396+
assume_utf8 = 0;
397+
text = start;
398+
strbuf_setlen(buf, orig_len);
399+
goto retry;
400+
}
401+
} else {
391402
w++;
392403
text++;
393404
}

0 commit comments

Comments
 (0)