Skip to content

Commit 3270741

Browse files
committed
utf8: refactor code to decide fallback encoding
The codepath we use to call iconv_open() has a provision to use a fallback encoding when it fails, hoping that "UTF-8" being spelled differently could be the reason why the library function did not like the encoding names we gave it. Essentially, we turn what we have observed to be used as variants of "UTF-8" (e.g. "utf8") into the most official spelling and use that as a fallback. We do the same thing for input and output encoding. Introduce a helper function to do just one side and call that twice. Signed-off-by: Junio C Hamano <[email protected]>
1 parent 0b65a8d commit 3270741

File tree

1 file changed

+18
-11
lines changed

1 file changed

+18
-11
lines changed

utf8.c

Lines changed: 18 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -489,6 +489,21 @@ char *reencode_string_iconv(const char *in, size_t insz, iconv_t conv, int *outs
489489
return out;
490490
}
491491

492+
static const char *fallback_encoding(const char *name)
493+
{
494+
/*
495+
* Some platforms do not have the variously spelled variants of
496+
* UTF-8, so let's fall back to trying the most official
497+
* spelling. We do so only as a fallback in case the platform
498+
* does understand the user's spelling, but not our official
499+
* one.
500+
*/
501+
if (is_encoding_utf8(name))
502+
return "UTF-8";
503+
504+
return name;
505+
}
506+
492507
char *reencode_string_len(const char *in, int insz,
493508
const char *out_encoding, const char *in_encoding,
494509
int *outsz)
@@ -501,17 +516,9 @@ char *reencode_string_len(const char *in, int insz,
501516

502517
conv = iconv_open(out_encoding, in_encoding);
503518
if (conv == (iconv_t) -1) {
504-
/*
505-
* Some platforms do not have the variously spelled variants of
506-
* UTF-8, so let's fall back to trying the most official
507-
* spelling. We do so only as a fallback in case the platform
508-
* does understand the user's spelling, but not our official
509-
* one.
510-
*/
511-
if (is_encoding_utf8(in_encoding))
512-
in_encoding = "UTF-8";
513-
if (is_encoding_utf8(out_encoding))
514-
out_encoding = "UTF-8";
519+
in_encoding = fallback_encoding(in_encoding);
520+
out_encoding = fallback_encoding(out_encoding);
521+
515522
conv = iconv_open(out_encoding, in_encoding);
516523
if (conv == (iconv_t) -1)
517524
return NULL;

0 commit comments

Comments
 (0)