Skip to content

Commit 1768383

Browse files
newrengitster
authored andcommitted
fast-export: avoid stripping encoding header if we cannot reencode
When fast-export encounters a commit with an 'encoding' header, it tries to reencode in utf-8 and then drops the encoding header. However, if it fails to reencode in utf-8 because e.g. one of the characters in the commit message was invalid in the old encoding, then we need to retain the original encoding or otherwise we lose information needed to understand all the other (valid) characters in the original commit message. Signed-off-by: Elijah Newren <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 328bd09 commit 1768383

File tree

3 files changed

+23
-2
lines changed

3 files changed

+23
-2
lines changed

builtin/fast-export.c

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -642,9 +642,12 @@ static void handle_commit(struct commit *commit, struct rev_info *rev,
642642
printf("commit %s\nmark :%"PRIu32"\n", refname, last_idnum);
643643
if (show_original_ids)
644644
printf("original-oid %s\n", oid_to_hex(&commit->object.oid));
645-
printf("%.*s\n%.*s\ndata %u\n%s",
645+
printf("%.*s\n%.*s\n",
646646
(int)(author_end - author), author,
647-
(int)(committer_end - committer), committer,
647+
(int)(committer_end - committer), committer);
648+
if (!reencoded && encoding)
649+
printf("encoding %s\n", encoding);
650+
printf("data %u\n%s",
648651
(unsigned)(reencoded
649652
? strlen(reencoded) : message
650653
? strlen(message) : 0),

t/t9350-fast-export.sh

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -120,6 +120,23 @@ test_expect_success 'iso-8859-7' '
120120
! grep ^encoding actual)
121121
'
122122

123+
test_expect_success 'encoding preserved if reencoding fails' '
124+
125+
test_when_finished "git reset --hard HEAD~1" &&
126+
test_config i18n.commitencoding iso-8859-7 &&
127+
echo rosten >file &&
128+
git commit -s -F "$TEST_DIRECTORY/t9350/broken-iso-8859-7-commit-message.txt" file &&
129+
git fast-export wer^..wer >iso-8859-7.fi &&
130+
sed "s/wer/i18n-invalid/" iso-8859-7.fi |
131+
(cd new &&
132+
git fast-import &&
133+
git cat-file commit i18n-invalid >actual &&
134+
grep ^encoding actual &&
135+
# Also verify that the commit has the expected size; i.e.
136+
# that no bytes were re-encoded to a different encoding.
137+
test 252 -eq "$(git cat-file -s i18n-invalid)")
138+
'
139+
123140
test_expect_success 'import/export-marks' '
124141
125142
git checkout -b marks master &&
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Pi: �; Invalid: �

0 commit comments

Comments
 (0)