Skip to content

Commit aab5720

Browse files
author
Eric Wong
committed
git-svn: respect i18n.commitencoding config
SVN itself always stores log messages in the repository as UTF-8. git always stores/retrieves everything as raw binary data with no transformations whatsoever. To interact with SVN, we need to encode log messages as UTF-8 before sending them to SVN, as SVN cannot do it for us. When retrieving log messages from SVN, we also need to (attempt to) reencode the UTF-8 log message back to the user-specified commit encoding. Note, handling i18n.logoutputencoding for "git svn log" also needs to be done in a future change. Also, this change only deals with the encoding of commit messages and nothing else (path names, blob content, ...). In-Reply-To: <[email protected]> James North <[email protected]> wrote: > Hi, > > I'm using git-svn on a system with ISO-8859-1 encoding. The problem is > when I try to use "git svn dcommit" to send changes to a remote svn > (also ISO-8859-1). > > Seems like git-svn is sending commit messages with utf-8 (just a > guessing...) and they look bad on the remote svn log. E.g. "Ca?\241a > de cami?\243n" > > I have tried using i18n.commitencoding=ISO-8859-1 as suggested by the > warning when doing "git svn dcommit" but messages still are sent with > wrong encoding. Signed-off-by: Eric Wong <[email protected]>
1 parent 163f368 commit aab5720

File tree

2 files changed

+101
-3
lines changed

2 files changed

+101
-3
lines changed

git-svn.perl

Lines changed: 21 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1136,9 +1136,19 @@ sub get_commit_entry {
11361136
system($editor, $commit_editmsg);
11371137
}
11381138
rename $commit_editmsg, $commit_msg or croak $!;
1139-
open $log_fh, '<', $commit_msg or croak $!;
1140-
{ local $/; chomp($log_entry{log} = <$log_fh>); }
1141-
close $log_fh or croak $!;
1139+
{
1140+
# SVN requires messages to be UTF-8 when entering the repo
1141+
local $/;
1142+
open $log_fh, '<', $commit_msg or croak $!;
1143+
binmode $log_fh;
1144+
chomp($log_entry{log} = <$log_fh>);
1145+
1146+
if (my $enc = Git::config('i18n.commitencoding')) {
1147+
require Encode;
1148+
Encode::from_to($log_entry{log}, $enc, 'UTF-8');
1149+
}
1150+
close $log_fh or croak $!;
1151+
}
11421152
unlink $commit_msg;
11431153
\%log_entry;
11441154
}
@@ -2273,6 +2283,14 @@ sub do_git_commit {
22732283
}
22742284
defined(my $pid = open3(my $msg_fh, my $out_fh, '>&STDERR', @exec))
22752285
or croak $!;
2286+
binmode $msg_fh;
2287+
2288+
# we always get UTF-8 from SVN, but we may want our commits in
2289+
# a different encoding.
2290+
if (my $enc = Git::config('i18n.commitencoding')) {
2291+
require Encode;
2292+
Encode::from_to($log_entry->{log}, 'UTF-8', $enc);
2293+
}
22762294
print $msg_fh $log_entry->{log} or croak $!;
22772295
restore_commit_header_env($old_env);
22782296
unless ($self->no_metadata) {
Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
#!/bin/sh
2+
#
3+
# Copyright (c) 2008 Eric Wong
4+
5+
test_description='git svn honors i18n.commitEncoding in config'
6+
7+
. ./lib-git-svn.sh
8+
9+
compare_git_head_with () {
10+
nr=`wc -l < "$1"`
11+
a=7
12+
b=$(($a + $nr - 1))
13+
git cat-file commit HEAD | sed -ne "$a,${b}p" >current &&
14+
test_cmp current "$1"
15+
}
16+
17+
compare_svn_head_with () {
18+
LC_ALL=en_US.UTF-8 svn log --limit 1 `git svn info --url` | \
19+
sed -e 1,3d -e "/^-\+\$/d" >current &&
20+
test_cmp current "$1"
21+
}
22+
23+
for H in ISO-8859-1 EUCJP ISO-2022-JP
24+
do
25+
test_expect_success "$H setup" '
26+
mkdir $H &&
27+
svn import -m "$H test" $H "$svnrepo"/$H &&
28+
git svn clone "$svnrepo"/$H $H
29+
'
30+
done
31+
32+
for H in ISO-8859-1 EUCJP ISO-2022-JP
33+
do
34+
test_expect_success "$H commit on git side" '
35+
(
36+
cd $H &&
37+
git config i18n.commitencoding $H &&
38+
git checkout -b t refs/remotes/git-svn &&
39+
echo $H >F &&
40+
git add F &&
41+
git commit -a -F "$TEST_DIRECTORY"/t3900/$H.txt &&
42+
E=$(git cat-file commit HEAD | sed -ne "s/^encoding //p") &&
43+
test "z$E" = "z$H"
44+
compare_git_head_with "$TEST_DIRECTORY"/t3900/$H.txt
45+
)
46+
'
47+
done
48+
49+
for H in ISO-8859-1 EUCJP ISO-2022-JP
50+
do
51+
test_expect_success "$H dcommit to svn" '
52+
(
53+
cd $H &&
54+
git svn dcommit &&
55+
git cat-file commit HEAD | grep git-svn-id: &&
56+
E=$(git cat-file commit HEAD | sed -ne "s/^encoding //p") &&
57+
test "z$E" = "z$H" &&
58+
compare_git_head_with "$TEST_DIRECTORY"/t3900/$H.txt
59+
)
60+
'
61+
done
62+
63+
test_expect_success 'ISO-8859-1 should match UTF-8 in svn' '
64+
(
65+
cd ISO-8859-1 &&
66+
compare_svn_head_with "$TEST_DIRECTORY"/t3900/1-UTF-8.txt
67+
)
68+
'
69+
70+
for H in EUCJP ISO-2022-JP
71+
do
72+
test_expect_success '$H should match UTF-8 in svn' '
73+
(
74+
cd $H &&
75+
compare_svn_head_with "$TEST_DIRECTORY"/t3900/2-UTF-8.txt
76+
)
77+
'
78+
done
79+
80+
test_done

0 commit comments

Comments
 (0)