Skip to content

Commit 4078de5

Browse files
jltoblergitster
authored andcommitted
builtin: introduce diff-pairs command
Through git-diff(1), a single diff can be generated from a pair of blob revisions directly. Unfortunately, there is not a mechanism to compute batches of specific file pair diffs in a single process. Such a feature is particularly useful on the server-side where diffing between a large set of changes is not feasible all at once due to timeout concerns. To facilitate this, introduce git-diff-pairs(1) which takes the null-terminated raw diff format as input on stdin and produces diffs in other formats. As the raw diff format already contains the necessary metadata, it becomes possible to progressively generate batches of diffs without having to recompute rename detection or retrieve object context. Something like the following: git diff-tree -r -z -M $old $new | git diff-pairs -p should generate the same output as `git diff-tree -p -M`. Furthermore, each line of raw diff formatted input can also be individually fed to a separate git-diff-pairs(1) process and still produce the same output. Based-on-patch-by: Jeff King <[email protected]> Signed-off-by: Justin Tobler <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 6e80500 commit 4078de5

File tree

11 files changed

+328
-0
lines changed

11 files changed

+328
-0
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,7 @@
5454
/git-diff
5555
/git-diff-files
5656
/git-diff-index
57+
/git-diff-pairs
5758
/git-diff-tree
5859
/git-difftool
5960
/git-difftool--helper

Documentation/git-diff-pairs.adoc

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
git-diff-pairs(1)
2+
=================
3+
4+
NAME
5+
----
6+
git-diff-pairs - Compare blob pairs generated by `diff-tree --raw`
7+
8+
SYNOPSIS
9+
--------
10+
[verse]
11+
'git diff-pairs' [diff-options]
12+
13+
DESCRIPTION
14+
-----------
15+
16+
Given the output of `diff-tree -z` on its stdin, `diff-pairs` will
17+
reformat that output into whatever format is requested on its command
18+
line. For example:
19+
20+
-----------------------------
21+
git diff-tree -z -M $a $b |
22+
git diff-pairs -p
23+
-----------------------------
24+
25+
will compute the tree diff in one step (including renames), and then
26+
`diff-pairs` will compute and format the blob-level diffs for each pair.
27+
This can be used to modify the raw diff in the middle (without having to
28+
parse or re-create more complicated formats like `--patch`), or to
29+
compute diffs progressively over the course of multiple invocations of
30+
`diff-pairs`.
31+
32+
Each blob pair is fed to the diff machinery individually queued and the output
33+
is flushed on stdin EOF.
34+
35+
OPTIONS
36+
-------
37+
38+
include::diff-options.adoc[]
39+
40+
include::diff-generate-patch.adoc[]
41+
42+
NOTES
43+
----
44+
45+
`diff-pairs` should handle any input generated by `diff-tree --raw -z`.
46+
It may choke or otherwise misbehave on output from `diff-files`, etc.
47+
48+
Here's an incomplete list of things that `diff-pairs` could do, but
49+
doesn't (mostly in the name of simplicity):
50+
51+
- Only `-z` input is accepted, not normal `--raw` input.
52+
53+
- Abbreviated sha1s are rejected in the input from `diff-tree`; if you
54+
want to abbreviate the output, you can pass `--abbrev` to
55+
`diff-pairs`.
56+
57+
- Pathspecs are not handled by `diff-pairs`; you can limit the diff via
58+
the initial `diff-tree` invocation.
59+
60+
GIT
61+
---
62+
Part of the linkgit:git[1] suite

Documentation/meson.build

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ manpages = {
4141
'git-diagnose.adoc' : 1,
4242
'git-diff-files.adoc' : 1,
4343
'git-diff-index.adoc' : 1,
44+
'git-diff-pairs.adoc' : 1,
4445
'git-difftool.adoc' : 1,
4546
'git-diff-tree.adoc' : 1,
4647
'git-diff.adoc' : 1,

Makefile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1232,6 +1232,7 @@ BUILTIN_OBJS += builtin/describe.o
12321232
BUILTIN_OBJS += builtin/diagnose.o
12331233
BUILTIN_OBJS += builtin/diff-files.o
12341234
BUILTIN_OBJS += builtin/diff-index.o
1235+
BUILTIN_OBJS += builtin/diff-pairs.o
12351236
BUILTIN_OBJS += builtin/diff-tree.o
12361237
BUILTIN_OBJS += builtin/diff.o
12371238
BUILTIN_OBJS += builtin/difftool.o

builtin.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -152,6 +152,7 @@ int cmd_diagnose(int argc, const char **argv, const char *prefix, struct reposit
152152
int cmd_diff_files(int argc, const char **argv, const char *prefix, struct repository *repo);
153153
int cmd_diff_index(int argc, const char **argv, const char *prefix, struct repository *repo);
154154
int cmd_diff(int argc, const char **argv, const char *prefix, struct repository *repo);
155+
int cmd_diff_pairs(int argc, const char **argv, const char *prefix, struct repository *repo);
155156
int cmd_diff_tree(int argc, const char **argv, const char *prefix, struct repository *repo);
156157
int cmd_difftool(int argc, const char **argv, const char *prefix, struct repository *repo);
157158
int cmd_env__helper(int argc, const char **argv, const char *prefix, struct repository *repo);

builtin/diff-pairs.c

Lines changed: 178 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,178 @@
1+
#include "builtin.h"
2+
#include "commit.h"
3+
#include "config.h"
4+
#include "diff.h"
5+
#include "diffcore.h"
6+
#include "gettext.h"
7+
#include "hex.h"
8+
#include "object.h"
9+
#include "parse-options.h"
10+
#include "revision.h"
11+
#include "strbuf.h"
12+
13+
static unsigned parse_mode_or_die(const char *mode, const char **endp)
14+
{
15+
uint16_t ret;
16+
17+
*endp = parse_mode(mode, &ret);
18+
if (!*endp)
19+
die("unable to parse mode: %s", mode);
20+
return ret;
21+
}
22+
23+
static void parse_oid(const char *p, struct object_id *oid, const char **endp,
24+
const struct git_hash_algo *algop)
25+
{
26+
if (parse_oid_hex_algop(p, oid, endp, algop) || *(*endp)++ != ' ')
27+
die("unable to parse object id: %s", p);
28+
}
29+
30+
static unsigned short parse_score(const char *score)
31+
{
32+
unsigned long ret;
33+
char *endp;
34+
35+
errno = 0;
36+
ret = strtoul(score, &endp, 10);
37+
ret *= MAX_SCORE / 100;
38+
if (errno || endp == score || *endp || (unsigned short)ret != ret)
39+
die("unable to parse rename/copy score: %s", score);
40+
return ret;
41+
}
42+
43+
static void flush_diff_queue(struct diff_options *options)
44+
{
45+
/*
46+
* If rename detection is not requested, use rename information from the
47+
* raw diff formatted input. Setting found_follow ensures diffcore_std()
48+
* does not mess with rename information already present in queued
49+
* filepairs.
50+
*/
51+
if (!options->detect_rename)
52+
options->found_follow = 1;
53+
diffcore_std(options);
54+
diff_flush(options);
55+
}
56+
57+
int cmd_diff_pairs(int argc, const char **argv, const char *prefix,
58+
struct repository *repo)
59+
{
60+
struct strbuf path_dst = STRBUF_INIT;
61+
struct strbuf path = STRBUF_INIT;
62+
struct strbuf meta = STRBUF_INIT;
63+
struct rev_info revs;
64+
int ret;
65+
66+
const char * const usage[] = {
67+
N_("git diff-pairs [diff-options]"),
68+
NULL
69+
};
70+
struct option options[] = {
71+
OPT_END()
72+
};
73+
74+
show_usage_with_options_if_asked(argc, argv, usage, options);
75+
76+
repo_init_revisions(repo, &revs, prefix);
77+
repo_config(repo, git_diff_basic_config, NULL);
78+
revs.disable_stdin = 1;
79+
revs.abbrev = 0;
80+
revs.diff = 1;
81+
82+
argc = setup_revisions(argc, argv, &revs, NULL);
83+
84+
/* Don't allow pathspecs at all. */
85+
if (revs.prune_data.nr)
86+
usage_with_options(usage, options);
87+
88+
if (!revs.diffopt.output_format)
89+
revs.diffopt.output_format = DIFF_FORMAT_RAW;
90+
91+
while (1) {
92+
struct object_id oid_a, oid_b;
93+
struct diff_filepair *pair;
94+
unsigned mode_a, mode_b;
95+
const char *p;
96+
char status;
97+
98+
if (strbuf_getline_nul(&meta, stdin) == EOF)
99+
break;
100+
101+
p = meta.buf;
102+
if (*p != ':')
103+
die("invalid raw diff input");
104+
p++;
105+
106+
mode_a = parse_mode_or_die(p, &p);
107+
mode_b = parse_mode_or_die(p, &p);
108+
109+
parse_oid(p, &oid_a, &p, repo->hash_algo);
110+
parse_oid(p, &oid_b, &p, repo->hash_algo);
111+
112+
status = *p++;
113+
114+
if (strbuf_getline_nul(&path, stdin) == EOF)
115+
die("got EOF while reading path");
116+
117+
switch (status) {
118+
case DIFF_STATUS_ADDED:
119+
pair = diff_filepair_addremove(&revs.diffopt, '+',
120+
mode_b, &oid_b,
121+
1, path.buf, 0);
122+
if (pair)
123+
pair->status = status;
124+
break;
125+
126+
case DIFF_STATUS_DELETED:
127+
pair = diff_filepair_addremove(&revs.diffopt, '-',
128+
mode_a, &oid_a,
129+
1, path.buf, 0);
130+
if (pair)
131+
pair->status = status;
132+
break;
133+
134+
case DIFF_STATUS_TYPE_CHANGED:
135+
case DIFF_STATUS_MODIFIED:
136+
pair = diff_filepair_change(&revs.diffopt,
137+
mode_a, mode_b,
138+
&oid_a, &oid_b, 1, 1,
139+
path.buf, 0, 0);
140+
if (pair)
141+
pair->status = status;
142+
break;
143+
144+
case DIFF_STATUS_RENAMED:
145+
case DIFF_STATUS_COPIED:
146+
{
147+
struct diff_filespec *a, *b;
148+
149+
if (strbuf_getline_nul(&path_dst, stdin) == EOF)
150+
die("got EOF while reading destination path");
151+
152+
a = alloc_filespec(path.buf);
153+
b = alloc_filespec(path_dst.buf);
154+
fill_filespec(a, &oid_a, 1, mode_a);
155+
fill_filespec(b, &oid_b, 1, mode_b);
156+
157+
pair = diff_queue(&diff_queued_diff, a, b);
158+
pair->status = status;
159+
pair->score = parse_score(p);
160+
pair->renamed_pair = 1;
161+
}
162+
break;
163+
164+
default:
165+
die("unknown diff status: %c", status);
166+
}
167+
}
168+
169+
flush_diff_queue(&revs.diffopt);
170+
ret = diff_result_code(&revs);
171+
172+
strbuf_release(&path_dst);
173+
strbuf_release(&path);
174+
strbuf_release(&meta);
175+
release_revisions(&revs);
176+
177+
return ret;
178+
}

command-list.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -95,6 +95,7 @@ git-diagnose ancillaryinterrogators
9595
git-diff mainporcelain info
9696
git-diff-files plumbinginterrogators
9797
git-diff-index plumbinginterrogators
98+
git-diff-pairs plumbinginterrogators
9899
git-diff-tree plumbinginterrogators
99100
git-difftool ancillaryinterrogators complete
100101
git-fast-export ancillarymanipulators

git.c

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -540,6 +540,7 @@ static struct cmd_struct commands[] = {
540540
{ "diff", cmd_diff, NO_PARSEOPT },
541541
{ "diff-files", cmd_diff_files, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
542542
{ "diff-index", cmd_diff_index, RUN_SETUP | NO_PARSEOPT },
543+
{ "diff-pairs", cmd_diff_pairs, RUN_SETUP | NO_PARSEOPT },
543544
{ "diff-tree", cmd_diff_tree, RUN_SETUP | NO_PARSEOPT },
544545
{ "difftool", cmd_difftool, RUN_SETUP_GENTLY },
545546
{ "fast-export", cmd_fast_export, RUN_SETUP },

meson.build

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -537,6 +537,7 @@ builtin_sources = [
537537
'builtin/diagnose.c',
538538
'builtin/diff-files.c',
539539
'builtin/diff-index.c',
540+
'builtin/diff-pairs.c',
540541
'builtin/diff-tree.c',
541542
'builtin/diff.c',
542543
'builtin/difftool.c',

t/meson.build

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -500,6 +500,7 @@ integration_tests = [
500500
't4067-diff-partial-clone.sh',
501501
't4068-diff-symmetric-merge-base.sh',
502502
't4069-remerge-diff.sh',
503+
't4070-diff-pairs.sh',
503504
't4100-apply-stat.sh',
504505
't4101-apply-nonl.sh',
505506
't4102-apply-rename.sh',

0 commit comments

Comments
 (0)