Skip to content

Commit e8adf23

Browse files
mhaggergitster
authored andcommitted
xdl_change_compact(): introduce the concept of a change group
The idea of xdl_change_compact() is fairly simple: * Proceed through groups of changed lines in the file to be compacted, keeping track of the corresponding location in the "other" file. * If possible, slide the group up and down to try to give the most aesthetically pleasing diff. Whenever it is slid, the current location in the other file needs to be adjusted. But these simple concepts are obfuscated by a lot of index handling that is written in terse, subtle, and varied patterns. I found it very hard to convince myself that the function was correct. So introduce a "struct group" that represents a group of changed lines in a file. Add some functions that perform elementary operations on groups: * Initialize a group to the first group in a file * Move to the next or previous group in a file * Slide a group up or down Even though the resulting code is longer, I think it is easier to understand and review. Its performance is not changed appreciably (though it would be if `group_next()` and `group_previous()` were not inlined). ...and in fact, the rewriting helped me discover another bug in the --compaction-heuristic code: The update of blank_lines was never done for the highest possible position of the group. This means that it could fail to slide the group to its highest possible position, even if that position had a blank line as its last line. So for example, it yielded the following diff: $ git diff --no-index --compaction-heuristic a.txt b.txt diff --git a/a.txt b/b.txt index e53969f..0d60c5fe 100644 --- a/a.txt +++ b/b.txt @@ -1,3 +1,7 @@ 1 A + +B + +A 2 when in fact the following diff is better (according to the rules of --compaction-heuristic): $ git diff --no-index --compaction-heuristic a.txt b.txt diff --git a/a.txt b/b.txt index e53969f..0d60c5fe 100644 --- a/a.txt +++ b/b.txt @@ -1,3 +1,7 @@ 1 +A + +B + A 2 The new code gives the bottom answer. Signed-off-by: Michael Haggerty <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 152598c commit e8adf23

File tree

1 file changed

+203
-90
lines changed

1 file changed

+203
-90
lines changed

xdiff/xdiffi.c

Lines changed: 203 additions & 90 deletions
Original file line numberDiff line numberDiff line change
@@ -413,126 +413,239 @@ static int recs_match(xrecord_t *rec1, xrecord_t *rec2, long flags)
413413
flags));
414414
}
415415

416-
int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long flags) {
417-
long ix, ixo, ixs, ixref, grpsiz, nrec = xdf->nrec;
418-
char *rchg = xdf->rchg, *rchgo = xdfo->rchg;
419-
unsigned int blank_lines;
420-
xrecord_t **recs = xdf->recs;
416+
/*
417+
* Represent a group of changed lines in an xdfile_t (i.e., a contiguous group
418+
* of lines that was inserted or deleted from the corresponding version of the
419+
* file). We consider there to be such a group at the beginning of the file, at
420+
* the end of the file, and between any two unchanged lines, though most such
421+
* groups will usually be empty.
422+
*
423+
* If the first line in a group is equal to the line following the group, then
424+
* the group can be slid down. Similarly, if the last line in a group is equal
425+
* to the line preceding the group, then the group can be slid up. See
426+
* group_slide_down() and group_slide_up().
427+
*
428+
* Note that loops that are testing for changed lines in xdf->rchg do not need
429+
* index bounding since the array is prepared with a zero at position -1 and N.
430+
*/
431+
struct group {
432+
/*
433+
* The index of the first changed line in the group, or the index of
434+
* the unchanged line above which the (empty) group is located.
435+
*/
436+
long start;
421437

422438
/*
423-
* This is the same of what GNU diff does. Move back and forward
424-
* change groups for a consistent and pretty diff output. This also
425-
* helps in finding joinable change groups and reduce the diff size.
439+
* The index of the first unchanged line after the group. For an empty
440+
* group, end is equal to start.
426441
*/
427-
for (ix = ixo = 0;;) {
428-
/*
429-
* Find the first changed line in the to-be-compacted file.
430-
* We need to keep track of both indexes, so if we find a
431-
* changed lines group on the other file, while scanning the
432-
* to-be-compacted file, we need to skip it properly. Note
433-
* that loops that are testing for changed lines on rchg* do
434-
* not need index bounding since the array is prepared with
435-
* a zero at position -1 and N.
436-
*/
437-
for (; ix < nrec && !rchg[ix]; ix++)
438-
while (rchgo[ixo++]);
439-
if (ix == nrec)
440-
break;
442+
long end;
443+
};
444+
445+
/*
446+
* Initialize g to point at the first group in xdf.
447+
*/
448+
static void group_init(xdfile_t *xdf, struct group *g)
449+
{
450+
g->start = g->end = 0;
451+
while (xdf->rchg[g->end])
452+
g->end++;
453+
}
454+
455+
/*
456+
* Move g to describe the next (possibly empty) group in xdf and return 0. If g
457+
* is already at the end of the file, do nothing and return -1.
458+
*/
459+
static inline int group_next(xdfile_t *xdf, struct group *g)
460+
{
461+
if (g->end == xdf->nrec)
462+
return -1;
463+
464+
g->start = g->end + 1;
465+
for (g->end = g->start; xdf->rchg[g->end]; g->end++)
466+
;
467+
468+
return 0;
469+
}
470+
471+
/*
472+
* Move g to describe the previous (possibly empty) group in xdf and return 0.
473+
* If g is already at the beginning of the file, do nothing and return -1.
474+
*/
475+
static inline int group_previous(xdfile_t *xdf, struct group *g)
476+
{
477+
if (g->start == 0)
478+
return -1;
479+
480+
g->end = g->start - 1;
481+
for (g->start = g->end; xdf->rchg[g->start - 1]; g->start--)
482+
;
483+
484+
return 0;
485+
}
486+
487+
/*
488+
* If g can be slid toward the end of the file, do so, and if it bumps into a
489+
* following group, expand this group to include it. Return 0 on success or -1
490+
* if g cannot be slid down.
491+
*/
492+
static int group_slide_down(xdfile_t *xdf, struct group *g, long flags)
493+
{
494+
if (g->end < xdf->nrec &&
495+
recs_match(xdf->recs[g->start], xdf->recs[g->end], flags)) {
496+
xdf->rchg[g->start++] = 0;
497+
xdf->rchg[g->end++] = 1;
498+
499+
while (xdf->rchg[g->end])
500+
g->end++;
501+
502+
return 0;
503+
} else {
504+
return -1;
505+
}
506+
}
507+
508+
/*
509+
* If g can be slid toward the beginning of the file, do so, and if it bumps
510+
* into a previous group, expand this group to include it. Return 0 on success
511+
* or -1 if g cannot be slid up.
512+
*/
513+
static int group_slide_up(xdfile_t *xdf, struct group *g, long flags)
514+
{
515+
if (g->start > 0 &&
516+
recs_match(xdf->recs[g->start - 1], xdf->recs[g->end - 1], flags)) {
517+
xdf->rchg[--g->start] = 1;
518+
xdf->rchg[--g->end] = 0;
519+
520+
while (xdf->rchg[g->start - 1])
521+
g->start--;
522+
523+
return 0;
524+
} else {
525+
return -1;
526+
}
527+
}
528+
529+
static void xdl_bug(const char *msg)
530+
{
531+
fprintf(stderr, "BUG: %s\n", msg);
532+
exit(1);
533+
}
534+
535+
/*
536+
* Move back and forward change groups for a consistent and pretty diff output.
537+
* This also helps in finding joinable change groups and reducing the diff
538+
* size.
539+
*/
540+
int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long flags) {
541+
struct group g, go;
542+
long earliest_end, end_matching_other;
543+
long groupsize;
544+
unsigned int blank_lines;
545+
546+
group_init(xdf, &g);
547+
group_init(xdfo, &go);
548+
549+
while (1) {
550+
/* If the group is empty in the to-be-compacted file, skip it: */
551+
if (g.end == g.start)
552+
goto next;
441553

442554
/*
443-
* Record the start of a changed-group in the to-be-compacted file
444-
* and find the end of it, on both to-be-compacted and other file
445-
* indexes (ix and ixo).
555+
* Now shift the change up and then down as far as possible in
556+
* each direction. If it bumps into any other changes, merge them.
446557
*/
447-
ixs = ix;
448-
for (ix++; rchg[ix]; ix++);
449-
for (; rchgo[ixo]; ixo++);
450-
451558
do {
452-
grpsiz = ix - ixs;
453-
blank_lines = 0;
559+
groupsize = g.end - g.start;
454560

455561
/*
456-
* If the line before the current change group, is equal to
457-
* the last line of the current change group, shift backward
458-
* the group.
562+
* Keep track of the last "end" index that causes this
563+
* group to align with a group of changed lines in the
564+
* other file. -1 indicates that we haven't found such
565+
* a match yet:
459566
*/
460-
while (ixs > 0 && recs_match(recs[ixs - 1], recs[ix - 1], flags)) {
461-
rchg[--ixs] = 1;
462-
rchg[--ix] = 0;
463-
464-
/*
465-
* This change might have joined two change groups,
466-
* so we try to take this scenario in account by moving
467-
* the start index accordingly (and so the other-file
468-
* end-of-group index).
469-
*/
470-
for (; rchg[ixs - 1]; ixs--);
471-
while (rchgo[--ixo]);
472-
}
567+
end_matching_other = -1;
473568

474569
/*
475-
* Record the end-of-group position in case we are matched
476-
* with a group of changes in the other file (that is, the
477-
* change record before the end-of-group index in the other
478-
* file is set).
570+
* Boolean value that records whether there are any blank
571+
* lines that could be made to be the last line of this
572+
* group.
479573
*/
480-
ixref = rchgo[ixo - 1] ? ix: nrec;
574+
blank_lines = 0;
575+
576+
/* Shift the group backward as much as possible: */
577+
while (!group_slide_up(xdf, &g, flags))
578+
if (group_previous(xdfo, &go))
579+
xdl_bug("group sync broken sliding up");
481580

482581
/*
483-
* If the first line of the current change group, is equal to
484-
* the line next of the current change group, shift forward
485-
* the group.
582+
* This is this highest that this group can be shifted.
583+
* Record its end index:
486584
*/
487-
while (ix < nrec && recs_match(recs[ixs], recs[ix], flags)) {
488-
blank_lines += is_blank_line(recs[ix], flags);
489-
490-
rchg[ixs++] = 0;
491-
rchg[ix++] = 1;
492-
493-
/*
494-
* This change might have joined two change groups,
495-
* so we try to take this scenario in account by moving
496-
* the start index accordingly (and so the other-file
497-
* end-of-group index). Keep tracking the reference
498-
* index in case we are shifting together with a
499-
* corresponding group of changes in the other file.
500-
*/
501-
for (; rchg[ix]; ix++);
502-
while (rchgo[++ixo])
503-
ixref = ix;
585+
earliest_end = g.end;
586+
587+
if (go.end > go.start)
588+
end_matching_other = g.end;
589+
590+
/* Now shift the group forward as far as possible: */
591+
while (1) {
592+
if (!blank_lines)
593+
blank_lines = is_blank_line(
594+
xdf->recs[g.end - 1],
595+
flags);
596+
597+
if (group_slide_down(xdf, &g, flags))
598+
break;
599+
if (group_next(xdfo, &go))
600+
xdl_bug("group sync broken sliding down");
601+
602+
if (go.end > go.start)
603+
end_matching_other = g.end;
504604
}
505-
} while (grpsiz != ix - ixs);
605+
} while (groupsize != g.end - g.start);
506606

507-
if (ixref < ix) {
607+
if (g.end == earliest_end) {
608+
/* no shifting was possible */
609+
} else if (end_matching_other != -1) {
508610
/*
509-
* Try to move back the possibly merged group of changes, to match
510-
* the recorded position in the other file.
611+
* Move the possibly merged group of changes back to line
612+
* up with the last group of changes from the other file
613+
* that it can align with.
511614
*/
512-
while (ixref < ix) {
513-
rchg[--ixs] = 1;
514-
rchg[--ix] = 0;
515-
while (rchgo[--ixo]);
615+
while (go.end == go.start) {
616+
if (group_slide_up(xdf, &g, flags))
617+
xdl_bug("match disappeared");
618+
if (group_previous(xdfo, &go))
619+
xdl_bug("group sync broken sliding to match");
516620
}
517621
} else if ((flags & XDF_COMPACTION_HEURISTIC) && blank_lines) {
518622
/*
519-
* The group can be slid up to make its last line a
520-
* blank line. Do so.
623+
* Compaction heuristic: if it is possible to shift the
624+
* group to make its bottom line a blank line, do so.
521625
*
522626
* As we already shifted the group forward as far as
523-
* possible in the earlier loop, we need to shift it
524-
* back only if at all.
627+
* possible in the earlier loop, we only need to handle
628+
* backward shifts, not forward ones.
525629
*/
526-
while (ixs > 0 &&
527-
!is_blank_line(recs[ix - 1], flags) &&
528-
recs_match(recs[ixs - 1], recs[ix - 1], flags)) {
529-
rchg[--ixs] = 1;
530-
rchg[--ix] = 0;
531-
while (rchgo[--ixo]);
630+
while (!is_blank_line(xdf->recs[g.end - 1], flags)) {
631+
if (group_slide_up(xdf, &g, flags))
632+
xdl_bug("blank line disappeared");
633+
if (group_previous(xdfo, &go))
634+
xdl_bug("group sync broken sliding to blank line");
532635
}
533636
}
637+
638+
next:
639+
/* Move past the just-processed group: */
640+
if (group_next(xdf, &g))
641+
break;
642+
if (group_next(xdfo, &go))
643+
xdl_bug("group sync broken moving to next group");
534644
}
535645

646+
if (!group_next(xdfo, &go))
647+
xdl_bug("group sync broken at end of file");
648+
536649
return 0;
537650
}
538651

0 commit comments

Comments
 (0)