Skip to content

Commit a2430dd

Browse files
npitregitster
authored andcommitted
pack-objects: fix pack generation when using pack_size_limit
Current handling of pack_size_limit is quite suboptimal. Let's consider a list of objects to pack which contain alternatively big and small objects (which pretty matches reality when big blobs are interlaced with tree objects). Currently, the code simply close the pack and opens a new one when the next object in line breaks the size limit. The current code may degenerate into: - small tree object => store into pack #1 - big blob object busting the pack size limit => store into pack #2 - small blob but pack #2 is over the limit already => pack #3 - big blob busting the size limit => pack #4 - small tree but pack #4 is over the limit => pack #5 - big blob => pack #6 - small tree => pack #7 - ... and so on. The reality is that the content of packs 1, 3, 5 and 7 could well be stored more efficiently (and delta compressed) together in pack #1 if the big blobs were not forcing an immediate transition to a new pack. Incidentally this can be fixed pretty easily by simply skipping over those objects that are too big to fit in the current pack while trying the whole list of unwritten objects, and then that list considered from the beginning again when a new pack is opened. This creates much fewer smallish pack files and help making more predictable test cases for the test suite. This change made one of the self sanity checks useless so it is removed as well. That check was rather redundant already anyway. Signed-off-by: Nicolas Pitre <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 2fca19f commit a2430dd

File tree

1 file changed

+13
-24
lines changed

1 file changed

+13
-24
lines changed

builtin-pack-objects.c

Lines changed: 13 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -246,7 +246,7 @@ static unsigned long write_object(struct sha1file *f,
246246

247247
type = entry->type;
248248

249-
/* write limit if limited packsize and not first object */
249+
/* apply size limit if limited packsize and not first object */
250250
if (!pack_size_limit || !nr_written)
251251
limit = 0;
252252
else if (pack_size_limit <= write_offset)
@@ -443,11 +443,15 @@ static int write_one(struct sha1file *f,
443443

444444
/* offset is non zero if object is written already. */
445445
if (e->idx.offset || e->preferred_base)
446-
return 1;
446+
return -1;
447447

448-
/* if we are deltified, write out base object first. */
449-
if (e->delta && !write_one(f, e->delta, offset))
450-
return 0;
448+
/*
449+
* If we are deltified, attempt to write out base object first.
450+
* If that fails due to the pack size limit then the current
451+
* object might still possibly fit undeltified within that limit.
452+
*/
453+
if (e->delta)
454+
write_one(f, e->delta, offset);
451455

452456
e->idx.offset = *offset;
453457
size = write_object(f, e, *offset);
@@ -501,11 +505,9 @@ static void write_pack_file(void)
501505
sha1write(f, &hdr, sizeof(hdr));
502506
offset = sizeof(hdr);
503507
nr_written = 0;
504-
for (; i < nr_objects; i++) {
505-
if (!write_one(f, objects + i, &offset))
506-
break;
507-
display_progress(progress_state, written);
508-
}
508+
for (i = 0; i < nr_objects; i++)
509+
if (write_one(f, objects + i, &offset) == 1)
510+
display_progress(progress_state, written);
509511

510512
/*
511513
* Did we write the wrong # entries in the header?
@@ -580,26 +582,13 @@ static void write_pack_file(void)
580582
written_list[j]->offset = (off_t)-1;
581583
}
582584
nr_remaining -= nr_written;
583-
} while (nr_remaining && i < nr_objects);
585+
} while (nr_remaining);
584586

585587
free(written_list);
586588
stop_progress(&progress_state);
587589
if (written != nr_result)
588590
die("wrote %"PRIu32" objects while expecting %"PRIu32,
589591
written, nr_result);
590-
/*
591-
* We have scanned through [0 ... i). Since we have written
592-
* the correct number of objects, the remaining [i ... nr_objects)
593-
* items must be either already written (due to out-of-order delta base)
594-
* or a preferred base. Count those which are neither and complain if any.
595-
*/
596-
for (j = 0; i < nr_objects; i++) {
597-
struct object_entry *e = objects + i;
598-
j += !e->idx.offset && !e->preferred_base;
599-
}
600-
if (j)
601-
die("wrote %"PRIu32" objects as expected but %"PRIu32
602-
" unwritten", written, j);
603592
}
604593

605594
static int locate_object_entry_hash(const unsigned char *sha1)

0 commit comments

Comments
 (0)