Skip to content

Commit 51b2774

Browse files
peffgitster
authored andcommitted
parse_object_buffer(): respect save_commit_buffer
If the global variable "save_commit_buffer" is set to 0, then parse_commit() will throw away the commit object data after parsing it, rather than sticking it into a commit slab. This goes all the way back to 60ab26d ([PATCH] Avoid wasting memory in git-rev-list, 2005-09-15). But there's another code path which may similarly stash the buffer: parse_object_buffer(). This is where we end up if we parse a commit via parse_object(), and it's used directly in a few other code paths like git-fsck. The original goal of 60ab26d was avoiding extra memory usage for rev-list. And there it's not all that important to catch parse_object(). We use that function only for looking at the tips of the traversal, and the majority of the commits are parsed by following parent links, where we use parse_commit() directly. So we were wasting some memory, but only a small portion. It's much easier to see the effect with fsck. Since we now turn off save_commit_buffer by default there, we _should_ be able to drop the freeing of the commit buffer in fsck_obj(). But if we do so (taking the first hunk of this patch without the rest), then the peak heap of "git fsck" in a clone of git.git goes from 136MB to 194MB. Teaching parse_object_buffer() to respect save_commit_buffer brings that down to 134.5MB (it's hard to tell from massif's output, but I suspect the savings comes from avoiding the overhead of the mostly-empty commit slab). Other programs should see a small improvement. Both "rev-list --all" and "fsck --connectivity-only" improve by a few hundred kilobytes, as they'd avoid loading the tip objects of their traversals. Most importantly, no code should be hurt by doing this. Any program that turns off save_commit_buffer is already making the assumption that any commit it sees may need to have its object data loaded on demand, as it doesn't know which ones were parsed by parse_commit() versus parse_object(). Not to mention that anything parsed by the commit graph may be in the same boat, even if save_commit_buffer was not disabled. This should be the only spot that needs to be fixed. Grepping for set_commit_buffer() shows that this and parse_commit() are the only relevant calls. Signed-off-by: Jeff King <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 069e445 commit 51b2774

File tree

2 files changed

+2
-4
lines changed

2 files changed

+2
-4
lines changed

builtin/fsck.c

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -439,9 +439,6 @@ static int fsck_obj(struct object *obj, void *buffer, unsigned long size)
439439
out:
440440
if (obj->type == OBJ_TREE)
441441
free_tree_buffer((struct tree *)obj);
442-
if (obj->type == OBJ_COMMIT)
443-
free_commit_buffer(the_repository->parsed_objects,
444-
(struct commit *)obj);
445442
return err;
446443
}
447444

object.c

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -233,7 +233,8 @@ struct object *parse_object_buffer(struct repository *r, const struct object_id
233233
if (commit) {
234234
if (parse_commit_buffer(r, commit, buffer, size, 1))
235235
return NULL;
236-
if (!get_cached_commit_buffer(r, commit, NULL)) {
236+
if (save_commit_buffer &&
237+
!get_cached_commit_buffer(r, commit, NULL)) {
237238
set_commit_buffer(r, commit, buffer, size);
238239
*eaten_p = 1;
239240
}

0 commit comments

Comments
 (0)