From aad1d1c0d58f36d5cc7822c9ff6f0064708a1f20 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Tue, 6 Jan 2026 05:13:49 -0500 Subject: [PATCH 01/17] t/perf/perf-lib: fix assignment of TEST_OUTPUT_DIRECTORY MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Using the perf suite's "run" helper in a vanilla build fails like this: $ make && (cd t/perf && ./run p0000-perf-lib-sanity.sh) === Running 1 tests in this tree === perf 1 - test_perf_default_repo works: 1 2 3 ok perf 2 - test_checkout_worktree works: 1 2 3 ok ok 3 - test_export works perf 4 - export a weird var: 1 2 3 ok perf 5 - éḿíẗ ńöń-ÁŚĆÍÍ ćḧáŕáćẗéŕś: 1 2 3 ok ok 6 - test_export works with weird vars perf 7 - important variables available in subshells: 1 2 3 ok perf 8 - test-lib-functions correctly loaded in subshells: 1 2 3 ok # passed all 8 test(s) 1..8 cannot open test-results/p0000-perf-lib-sanity.subtests: No such file or directory at ./aggregate.perl line 159. It is trying to aggregate results written into t/perf/test-results, but the p0000 script did not write anything there. The "run" script looks in $TEST_OUTPUT_DIRECTORY/test-results, or if that variable is not set, in test-results in the current working directory (which should be t/perf itself). It pulls the value of $TEST_OUTPUT_DIRECTORY from the GIT-BUILD-OPTIONS file. But that doesn't quite match the setup in perf-lib.sh (which is what scripts like p0000 use). There we do this at the top of the script: TEST_OUTPUT_DIRECTORY=$(pwd) and then let test-lib.sh append "/test-results" to that. Historically, that made the vanilla case work: we'd always use t/perf/test-results. But when $TEST_OUTPUT_DIRECTORY was set, it would break. Commit 5756ccd181 (t/perf: fix benchmarks with out-of-tree builds, 2025-04-28) fixed that second case by loading GIT-BUILD-OPTIONS ourselves. But that broke the vanilla case! Now our setting of $TEST_OUTPUT_DIRECTORY in perf-lib.sh is ignored, because it is overwritten by GIT-BUILD-OPTIONS. And when test-lib.sh sees that the output directory is empty, it defaults to t/test-results, rather than t/perf/test-results. Nobody seems to have noticed, probably for two reasons: 1. It only matters if you're trying to aggregate results (like the "run" script does). Just running "./p0000-perf-lib-sanity.sh" manually still produces useful output; the stored result files are just in an unexpected place. 2. There might be leftover files in t/perf/test-results from previous runs (before 5756ccd181). In particular, the ".subtests" files don't tend to change, and the lack of that file is what causes it to barf completely. So it's possible that the aggregation could have been showing stale results that did not match the run that just happened. We can fix it by setting TEST_OUTPUT_DIRECTORY only after we've loaded GIT-BUILD-OPTIONS, so that we override its value and not the other way around. And we'll do so only when the variable is not set, which should retain the fix for that case from 5756ccd181. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- t/perf/perf-lib.sh | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/t/perf/perf-lib.sh b/t/perf/perf-lib.sh index b15c74d6f15cd4..2ac007888e806a 100644 --- a/t/perf/perf-lib.sh +++ b/t/perf/perf-lib.sh @@ -20,7 +20,7 @@ # These variables must be set before the inclusion of test-lib.sh below, # because it will change our working directory. TEST_DIRECTORY=$(pwd)/.. -TEST_OUTPUT_DIRECTORY=$(pwd) +perf_dir=$(pwd) TEST_NO_CREATE_REPO=t TEST_NO_MALLOC_CHECK=t @@ -58,6 +58,7 @@ then fi . "$GIT_BUILD_DIR"/GIT-BUILD-OPTIONS +: ${TEST_OUTPUT_DIRECTORY:=$perf_dir} . "$GIT_SOURCE_DIR"/t/test-lib.sh # Then restore GIT_PERF_* settings. From 79d301c7676e79a09e7a1c65ca754132e1770828 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Tue, 6 Jan 2026 05:16:04 -0500 Subject: [PATCH 02/17] t/perf/run: preserve GIT_PERF_* from environment If you run: GIT_PERF_LARGE_REPO=/some/path ./p1006-cat-file.sh it will use the repo in /some/path. But if you use the "run" helper script to aggregate and compare results, like this: GIT_PERF_LARGE_REPO=/some/path ./run HEAD^ HEAD p1006-cat-file.sh it will ignore that variable. This is because the presence of the LARGE_REPO variable in GIT-BUILD-OPTIONS overrides what's in the environment. This started with 4638e8806e (Makefile: use common template for GIT-BUILD-OPTIONS, 2024-12-06), which now writes even empty variables (though arguably it was wrong even before with a non-empty value, as we generally prefer the environment to take precedence over on-disk config). We had the same problem in perf-lib.sh itself, and we hacked around it with 32b74b9809 (perf: do allow `GIT_PERF_*` to be overridden again, 2025-04-04). That's what lets the direct invocation of "./p1006" work above. And in fact that was sufficient for "./run", too, until it started loading GIT-BUILD-OPTIONS itself in 5756ccd181 (t/perf: fix benchmarks with out-of-tree builds, 2025-04-28). Now it has the same problem: it clobbers any incoming GIT_PERF options from the environment. We can use the same hack here in the "run" script. It's quite ugly, but it's just short enough that I don't think it's worth trying to factor it out into a common shell library. In the long run, we might consider teaching GIT-BUILD-OPTIONS to be more gentle in overwriting existing entries. There are probably other GIT_TEST_* variables which would need the same treatment. And if and when we come up with a more complete solution, we can use it in both spots. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- t/perf/run | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/t/perf/run b/t/perf/run index 073bcb2affffc3..13913db4a392da 100755 --- a/t/perf/run +++ b/t/perf/run @@ -204,8 +204,18 @@ run_subsection () { get_var_from_env_or_config "GIT_PERF_CODESPEED_OUTPUT" "perf" "codespeedOutput" "--bool" get_var_from_env_or_config "GIT_PERF_SEND_TO_CODESPEED" "perf" "sendToCodespeed" +# Preserve GIT_PERF settings from the environment when loading +# GIT-BUILD-OPTIONS; see the similar hack in perf-lib.sh. +git_perf_settings="$(env | + sed -n "/^GIT_PERF_/{ + # escape all single-quotes in the value + s/'/'\\\\''/g + # turn this into an eval-able assignment + s/^\\([^=]*=\\)\\(.*\\)/\\1'\\2'/p + }")" cd "$(dirname $0)" . ../../GIT-BUILD-OPTIONS +eval "$git_perf_settings" if test -n "$TEST_OUTPUT_DIRECTORY" then From 9e8b448dd83297ac85f6a62a0d2408629fb45cc0 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Tue, 6 Jan 2026 05:25:58 -0500 Subject: [PATCH 03/17] cat-file: only use bitmaps when filtering Commit 8002e8ee18 (builtin/cat-file: use bitmaps to efficiently filter by object type, 2025-04-02) introduced a performance regression when we are not filtering objects: it uses bitmaps even when they won't help, incurring extra costs. For example, running the new perf tests from this commit, which check the performance of listing objects by oid: $ export GIT_PERF_LARGE_REPO=/path/to/linux.git $ git -C "$GIT_PERF_LARGE_REPO" repack -adb $ GIT_SKIP_TESTS=p1006.1 ./run 8002e8ee18^ 8002e8ee18 p1006-cat-file.sh [...] Test 8002e8ee18^ 8002e8ee18 ------------------------------------------------------------------------------- 1006.2: list all objects (sorted) 1.48(1.44+0.04) 6.39(6.35+0.04) +331.8% 1006.3: list all objects (unsorted) 3.01(2.97+0.04) 3.40(3.29+0.10) +13.0% 1006.4: list blobs 4.85(4.67+0.17) 1.68(1.58+0.10) -65.4% An invocation that filters, like listing all blobs (1006.4), does benefit from using the bitmaps; it now doesn't have to check the type of each object from the pack data, so the tradeoff is worth it. But for listing all objects in sorted idx order (1006.2), we otherwise would never open the bitmap nor the revindex file. Worse, our sorting step gets much worse. Normally we append into an array in pack .idx order, and the sort step is trivial. But with bitmaps, we get the objects in pack order, which is apparently random with respect to oid, and have to sort the whole thing. (Note that this freshly-packed state represents the best case for .idx sorting; if we had two packs, then we'd have their objects one after the other and qsort would have to interleave them). The unsorted test in 1006.3 is interesting: there we are going in pack order, so we load the revindex for the pack anyway. And though we don't sort the result, we do use an oidset to check for duplicates. So we can see in the 8002e8ee18^ timings that those two things cost ~1.5s over the sorted case (mostly the oidset hash cost). We also incur the extra cost to open the bitmap file as of 8002e8ee18, which seems to be ~400ms. (This would probably be faster with a bitmap lookup table, but writing that out is not yet the default). So we know that bitmaps help when there's filtering to be done, but otherwise make things worse. Let's only use them when there's a filter. The perf script shows that we've fixed the regressions without hurting the bitmap case: Test 8002e8ee18^ 8002e8ee18 HEAD -------------------------------------------------------------------------------------------------------- 1006.2: list all objects (sorted) 1.56(1.53+0.03) 6.44(6.37+0.06) +312.8% 1.62(1.54+0.06) +3.8% 1006.3: list all objects (unsorted) 3.04(2.98+0.06) 3.45(3.38+0.07) +13.5% 3.04(2.99+0.04) +0.0% 1006.4: list blobs 5.14(4.98+0.15) 1.76(1.68+0.06) -65.8% 1.73(1.64+0.09) -66.3% Note that there's another related case: we might have a filter that cannot be used with bitmaps. That check is handled already for us in for_each_bitmapped_object(), though we'd still load the bitmap and revindex files pointlessly in that case. I don't think it can happen in practice for cat-file, though, since it allows only blob:none, blob:limit, and object:type filters, all of which work with bitmaps. It would be easy-ish to insert an extra check like: can_filter_bitmap(&opt->objects_filter); into the conditional, but I didn't bother here. It would be redundant with the call in for_each_bitmapped_object(), and the can_filter helper function is static local in the bitmap code (so we'd have to make it public). Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- builtin/cat-file.c | 8 +++++--- t/perf/p1006-cat-file.sh | 14 ++++++++++++++ 2 files changed, 19 insertions(+), 3 deletions(-) diff --git a/builtin/cat-file.c b/builtin/cat-file.c index 67a5ff2b9ebd29..6d704ec6c98398 100644 --- a/builtin/cat-file.c +++ b/builtin/cat-file.c @@ -839,12 +839,14 @@ static void batch_each_object(struct batch_options *opt, .callback = callback, .payload = _payload, }; - struct bitmap_index *bitmap = prepare_bitmap_git(the_repository); + struct bitmap_index *bitmap = NULL; for_each_loose_object(batch_one_object_loose, &payload, 0); - if (bitmap && !for_each_bitmapped_object(bitmap, &opt->objects_filter, - batch_one_object_bitmapped, &payload)) { + if (opt->objects_filter.choice != LOFC_DISABLED && + (bitmap = prepare_bitmap_git(the_repository)) && + !for_each_bitmapped_object(bitmap, &opt->objects_filter, + batch_one_object_bitmapped, &payload)) { struct packed_git *pack; for (pack = get_all_packs(the_repository); pack; pack = pack->next) { diff --git a/t/perf/p1006-cat-file.sh b/t/perf/p1006-cat-file.sh index dcd801537968b3..da34ece2428a4d 100755 --- a/t/perf/p1006-cat-file.sh +++ b/t/perf/p1006-cat-file.sh @@ -9,4 +9,18 @@ test_perf 'cat-file --batch-check' ' git cat-file --batch-all-objects --batch-check ' +test_perf 'list all objects (sorted)' ' + git cat-file --batch-all-objects --batch-check="%(objectname)" +' + +test_perf 'list all objects (unsorted)' ' + git cat-file --batch-all-objects --batch-check="%(objectname)" \ + --unordered +' + +test_perf 'list blobs' ' + git cat-file --batch-all-objects --batch-check="%(objectname)" \ + --unordered --filter=object:type=blob +' + test_done From 6c5c7e7071eeac33139fc7a7c0387bfb981153c2 Mon Sep 17 00:00:00 2001 From: Andrew Chitester Date: Tue, 6 Jan 2026 08:26:58 -0500 Subject: [PATCH 04/17] t1420: modernize the lost-found test This test indirectly checks that the lost-found folder has 2 files in it and then checks that the expected two files exist. Make this more deliberate by removing the old test -f and compare the actual ls of the lost-found directory with the expected files. Signed-off-by: Andrew Chitester Signed-off-by: Junio C Hamano --- t/t1420-lost-found.sh | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/t/t1420-lost-found.sh b/t/t1420-lost-found.sh index 2fb2f44f021ea3..926c6d63e392db 100755 --- a/t/t1420-lost-found.sh +++ b/t/t1420-lost-found.sh @@ -28,9 +28,12 @@ test_expect_success 'lost and found something' ' test_tick && git reset --hard HEAD^ && git fsck --lost-found && - test 2 = $(ls .git/lost-found/*/* | wc -l) && - test -f .git/lost-found/commit/$(cat lost-commit) && - test -f .git/lost-found/other/$(cat lost-other) + ls .git/lost-found/*/* >actual && + cat >expect <<-EOF && + .git/lost-found/commit/$(cat lost-commit) + .git/lost-found/other/$(cat lost-other) + EOF + test_cmp expect actual ' test_done From 3d61c1988b5d9468113dbe250e99eaabd3832090 Mon Sep 17 00:00:00 2001 From: Kristoffer Haugsbakk Date: Thu, 8 Jan 2026 07:28:15 +0100 Subject: [PATCH 05/17] =?UTF-8?q?doc:=20patch-id:=20don=E2=80=99t=20use=20?= =?UTF-8?q?semicolon=20between=20bullet=20points?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit These bullet points are full-fledged paragraphs with sentences. It’s best to restrict semicolon-termination to the case when the bullet list amounts to a list of items.[1] † 1: Like “List: ... • first; ... • second; and ... • third.” Signed-off-by: Kristoffer Haugsbakk Signed-off-by: Junio C Hamano --- Documentation/git-patch-id.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/git-patch-id.adoc b/Documentation/git-patch-id.adoc index 92a1af36a2765c..bac37db09d486a 100644 --- a/Documentation/git-patch-id.adoc +++ b/Documentation/git-patch-id.adoc @@ -45,7 +45,7 @@ This is the default if `patchid.verbatim` is `true`. with two different settings for `-O` result in the same patch ID signature, thereby allowing the computed result to be used as a key to index some meta-information about the change between - the two trees; + the two trees. - Result is different from the value produced by git 1.9 and older or produced when an "unstable" hash (see `--unstable` below) is From 92a61fe44dad03360141051180ace4aa39984abc Mon Sep 17 00:00:00 2001 From: Kristoffer Haugsbakk Date: Thu, 8 Jan 2026 07:28:16 +0100 Subject: [PATCH 06/17] doc: patch-id: capitalize Git version Git versions are always capitalized. Signed-off-by: Kristoffer Haugsbakk Signed-off-by: Junio C Hamano --- Documentation/git-patch-id.adoc | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/Documentation/git-patch-id.adoc b/Documentation/git-patch-id.adoc index bac37db09d486a..82992e35fc1075 100644 --- a/Documentation/git-patch-id.adoc +++ b/Documentation/git-patch-id.adoc @@ -47,7 +47,7 @@ This is the default if `patchid.verbatim` is `true`. as a key to index some meta-information about the change between the two trees. -- Result is different from the value produced by git 1.9 and older +- Result is different from the value produced by Git 1.9 and older or produced when an "unstable" hash (see `--unstable` below) is configured - even when used on a diff output taken without any use of `-O`, thereby making existing databases storing such @@ -61,8 +61,8 @@ This is the default if `patchid.stable` is set to `true`. `--unstable`:: Use an "unstable" hash as the patch ID. With this option, the result produced is compatible with the patch-id value produced - by git 1.9 and older and whitespace is ignored. Users with pre-existing - databases storing patch-ids produced by git 1.9 and older (who do not deal + by Git 1.9 and older and whitespace is ignored. Users with pre-existing + databases storing patch-ids produced by Git 1.9 and older (who do not deal with reordered patches) may want to use this option. + This is the default. From 285659cc983a60f89a60f96cff7397375766f3f8 Mon Sep 17 00:00:00 2001 From: Kristoffer Haugsbakk Date: Thu, 8 Jan 2026 07:28:17 +0100 Subject: [PATCH 07/17] =?UTF-8?q?patch-id:=20use=20=E2=80=9Cpatch=20ID?= =?UTF-8?q?=E2=80=9D=20throughout?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The “Description” section decided to introduce and use the term “patch ID” for the ID value itself. Let’s use the same term on the options as well. Also make to sure to use bare “ID” instead of “id”. Signed-off-by: Kristoffer Haugsbakk Signed-off-by: Junio C Hamano --- Documentation/git-patch-id.adoc | 10 +++++----- builtin/patch-id.c | 4 ++-- 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/Documentation/git-patch-id.adoc b/Documentation/git-patch-id.adoc index 82992e35fc1075..9999f164b583c9 100644 --- a/Documentation/git-patch-id.adoc +++ b/Documentation/git-patch-id.adoc @@ -31,7 +31,7 @@ OPTIONS ------- `--verbatim`:: - Calculate the patch-id of the input as it is given, do not strip + Calculate the patch ID of the input as it is given, do not strip any whitespace. + This is the default if `patchid.verbatim` is `true`. @@ -51,18 +51,18 @@ This is the default if `patchid.verbatim` is `true`. or produced when an "unstable" hash (see `--unstable` below) is configured - even when used on a diff output taken without any use of `-O`, thereby making existing databases storing such - "unstable" or historical patch-ids unusable. + "unstable" or historical patch IDs unusable. -- All whitespace within the patch is ignored and does not affect the id. +- All whitespace within the patch is ignored and does not affect the ID. -- + This is the default if `patchid.stable` is set to `true`. `--unstable`:: Use an "unstable" hash as the patch ID. With this option, - the result produced is compatible with the patch-id value produced + the result produced is compatible with the patch ID value produced by Git 1.9 and older and whitespace is ignored. Users with pre-existing - databases storing patch-ids produced by Git 1.9 and older (who do not deal + databases storing patch IDs produced by Git 1.9 and older (who do not deal with reordered patches) may want to use this option. + This is the default. diff --git a/builtin/patch-id.c b/builtin/patch-id.c index d26e9d0c1eae6a..2781598ede6ea0 100644 --- a/builtin/patch-id.c +++ b/builtin/patch-id.c @@ -228,9 +228,9 @@ int cmd_patch_id(int argc, int opts = 0; struct option builtin_patch_id_options[] = { OPT_CMDMODE(0, "unstable", &opts, - N_("use the unstable patch-id algorithm"), 1), + N_("use the unstable patch ID algorithm"), 1), OPT_CMDMODE(0, "stable", &opts, - N_("use the stable patch-id algorithm"), 2), + N_("use the stable patch ID algorithm"), 2), OPT_CMDMODE(0, "verbatim", &opts, N_("don't strip whitespace from the patch"), 3), OPT_END() From f671f5a83b78643964bdfb75eb0aab54d23e25e9 Mon Sep 17 00:00:00 2001 From: Kristoffer Haugsbakk Date: Thu, 8 Jan 2026 07:28:18 +0100 Subject: [PATCH 08/17] doc: patch-id: use definite article for the result Signed-off-by: Kristoffer Haugsbakk Signed-off-by: Junio C Hamano --- Documentation/git-patch-id.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/git-patch-id.adoc b/Documentation/git-patch-id.adoc index 9999f164b583c9..abd02fccdc0862 100644 --- a/Documentation/git-patch-id.adoc +++ b/Documentation/git-patch-id.adoc @@ -47,7 +47,7 @@ This is the default if `patchid.verbatim` is `true`. as a key to index some meta-information about the change between the two trees. -- Result is different from the value produced by Git 1.9 and older +- The result is different from the value produced by Git 1.9 and older or produced when an "unstable" hash (see `--unstable` below) is configured - even when used on a diff output taken without any use of `-O`, thereby making existing databases storing such From 89d4f3af16a95fe1b6abe8c9f43fb7fb6826dadf Mon Sep 17 00:00:00 2001 From: Kristoffer Haugsbakk Date: Thu, 8 Jan 2026 07:28:19 +0100 Subject: [PATCH 09/17] doc: patch-id: spell out the git-diff-tree(1) form You specifically need `--patch` since the default output is a raw diff. Signed-off-by: Kristoffer Haugsbakk Signed-off-by: Junio C Hamano --- Documentation/git-patch-id.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/git-patch-id.adoc b/Documentation/git-patch-id.adoc index abd02fccdc0862..61498def317984 100644 --- a/Documentation/git-patch-id.adoc +++ b/Documentation/git-patch-id.adoc @@ -21,7 +21,7 @@ the same time also reasonably unique, i.e., two patches that have the same The main usecase for this command is to look for likely duplicate commits. -When dealing with `git diff-tree` output, it takes advantage of +When dealing with `git diff-tree --patch` output, it takes advantage of the fact that the patch is prefixed with the object name of the commit, and outputs two 40-byte hexadecimal strings. The first string is the patch ID, and the second string is the commit ID. From 3f051fc9c9f9e719ad0e37b66737b466b82d17b3 Mon Sep 17 00:00:00 2001 From: Kristoffer Haugsbakk Date: Thu, 8 Jan 2026 07:28:20 +0100 Subject: [PATCH 10/17] doc: patch-id: --verbatim locks in --stable MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The default `--unstable` is a legacy format that predates `--stable`. That’s why 2871f4d4 (builtin: patch-id: add --verbatim as a command mode, 2022-10-24) made `--verbatim` lock in[1] `--stable`: Users of --unstable mainly care about compatibility with old git versions, which unstripping the whitespace would break. Thus there isn't a usecase for the combination of --verbatim and --unstable, and we don't expose this so as to not add maintainence burden. † 1: imply `--stable`, disallow `--unstable` Signed-off-by: Kristoffer Haugsbakk Signed-off-by: Junio C Hamano --- Documentation/git-patch-id.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/git-patch-id.adoc b/Documentation/git-patch-id.adoc index 61498def317984..013e1a6190681b 100644 --- a/Documentation/git-patch-id.adoc +++ b/Documentation/git-patch-id.adoc @@ -32,7 +32,7 @@ OPTIONS `--verbatim`:: Calculate the patch ID of the input as it is given, do not strip - any whitespace. + any whitespace. Implies `--stable` and forbids `--unstable`. + This is the default if `patchid.verbatim` is `true`. From 2bfc69e648d049a72403a4aa3f11e9360516e470 Mon Sep 17 00:00:00 2001 From: Michael Lyons Date: Thu, 8 Jan 2026 10:30:20 -0500 Subject: [PATCH 11/17] doc: blame-options: convert to new doc format MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Use __ instead of in the description - Modify some samples to use - Use `backticks` for keywords and more complex option descriptions. The new rendering engine will apply synopsis rules to these spans. Signed-off-by: Michael Lyons Acked-by: Jean-Noël AVILA Signed-off-by: Junio C Hamano --- Documentation/blame-options.adoc | 120 +++++++++++++++---------------- 1 file changed, 60 insertions(+), 60 deletions(-) diff --git a/Documentation/blame-options.adoc b/Documentation/blame-options.adoc index 1fb948fc76f3ab..1ae1222b6b5ffa 100644 --- a/Documentation/blame-options.adoc +++ b/Documentation/blame-options.adoc @@ -1,105 +1,105 @@ --b:: +`-b`:: Show blank SHA-1 for boundary commits. This can also be controlled via the `blame.blankBoundary` config option. ---root:: +`--root`:: Do not treat root commits as boundaries. This can also be controlled via the `blame.showRoot` config option. ---show-stats:: +`--show-stats`:: Include additional statistics at the end of blame output. --L ,:: --L ::: - Annotate only the line range given by ',', - or by the function name regex ''. +`-L ,`:: +`-L :`:: + Annotate only the line range given by `,`, + or by the function name regex __. May be specified multiple times. Overlapping ranges are allowed. + -'' and '' are optional. `-L ` or `-L ,` spans from -'' to end of file. `-L ,` spans from start of file to ''. +__ and __ are optional. `-L ` or `-L ,` spans from +__ to end of file. `-L ,` spans from start of file to __. + include::line-range-format.adoc[] --l:: +`-l`:: Show long rev (Default: off). --t:: +`-t`:: Show raw timestamp (Default: off). --S :: - Use revisions from revs-file instead of calling linkgit:git-rev-list[1]. +`-S `:: + Use revisions from __ instead of calling + linkgit:git-rev-list[1]. ---reverse ..:: +`--reverse ..`:: Walk history forward instead of backward. Instead of showing the revision in which a line appeared, this shows the last revision in which a line has existed. This requires a range of - revision like START..END where the path to blame exists in - START. `git blame --reverse START` is taken as `git blame - --reverse START..HEAD` for convenience. + revision like `..` where the path to blame exists in + __. `git blame --reverse ` is taken as `git blame + --reverse ..HEAD` for convenience. ---first-parent:: +`--first-parent`:: Follow only the first parent commit upon seeing a merge commit. This option can be used to determine when a line was introduced to a particular integration branch, rather than when it was introduced to the history overall. --p:: ---porcelain:: +`-p`:: +`--porcelain`:: Show in a format designed for machine consumption. ---line-porcelain:: +`--line-porcelain`:: Show the porcelain format, but output commit information for each line, not just the first time a commit is referenced. - Implies --porcelain. + Implies `--porcelain`. ---incremental:: +`--incremental`:: Show the result incrementally in a format designed for machine consumption. ---encoding=:: - Specifies the encoding used to output author names +`--encoding=`:: + Specify the encoding used to output author names and commit summaries. Setting it to `none` makes blame output unconverted data. For more information see the discussion about encoding in the linkgit:git-log[1] manual page. ---contents :: - Annotate using the contents from the named file, starting from - if it is specified, and HEAD otherwise. You may specify '-' to make +`--contents `:: + Annotate using the contents from __, starting from __ + if it is specified, and `HEAD` otherwise. You may specify `-` to make the command read from the standard input for the file contents. ---date :: - Specifies the format used to output dates. If --date is not - provided, the value of the blame.date config variable is - used. If the blame.date config variable is also not set, the +`--date `:: + Specify the format used to output dates. If `--date` is not + provided, the value of the `blame.date` config variable is + used. If the `blame.date` config variable is also not set, the iso format is used. For supported values, see the discussion - of the --date option at linkgit:git-log[1]. + of the `--date` option at linkgit:git-log[1]. ---progress:: ---no-progress:: - Progress status is reported on the standard error stream - by default when it is attached to a terminal. This flag - enables progress reporting even if not attached to a - terminal. Can't use `--progress` together with `--porcelain` - or `--incremental`. +`--progress`:: +`--no-progress`:: + Enable progress reporting on the standard error stream even if + not attached to a terminal. By default, progress status is + reported only when it is attached. You can't use `--progress` + together with `--porcelain` or `--incremental`. --M[]:: +`-M[]`:: Detect moved or copied lines within a file. When a commit moves or copies a block of lines (e.g. the original file - has A and then B, and the commit changes it to B and then - A), the traditional 'blame' algorithm notices only half of + has _A_ and then _B_, and the commit changes it to _B_ and then + _A_), the traditional `blame` algorithm notices only half of the movement and typically blames the lines that were moved - up (i.e. B) to the parent and assigns blame to the lines that - were moved down (i.e. A) to the child commit. With this + up (i.e. _B_) to the parent and assigns blame to the lines that + were moved down (i.e. _A_) to the child commit. With this option, both groups of lines are blamed on the parent by running extra passes of inspection. + - is optional but it is the lower bound on the number of +__ is optional, but it is the lower bound on the number of alphanumeric characters that Git must detect as moving/copying within a file for it to associate those lines with the parent commit. The default value is 20. --C[]:: +`-C[]`:: In addition to `-M`, detect lines moved or copied from other files that were modified in the same commit. This is useful when you reorganize your program and move code @@ -109,14 +109,14 @@ commit. The default value is 20. option is given three times, the command additionally looks for copies from other files in any commit. + - is optional but it is the lower bound on the number of +__ is optional, but it is the lower bound on the number of alphanumeric characters that Git must detect as moving/copying between files for it to associate those lines with the parent commit. And the default value is 40. If there are more than one -`-C` options given, the argument of the last `-C` will +`-C` options given, the __ argument of the last `-C` will take effect. ---ignore-rev :: +`--ignore-rev `:: Ignore changes made by the revision when assigning blame, as if the change never happened. Lines that were changed or added by an ignored commit will be blamed on the previous commit that changed that line or @@ -126,26 +126,26 @@ take effect. another commit will be marked with a `?` in the blame output. If the `blame.markUnblamableLines` config option is set, then those lines touched by an ignored commit that we could not attribute to another revision are - marked with a '*'. In the porcelain modes, we print 'ignored' and - 'unblamable' on a newline respectively. + marked with a `*`. In the porcelain modes, we print `ignored` and + `unblamable` on a newline respectively. ---ignore-revs-file :: - Ignore revisions listed in `file`, which must be in the same format as an +`--ignore-revs-file `:: + Ignore revisions listed in __, which must be in the same format as an `fsck.skipList`. This option may be repeated, and these files will be processed after any files specified with the `blame.ignoreRevsFile` config option. An empty file name, `""`, will clear the list of revs from previously processed files. ---color-lines:: +`--color-lines`:: Color line annotations in the default format differently if they come from the same commit as the preceding line. This makes it easier to distinguish code blocks introduced by different commits. The color defaults to cyan and can be adjusted using the `color.blame.repeatedLines` config option. ---color-by-age:: - Color line annotations depending on the age of the line in the default format. - The `color.blame.highlightRecent` config option controls what color is used for - each range of age. +`--color-by-age`:: + Color line annotations depending on the age of the line in + the default format. The `color.blame.highlightRecent` config + option controls what color is used for each range of age. --h:: +`-h`:: Show help message. From e40e01a75a795bc40822d80c14a3cfe728d45c1b Mon Sep 17 00:00:00 2001 From: Michael Lyons Date: Thu, 8 Jan 2026 10:30:21 -0500 Subject: [PATCH 12/17] doc: git-blame: convert to new doc format MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Use __ instead of in the description - Use _underscores_ around math associated with - Use `backticks` for keywords and more complex option descriptions. The new rendering engine will apply synopsis rules to these spans. Signed-off-by: Michael Lyons Acked-by: Jean-Noël AVILA Signed-off-by: Junio C Hamano --- Documentation/git-blame.adoc | 72 ++++++++++++++++++------------------ 1 file changed, 37 insertions(+), 35 deletions(-) diff --git a/Documentation/git-blame.adoc b/Documentation/git-blame.adoc index adcbb6f5dc97a3..8808009e87eb1d 100644 --- a/Documentation/git-blame.adoc +++ b/Documentation/git-blame.adoc @@ -7,12 +7,12 @@ git-blame - Show what revision and author last modified each line of a file SYNOPSIS -------- -[verse] -'git blame' [-c] [-b] [-l] [--root] [-t] [-f] [-n] [-s] [-e] [-p] [-w] [--incremental] - [-L ] [-S ] [-M] [-C] [-C] [-C] [--since=] - [--ignore-rev ] [--ignore-revs-file ] - [--color-lines] [--color-by-age] [--progress] [--abbrev=] - [ --contents ] [ | --reverse ..] [--] +[synopsis] +git blame [-c] [-b] [-l] [--root] [-t] [-f] [-n] [-s] [-e] [-p] [-w] [--incremental] + [-L ] [-S ] [-M] [-C] [-C] [-C] [--since=] + [--ignore-rev ] [--ignore-revs-file ] + [--color-lines] [--color-by-age] [--progress] [--abbrev=] + [ --contents ] [ | --reverse ..] [--] DESCRIPTION ----------- @@ -30,7 +30,7 @@ lines that were copied and pasted from another file, etc., see the `-C` and `-M` options. The report does not tell you anything about lines which have been deleted or -replaced; you need to use a tool such as 'git diff' or the "pickaxe" +replaced; you need to use a tool such as `git diff` or the "pickaxe" interface briefly mentioned in the following paragraph. Apart from supporting file annotation, Git also supports searching the @@ -50,47 +50,47 @@ OPTIONS ------- include::blame-options.adoc[] --c:: +`-c`:: Use the same output mode as linkgit:git-annotate[1] (Default: off). ---score-debug:: +`--score-debug`:: Include debugging information related to the movement of lines between files (see `-C`) and lines moved within a file (see `-M`). The first number listed is the score. This is the number of alphanumeric characters detected as having been moved between or within files. This must be above - a certain threshold for 'git blame' to consider those lines + a certain threshold for `git blame` to consider those lines of code to have been moved. --f:: ---show-name:: +`-f`:: +`--show-name`:: Show the filename in the original commit. By default the filename is shown if there is any line that came from a file with a different name, due to rename detection. --n:: ---show-number:: +`-n`:: +`--show-number`:: Show the line number in the original commit (Default: off). --s:: +`-s`:: Suppress the author name and timestamp from the output. --e:: ---show-email:: +`-e`:: +`--show-email`:: Show the author email instead of the author name (Default: off). This can also be controlled via the `blame.showEmail` config option. --w:: +`-w`:: Ignore whitespace when comparing the parent's version and the child's to find where the lines came from. include::diff-algorithm-option.adoc[] ---abbrev=:: - Instead of using the default 7+1 hexadecimal digits as the - abbreviated object name, use +1 digits, where is at - least but ensures the commit object names are unique. +`--abbrev=`:: + Instead of using the default _7+1_ hexadecimal digits as the + abbreviated object name, use _+1_ digits, where __ is at + least __ but ensures the commit object names are unique. Note that 1 column is used for a caret to mark the boundary commit. @@ -124,21 +124,21 @@ header at the minimum has the first line which has: This header line is followed by the following information at least once for each commit: -- the author name ("author"), email ("author-mail"), time - ("author-time"), and time zone ("author-tz"); similarly +- the author name (`author`), email (`author-mail`), time + (`author-time`), and time zone (`author-tz`); similarly for committer. - the filename in the commit that the line is attributed to. -- the first line of the commit log message ("summary"). +- the first line of the commit log message (`summary`). The contents of the actual line are output after the above -header, prefixed by a TAB. This is to allow adding more +header, prefixed by a _TAB_. This is to allow adding more header elements later. The porcelain format generally suppresses commit information that has already been seen. For example, two lines that are blamed to the same commit will both be shown, but the details for that commit will be shown only once. Information which is specific to individual lines will not be -grouped together, like revs to be marked 'ignored' or 'unblamable'. This +grouped together, like revs to be marked `ignored` or `unblamable`. This is more efficient, but may require more state be kept by the reader. The `--line-porcelain` option can be used to output full commit information for each line, allowing simpler (but less efficient) usage like: @@ -152,7 +152,7 @@ for each line, allowing simpler (but less efficient) usage like: SPECIFYING RANGES ----------------- -Unlike 'git blame' and 'git annotate' in older versions of git, the extent +Unlike `git blame` and `git annotate` in older versions of git, the extent of the annotation can be limited to both line ranges and revision ranges. The `-L` option, which limits annotation to a range of lines, may be specified multiple times. @@ -173,7 +173,7 @@ which limits the annotation to the body of the `hello` subroutine. When you are not interested in changes older than version v2.6.18, or changes older than 3 weeks, you can use revision -range specifiers similar to 'git rev-list': +range specifiers similar to `git rev-list`: git blame v2.6.18.. -- foo git blame --since=3.weeks -- foo @@ -212,8 +212,9 @@ does not contain the actual lines from the file that is being annotated. . Each blame entry always starts with a line of: - - <40-byte-hex-sha1> ++ +[synopsis] +<40-byte-hex-sha1> + Line numbers count from 1. @@ -224,16 +225,17 @@ Line numbers count from 1. . Unlike the Porcelain format, the filename information is always given and terminates the entry: - - "filename" ++ +[synopsis] +filename + and thus it is really quite easy to parse for some line- and word-oriented parser (which should be quite natural for most scripting languages). + [NOTE] For people who do parsing: to make it more robust, just ignore any -lines between the first and last one ("" and "filename" lines) -where you do not recognize the tag words (or care about that particular +lines between the first and last one (_<40-byte-hex-sha1>_ and `filename` +lines) where you do not recognize the tag words (or care about that particular one) at the beginning of the "extended information" lines. That way, if there is ever added information (like the commit encoding or extended commit commentary), a blame viewer will not care. From 02fc44a98997f1b1602ca9cce6d460092803dae1 Mon Sep 17 00:00:00 2001 From: "brian m. carlson" Date: Fri, 9 Jan 2026 01:46:08 +0000 Subject: [PATCH 13/17] gitfaq: document using stash import/export to sync working tree Git 2.51 learned how to import and export stashes. This is a secure and robust way to transfer working tree states across machines and comes with almost none of the pitfalls of rsync or other tools. Recommend this as an alternative in the FAQ. Signed-off-by: brian m. carlson Signed-off-by: Junio C Hamano --- Documentation/gitfaq.adoc | 41 +++++++++++++++++++++++++++------------ 1 file changed, 29 insertions(+), 12 deletions(-) diff --git a/Documentation/gitfaq.adoc b/Documentation/gitfaq.adoc index f2917d142c3630..2b62320abb7f11 100644 --- a/Documentation/gitfaq.adoc +++ b/Documentation/gitfaq.adoc @@ -214,14 +214,30 @@ of refs, such that both sides end up with different commits on a branch that the other doesn't have. This can result in important objects becoming unreferenced and possibly pruned by `git gc`, causing data loss. + -Therefore, it's better to push your work to either the other system or a central -server using the normal push and pull mechanism. However, this doesn't always -preserve important data, like stashes, so some people prefer to share a working -tree across systems. -+ -If you do this, the recommended approach is to use `rsync -a --delete-after` -(ideally with an encrypted connection such as with `ssh`) on the root of -repository. You should ensure several things when you do this: +Therefore, it's better to push your work to either the other system or a +central server using the normal push and pull mechanism. In Git 2.51, Git +learned to import and export stashes, so it's possible to synchronize the state +of the working tree by stashing it with `git stash`, then exporting either all +stashes with `git stash export --to-ref refs/heads/stashes` (assuming you want +to export to the `stashes` branch) or selecting stashes by adding their numbers +to the end of that command. It's also possible to include untracked files by +using the `--include-untracked` argument when stashing the data in the first +place, but be careful not to do this if any of these contain sensitive +information. ++ +You can then push the `stashes` branch (or whatever branch you've exported to), +fetch them to the local system (such as with `git fetch origin ++stashes:stashes`), and import the stashes on the other system with `git stash +import stashes` (again, changing the name as necessary). Applying the changes +to the working tree can be done with `git stash pop` or `git stash apply`. +This is the approach that is most robust and most likely to avoid unintended +problems. ++ +Having said that, there are some cases where people nevertheless prefer to +share a working tree across systems. If you do this, the recommended approach +is to use `rsync -a --delete-after` (ideally with an encrypted connection such +as with `ssh`) on the root of repository. You should ensure several things +when you do this: + * If you have additional worktrees or a separate Git directory, they must be synced at the same time as the main working tree and repository. @@ -232,10 +248,11 @@ repository. You should ensure several things when you do this: any sort are taking place on it, including background operations like `git gc` and operations invoked by your editor). + -Be aware that even with these recommendations, syncing in this way has some risk -since it bypasses Git's normal integrity checking for repositories, so having -backups is advised. You may also wish to do a `git fsck` to verify the -integrity of your data on the destination system after syncing. +Be aware that even with these recommendations, syncing working trees in this +way has some risk since it bypasses Git's normal integrity checking for +repositories, so having backups is advised. You may also wish to do a `git +fsck` to verify the integrity of your data on the destination system after +syncing. Common Issues ------------- From dbbf6a901bd0ebe9d43a27aadd517ad7d7d8dae0 Mon Sep 17 00:00:00 2001 From: K Jayatheerth Date: Fri, 9 Jan 2026 08:50:27 +0530 Subject: [PATCH 14/17] t7101: modernize test path checks Replace old-style `test -[df]` and `! test -[df]` assertions with the modern `test_path_is_file`, `test_path_is_dir`, and `test_path_is_missing` helpers. These helpers provide more informative error messages in case of failure (e.g., "File 'foo' is missing" instead of just exit code 1). While at it, fix a typo and an incorrect path reference in one of the test descriptions. Signed-off-by: K Jayatheerth Signed-off-by: Junio C Hamano --- t/t7101-reset-empty-subdirs.sh | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/t/t7101-reset-empty-subdirs.sh b/t/t7101-reset-empty-subdirs.sh index 33d5d5b76e7d17..d1d3e231fc2eb8 100755 --- a/t/t7101-reset-empty-subdirs.sh +++ b/t/t7101-reset-empty-subdirs.sh @@ -34,32 +34,32 @@ test_expect_success 'resetting tree HEAD^' ' ' test_expect_success 'checking initial files exist after rewind' ' - test -d path0 && - test -f path0/COPYING + test_path_is_dir path0 && + test_path_is_file path0/COPYING ' test_expect_success 'checking lack of path1/path2/COPYING' ' - ! test -f path1/path2/COPYING + test_path_is_missing path1/path2/COPYING ' test_expect_success 'checking lack of path1/COPYING' ' - ! test -f path1/COPYING + test_path_is_missing path1/COPYING ' test_expect_success 'checking lack of COPYING' ' - ! test -f COPYING + test_path_is_missing COPYING ' -test_expect_success 'checking checking lack of path1/COPYING-TOO' ' - ! test -f path0/COPYING-TOO +test_expect_success 'checking lack of path0/COPYING-TOO' ' + test_path_is_missing path0/COPYING-TOO ' test_expect_success 'checking lack of path1/path2' ' - ! test -d path1/path2 + test_path_is_missing path1/path2 ' test_expect_success 'checking lack of path1' ' - ! test -d path1 + test_path_is_missing path1 ' test_done From 2ac93bfcbc28e28a4844095648988558dad02aa3 Mon Sep 17 00:00:00 2001 From: Derrick Stolee Date: Fri, 9 Jan 2026 03:39:01 +0000 Subject: [PATCH 15/17] builtin.h: update documentation The documentation for the builtin API was moved from the technical documentation and into a comment in builtin.h by ec14d4ecb5 (builtin.h: take over documentation from api-builtin.txt, 2017-08-02). This documentation wasn't updated as part of the major overhaul to include a repository struct in 9b1cb5070f (builtin: add a repository parameter for builtin functions, 2024-09-13). There was a brief update regarding the move from *.txt to *.adoc by e8015223c7 (builtin.h: *.txt -> *.adoc fixes, 2025-03-03). I noticed that there was quite a bit missing from the old documentation, which is still visible on git-scm.com [1]. [1] https://github.com/git/git-scm.com/issues/2124 This change updates the documentation in the following ways: 1. Updates the cmd_foo() prototype to include a repository. 2. Adds some newlines to have uniformity in the list of flags. 3. Adds a description of the NO_PARSEOPT flag. 4. Describes the tests that perform checks on all builtins, which may trip up a contributor working on a new builtin. I double-checked these instructions against a toy example in my local branch to be sure that it was complete. Signed-off-by: Derrick Stolee Signed-off-by: Junio C Hamano --- builtin.h | 26 +++++++++++++++++++++++++- 1 file changed, 25 insertions(+), 1 deletion(-) diff --git a/builtin.h b/builtin.h index 1b35565fbd9a3c..e5e16ecaa6c9d7 100644 --- a/builtin.h +++ b/builtin.h @@ -17,7 +17,8 @@ * . Define the implementation of the built-in command `foo` with * signature: * - * int cmd_foo(int argc, const char **argv, const char *prefix); + * int cmd_foo(int argc, const char **argv, + * const char *prefix, struct repository *repo); * * . Add the external declaration for the function to `builtin.h`. * @@ -29,12 +30,14 @@ * where options is the bitwise-or of: * * `RUN_SETUP`: + * * If there is not a Git directory to work on, abort. If there * is a work tree, chdir to the top of it if the command was * invoked in a subdirectory. If there is no work tree, no * chdir() is done. * * `RUN_SETUP_GENTLY`: + * * If there is a Git directory, chdir as per RUN_SETUP, otherwise, * don't chdir anywhere. * @@ -57,6 +60,12 @@ * more informed decision, e.g., by ignoring `pager.` for * certain subcommands. * + * `NO_PARSEOPT`: + * + * Most Git builtins use the parseopt library for parsing options. + * This flag indicates that a custom parser is used and thus the + * builtin would not appear in 'git --list-cmds=parseopt'. + * * . Add `builtin/foo.o` to `BUILTIN_OBJS` in `Makefile`. * * Additionally, if `foo` is a new command, there are 4 more things to do: @@ -69,6 +78,21 @@ * * . Add an entry for `/git-foo` to `.gitignore`. * + * As you work on implementing your builtin, be mindful that the + * following tests will check different aspects of the builtin's + * readiness and adherence to matching the documentation: + * + * * t0012-help.sh checks that the builtin can handle -h, which comes + * automatically with the parseopt API. + * + * * t0450-txt-doc-vs-help.sh checks that the -h help output matches the + * SYNOPSIS in the documentation for the builtin. + * + * * t1517-outside-repo.sh checks that the builtin can handle -h when + * run outside of the context of a repository. Note that this test + * requires that the usage has a space after the builtin name, so some + * minimum description of options is required. + * * * How a built-in is called * ------------------------ From a3d1f391d35762162356201028fb73774a6c4a8b Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Thu, 15 Jan 2026 11:12:53 -0800 Subject: [PATCH 16/17] Revert "Merge branch 'ar/run-command-hook'" This reverts commit f406b8955295d01089ba2baf35eceadff2d11cae, reversing changes made to 1627809eeff75e6ec936fc609e7be46d5eb2fa9e. It seems to have caused a few regressions, two of the three known ones we have proposed solutions for. Let's give ourselves a bit more room to maneuver during the pre-release freeze period and restart once the 2.53 ships. --- Documentation/RelNotes/2.53.0.adoc | 3 - builtin/hook.c | 6 - builtin/receive-pack.c | 270 +++++++++++++++++------------ commit.c | 3 - hook.c | 29 +--- hook.h | 51 ------ refs.c | 100 +++++------ run-command.c | 142 +++------------ run-command.h | 38 ---- sequencer.c | 42 ++--- t/helper/test-run-command.c | 67 +------ t/t0061-run-command.sh | 38 ---- transport.c | 89 +++++----- 13 files changed, 292 insertions(+), 586 deletions(-) diff --git a/Documentation/RelNotes/2.53.0.adoc b/Documentation/RelNotes/2.53.0.adoc index dcdebe8954f067..96d871782a1bc9 100644 --- a/Documentation/RelNotes/2.53.0.adoc +++ b/Documentation/RelNotes/2.53.0.adoc @@ -93,9 +93,6 @@ Performance, Internal Implementation, Development Support etc. * Prepare test suite for Git for Windows that supports symbolic links. - * Use hook API to replace ad-hoc invocation of hook scripts with the - run_command() API. - * Import newer version of "clar", unit testing framework. (merge 84071a6dea ps/clar-integers later to maint). diff --git a/builtin/hook.c b/builtin/hook.c index 73e7b8c2e878eb..7afec380d2e579 100644 --- a/builtin/hook.c +++ b/builtin/hook.c @@ -43,12 +43,6 @@ static int run(int argc, const char **argv, const char *prefix, if (!argc) goto usage; - /* - * All current "hook run" use-cases require ungrouped child output. - * If this changes, a hook run argument can be added to toggle it. - */ - opt.ungroup = 1; - /* * Having a -- for "run" when providing is * mandatory. diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c index ef1f77be8c9db6..9c491746168a6f 100644 --- a/builtin/receive-pack.c +++ b/builtin/receive-pack.c @@ -749,7 +749,7 @@ static int check_cert_push_options(const struct string_list *push_options) return retval; } -static void prepare_push_cert_sha1(struct run_hooks_opt *opt) +static void prepare_push_cert_sha1(struct child_process *proc) { static int already_done; @@ -775,23 +775,23 @@ static void prepare_push_cert_sha1(struct run_hooks_opt *opt) nonce_status = check_nonce(sigcheck.payload); } if (!is_null_oid(&push_cert_oid)) { - strvec_pushf(&opt->env, "GIT_PUSH_CERT=%s", + strvec_pushf(&proc->env, "GIT_PUSH_CERT=%s", oid_to_hex(&push_cert_oid)); - strvec_pushf(&opt->env, "GIT_PUSH_CERT_SIGNER=%s", + strvec_pushf(&proc->env, "GIT_PUSH_CERT_SIGNER=%s", sigcheck.signer ? sigcheck.signer : ""); - strvec_pushf(&opt->env, "GIT_PUSH_CERT_KEY=%s", + strvec_pushf(&proc->env, "GIT_PUSH_CERT_KEY=%s", sigcheck.key ? sigcheck.key : ""); - strvec_pushf(&opt->env, "GIT_PUSH_CERT_STATUS=%c", + strvec_pushf(&proc->env, "GIT_PUSH_CERT_STATUS=%c", sigcheck.result); if (push_cert_nonce) { - strvec_pushf(&opt->env, + strvec_pushf(&proc->env, "GIT_PUSH_CERT_NONCE=%s", push_cert_nonce); - strvec_pushf(&opt->env, + strvec_pushf(&proc->env, "GIT_PUSH_CERT_NONCE_STATUS=%s", nonce_status); if (nonce_status == NONCE_SLOP) - strvec_pushf(&opt->env, + strvec_pushf(&proc->env, "GIT_PUSH_CERT_NONCE_SLOP=%ld", nonce_stamp_slop); } @@ -803,74 +803,119 @@ struct receive_hook_feed_state { struct ref_push_report *report; int skip_broken; struct strbuf buf; + const struct string_list *push_options; }; -static int feed_receive_hook_cb(int hook_stdin_fd, void *pp_cb UNUSED, void *pp_task_cb) +typedef int (*feed_fn)(void *, const char **, size_t *); +static int run_and_feed_hook(const char *hook_name, feed_fn feed, + struct receive_hook_feed_state *feed_state) { - struct receive_hook_feed_state *state = pp_task_cb; - struct command *cmd = state->cmd; - unsigned int lines_batch_size = 500; - - strbuf_reset(&state->buf); - - /* batch lines to avoid going through run-command's poll loop for each line */ - for (unsigned int i = 0; i < lines_batch_size; i++) { - while (cmd && - state->skip_broken && (cmd->error_string || cmd->did_not_exist)) - cmd = cmd->next; + struct child_process proc = CHILD_PROCESS_INIT; + struct async muxer; + int code; + const char *hook_path = find_hook(the_repository, hook_name); - if (!cmd) - break; /* no more commands left */ + if (!hook_path) + return 0; - if (!state->report) - state->report = cmd->report; + strvec_push(&proc.args, hook_path); + proc.in = -1; + proc.stdout_to_stderr = 1; + proc.trace2_hook_name = hook_name; + + if (feed_state->push_options) { + size_t i; + for (i = 0; i < feed_state->push_options->nr; i++) + strvec_pushf(&proc.env, + "GIT_PUSH_OPTION_%"PRIuMAX"=%s", + (uintmax_t)i, + feed_state->push_options->items[i].string); + strvec_pushf(&proc.env, "GIT_PUSH_OPTION_COUNT=%"PRIuMAX"", + (uintmax_t)feed_state->push_options->nr); + } else + strvec_pushf(&proc.env, "GIT_PUSH_OPTION_COUNT"); - if (state->report) { - struct object_id *old_oid; - struct object_id *new_oid; - const char *ref_name; + if (tmp_objdir) + strvec_pushv(&proc.env, tmp_objdir_env(tmp_objdir)); - old_oid = state->report->old_oid ? state->report->old_oid : &cmd->old_oid; - new_oid = state->report->new_oid ? state->report->new_oid : &cmd->new_oid; - ref_name = state->report->ref_name ? state->report->ref_name : cmd->ref_name; + if (use_sideband) { + memset(&muxer, 0, sizeof(muxer)); + muxer.proc = copy_to_sideband; + muxer.in = -1; + code = start_async(&muxer); + if (code) + return code; + proc.err = muxer.in; + } - strbuf_addf(&state->buf, "%s %s %s\n", - oid_to_hex(old_oid), oid_to_hex(new_oid), - ref_name); + prepare_push_cert_sha1(&proc); - state->report = state->report->next; - if (!state->report) - cmd = cmd->next; - } else { - strbuf_addf(&state->buf, "%s %s %s\n", - oid_to_hex(&cmd->old_oid), oid_to_hex(&cmd->new_oid), - cmd->ref_name); - cmd = cmd->next; - } + code = start_command(&proc); + if (code) { + if (use_sideband) + finish_async(&muxer); + return code; } - state->cmd = cmd; + sigchain_push(SIGPIPE, SIG_IGN); - if (state->buf.len > 0) { - int ret = write_in_full(hook_stdin_fd, state->buf.buf, state->buf.len); - if (ret < 0) { - if (errno == EPIPE) - return 1; /* child closed pipe */ - return ret; - } + while (1) { + const char *buf; + size_t n; + if (feed(feed_state, &buf, &n)) + break; + if (write_in_full(proc.in, buf, n) < 0) + break; } + close(proc.in); + if (use_sideband) + finish_async(&muxer); - return state->cmd ? 0 : 1; /* 0 = more to come, 1 = EOF */ + sigchain_pop(SIGPIPE); + + return finish_command(&proc); } -static void hook_output_to_sideband(struct strbuf *output, void *cb_data UNUSED) +static int feed_receive_hook(void *state_, const char **bufp, size_t *sizep) { - if (!output) - BUG("output must be non-NULL"); + struct receive_hook_feed_state *state = state_; + struct command *cmd = state->cmd; - /* buffer might be empty for keepalives */ - if (output->len) - send_sideband(1, 2, output->buf, output->len, use_sideband); + while (cmd && + state->skip_broken && (cmd->error_string || cmd->did_not_exist)) + cmd = cmd->next; + if (!cmd) + return -1; /* EOF */ + if (!bufp) + return 0; /* OK, can feed something. */ + strbuf_reset(&state->buf); + if (!state->report) + state->report = cmd->report; + if (state->report) { + struct object_id *old_oid; + struct object_id *new_oid; + const char *ref_name; + + old_oid = state->report->old_oid ? state->report->old_oid : &cmd->old_oid; + new_oid = state->report->new_oid ? state->report->new_oid : &cmd->new_oid; + ref_name = state->report->ref_name ? state->report->ref_name : cmd->ref_name; + strbuf_addf(&state->buf, "%s %s %s\n", + oid_to_hex(old_oid), oid_to_hex(new_oid), + ref_name); + state->report = state->report->next; + if (!state->report) + state->cmd = cmd->next; + } else { + strbuf_addf(&state->buf, "%s %s %s\n", + oid_to_hex(&cmd->old_oid), oid_to_hex(&cmd->new_oid), + cmd->ref_name); + state->cmd = cmd->next; + } + if (bufp) { + *bufp = state->buf.buf; + *sizep = state->buf.len; + } + return 0; } static int run_receive_hook(struct command *commands, @@ -878,65 +923,47 @@ static int run_receive_hook(struct command *commands, int skip_broken, const struct string_list *push_options) { - struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT; - struct command *iter = commands; - struct receive_hook_feed_state feed_state; - int ret; - - /* if there are no valid commands, don't invoke the hook at all. */ - while (iter && skip_broken && (iter->error_string || iter->did_not_exist)) - iter = iter->next; - if (!iter) - return 0; - - if (push_options) { - for (int i = 0; i < push_options->nr; i++) - strvec_pushf(&opt.env, "GIT_PUSH_OPTION_%d=%s", i, - push_options->items[i].string); - strvec_pushf(&opt.env, "GIT_PUSH_OPTION_COUNT=%"PRIuMAX"", - (uintmax_t)push_options->nr); - } else { - strvec_push(&opt.env, "GIT_PUSH_OPTION_COUNT"); - } - - if (tmp_objdir) - strvec_pushv(&opt.env, tmp_objdir_env(tmp_objdir)); - - prepare_push_cert_sha1(&opt); - - /* set up sideband printer */ - if (use_sideband) - opt.consume_output = hook_output_to_sideband; - - /* set up stdin callback */ - feed_state.cmd = commands; - feed_state.skip_broken = skip_broken; - feed_state.report = NULL; - strbuf_init(&feed_state.buf, 0); - opt.feed_pipe_cb_data = &feed_state; - opt.feed_pipe = feed_receive_hook_cb; - - ret = run_hooks_opt(the_repository, hook_name, &opt); - - strbuf_release(&feed_state.buf); + struct receive_hook_feed_state state; + int status; - return ret; + strbuf_init(&state.buf, 0); + state.cmd = commands; + state.skip_broken = skip_broken; + state.report = NULL; + if (feed_receive_hook(&state, NULL, NULL)) + return 0; + state.cmd = commands; + state.push_options = push_options; + status = run_and_feed_hook(hook_name, feed_receive_hook, &state); + strbuf_release(&state.buf); + return status; } static int run_update_hook(struct command *cmd) { - struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT; + struct child_process proc = CHILD_PROCESS_INIT; + int code; + const char *hook_path = find_hook(the_repository, "update"); - strvec_pushl(&opt.args, - cmd->ref_name, - oid_to_hex(&cmd->old_oid), - oid_to_hex(&cmd->new_oid), - NULL); + if (!hook_path) + return 0; - if (use_sideband) - opt.consume_output = hook_output_to_sideband; + strvec_push(&proc.args, hook_path); + strvec_push(&proc.args, cmd->ref_name); + strvec_push(&proc.args, oid_to_hex(&cmd->old_oid)); + strvec_push(&proc.args, oid_to_hex(&cmd->new_oid)); + + proc.no_stdin = 1; + proc.stdout_to_stderr = 1; + proc.err = use_sideband ? -1 : 0; + proc.trace2_hook_name = "update"; - return run_hooks_opt(the_repository, "update", &opt); + code = start_command(&proc); + if (code) + return code; + if (use_sideband) + copy_to_sideband(proc.err, -1, NULL); + return finish_command(&proc); } static struct command *find_command_by_refname(struct command *list, @@ -1613,20 +1640,33 @@ static const char *update(struct command *cmd, struct shallow_info *si) static void run_update_post_hook(struct command *commands) { struct command *cmd; - struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT; + struct child_process proc = CHILD_PROCESS_INIT; + const char *hook; + + hook = find_hook(the_repository, "post-update"); + if (!hook) + return; for (cmd = commands; cmd; cmd = cmd->next) { if (cmd->error_string || cmd->did_not_exist) continue; - strvec_push(&opt.args, cmd->ref_name); + if (!proc.args.nr) + strvec_push(&proc.args, hook); + strvec_push(&proc.args, cmd->ref_name); } - if (!opt.args.nr) + if (!proc.args.nr) return; - if (use_sideband) - opt.consume_output = hook_output_to_sideband; + proc.no_stdin = 1; + proc.stdout_to_stderr = 1; + proc.err = use_sideband ? -1 : 0; + proc.trace2_hook_name = "post-update"; - run_hooks_opt(the_repository, "post-update", &opt); + if (!start_command(&proc)) { + if (use_sideband) + copy_to_sideband(proc.err, -1, NULL); + finish_command(&proc); + } } static void check_aliased_update_internal(struct command *cmd, diff --git a/commit.c b/commit.c index efd0c026831e6b..28bb5ce029f3c5 100644 --- a/commit.c +++ b/commit.c @@ -1978,9 +1978,6 @@ int run_commit_hook(int editor_is_used, const char *index_file, strvec_push(&opt.args, arg); va_end(args); - /* All commit hook use-cases require ungrouping child output. */ - opt.ungroup = 1; - opt.invoked_hook = invoked_hook; return run_hooks_opt(the_repository, name, &opt); } diff --git a/hook.c b/hook.c index 35211e5ed7c474..b3de1048bf44b9 100644 --- a/hook.c +++ b/hook.c @@ -55,7 +55,7 @@ int hook_exists(struct repository *r, const char *name) static int pick_next_hook(struct child_process *cp, struct strbuf *out UNUSED, void *pp_cb, - void **pp_task_cb) + void **pp_task_cb UNUSED) { struct hook_cb_data *hook_cb = pp_cb; const char *hook_path = hook_cb->hook_path; @@ -65,22 +65,11 @@ static int pick_next_hook(struct child_process *cp, cp->no_stdin = 1; strvec_pushv(&cp->env, hook_cb->options->env.v); - - if (hook_cb->options->path_to_stdin && hook_cb->options->feed_pipe) - BUG("options path_to_stdin and feed_pipe are mutually exclusive"); - /* reopen the file for stdin; run_command closes it. */ if (hook_cb->options->path_to_stdin) { cp->no_stdin = 0; cp->in = xopen(hook_cb->options->path_to_stdin, O_RDONLY); } - - if (hook_cb->options->feed_pipe) { - cp->no_stdin = 0; - /* start_command() will allocate a pipe / stdin fd for us */ - cp->in = -1; - } - cp->stdout_to_stderr = 1; cp->trace2_hook_name = hook_cb->hook_name; cp->dir = hook_cb->options->dir; @@ -88,12 +77,6 @@ static int pick_next_hook(struct child_process *cp, strvec_push(&cp->args, hook_path); strvec_pushv(&cp->args, hook_cb->options->args.v); - /* - * Provide per-hook internal state via task_cb for easy access, so - * hook callbacks don't have to go through hook_cb->options. - */ - *pp_task_cb = hook_cb->options->feed_pipe_cb_data; - /* * This pick_next_hook() will be called again, we're only * running one hook, so indicate that no more work will be @@ -153,12 +136,10 @@ int run_hooks_opt(struct repository *r, const char *hook_name, .tr2_label = hook_name, .processes = 1, - .ungroup = options->ungroup, + .ungroup = 1, .get_next_task = pick_next_hook, .start_failure = notify_start_failure, - .feed_pipe = options->feed_pipe, - .consume_output = options->consume_output, .task_finished = notify_hook_finished, .data = &cb_data, @@ -167,9 +148,6 @@ int run_hooks_opt(struct repository *r, const char *hook_name, if (!options) BUG("a struct run_hooks_opt must be provided to run_hooks"); - if (options->path_to_stdin && options->feed_pipe) - BUG("options path_to_stdin and feed_pipe are mutually exclusive"); - if (options->invoked_hook) *options->invoked_hook = 0; @@ -199,9 +177,6 @@ int run_hooks(struct repository *r, const char *hook_name) { struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT; - /* All use-cases of this API require ungrouping. */ - opt.ungroup = 1; - return run_hooks_opt(r, hook_name, &opt); } diff --git a/hook.h b/hook.h index ae502178b9bfad..11863fa7347e6f 100644 --- a/hook.h +++ b/hook.h @@ -1,7 +1,6 @@ #ifndef HOOK_H #define HOOK_H #include "strvec.h" -#include "run-command.h" struct repository; @@ -34,60 +33,10 @@ struct run_hooks_opt */ int *invoked_hook; - /** - * Allow hooks to set run_processes_parallel() 'ungroup' behavior. - */ - unsigned int ungroup:1; - /** * Path to file which should be piped to stdin for each hook. */ const char *path_to_stdin; - - /** - * Callback used to incrementally feed a child hook stdin pipe. - * - * Useful especially if a hook consumes large quantities of data - * (e.g. a list of all refs in a client push), so feeding it via - * in-memory strings or slurping to/from files is inefficient. - * While the callback allows piecemeal writing, it can also be - * used for smaller inputs, where it gets called only once. - * - * Add hook callback initalization context to `feed_pipe_ctx`. - * Add hook callback internal state to `feed_pipe_cb_data`. - * - */ - feed_pipe_fn feed_pipe; - - /** - * Opaque data pointer used to pass context to `feed_pipe_fn`. - * - * It can be accessed via the second callback arg 'pp_cb': - * ((struct hook_cb_data *) pp_cb)->hook_cb->options->feed_pipe_ctx; - * - * The caller is responsible for managing the memory for this data. - * Only useful when using `run_hooks_opt.feed_pipe`, otherwise ignore it. - */ - void *feed_pipe_ctx; - - /** - * Opaque data pointer used to keep internal state across callback calls. - * - * It can be accessed directly via the third callback arg 'pp_task_cb': - * struct ... *state = pp_task_cb; - * - * The caller is responsible for managing the memory for this data. - * Only useful when using `run_hooks_opt.feed_pipe`, otherwise ignore it. - */ - void *feed_pipe_cb_data; - - /* - * Populate this to capture output and prevent it from being printed to - * stderr. This will be passed directly through to - * run_command:run_parallel_processes(). See t/helper/test-run-command.c - * for an example. - */ - consume_output_fn consume_output; }; #define RUN_HOOKS_OPT_INIT { \ diff --git a/refs.c b/refs.c index e06e0cb07283d3..046b695bb20f54 100644 --- a/refs.c +++ b/refs.c @@ -2422,72 +2422,68 @@ static int ref_update_reject_duplicates(struct string_list *refnames, return 0; } -struct transaction_feed_cb_data { - size_t index; - struct strbuf buf; -}; - -static int transaction_hook_feed_stdin(int hook_stdin_fd, void *pp_cb, void *pp_task_cb) +static int run_transaction_hook(struct ref_transaction *transaction, + const char *state) { - struct hook_cb_data *hook_cb = pp_cb; - struct ref_transaction *transaction = hook_cb->options->feed_pipe_ctx; - struct transaction_feed_cb_data *feed_cb_data = pp_task_cb; - struct strbuf *buf = &feed_cb_data->buf; - struct ref_update *update; - size_t i = feed_cb_data->index++; - int ret; - - if (i >= transaction->nr) - return 1; /* No more refs to process */ - - update = transaction->updates[i]; + struct child_process proc = CHILD_PROCESS_INIT; + struct strbuf buf = STRBUF_INIT; + const char *hook; + int ret = 0; - if (update->flags & REF_LOG_ONLY) - return 0; + hook = find_hook(transaction->ref_store->repo, "reference-transaction"); + if (!hook) + return ret; - strbuf_reset(buf); + strvec_pushl(&proc.args, hook, state, NULL); + proc.in = -1; + proc.stdout_to_stderr = 1; + proc.trace2_hook_name = "reference-transaction"; - if (!(update->flags & REF_HAVE_OLD)) - strbuf_addf(buf, "%s ", oid_to_hex(null_oid(the_hash_algo))); - else if (update->old_target) - strbuf_addf(buf, "ref:%s ", update->old_target); - else - strbuf_addf(buf, "%s ", oid_to_hex(&update->old_oid)); + ret = start_command(&proc); + if (ret) + return ret; - if (!(update->flags & REF_HAVE_NEW)) - strbuf_addf(buf, "%s ", oid_to_hex(null_oid(the_hash_algo))); - else if (update->new_target) - strbuf_addf(buf, "ref:%s ", update->new_target); - else - strbuf_addf(buf, "%s ", oid_to_hex(&update->new_oid)); + sigchain_push(SIGPIPE, SIG_IGN); - strbuf_addf(buf, "%s\n", update->refname); + for (size_t i = 0; i < transaction->nr; i++) { + struct ref_update *update = transaction->updates[i]; - ret = write_in_full(hook_stdin_fd, buf->buf, buf->len); - if (ret < 0 && errno != EPIPE) - return ret; + if (update->flags & REF_LOG_ONLY) + continue; - return 0; /* no more input to feed */ -} + strbuf_reset(&buf); -static int run_transaction_hook(struct ref_transaction *transaction, - const char *state) -{ - struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT; - struct transaction_feed_cb_data feed_ctx = { 0 }; - int ret = 0; + if (!(update->flags & REF_HAVE_OLD)) + strbuf_addf(&buf, "%s ", oid_to_hex(null_oid(the_hash_algo))); + else if (update->old_target) + strbuf_addf(&buf, "ref:%s ", update->old_target); + else + strbuf_addf(&buf, "%s ", oid_to_hex(&update->old_oid)); - strvec_push(&opt.args, state); + if (!(update->flags & REF_HAVE_NEW)) + strbuf_addf(&buf, "%s ", oid_to_hex(null_oid(the_hash_algo))); + else if (update->new_target) + strbuf_addf(&buf, "ref:%s ", update->new_target); + else + strbuf_addf(&buf, "%s ", oid_to_hex(&update->new_oid)); - opt.feed_pipe = transaction_hook_feed_stdin; - opt.feed_pipe_ctx = transaction; - opt.feed_pipe_cb_data = &feed_ctx; + strbuf_addf(&buf, "%s\n", update->refname); - strbuf_init(&feed_ctx.buf, 0); + if (write_in_full(proc.in, buf.buf, buf.len) < 0) { + if (errno != EPIPE) { + /* Don't leak errno outside this API */ + errno = 0; + ret = -1; + } + break; + } + } - ret = run_hooks_opt(transaction->ref_store->repo, "reference-transaction", &opt); + close(proc.in); + sigchain_pop(SIGPIPE); + strbuf_release(&buf); - strbuf_release(&feed_ctx.buf); + ret |= finish_command(&proc); return ret; } diff --git a/run-command.c b/run-command.c index 2d3c2ac55c5c02..e3e02475ccec50 100644 --- a/run-command.c +++ b/run-command.c @@ -1478,32 +1478,15 @@ enum child_state { GIT_CP_WAIT_CLEANUP, }; -struct parallel_child { - enum child_state state; - struct child_process process; - struct strbuf err; - void *data; -}; - -static int child_is_working(const struct parallel_child *pp_child) -{ - return pp_child->state == GIT_CP_WORKING; -} - -static int child_is_ready_for_cleanup(const struct parallel_child *pp_child) -{ - return child_is_working(pp_child) && !pp_child->process.in; -} - -static int child_is_receiving_input(const struct parallel_child *pp_child) -{ - return child_is_working(pp_child) && pp_child->process.in > 0; -} - struct parallel_processes { size_t nr_processes; - struct parallel_child *children; + struct { + enum child_state state; + struct child_process process; + struct strbuf err; + void *data; + } *children; /* * The struct pollfd is logically part of *children, * but the system call expects it as its own array. @@ -1526,7 +1509,7 @@ static void kill_children(const struct parallel_processes *pp, int signo) { for (size_t i = 0; i < opts->processes; i++) - if (child_is_working(&pp->children[i])) + if (pp->children[i].state == GIT_CP_WORKING) kill(pp->children[i].process.pid, signo); } @@ -1595,10 +1578,7 @@ static void pp_cleanup(struct parallel_processes *pp, * When get_next_task added messages to the buffer in its last * iteration, the buffered output is non empty. */ - if (opts->consume_output) - opts->consume_output(&pp->buffered_output, opts->data); - else - strbuf_write(&pp->buffered_output, stderr); + strbuf_write(&pp->buffered_output, stderr); strbuf_release(&pp->buffered_output); sigchain_pop_common(); @@ -1672,44 +1652,6 @@ static int pp_start_one(struct parallel_processes *pp, return 0; } -static void pp_buffer_stdin(struct parallel_processes *pp, - const struct run_process_parallel_opts *opts) -{ - /* Buffer stdin for each pipe. */ - for (size_t i = 0; i < opts->processes; i++) { - struct child_process *proc = &pp->children[i].process; - int ret; - - if (!child_is_receiving_input(&pp->children[i])) - continue; - - /* - * child input is provided via path_to_stdin when the feed_pipe cb is - * missing, so we just signal an EOF. - */ - if (!opts->feed_pipe) { - close(proc->in); - proc->in = 0; - continue; - } - - /** - * Feed the pipe: - * ret < 0 means error - * ret == 0 means there is more data to be fed - * ret > 0 means feeding finished - */ - ret = opts->feed_pipe(proc->in, opts->data, pp->children[i].data); - if (ret < 0) - die_errno("feed_pipe"); - - if (ret) { - close(proc->in); - proc->in = 0; - } - } -} - static void pp_buffer_stderr(struct parallel_processes *pp, const struct run_process_parallel_opts *opts, int output_timeout) @@ -1723,7 +1665,7 @@ static void pp_buffer_stderr(struct parallel_processes *pp, /* Buffer output from all pipes. */ for (size_t i = 0; i < opts->processes; i++) { - if (child_is_working(&pp->children[i]) && + if (pp->children[i].state == GIT_CP_WORKING && pp->pfd[i].revents & (POLLIN | POLLHUP)) { int n = strbuf_read_once(&pp->children[i].err, pp->children[i].process.err, 0); @@ -1737,17 +1679,13 @@ static void pp_buffer_stderr(struct parallel_processes *pp, } } -static void pp_output(const struct parallel_processes *pp, - const struct run_process_parallel_opts *opts) +static void pp_output(const struct parallel_processes *pp) { size_t i = pp->output_owner; - if (child_is_working(&pp->children[i]) && + if (pp->children[i].state == GIT_CP_WORKING && pp->children[i].err.len) { - if (opts->consume_output) - opts->consume_output(&pp->children[i].err, opts->data); - else - strbuf_write(&pp->children[i].err, stderr); + strbuf_write(&pp->children[i].err, stderr); strbuf_reset(&pp->children[i].err); } } @@ -1784,7 +1722,6 @@ static int pp_collect_finished(struct parallel_processes *pp, pp->children[i].state = GIT_CP_FREE; if (pp->pfd) pp->pfd[i].fd = -1; - pp->children[i].process.in = 0; child_process_init(&pp->children[i].process); if (opts->ungroup) { @@ -1795,15 +1732,11 @@ static int pp_collect_finished(struct parallel_processes *pp, } else { const size_t n = opts->processes; - /* Output errors, then all other finished child processes */ - if (opts->consume_output) { - opts->consume_output(&pp->children[i].err, opts->data); - opts->consume_output(&pp->buffered_output, opts->data); - } else { - strbuf_write(&pp->children[i].err, stderr); - strbuf_write(&pp->buffered_output, stderr); - } + strbuf_write(&pp->children[i].err, stderr); strbuf_reset(&pp->children[i].err); + + /* Output all other finished child processes */ + strbuf_write(&pp->buffered_output, stderr); strbuf_reset(&pp->buffered_output); /* @@ -1815,7 +1748,7 @@ static int pp_collect_finished(struct parallel_processes *pp, * running process time. */ for (i = 0; i < n; i++) - if (child_is_working(&pp->children[(pp->output_owner + i) % n])) + if (pp->children[(pp->output_owner + i) % n].state == GIT_CP_WORKING) break; pp->output_owner = (pp->output_owner + i) % n; } @@ -1823,27 +1756,6 @@ static int pp_collect_finished(struct parallel_processes *pp, return result; } -static void pp_handle_child_IO(struct parallel_processes *pp, - const struct run_process_parallel_opts *opts, - int output_timeout) -{ - /* - * First push input, if any (it might no-op), to child tasks to avoid them blocking - * after input. This also prevents deadlocks when ungrouping below, if a child blocks - * while the parent also waits for them to finish. - */ - pp_buffer_stdin(pp, opts); - - if (opts->ungroup) { - for (size_t i = 0; i < opts->processes; i++) - if (child_is_ready_for_cleanup(&pp->children[i])) - pp->children[i].state = GIT_CP_WAIT_CLEANUP; - } else { - pp_buffer_stderr(pp, opts, output_timeout); - pp_output(pp, opts); - } -} - void run_processes_parallel(const struct run_process_parallel_opts *opts) { int i, code; @@ -1863,16 +1775,6 @@ void run_processes_parallel(const struct run_process_parallel_opts *opts) "max:%"PRIuMAX, (uintmax_t)opts->processes); - if (opts->ungroup && opts->consume_output) - BUG("ungroup and reading output are mutualy exclusive"); - - /* - * Child tasks might receive input via stdin, terminating early (or not), so - * ignore the default SIGPIPE which gets handled by each feed_pipe_fn which - * actually writes the data to children stdin fds. - */ - sigchain_push(SIGPIPE, SIG_IGN); - pp_init(&pp, opts, &pp_sig); while (1) { for (i = 0; @@ -1890,7 +1792,13 @@ void run_processes_parallel(const struct run_process_parallel_opts *opts) } if (!pp.nr_processes) break; - pp_handle_child_IO(&pp, opts, output_timeout); + if (opts->ungroup) { + for (size_t i = 0; i < opts->processes; i++) + pp.children[i].state = GIT_CP_WAIT_CLEANUP; + } else { + pp_buffer_stderr(&pp, opts, output_timeout); + pp_output(&pp); + } code = pp_collect_finished(&pp, opts); if (code) { pp.shutdown = 1; @@ -1901,8 +1809,6 @@ void run_processes_parallel(const struct run_process_parallel_opts *opts) pp_cleanup(&pp, opts); - sigchain_pop(SIGPIPE); - if (do_trace2) trace2_region_leave(tr2_category, tr2_label, NULL); } diff --git a/run-command.h b/run-command.h index 7093252863966f..0df25e445f001c 100644 --- a/run-command.h +++ b/run-command.h @@ -420,32 +420,6 @@ typedef int (*start_failure_fn)(struct strbuf *out, void *pp_cb, void *pp_task_cb); -/** - * This callback is repeatedly called on every child process who requests - * start_command() to create a pipe by setting child_process.in < 0. - * - * pp_cb is the callback cookie as passed into run_processes_parallel, and - * pp_task_cb is the callback cookie as passed into get_next_task_fn. - * - * Returns < 0 for error - * Returns == 0 when there is more data to be fed (will be called again) - * Returns > 0 when finished (child closed fd or no more data to be fed) - */ -typedef int (*feed_pipe_fn)(int child_in, - void *pp_cb, - void *pp_task_cb); - -/** - * If this callback is provided, output is collated into a new pipe instead - * of the process stderr. Then `consume_output_fn` will be called repeatedly - * with output contained in the `output` arg. It will also be called with an - * empty `output` to allow for keepalives or similar operations if necessary. - * - * pp_cb is the callback cookie as passed into run_processes_parallel. - * No task cookie is provided because the callback receives collated output. - */ -typedef void (*consume_output_fn)(struct strbuf *output, void *pp_cb); - /** * This callback is called on every child process that finished processing. * @@ -499,18 +473,6 @@ struct run_process_parallel_opts */ start_failure_fn start_failure; - /* - * feed_pipe: see feed_pipe_fn() above. This can be NULL to omit any - * special handling. - */ - feed_pipe_fn feed_pipe; - - /* - * consume_output: see consume_output_fn() above. This can be NULL - * to omit any special handling. - */ - consume_output_fn consume_output; - /** * task_finished: See task_finished_fn() above. This can be * NULL to omit any special handling. diff --git a/sequencer.c b/sequencer.c index 71ed31c7740688..5476d39ba9b097 100644 --- a/sequencer.c +++ b/sequencer.c @@ -1292,40 +1292,32 @@ int update_head_with_reflog(const struct commit *old_head, return ret; } -static int pipe_from_strbuf(int hook_stdin_fd, void *pp_cb, void *pp_task_cb UNUSED) -{ - struct hook_cb_data *hook_cb = pp_cb; - struct strbuf *to_pipe = hook_cb->options->feed_pipe_ctx; - int ret; - - if (!to_pipe) - BUG("pipe_from_strbuf called without feed_pipe_ctx"); - - ret = write_in_full(hook_stdin_fd, to_pipe->buf, to_pipe->len); - if (ret < 0 && errno != EPIPE) - return ret; - - return 1; /* done writing */ -} - static int run_rewrite_hook(const struct object_id *oldoid, const struct object_id *newoid) { - struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT; + struct child_process proc = CHILD_PROCESS_INIT; int code; struct strbuf sb = STRBUF_INIT; + const char *hook_path = find_hook(the_repository, "post-rewrite"); - strbuf_addf(&sb, "%s %s\n", oid_to_hex(oldoid), oid_to_hex(newoid)); - - opt.feed_pipe_ctx = &sb; - opt.feed_pipe = pipe_from_strbuf; - - strvec_push(&opt.args, "amend"); + if (!hook_path) + return 0; - code = run_hooks_opt(the_repository, "post-rewrite", &opt); + strvec_pushl(&proc.args, hook_path, "amend", NULL); + proc.in = -1; + proc.stdout_to_stderr = 1; + proc.trace2_hook_name = "post-rewrite"; + code = start_command(&proc); + if (code) + return code; + strbuf_addf(&sb, "%s %s\n", oid_to_hex(oldoid), oid_to_hex(newoid)); + sigchain_push(SIGPIPE, SIG_IGN); + write_in_full(proc.in, sb.buf, sb.len); + close(proc.in); strbuf_release(&sb); - return code; + sigchain_pop(SIGPIPE); + return finish_command(&proc); } void commit_post_rewrite(struct repository *r, diff --git a/t/helper/test-run-command.c b/t/helper/test-run-command.c index 49eace8dce1961..3719f23cc2d02f 100644 --- a/t/helper/test-run-command.c +++ b/t/helper/test-run-command.c @@ -23,26 +23,19 @@ static int number_callbacks; static int parallel_next(struct child_process *cp, struct strbuf *err, void *cb, - void **task_cb) + void **task_cb UNUSED) { struct child_process *d = cb; if (number_callbacks >= 4) return 0; strvec_pushv(&cp->args, d->args.v); - cp->in = d->in; - cp->no_stdin = d->no_stdin; if (err) strbuf_addstr(err, "preloaded output of a child\n"); else fprintf(stderr, "preloaded output of a child\n"); number_callbacks++; - - /* test_stdin callback will use this to count remaining lines */ - *task_cb = xmalloc(sizeof(int)); - *(int*)(*task_cb) = 2; - return 1; } @@ -58,61 +51,18 @@ static int no_job(struct child_process *cp UNUSED, return 0; } -static void test_divert_output(struct strbuf *output, void *cb UNUSED) -{ - FILE *output_file; - - output_file = fopen("./output_file", "a"); - - strbuf_write(output, output_file); - fclose(output_file); -} - static int task_finished(int result UNUSED, struct strbuf *err, void *pp_cb UNUSED, - void *pp_task_cb) + void *pp_task_cb UNUSED) { if (err) strbuf_addstr(err, "asking for a quick stop\n"); else fprintf(stderr, "asking for a quick stop\n"); - - FREE_AND_NULL(pp_task_cb); - return 1; } -static int task_finished_quiet(int result UNUSED, - struct strbuf *err UNUSED, - void *pp_cb UNUSED, - void *pp_task_cb) -{ - FREE_AND_NULL(pp_task_cb); - return 0; -} - -static int test_stdin_pipe_feed(int hook_stdin_fd, void *cb UNUSED, void *task_cb) -{ - int *lines_remaining = task_cb; - - if (*lines_remaining) { - struct strbuf buf = STRBUF_INIT; - strbuf_addf(&buf, "sample stdin %d\n", --(*lines_remaining)); - if (write_in_full(hook_stdin_fd, buf.buf, buf.len) < 0) { - if (errno == EPIPE) { - /* child closed stdin, nothing more to do */ - strbuf_release(&buf); - return 1; - } - die_errno("write"); - } - strbuf_release(&buf); - } - - return !(*lines_remaining); -} - struct testsuite { struct string_list tests, failed; int next; @@ -207,8 +157,6 @@ static int testsuite(int argc, const char **argv) struct run_process_parallel_opts opts = { .get_next_task = next_test, .start_failure = test_failed, - .feed_pipe = test_stdin_pipe_feed, - .consume_output = test_divert_output, .task_finished = test_finished, .data = &suite, }; @@ -512,23 +460,12 @@ int cmd__run_command(int argc, const char **argv) if (!strcmp(argv[1], "run-command-parallel")) { opts.get_next_task = parallel_next; - opts.task_finished = task_finished_quiet; } else if (!strcmp(argv[1], "run-command-abort")) { opts.get_next_task = parallel_next; opts.task_finished = task_finished; } else if (!strcmp(argv[1], "run-command-no-jobs")) { opts.get_next_task = no_job; opts.task_finished = task_finished; - } else if (!strcmp(argv[1], "run-command-stdin")) { - proc.in = -1; - proc.no_stdin = 0; - opts.get_next_task = parallel_next; - opts.task_finished = task_finished_quiet; - opts.feed_pipe = test_stdin_pipe_feed; - } else if (!strcmp(argv[1], "run-command-divert-output")) { - opts.get_next_task = parallel_next; - opts.consume_output = test_divert_output; - opts.task_finished = task_finished_quiet; } else { ret = 1; fprintf(stderr, "check usage\n"); diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh index 74529e219e2aef..76d4936a879afd 100755 --- a/t/t0061-run-command.sh +++ b/t/t0061-run-command.sh @@ -164,44 +164,6 @@ test_expect_success 'run_command runs ungrouped in parallel with more tasks than test_line_count = 4 err ' -test_expect_success 'run_command can divert output' ' - test_when_finished rm output_file && - test-tool run-command run-command-divert-output 3 sh -c "printf \"%s\n%s\n\" Hello World" 2>actual && - test_must_be_empty actual && - test_cmp expect output_file -' - -test_expect_success 'run_command listens to stdin' ' - cat >expect <<-\EOF && - preloaded output of a child - listening for stdin: - sample stdin 1 - sample stdin 0 - preloaded output of a child - listening for stdin: - sample stdin 1 - sample stdin 0 - preloaded output of a child - listening for stdin: - sample stdin 1 - sample stdin 0 - preloaded output of a child - listening for stdin: - sample stdin 1 - sample stdin 0 - EOF - - write_script stdin-script <<-\EOF && - echo "listening for stdin:" - while read line - do - echo "$line" - done - EOF - test-tool run-command run-command-stdin 2 ./stdin-script 2>actual && - test_cmp expect actual -' - cat >expect <<-EOF preloaded output of a child asking for a quick stop diff --git a/transport.c b/transport.c index 6d0f02be5d7c00..c7f06a7382e605 100644 --- a/transport.c +++ b/transport.c @@ -1316,66 +1316,65 @@ static void die_with_unpushed_submodules(struct string_list *needs_pushing) die(_("Aborting.")); } -struct feed_pre_push_hook_data { +static int run_pre_push_hook(struct transport *transport, + struct ref *remote_refs) +{ + int ret = 0, x; + struct ref *r; + struct child_process proc = CHILD_PROCESS_INIT; struct strbuf buf; - const struct ref *refs; -}; + const char *hook_path = find_hook(the_repository, "pre-push"); -static int pre_push_hook_feed_stdin(int hook_stdin_fd, void *pp_cb UNUSED, void *pp_task_cb) -{ - struct feed_pre_push_hook_data *data = pp_task_cb; - const struct ref *r = data->refs; - int ret = 0; + if (!hook_path) + return 0; - if (!r) - return 1; /* no more refs */ + strvec_push(&proc.args, hook_path); + strvec_push(&proc.args, transport->remote->name); + strvec_push(&proc.args, transport->url); - data->refs = r->next; + proc.in = -1; + proc.trace2_hook_name = "pre-push"; - switch (r->status) { - case REF_STATUS_REJECT_NONFASTFORWARD: - case REF_STATUS_REJECT_REMOTE_UPDATED: - case REF_STATUS_REJECT_STALE: - case REF_STATUS_UPTODATE: - return 0; /* skip refs which won't be pushed */ - default: - break; + if (start_command(&proc)) { + finish_command(&proc); + return -1; } - if (!r->peer_ref) - return 0; - - strbuf_reset(&data->buf); - strbuf_addf(&data->buf, "%s %s %s %s\n", - r->peer_ref->name, oid_to_hex(&r->new_oid), - r->name, oid_to_hex(&r->old_oid)); + sigchain_push(SIGPIPE, SIG_IGN); - ret = write_in_full(hook_stdin_fd, data->buf.buf, data->buf.len); - if (ret < 0 && errno != EPIPE) - return ret; /* We do not mind if a hook does not read all refs. */ + strbuf_init(&buf, 256); - return 0; -} + for (r = remote_refs; r; r = r->next) { + if (!r->peer_ref) continue; + if (r->status == REF_STATUS_REJECT_NONFASTFORWARD) continue; + if (r->status == REF_STATUS_REJECT_STALE) continue; + if (r->status == REF_STATUS_REJECT_REMOTE_UPDATED) continue; + if (r->status == REF_STATUS_UPTODATE) continue; -static int run_pre_push_hook(struct transport *transport, - struct ref *remote_refs) -{ - struct run_hooks_opt opt = RUN_HOOKS_OPT_INIT; - struct feed_pre_push_hook_data data; - int ret = 0; + strbuf_reset(&buf); + strbuf_addf( &buf, "%s %s %s %s\n", + r->peer_ref->name, oid_to_hex(&r->new_oid), + r->name, oid_to_hex(&r->old_oid)); - strvec_push(&opt.args, transport->remote->name); - strvec_push(&opt.args, transport->url); + if (write_in_full(proc.in, buf.buf, buf.len) < 0) { + /* We do not mind if a hook does not read all refs. */ + if (errno != EPIPE) + ret = -1; + break; + } + } - strbuf_init(&data.buf, 0); - data.refs = remote_refs; + strbuf_release(&buf); - opt.feed_pipe = pre_push_hook_feed_stdin; - opt.feed_pipe_cb_data = &data; + x = close(proc.in); + if (!ret) + ret = x; - ret = run_hooks_opt(the_repository, "pre-push", &opt); + sigchain_pop(SIGPIPE); - strbuf_release(&data.buf); + x = finish_command(&proc); + if (!ret) + ret = x; return ret; } From b5c409c40f1595e3e590760c6f14a16b6683e22c Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Fri, 16 Jan 2026 11:34:19 -0800 Subject: [PATCH 17/17] Merge a handful more topics after -rc0 Signed-off-by: Junio C Hamano --- Documentation/RelNotes/2.53.0.adoc | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/Documentation/RelNotes/2.53.0.adoc b/Documentation/RelNotes/2.53.0.adoc index 96d871782a1bc9..039f613f12a821 100644 --- a/Documentation/RelNotes/2.53.0.adoc +++ b/Documentation/RelNotes/2.53.0.adoc @@ -249,6 +249,16 @@ Fixes since v2.52 * Mailmap update for Karsten (merge e97678c4ef js/mailmap-karsten-blees later to maint). + * Perf-test fixes. + (merge 79d301c767 jk/t-perf-fixes later to maint). + + * Fix for a performance regression in "git cat-file". + (merge 9e8b448dd8 jk/cat-file-avoid-bitmap-when-unneeded later to maint). + + * Update a FAQ entry on synching two separate repositories using the + "git stash export/import" recently introduced. + (merge 02fc44a989 bc/doc-stash-import-export later to maint). + * Other code cleanup, docfix, build fix, etc. (merge 46207a54cc qj/doc-http-bad-want-response later to maint). (merge df90eccd93 kh/doc-commit-extra-references later to maint). @@ -275,3 +285,6 @@ Fixes since v2.52 (merge 06188ea5f3 rs/parse-config-expiry-simplify later to maint). (merge 861dbb1586 dd/t5403-modernise later to maint). (merge acffc5e9e5 ja/doc-synopsis-style-more later to maint). + (merge 6c5c7e7071 ac/t1420-use-more-direct-check later to maint). + (merge 2ac93bfcbc ds/builtin-doc-update later to maint). + (merge 3f051fc9c9 kh/doc-patch-id later to maint).