From d1e6c61272737926cd049a6126431fe9c12e4a54 Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Tue, 2 Jul 2024 13:51:43 -0700 Subject: [PATCH 001/297] checkout: special case error messages during noop switching "git checkout" ran with no branch and no pathspec behaves like switching the branch to the current branch (in other words, a no-op, except that it gives a side-effect "here are the modified paths" report). But unlike "git checkout HEAD" or "git checkout main" (when you are on the 'main' branch), the user is much less conscious that they are "switching" to the current branch. This twists end-user expectation in a strange way. There are options (like "--ours") that make sense only when we are checking out paths out of either the tree-ish or out of the index. So the error message the command below gives $ git checkout --ours fatal: '--ours/theirs' cannot be used with switching branches is technically correct, but because the end-user may not even be aware of the fact that the command they are issuing is about no-op branch switching [*], they may find the error confusing. Let's refactor the code to make it easier to special case the "no-op branch switching" situation, and then customize the exact error message for "--ours/--theirs". Since it is more likely that the end-user forgot to give pathspec that is required by the option, let's make it say $ git checkout --ours fatal: '--ours/theirs' needs the paths to check out instead. Among the other options that are incompatible with branch switching, there may be some that benefit by having messages tweaked when a no-op branch switching is done, but I'll leave them as #leftoverbits material. [Footnote] * Yes, the end-users are irrational. When they did not give "--ours", they take it granted that "git checkout" gives a short status, e.g.. $ git checkout M builtin/checkout.c M t/t7201-co.sh exactly as a branch switching command. Signed-off-by: Junio C Hamano --- builtin/checkout.c | 21 ++++++++++++++------- t/t7201-co.sh | 13 +++++++++++++ 2 files changed, 27 insertions(+), 7 deletions(-) diff --git a/builtin/checkout.c b/builtin/checkout.c index 3a7bfde1588016..2ec873a61249c5 100644 --- a/builtin/checkout.c +++ b/builtin/checkout.c @@ -1518,6 +1518,10 @@ static void die_if_some_operation_in_progress(void) static int checkout_branch(struct checkout_opts *opts, struct branch_info *new_branch_info) { + int noop_switch = (!new_branch_info->name && + !opts->new_branch && + !opts->force_detach); + if (opts->pathspec.nr) die(_("paths cannot be used with switching branches")); @@ -1529,9 +1533,14 @@ static int checkout_branch(struct checkout_opts *opts, die(_("'%s' cannot be used with switching branches"), "--[no]-overlay"); - if (opts->writeout_stage) - die(_("'%s' cannot be used with switching branches"), - "--ours/--theirs"); + if (opts->writeout_stage) { + const char *msg; + if (noop_switch) + msg = _("'%s' needs the paths to check out"); + else + msg = _("'%s' cannot be used with switching branches"); + die(msg, "--ours/--theirs"); + } if (opts->force && opts->merge) die(_("'%s' cannot be used with '%s'"), "-f", "-m"); @@ -1558,10 +1567,8 @@ static int checkout_branch(struct checkout_opts *opts, die(_("Cannot switch branch to a non-commit '%s'"), new_branch_info->name); - if (!opts->switch_branch_doing_nothing_is_ok && - !new_branch_info->name && - !opts->new_branch && - !opts->force_detach) + if (noop_switch && + !opts->switch_branch_doing_nothing_is_ok) die(_("missing branch or commit argument")); if (!opts->implicit_detach && diff --git a/t/t7201-co.sh b/t/t7201-co.sh index 10cc6c46051e95..42acf965c8feda 100755 --- a/t/t7201-co.sh +++ b/t/t7201-co.sh @@ -497,6 +497,19 @@ test_expect_success 'checkout unmerged stage' ' test ztheirside = "z$(cat file)" ' +test_expect_success 'checkout --ours is incompatible with switching' ' + test_must_fail git checkout --ours 2>error && + test_grep "needs the paths to check out" error && + + test_must_fail git checkout --ours HEAD 2>error && + test_grep "cannot be used with switching" error && + + test_must_fail git checkout --ours main 2>error && + test_grep "cannot be used with switching" error && + + git checkout --ours file +' + test_expect_success 'checkout path with --merge from tree-ish is a no-no' ' setup_conflicting_index && test_must_fail git checkout -m HEAD -- file From 67be8c4de5255efd95c48c9ddd6cd8537fbf03d1 Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Mon, 15 Jul 2024 08:05:00 -0700 Subject: [PATCH 002/297] doc: note that AT&T ksh does not work with our test suite The scripted Porcelain commands do not allow use of "local" because it is not universally supported, but we use it liberally in our test scripts, which means some POSIX compliant shells (like "ksh93") can not be used to run our tests. Document the status quo, to help the next person who gets perplexed seeing our tests fail. Signed-off-by: Junio C Hamano --- Documentation/CodingGuidelines | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines index 1d92b2da03e8ca..94ca5cf7c064ed 100644 --- a/Documentation/CodingGuidelines +++ b/Documentation/CodingGuidelines @@ -185,8 +185,8 @@ For shell scripts specifically (not exhaustive): - Even though "local" is not part of POSIX, we make heavy use of it in our test suite. We do not use it in scripted Porcelains, and - hopefully nobody starts using "local" before they are reimplemented - in C ;-) + hopefully nobody starts using "local" before all shells that matter + support it (notably, ksh from AT&T Research does not support it yet). - Some versions of shell do not understand "export variable=value", so we write "variable=value" and then "export variable" on two From 8db8786fc2de36731738fd480d1df422c6e86579 Mon Sep 17 00:00:00 2001 From: Justin Tobler Date: Mon, 15 Jul 2024 13:37:39 -0500 Subject: [PATCH 003/297] doc: clarify post-receive hook behavior The `githooks` documentation mentions that the post-receive hook executes once after git-receive-pack(1) updates all references and that it also receives the same information as the pre-receive hook on standard input. This is misleading though because the hook only executes once if at least one of the attempted reference updates is successful. Also, while each line provided on standard input is in the same format as the pre-receive hook, the information received only includes the set of references that were successfully updated. Update the documentation to clarify these points and also provide a reference to the post-receive hook section of the `git-receive-pack` documentation which has additional information. Signed-off-by: Justin Tobler Signed-off-by: Junio C Hamano --- Documentation/githooks.txt | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt index 883982e7a05162..efd7b7407b1e39 100644 --- a/Documentation/githooks.txt +++ b/Documentation/githooks.txt @@ -415,13 +415,13 @@ post-receive This hook is invoked by linkgit:git-receive-pack[1] when it reacts to `git push` and updates reference(s) in its repository. -It executes on the remote repository once after all the refs have -been updated. +The hook executes on the remote repository once after all the proposed +ref updates are processed and if at least one ref is updated as the +result. -This hook executes once for the receive operation. It takes no -arguments, but gets the same information as the -<> -hook does on its standard input. +The hook takes no arguments. It receives one line on standard input for +each ref that is successfully updated following the same format as the +<> hook. This hook does not affect the outcome of `git receive-pack`, as it is called after the real work is done. @@ -448,6 +448,9 @@ environment variables will not be set. If the client selects to use push options, but doesn't transmit any, the count variable will be set to zero, `GIT_PUSH_OPTION_COUNT=0`. +See the "post-receive" section in linkgit:git-receive-pack[1] for +additional details. + [[post-update]] post-update ~~~~~~~~~~~ From 5133ead5286546d65b038b9381ef8b1cd4663c0b Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Tue, 16 Jul 2024 12:56:25 -0700 Subject: [PATCH 004/297] Revert "reflog expire: don't use lookup_commit_reference_gently()" During Git 2.35 timeframe, daf1d828 (reflog expire: don't use lookup_commit_reference_gently(), 2021-12-22) replaced a call to lookup_commit_reference_gently() with a call to lookup_commit(). What it failed to consider was that our refs do not necessarily point at commits (most notably, we have annotated and signed tags), and more importantly that lookup_commit() does not dereference a tag to return a commit; instead it returns NULL when a tag is given. Since the commit returned is used as a starting point for the reachability check, this ejected the commits that are reachable only by an annotated tag out of the set of reachable commits, breaking the computation to correctly implement the "--expire-unreachable" option. We also started giving an error message that the API function expected to be fed a commit object. This problem hasn't been reported or noticed for a long time, probably because the "refs/tags/" hierarchy by default is not covered by reflogs, as nobody usually moves tags. Revert the change to correctly find the commit pointed at by the ref to restore the previous behaviour, but do so only in a more modern codebase, as we had significant code churn since then and it is not grave enough to worry about for older maintenance tracks. Signed-off-by: Junio C Hamano --- reflog.c | 3 ++- t/t1410-reflog.sh | 8 ++++++++ 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/reflog.c b/reflog.c index 0a1bc35e8cd2c8..2e45a17e126b4a 100644 --- a/reflog.c +++ b/reflog.c @@ -330,7 +330,8 @@ void reflog_expiry_prepare(const char *refname, if (!cb->cmd.expire_unreachable || is_head(refname)) { cb->unreachable_expire_kind = UE_HEAD; } else { - commit = lookup_commit(the_repository, oid); + commit = lookup_commit_reference_gently(the_repository, + oid, 1); if (commit && is_null_oid(&commit->object.oid)) commit = NULL; cb->unreachable_expire_kind = commit ? UE_NORMAL : UE_ALWAYS; diff --git a/t/t1410-reflog.sh b/t/t1410-reflog.sh index d2f5f42e67440d..16816e82b2e328 100755 --- a/t/t1410-reflog.sh +++ b/t/t1410-reflog.sh @@ -146,6 +146,14 @@ test_expect_success rewind ' test_line_count = 5 output ' +test_expect_success 'reflog expire should not barf on an annotated tag' ' + test_when_finished "git tag -d v0.tag || :" && + git -c core.logAllRefUpdates=always \ + tag -a -m "tag name" v0.tag main && + git reflog expire --dry-run refs/tags/v0.tag 2>err && + test_grep ! "error: [Oo]bject .* not a commit" err +' + test_expect_success 'corrupt and check' ' corrupt $F && From c93dda2e785791c91fc2a9729b0bf03300fe43b3 Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Fri, 19 Jul 2024 13:44:35 -0700 Subject: [PATCH 005/297] howto-maintain: cover a whole development cycle The "policy" part is more important than the "daily operation" part in that it establishes why certain maintainer tasks exist and are performed the way they are. The text briefly touches the role each integration branches play in the workflow, but does not give the whole picture of what happens in a single development cycle using these branches. Extend the description to describe a whole development cycle. Signed-off-by: Junio C Hamano --- Documentation/howto/maintain-git.txt | 52 ++++++++++++++++++++++------ 1 file changed, 42 insertions(+), 10 deletions(-) diff --git a/Documentation/howto/maintain-git.txt b/Documentation/howto/maintain-git.txt index 013014bbef67ac..7219faf09f7bbe 100644 --- a/Documentation/howto/maintain-git.txt +++ b/Documentation/howto/maintain-git.txt @@ -37,22 +37,20 @@ The Policy The policy on Integration is informally mentioned in "A Note from the maintainer" message, which is periodically posted to -this mailing list after each feature release is made. +the mailing list after each feature release is made: - Feature releases are numbered as vX.Y.0 and are meant to contain bugfixes and enhancements in any area, including functionality, performance and usability, without regression. - - One release cycle for a feature release is expected to last for - eight to ten weeks. - - - Maintenance releases are numbered as vX.Y.Z and are meant + - Maintenance releases are numbered as vX.Y.Z (0 < Z) and are meant to contain only bugfixes for the corresponding vX.Y.0 feature release and earlier maintenance releases vX.Y.W (W < Z). - - 'master' branch is used to prepare for the next feature + - The 'master' branch is used to prepare for the next feature release. In other words, at some point, the tip of 'master' - branch is tagged with vX.Y.0. + branch is tagged as vX.(Y+1).0, when vX.Y.0 is the latest + feature release. - 'maint' branch is used to prepare for the next maintenance release. After the feature release vX.Y.0 is made, the tip @@ -63,11 +61,13 @@ this mailing list after each feature release is made. - 'next' branch is used to publish changes (both enhancements and fixes) that (1) have worthwhile goal, (2) are in a fairly good shape suitable for everyday use, (3) but have not yet - demonstrated to be regression free. New changes are tested - in 'next' before merged to 'master'. + demonstrated to be regression free. Reviews from contributors on + the mailing list help to make the determination. After a topic + is merged to 'next', it is tested for at least 7 calendar days + before getting merged to 'master'. - 'seen' branch is used to publish other proposed changes that do - not yet pass the criteria set for 'next'. + not yet pass the criteria set for 'next' (see above). - The tips of 'master' and 'maint' branches will not be rewound to allow people to build their own customization on top of them. @@ -86,6 +86,38 @@ this mailing list after each feature release is made. users are encouraged to test it so that regressions and bugs are found before new topics are merged to 'master'. + - When a problem is found in a topic in 'next', the topic is marked + not to be merged to 'master'. Follow-up patches are discussed on + the mailing list and applied to the topic after being reviewed and + then the topic is merged (again) to 'next'. After going through + the usual testing in 'next', the entire (fixed) topic is merged + to 'master'. + + - One release cycle for a feature release is expected to last for + eight to ten weeks. A few "release candidate" releases are + expected to be tagged about a week apart before the final + release, and a "preview" release is tagged about a week before + the first release candidate gets tagged. + + - After the preview release is tagged, topics that were well + reviewed may be merged to 'master' before spending the usual 7 + calendar days in 'next', with the expectation that any bugs in + them can be caught and fixed in the release candidates before + the final release. + + - After the first release candidate is tagged, the contributors are + strongly encouraged to focus on finding and fixing new regressions + introduced during the cycle, over addressing old bugs and any new + features. Topics stop getting merged down from 'next' to 'master', + and new topics stop getting merged to 'next'. Unless they are fixes + to new regressions in the cycle, that is. + + - Soon after a feature release is made, the tip of 'maint' gets + fast-forwarded to point at the release. Topics that have been + kept in 'next' are merged down to 'master' and a new development + cycle starts. + + Note that before v1.9.0 release, the version numbers used to be structured slightly differently. vX.Y.Z were feature releases while vX.Y.Z.W were maintenance releases for vX.Y.Z. From bb0498b1bbfccaa0b7d31494eab49a5aab2cc718 Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Fri, 19 Jul 2024 13:52:09 -0700 Subject: [PATCH 006/297] howto-maintain: update daily tasks Some "implementation details" of how I perform these integration tasks day to day have changed since the document was originally written. Update to reflect the way things are currently done. Signed-off-by: Junio C Hamano --- Documentation/howto/maintain-git.txt | 106 ++++++++++++++++----------- 1 file changed, 63 insertions(+), 43 deletions(-) diff --git a/Documentation/howto/maintain-git.txt b/Documentation/howto/maintain-git.txt index 7219faf09f7bbe..41f54050f84e0e 100644 --- a/Documentation/howto/maintain-git.txt +++ b/Documentation/howto/maintain-git.txt @@ -211,12 +211,12 @@ by doing the following: The initial round is done with: $ git checkout ai/topic ;# or "git checkout -b ai/topic master" - $ git am -sc3 mailbox + $ git am -sc3 --whitespace=warn mailbox and replacing an existing topic with subsequent round is done with: $ git checkout master...ai/topic ;# try to reapply to the same base - $ git am -sc3 mailbox + $ git am -sc3 --whitespace=warn mailbox to prepare the new round on a detached HEAD, and then @@ -241,39 +241,59 @@ by doing the following: (trivial typofixes etc. are often squashed directly into the patches that need fixing, without being applied as a separate "SQUASH???" commit), so that they can be removed easily as needed. + The expectation is that the original author will make corrections + in a reroll. + - By now, new topic branches are created and existing topic + branches are updated. The integration branches 'next', 'jch', + and 'seen' need to be updated to contain them. - - Merge maint to master as needed: + - If there are topics that have been merged to 'master' and should + be merged to 'maint', merge them to 'maint', and update the + release notes to the next maintenance release. - $ git checkout master - $ git merge maint - $ make test + - Review the latest issue of "What's cooking" again. Are topics + that have been sufficiently long in 'next' ready to be merged to + 'master'? Are topics we saw earlier and are in 'seen' now got + positive reviews and are ready to be merged to 'next'? - - Merge master to next as needed: + - If there are topics that have been cooking in 'next' long enough + and should be merged to 'master', merge them to 'master', and + update the release notes to the next feature release. - $ git checkout next - $ git merge master - $ make test + - If there were patches directly made on 'maint', merge 'maint' to + 'master'; make sure that the result is what you want. - - Review the last issue of "What's cooking" again and see if topics - that are ready to be merged to 'next' are still in good shape - (e.g. has there any new issue identified on the list with the - series?) + $ git checkout master + $ git merge -m "Sync with 'maint'" --no-log maint + $ git log -p --first-parent ORIG_HEAD.. + $ make test - - Prepare 'jch' branch, which is used to represent somewhere - between 'master' and 'seen' and often is slightly ahead of 'next'. + - Prepare to update the 'jch' branch, which is used to represent + somewhere between 'master' and 'seen' and often is slightly ahead + of 'next', and the 'seen' branch, which is used to hold the rest. $ Meta/Reintegrate master..jch >Meta/redo-jch.sh The result is a script that lists topics to be merged in order to - rebuild 'seen' as the input to Meta/Reintegrate script. Remove - later topics that should not be in 'jch' yet. Add a line that - consists of '### match next' before the name of the first topic - in the output that should be in 'jch' but not in 'next' yet. + rebuild the current 'jch'. Do the same for 'seen'. + + - Review the Meta/redo-jch.sh and Meta/redo-seen.sh scripts. The + former should have a line '### match next'---the idea is that + merging the topics listed before the line on top of 'master' + should result in a tree identical to that of 'next'. - - Now we are ready to start merging topics to 'next'. For each - branch whose tip is not merged to 'next', one of three things can - happen: + - As newly created topics are usually merged near the tip of + 'seen', add them to the end of the Meta/redo-seen.sh script. + Among the topics that were in 'seen', there may be ones that + are not quite ready for 'next' but are getting there. Move + them from Meta/redo-seen.sh to the end of Meta/redo-jch.sh. + The expectation is that you'd use 'jch' as your daily driver + as the first guinea pig, so you should choose carefully. + + - Now we are ready to start rebuilding 'jch' and merging topics to + 'next'. For each branch whose tip is not merged to 'next', one + of three things can happen: - The commits are all next-worthy; merge the topic to next; - The new parts are of mixed quality, but earlier ones are @@ -284,10 +304,12 @@ by doing the following: If a topic that was already in 'next' gained a patch, the script would list it as "ai/topic~1". To include the new patch to the updated 'next', drop the "~1" part; to keep it excluded, do not - touch the line. If a topic that was not in 'next' should be - merged to 'next', add it at the end of the list. Then: + touch the line. + + If a topic that was not in 'next' should be merged to 'next', add + it before the '### match next' line. Then: - $ git checkout -B jch master + $ git checkout --detach master $ sh Meta/redo-jch.sh -c1 to rebuild the 'jch' branch from scratch. "-c1" tells the script @@ -299,26 +321,29 @@ by doing the following: reference to the variable under its old name), in which case prepare an appropriate merge-fix first (see appendix), and rebuild the 'jch' branch from scratch, starting at the tip of - 'master'. + 'master', this time without using "-c1" to merge all topics. - Then do the same to 'next' + Then do the same to 'next'. $ git checkout next $ sh Meta/redo-jch.sh -c1 -e The "-e" option allows the merge message that comes from the history of the topic and the comments in the "What's cooking" to - be edited. The resulting tree should match 'jch' as the same set - of topics are merged on 'master'; otherwise there is a mismerge. - Investigate why and do not proceed until the mismerge is found - and rectified. + be edited. The resulting tree should match 'jch^{/^### match next'}' + as the same set of topics are merged on 'master'; otherwise there + is a mismerge. Investigate why and do not proceed until the mismerge + is found and rectified. - $ git diff jch next + If 'master' was updated before you started redoing 'next', then - Then build the rest of 'jch': + $ git diff 'jch^{/^### match next}' next - $ git checkout jch - $ sh Meta/redo-jch.sh + would show differences that went into 'master' (which 'jch' has, + but 'next' does not yet---often it is updates to the release + notes). Merge 'master' back to 'next' if that is the case. + + $ git merge -m "Sync with 'master'" --no-log master When all is well, clean up the redo-jch.sh script with @@ -328,12 +353,7 @@ by doing the following: merged to 'master'. This may lose '### match next' marker; add it again to the appropriate place when it happens. - - Rebuild 'seen'. - - $ Meta/Reintegrate jch..seen >Meta/redo-seen.sh - - Edit the result by adding new topics that are not still in 'seen' - in the script. Then + - Rebuild 'seen' on top of 'jch'. $ git checkout -B seen jch $ sh Meta/redo-seen.sh @@ -344,7 +364,7 @@ by doing the following: Double check by running - $ git branch --no-merged seen + $ git branch --no-merged seen '??/*' to see there is no unexpected leftover topics. From 39bdd84eaf646aa73a3709b0eb8be3f47378708f Mon Sep 17 00:00:00 2001 From: Phillip Wood Date: Sat, 20 Jul 2024 16:01:59 +0000 Subject: [PATCH 007/297] add-patch: handle splitting hunks with diff.suppressBlankEmpty When "add -p" parses diffs, it looks for context lines starting with a single space. But when diff.suppressBlankEmpty is in effect, an empty context line will omit the space, giving us a true empty line. This confuses the parser, which is unable to split based on such a line. It's tempting to say that we should just make sure that we generate a diff without that option. However, although we do not parse hunks that the user has manually edited with parse_diff() we do allow the user to split such hunks. As POSIX calls the decision of whether to print the space here "implementation-defined" we need to handle edited hunks where empty context lines omit the space. So let's handle both cases: a context line either starts with a space or consists of a totally empty line by normalizing the first character to a space when we parse them. Normalizing the first character rather than changing the code to check for a space or newline will hopefully future proof against introducing similar bugs if the code is changed. Reported-by: Ilya Tumaykin Helped-by: Jeff King Signed-off-by: Phillip Wood Signed-off-by: Junio C Hamano --- add-patch.c | 19 +++++++++++++------ t/t3701-add-interactive.sh | 19 +++++++++++++++++++ 2 files changed, 32 insertions(+), 6 deletions(-) diff --git a/add-patch.c b/add-patch.c index 79eda168ebb7cd..1c28a7fa1f706d 100644 --- a/add-patch.c +++ b/add-patch.c @@ -400,6 +400,12 @@ static void complete_file(char marker, struct hunk *hunk) hunk->splittable_into++; } +/* Empty context lines may omit the leading ' ' */ +static int normalize_marker(const char *p) +{ + return p[0] == '\n' || (p[0] == '\r' && p[1] == '\n') ? ' ' : p[0]; +} + static int parse_diff(struct add_p_state *s, const struct pathspec *ps) { struct strvec args = STRVEC_INIT; @@ -485,6 +491,7 @@ static int parse_diff(struct add_p_state *s, const struct pathspec *ps) while (p != pend) { char *eol = memchr(p, '\n', pend - p); const char *deleted = NULL, *mode_change = NULL; + char ch = normalize_marker(p); if (!eol) eol = pend; @@ -532,7 +539,7 @@ static int parse_diff(struct add_p_state *s, const struct pathspec *ps) * Start counting into how many hunks this one can be * split */ - marker = *p; + marker = ch; } else if (hunk == &file_diff->head && starts_with(p, "new file")) { file_diff->added = 1; @@ -586,10 +593,10 @@ static int parse_diff(struct add_p_state *s, const struct pathspec *ps) (int)(eol - (plain->buf + file_diff->head.start)), plain->buf + file_diff->head.start); - if ((marker == '-' || marker == '+') && *p == ' ') + if ((marker == '-' || marker == '+') && ch == ' ') hunk->splittable_into++; - if (marker && *p != '\\') - marker = *p; + if (marker && ch != '\\') + marker = ch; p = eol == pend ? pend : eol + 1; hunk->end = p - plain->buf; @@ -813,7 +820,7 @@ static int merge_hunks(struct add_p_state *s, struct file_diff *file_diff, (int)(hunk->end - hunk->start), plain + hunk->start); - if (plain[overlap_end] != ' ') + if (normalize_marker(&plain[overlap_end]) != ' ') return error(_("expected context line " "#%d in\n%.*s"), (int)(j + 1), @@ -953,7 +960,7 @@ static int split_hunk(struct add_p_state *s, struct file_diff *file_diff, context_line_count = 0; while (splittable_into > 1) { - ch = s->plain.buf[current]; + ch = normalize_marker(&s->plain.buf[current]); if (!ch) BUG("buffer overrun while splitting hunks"); diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh index 0b5339ac6ca824..4c3e2bc82f9724 100755 --- a/t/t3701-add-interactive.sh +++ b/t/t3701-add-interactive.sh @@ -1130,4 +1130,23 @@ test_expect_success 'reset -p with unmerged files' ' test_must_be_empty staged ' +test_expect_success 'hunk splitting works with diff.suppressBlankEmpty' ' + test_config diff.suppressBlankEmpty true && + write_script fake-editor.sh <<-\EOF && + tr F G <"$1" >"$1.tmp" && + mv "$1.tmp" "$1" + EOF + + test_write_lines a b "" c d "" e f "" >file && + git add file && + test_write_lines A b "" c D "" e F "" >file && + ( + test_set_editor "$(pwd)/fake-editor.sh" && + test_write_lines s n y e q | git add -p file + ) && + git cat-file blob :file >actual && + test_write_lines a b "" c D "" e G "" >expect && + test_cmp expect actual +' + test_done From 60cf761ed14298d618597e87e50f25bb61171e84 Mon Sep 17 00:00:00 2001 From: Phillip Wood Date: Sat, 20 Jul 2024 16:02:00 +0000 Subject: [PATCH 008/297] add-patch: use normalize_marker() when recounting edited hunk After the user has edited a hunk the number of lines in the pre- and post- image lines is recounted the hunk header can be updated before passing the hunk to "git apply". The recounting code correctly handles empty context lines where the leading ' ' is omitted by treating '\n' and '\r' as context lines. Update this code to use normalize_marker() so that the handling of empty context lines is consistent with the rest of the hunk parsing code. There is a small change in behavior as normalize_marker() only treats "\r\n" as an empty context line rather than any line starting with '\r'. This should not matter in practice as Macs have used Unix line endings since MacOs 10 was released in 2001 and if it transpires that someone is still using an earlier version of MacOs where lines end with '\r' then we will need to change the handling of '\r' in normalize_marker() anyway. Signed-off-by: Phillip Wood Signed-off-by: Junio C Hamano --- add-patch.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/add-patch.c b/add-patch.c index 1c28a7fa1f706d..537cf400e6e9d5 100644 --- a/add-patch.c +++ b/add-patch.c @@ -1178,14 +1178,14 @@ static ssize_t recount_edited_hunk(struct add_p_state *s, struct hunk *hunk, header->old_count = header->new_count = 0; for (i = hunk->start; i < hunk->end; ) { - switch (s->plain.buf[i]) { + switch(normalize_marker(&s->plain.buf[i])) { case '-': header->old_count++; break; case '+': header->new_count++; break; - case ' ': case '\r': case '\n': + case ' ': header->old_count++; header->new_count++; break; From 1c473dd6af11591e96425386db54aa7838eabefb Mon Sep 17 00:00:00 2001 From: Tomas Nordin Date: Mon, 22 Jul 2024 22:53:02 +0000 Subject: [PATCH 009/297] doc: remove dangling closing parenthesis The second line of the synopsis, starting with [--dry-run] has a dangling closing paren in the second optional group. Probably added by mistake, so remove it. Signed-off-by: Tomas Nordin Signed-off-by: Junio C Hamano --- Documentation/git-commit.txt | 2 +- builtin/commit.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/git-commit.txt b/Documentation/git-commit.txt index 89ecfc63a8d07f..c822113c111ded 100644 --- a/Documentation/git-commit.txt +++ b/Documentation/git-commit.txt @@ -9,7 +9,7 @@ SYNOPSIS -------- [verse] 'git commit' [-a | --interactive | --patch] [-s] [-v] [-u] [--amend] - [--dry-run] [(-c | -C | --squash) | --fixup [(amend|reword):])] + [--dry-run] [(-c | -C | --squash) | --fixup [(amend|reword):]] [-F | -m ] [--reset-author] [--allow-empty] [--allow-empty-message] [--no-verify] [-e] [--author=] [--date=] [--cleanup=] [--[no-]status] diff --git a/builtin/commit.c b/builtin/commit.c index 6e1484446b0cc2..7f9dd45d05221d 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -41,7 +41,7 @@ static const char * const builtin_commit_usage[] = { N_("git commit [-a | --interactive | --patch] [-s] [-v] [-u] [--amend]\n" - " [--dry-run] [(-c | -C | --squash) | --fixup [(amend|reword):])]\n" + " [--dry-run] [(-c | -C | --squash) | --fixup [(amend|reword):]]\n" " [-F | -m ] [--reset-author] [--allow-empty]\n" " [--allow-empty-message] [--no-verify] [-e] [--author=]\n" " [--date=] [--cleanup=] [--[no-]status]\n" From 728a1962cdaf7df6a1f261dcf3167842cd2db170 Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Mon, 22 Jul 2024 17:04:54 -0700 Subject: [PATCH 010/297] CodingGuidelines: document a shell that "fails" "VAR=VAL shell_func" Over the years, we accumulated the community wisdom to avoid the common "one-short export" construct for shell functions, but seem to have lost on which exact platform it is known to fail. Now during an investigation on a breakage for a recent topic, we found one example of failing shell. Let's document that. This does *not* mean that we can freely start using the construct once Ubuntu 20.04 is retired. But it does mean that we cannot use the construct until Ubuntu 20.04 is fully retired from the machines that matter. Moreover, posix explicitly says that the behaviour for the construct is unspecified. Helped-by: Kyle Lippincott Signed-off-by: Junio C Hamano --- Documentation/CodingGuidelines | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines index 1d92b2da03e8ca..52afb2725f240d 100644 --- a/Documentation/CodingGuidelines +++ b/Documentation/CodingGuidelines @@ -204,6 +204,33 @@ For shell scripts specifically (not exhaustive): local variable="$value" local variable="$(command args)" + - The common construct + + VAR=VAL command args + + to temporarily set and export environment variable VAR only while + "command args" is running is handy, but this triggers an + unspecified behaviour according to POSIX when used for a command + that is not an external command (like shell functions). Indeed, + dash 0.5.10.2-6 on Ubuntu 20.04, /bin/sh on FreeBSD 13, and AT&T + ksh all make a temporary assignment without exporting the variable, + in such a case. As it does not work portably across shells, do not + use this syntax for shell functions. A common workaround is to do + an explicit export in a subshell, like so: + + (incorrect) + VAR=VAL func args + + (correct) + ( + VAR=VAL && + export VAR && + func args + ) + + but be careful that the effect "func" makes to the variables in the + current shell will be lost across the subshell boundary. + - Use octal escape sequences (e.g. "\302\242"), not hexadecimal (e.g. "\xc2\xa2") in printf format strings, since hexadecimal escape sequences are not portable. From 70058db385fdf2191918c279ec081d7fee39d716 Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Thu, 25 Jul 2024 10:27:29 -0700 Subject: [PATCH 011/297] doc: difference in location to apply is "offset", not "fuzz" The documentation to "git rebase" says that the line numbers (in the rebased change) may not exactly be the same as the line numbers the change gets replayed on top of the new base, but uses a wrong noun "fuzz". It should have said "offset". They are both terms of art. "fuzz" is about context lines not exactly matching. "offset" is about the difference in the location that a change was taken from the original and the change gets replayed on the target. "offset" is often inevitable and part of normal life. "fuzz" on the other hand is often a sign of trouble (and indeed "Git" refuses to apply a change with "fuzz", except there are options to be fuzzy about whitespaces). Signed-off-by: Junio C Hamano --- Documentation/git-rebase.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt index 9a295bcee45f4f..166dcbb70653ff 100644 --- a/Documentation/git-rebase.txt +++ b/Documentation/git-rebase.txt @@ -715,7 +715,7 @@ The 'apply' backend works by creating a sequence of patches (by calling `format-patch` internally), and then applying the patches in sequence (calling `am` internally). Patches are composed of multiple hunks, each with line numbers, a context region, and the actual changes. The -line numbers have to be taken with some fuzz, since the other side +line numbers have to be taken with some offset, since the other side will likely have inserted or deleted lines earlier in the file. The context region is meant to help find how to adjust the line numbers in order to apply the changes to the right lines. However, if multiple From d98d9c77e5d9ac0b0663069e05a512037b9279cf Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Thu, 25 Jul 2024 16:12:41 -0700 Subject: [PATCH 012/297] mailmap: plug memory leak in read_mailmap_blob() When a named object to read mailmap from is not a blob, the code correctly errors out, but it forgot to free the object data before doing so. Signed-off-by: Junio C Hamano --- mailmap.c | 4 +++- t/t4203-mailmap.sh | 1 + 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/mailmap.c b/mailmap.c index 2d0212f4441d5d..2acf97f3076070 100644 --- a/mailmap.c +++ b/mailmap.c @@ -201,8 +201,10 @@ static int read_mailmap_blob(struct string_list *map, const char *name) buf = repo_read_object_file(the_repository, &oid, &type, &size); if (!buf) return error("unable to read mailmap object at %s", name); - if (type != OBJ_BLOB) + if (type != OBJ_BLOB) { + free(buf); return error("mailmap is not a blob: %s", name); + } read_mailmap_string(map, buf); diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh index 8a88dd7900ca8a..79e5f42760d919 100755 --- a/t/t4203-mailmap.sh +++ b/t/t4203-mailmap.sh @@ -5,6 +5,7 @@ test_description='.mailmap configurations' GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME +TEST_PASSES_SANITIZE_LEAK=true . ./test-lib.sh test_expect_success 'setup commits and contacts file' ' From c3d034df1606beca8a4ba3bc8a8266f4c711f23c Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Thu, 25 Jul 2024 16:07:28 -0700 Subject: [PATCH 013/297] csum-file: introduce discard_hashfile() The hashfile API is used to write out a "hashfile", which has a final checksum (typically SHA-1) at the end. An in-core hashfile structure has up to two file descriptors and a few buffers that can only be freed by calling a helper function that is private to the csum-file implementation. The usual flow of a user of the API is to first open a file descriptor for writing, obtain a hashfile associated with that write file descriptor by calling either hashfd() or hashfd_check(), call hashwrite() number of times to write data to the file, and then call finalize_hashfile(), which appends th checksum to the end of the file, closes file descriptors and releases associated buffers. But what if a caller finds some error after calling hashfd() to start the process and/or hashwrite() to send some data to the file, and wants to abort the operation? The underlying file descriptor is often managed by the tempfile API, so aborting will clean the file out of the filesystem, but the resources associated with the in-core hashfile structure is lost. Introduce discard_hashfile() API function to allow them to release the resources held by a hashfile structure the callers want to dispose of, and use that in read-cache.c:do_write_index(), which is a central place that writes the index file. Mark t2107 as leak-free, as this leak in "update-index --cacheinfo" test that deliberately makes it fail is now plugged. Signed-off-by: Junio C Hamano --- csum-file.c | 9 ++++ csum-file.h | 1 + read-cache.c | 99 +++++++++++++++++++++++++++-------- t/t2107-update-index-basic.sh | 1 + 4 files changed, 89 insertions(+), 21 deletions(-) diff --git a/csum-file.c b/csum-file.c index 8abbf013250eef..2131ee6b12e2a6 100644 --- a/csum-file.c +++ b/csum-file.c @@ -102,6 +102,15 @@ int finalize_hashfile(struct hashfile *f, unsigned char *result, return fd; } +void discard_hashfile(struct hashfile *f) +{ + if (0 <= f->check_fd) + close(f->check_fd); + if (0 <= f->fd) + close(f->fd); + free_hashfile(f); +} + void hashwrite(struct hashfile *f, const void *buf, unsigned int count) { while (count) { diff --git a/csum-file.h b/csum-file.h index 566e05cbd25a04..36c7c5585f5672 100644 --- a/csum-file.h +++ b/csum-file.h @@ -47,6 +47,7 @@ struct hashfile *hashfd(int fd, const char *name); struct hashfile *hashfd_check(const char *name); struct hashfile *hashfd_throughput(int fd, const char *name, struct progress *tp); int finalize_hashfile(struct hashfile *, unsigned char *, enum fsync_component, unsigned int); +void discard_hashfile(struct hashfile *); void hashwrite(struct hashfile *, const void *, unsigned int); void hashflush(struct hashfile *f); void crc32_begin(struct hashfile *); diff --git a/read-cache.c b/read-cache.c index 48bf24f87c00c1..1f67bb755b1cd1 100644 --- a/read-cache.c +++ b/read-cache.c @@ -2963,7 +2963,7 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile, if (err) { free(ieot); - return err; + goto cleanup; } offset = hashfile_total(f); @@ -2992,8 +2992,14 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile, hashwrite(f, sb.buf, sb.len); strbuf_release(&sb); free(ieot); - if (err) - return -1; + /* + * NEEDSWORK: write_index_ext_header() never returns a failure, + * and this part may want to be simplified. + */ + if (err) { + err = -1; + goto cleanup; + } } if (write_extensions & WRITE_SPLIT_INDEX_EXTENSION && @@ -3008,8 +3014,14 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile, sb.len) < 0; hashwrite(f, sb.buf, sb.len); strbuf_release(&sb); - if (err) - return -1; + /* + * NEEDSWORK: write_link_extension() never returns a failure, + * and this part may want to be simplified. + */ + if (err) { + err = -1; + goto cleanup; + } } if (write_extensions & WRITE_CACHE_TREE_EXTENSION && !drop_cache_tree && istate->cache_tree) { @@ -3019,8 +3031,14 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile, err = write_index_ext_header(f, eoie_c, CACHE_EXT_TREE, sb.len) < 0; hashwrite(f, sb.buf, sb.len); strbuf_release(&sb); - if (err) - return -1; + /* + * NEEDSWORK: write_index_ext_header() never returns a failure, + * and this part may want to be simplified. + */ + if (err) { + err = -1; + goto cleanup; + } } if (write_extensions & WRITE_RESOLVE_UNDO_EXTENSION && istate->resolve_undo) { @@ -3031,8 +3049,14 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile, sb.len) < 0; hashwrite(f, sb.buf, sb.len); strbuf_release(&sb); - if (err) - return -1; + /* + * NEEDSWORK: write_index_ext_header() never returns a failure, + * and this part may want to be simplified. + */ + if (err) { + err = -1; + goto cleanup; + } } if (write_extensions & WRITE_UNTRACKED_CACHE_EXTENSION && istate->untracked) { @@ -3043,8 +3067,14 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile, sb.len) < 0; hashwrite(f, sb.buf, sb.len); strbuf_release(&sb); - if (err) - return -1; + /* + * NEEDSWORK: write_index_ext_header() never returns a failure, + * and this part may want to be simplified. + */ + if (err) { + err = -1; + goto cleanup; + } } if (write_extensions & WRITE_FSMONITOR_EXTENSION && istate->fsmonitor_last_update) { @@ -3054,12 +3084,25 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile, err = write_index_ext_header(f, eoie_c, CACHE_EXT_FSMONITOR, sb.len) < 0; hashwrite(f, sb.buf, sb.len); strbuf_release(&sb); - if (err) - return -1; + /* + * NEEDSWORK: write_index_ext_header() never returns a failure, + * and this part may want to be simplified. + */ + if (err) { + err = -1; + goto cleanup; + } } if (istate->sparse_index) { - if (write_index_ext_header(f, eoie_c, CACHE_EXT_SPARSE_DIRECTORIES, 0) < 0) - return -1; + err = write_index_ext_header(f, eoie_c, CACHE_EXT_SPARSE_DIRECTORIES, 0); + /* + * NEEDSWORK: write_index_ext_header() never returns a failure, + * and this part may want to be simplified. + */ + if (err) { + err = -1; + goto cleanup; + } } /* @@ -3075,8 +3118,14 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile, err = write_index_ext_header(f, NULL, CACHE_EXT_ENDOFINDEXENTRIES, sb.len) < 0; hashwrite(f, sb.buf, sb.len); strbuf_release(&sb); - if (err) - return -1; + /* + * NEEDSWORK: write_index_ext_header() never returns a failure, + * and this part may want to be simplified. + */ + if (err) { + err = -1; + goto cleanup; + } } csum_fsync_flag = 0; @@ -3085,13 +3134,16 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile, finalize_hashfile(f, istate->oid.hash, FSYNC_COMPONENT_INDEX, CSUM_HASH_IN_STREAM | csum_fsync_flag); + f = NULL; if (close_tempfile_gently(tempfile)) { - error(_("could not close '%s'"), get_tempfile_path(tempfile)); - return -1; + err = error(_("could not close '%s'"), get_tempfile_path(tempfile)); + goto cleanup; + } + if (stat(get_tempfile_path(tempfile), &st)) { + err = error_errno(_("could not stat '%s'"), get_tempfile_path(tempfile)); + goto cleanup; } - if (stat(get_tempfile_path(tempfile), &st)) - return -1; istate->timestamp.sec = (unsigned int)st.st_mtime; istate->timestamp.nsec = ST_MTIME_NSEC(st); trace_performance_since(start, "write index, changed mask = %x", istate->cache_changed); @@ -3106,6 +3158,11 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile, istate->cache_nr); return 0; + +cleanup: + if (f) + discard_hashfile(f); + return err; } void set_alternate_index_output(const char *name) diff --git a/t/t2107-update-index-basic.sh b/t/t2107-update-index-basic.sh index cc72ead79f3978..f0eab13f96a6b9 100755 --- a/t/t2107-update-index-basic.sh +++ b/t/t2107-update-index-basic.sh @@ -5,6 +5,7 @@ test_description='basic update-index tests Tests for command-line parsing and basic operation. ' +TEST_PASSES_SANITIZE_LEAK=true . ./test-lib.sh test_expect_success 'update-index --nonsense fails' ' From c1997074966a9eee7354bd3310048660a6287570 Mon Sep 17 00:00:00 2001 From: Jayson Rhynas Date: Fri, 26 Jul 2024 11:41:40 -0400 Subject: [PATCH 014/297] doc: fix hex code escapes in git-ls-files The --format option on the git-ls-files man page states that `%xx` interpolates to the character with hex code `xx`. This mirrors the documentation and behavior of `git for-each-ref --format=...`. However, in reality it requires the character with code `XX` to be specified as `%xXX`, mirroring the behaviour of `git log --format`. Signed-off-by: Jayson Rhynas Signed-off-by: Junio C Hamano --- Documentation/git-ls-files.txt | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt index d08c7da8f495b3..58c529afbee4c5 100644 --- a/Documentation/git-ls-files.txt +++ b/Documentation/git-ls-files.txt @@ -219,9 +219,9 @@ followed by the ("attr/"). --format=:: A string that interpolates `%(fieldname)` from the result being shown. - It also interpolates `%%` to `%`, and `%xx` where `xx` are hex digits - interpolates to character with hex code `xx`; for example `%00` - interpolates to `\0` (NUL), `%09` to `\t` (TAB) and %0a to `\n` (LF). + It also interpolates `%%` to `%`, and `%xXX` where `XX` are hex digits + interpolates to character with hex code `XX`; for example `%x00` + interpolates to `\0` (NUL), `%x09` to `\t` (TAB) and %x0a to `\n` (LF). --format cannot be combined with `-s`, `-o`, `-k`, `-t`, `--resolve-undo` and `--eol`. \--:: From 6e71d6ac7cb575d04c452762662b00a89cd0b0db Mon Sep 17 00:00:00 2001 From: Kousik Sanagavarapu Date: Mon, 29 Jul 2024 10:02:48 +0530 Subject: [PATCH 015/297] unit-tests/test-lib: fix typo in check_pointer_eq() description The comment surrounding check_pointer_eq() should explain about what this function does instead of explaining check_int(). Correct this. Signed-off-by: Kousik Sanagavarapu Signed-off-by: Junio C Hamano --- t/unit-tests/test-lib.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/t/unit-tests/test-lib.h b/t/unit-tests/test-lib.h index 2de6d715d5f4c3..c59f646fd9f485 100644 --- a/t/unit-tests/test-lib.h +++ b/t/unit-tests/test-lib.h @@ -76,8 +76,9 @@ int test_assert(const char *location, const char *check, int ok); int check_bool_loc(const char *loc, const char *check, int ok); /* - * Compare two integers. Prints a message with the two values if the - * comparison fails. NB this is not thread safe. + * Compare the equality of two pointers of same type. Prints a message + * with the two values if the equality fails. NB this is not thread + * safe. */ #define check_pointer_eq(a, b) \ (test__tmp[0].p = (a), test__tmp[1].p = (b), \ From 8b426c84f376657e1e10839f3f5dafd9ba99593d Mon Sep 17 00:00:00 2001 From: David Disseldorp Date: Mon, 29 Jul 2024 17:14:00 +0200 Subject: [PATCH 016/297] notes: do not trigger editor when adding an empty note With "git notes add -C $blob", the given blob contents are to be made into a note without involving an editor. But when "--allow-empty" is given, the editor is invoked, which can cause problems for non-interactive callers[1]. This behaviour started with 90bc19b3ae (notes.c: introduce '--separator=' option, 2023-05-27), which changed editor invocation logic to check for a zero length note_data buffer. Restore the original behaviour of "git note" that takes the contents given via the "-m", "-C", "-F" options without invoking an editor, by checking for any prior parameter callbacks, indicated by a non-zero note_data.msg_nr. Remove the now-unneeded note_data.given flag. Add a test for this regression by checking whether GIT_EDITOR is invoked alongside "git notes add -C $empty_blob --allow-empty" [1] https://github.com/ddiss/icyci/issues/12 Signed-off-by: David Disseldorp [jc: enhanced the test with -m/-F options] Signed-off-by: Junio C Hamano --- builtin/notes.c | 22 ++++++++++------------ t/t3301-notes.sh | 10 ++++++++++ 2 files changed, 20 insertions(+), 12 deletions(-) diff --git a/builtin/notes.c b/builtin/notes.c index d9c356e3543fec..4cc5bfedc3c6dc 100644 --- a/builtin/notes.c +++ b/builtin/notes.c @@ -114,7 +114,6 @@ struct note_msg { }; struct note_data { - int given; int use_editor; int stripspace; char *edit_path; @@ -193,7 +192,7 @@ static void write_commented_object(int fd, const struct object_id *object) static void prepare_note_data(const struct object_id *object, struct note_data *d, const struct object_id *old_note) { - if (d->use_editor || !d->given) { + if (d->use_editor || !d->msg_nr) { int fd; struct strbuf buf = STRBUF_INIT; @@ -201,7 +200,7 @@ static void prepare_note_data(const struct object_id *object, struct note_data * d->edit_path = git_pathdup("NOTES_EDITMSG"); fd = xopen(d->edit_path, O_CREAT | O_TRUNC | O_WRONLY, 0600); - if (d->given) + if (d->msg_nr) write_or_die(fd, d->buf.buf, d->buf.len); else if (old_note) copy_obj_to_fd(fd, old_note); @@ -515,7 +514,6 @@ static int add(int argc, const char **argv, const char *prefix) if (d.msg_nr) concat_messages(&d); - d.given = !!d.buf.len; object_ref = argc > 1 ? argv[1] : "HEAD"; @@ -528,7 +526,7 @@ static int add(int argc, const char **argv, const char *prefix) if (note) { if (!force) { free_notes(t); - if (d.given) { + if (d.msg_nr) { free_note_data(&d); return error(_("Cannot add notes. " "Found existing notes for object %s. " @@ -690,14 +688,14 @@ static int append_edit(int argc, const char **argv, const char *prefix) usage_with_options(usage, options); } - if (d.msg_nr) + if (d.msg_nr) { concat_messages(&d); - d.given = !!d.buf.len; - - if (d.given && edit) - fprintf(stderr, _("The -m/-F/-c/-C options have been deprecated " - "for the 'edit' subcommand.\n" - "Please use 'git notes add -f -m/-F/-c/-C' instead.\n")); + if (edit) + fprintf(stderr, _("The -m/-F/-c/-C options have been " + "deprecated for the 'edit' subcommand.\n" + "Please use 'git notes add -f -m/-F/-c/-C' " + "instead.\n")); + } object_ref = 1 < argc ? argv[1] : "HEAD"; diff --git a/t/t3301-notes.sh b/t/t3301-notes.sh index 536bd11ff4769f..99137fb235731b 100755 --- a/t/t3301-notes.sh +++ b/t/t3301-notes.sh @@ -1557,4 +1557,14 @@ test_expect_success 'empty notes are displayed by git log' ' test_cmp expect actual ' +test_expect_success 'empty notes do not invoke the editor' ' + test_commit 18th && + GIT_EDITOR="false" git notes add -C "$empty_blob" --allow-empty && + git notes remove HEAD && + GIT_EDITOR="false" git notes add -m "" --allow-empty && + git notes remove HEAD && + GIT_EDITOR="false" git notes add -F /dev/null --allow-empty && + git notes remove HEAD +' + test_done From 4210ea6f0f2ee95370694e6a3d4fbb8ccf8c4847 Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Mon, 29 Jul 2024 18:17:34 -0700 Subject: [PATCH 017/297] t4204: patch-id supports various input format "git patch-id" was first developed to read from "git diff-tree --stdin -p" output. Later it was enhanced to read from "git diff-tree --stdin -p -v", which was the downstream of an early imitation of "git log" ("git rev-list" run in the upstream of a pipe to feed the "diff-tree"). These days, we also read from "git format-patch". Their output begins slightly differently, but the patch-id computed over them for the same commit should be the same. Ensure that we won't accidentally break this expectation. Signed-off-by: Junio C Hamano --- t/t4204-patch-id.sh | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/t/t4204-patch-id.sh b/t/t4204-patch-id.sh index a7fa94ce0a212e..1627fdda1b266f 100755 --- a/t/t4204-patch-id.sh +++ b/t/t4204-patch-id.sh @@ -114,6 +114,29 @@ test_expect_success 'patch-id supports git-format-patch output' ' test "$2" = $(git rev-parse HEAD) ' +test_expect_success 'patch-id computes the same for various formats' ' + # This test happens to consider "git log -p -1" output + # the canonical input format, so use it as the norm. + git log -1 -p same >log-p.output && + git patch-id expect && + + # format-patch begins with "From " + git format-patch -1 --stdout same >format-patch.output && + git patch-id actual && + test_cmp actual expect && + + # "diff-tree --stdin -p" begins with "" + same=$(git rev-parse same) && + echo $same | git diff-tree --stdin -p >diff-tree.output && + git patch-id actual && + test_cmp actual expect && + + # "diff-tree --stdin -v -p" begins with "commit " + echo $same | git diff-tree --stdin -p -v >diff-tree-v.output && + git patch-id actual && + test_cmp actual expect +' + test_expect_success 'whitespace is irrelevant in footer' ' get_patch_id main && git checkout same && From c92f3195adf61f248af5bc787def379b2db6f2e9 Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Mon, 29 Jul 2024 18:17:35 -0700 Subject: [PATCH 018/297] patch-id: call flush_current_id() only when needed The caller passes a flag that is used to become no-op when calling flush_current_id(). Instead of calling something that becomes a no-op, teach the caller not to call it in the first place. Signed-off-by: Junio C Hamano --- builtin/patch-id.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/builtin/patch-id.c b/builtin/patch-id.c index 3894d2b970612c..0f262e7a03cb98 100644 --- a/builtin/patch-id.c +++ b/builtin/patch-id.c @@ -6,10 +6,9 @@ #include "hex.h" #include "parse-options.h" -static void flush_current_id(int patchlen, struct object_id *id, struct object_id *result) +static void flush_current_id(struct object_id *id, struct object_id *result) { - if (patchlen) - printf("%s %s\n", oid_to_hex(result), oid_to_hex(id)); + printf("%s %s\n", oid_to_hex(result), oid_to_hex(id)); } static int remove_space(char *line) @@ -181,7 +180,8 @@ static void generate_id_list(int stable, int verbatim) oidclr(&oid); while (!feof(stdin)) { patchlen = get_one_patchid(&n, &result, &line_buf, stable, verbatim); - flush_current_id(patchlen, &oid, &result); + if (patchlen) + flush_current_id(&oid, &result); oidcpy(&oid, &n); } strbuf_release(&line_buf); From 2438294a13dc3f04684bcfb1af75008b3ca07f86 Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Mon, 29 Jul 2024 18:17:36 -0700 Subject: [PATCH 019/297] patch-id: make get_one_patchid() more extensible We pass two independent Boolean flags (i.e. do we want the stable variant of patch-id? do we want to hash the stuff verbatim?) into the function as two separate parameters. Before adding the third one and make the interface even wider, let's consolidate them into a single flag word. No changes in behaviour. Just a trivial interface change. Signed-off-by: Junio C Hamano --- builtin/patch-id.c | 25 ++++++++++++++++++++----- 1 file changed, 20 insertions(+), 5 deletions(-) diff --git a/builtin/patch-id.c b/builtin/patch-id.c index 0f262e7a03cb98..1d3b7ff12bc6f9 100644 --- a/builtin/patch-id.c +++ b/builtin/patch-id.c @@ -58,9 +58,19 @@ static int scan_hunk_header(const char *p, int *p_before, int *p_after) return 1; } +/* + * flag bits to control get_one_patchid()'s behaviour. + */ +enum { + GOPID_STABLE = (1<<0), /* --stable */ + GOPID_VERBATIM = (1<<1), /* --verbatim */ +}; + static int get_one_patchid(struct object_id *next_oid, struct object_id *result, - struct strbuf *line_buf, int stable, int verbatim) + struct strbuf *line_buf, unsigned flags) { + int stable = flags & GOPID_STABLE; + int verbatim = flags & GOPID_VERBATIM; int patchlen = 0, found_next = 0; int before = -1, after = -1; int diff_is_binary = 0; @@ -171,7 +181,7 @@ static int get_one_patchid(struct object_id *next_oid, struct object_id *result, return patchlen; } -static void generate_id_list(int stable, int verbatim) +static void generate_id_list(unsigned flags) { struct object_id oid, n, result; int patchlen; @@ -179,7 +189,7 @@ static void generate_id_list(int stable, int verbatim) oidclr(&oid); while (!feof(stdin)) { - patchlen = get_one_patchid(&n, &result, &line_buf, stable, verbatim); + patchlen = get_one_patchid(&n, &result, &line_buf, flags); if (patchlen) flush_current_id(&oid, &result); oidcpy(&oid, &n); @@ -218,6 +228,7 @@ int cmd_patch_id(int argc, const char **argv, const char *prefix) /* if nothing is set, default to unstable */ struct patch_id_opts config = {0, 0}; int opts = 0; + unsigned flags = 0; struct option builtin_patch_id_options[] = { OPT_CMDMODE(0, "unstable", &opts, N_("use the unstable patch-id algorithm"), 1), @@ -237,7 +248,11 @@ int cmd_patch_id(int argc, const char **argv, const char *prefix) argc = parse_options(argc, argv, prefix, builtin_patch_id_options, patch_id_usage, 0); - generate_id_list(opts ? opts > 1 : config.stable, - opts ? opts == 3 : config.verbatim); + if (opts ? opts > 1 : config.stable) + flags |= GOPID_STABLE; + if (opts ? opts == 3 : config.verbatim) + flags |= GOPID_VERBATIM; + generate_id_list(flags); + return 0; } From 3f288b6fafbb8c76edf53fba7cea90d5a20a9c56 Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Mon, 29 Jul 2024 18:17:37 -0700 Subject: [PATCH 020/297] patch-id: rewrite code that detects the beginning of a patch The get_one_patchid() function reads input lines until it finds a patch header (the line that begins a patch), whose beginning is one of: (1) an "", which is what "git diff-tree --stdin" shows; (2) "commit ", which is what "git log" shows; or (3) "From ", which is what "git log --format=email" shows. When it finds such a line, it returns to the caller, reporting the it found, and the size of the "patch" it processed. The caller then calls the function again, which then ignores the commit log message, and then processes the lines in the patch part until it hits another "beginning of a patch". The above logic was fairly easy to see until 2bb73ae8 (patch-id: use starts_with() and skip_prefix(), 2016-05-28) reorganized the code, which made another logic that has nothing to do with the "where does the next patch begin?" logic, which came from 2485eab5 (git-patch-id: do not trip over "no newline" markers, 2011-02-17) that ignores the "\ No newline at the end", rolled into the same single if() statement. Let's split it out. The "\ No newline at the end" marker is part of the patch, should not appear before we start reading the patch part, and does not belong to the detection of patch header. Signed-off-by: Junio C Hamano --- builtin/patch-id.c | 29 +++++++++++++++++++++-------- 1 file changed, 21 insertions(+), 8 deletions(-) diff --git a/builtin/patch-id.c b/builtin/patch-id.c index 1d3b7ff12bc6f9..a2fdc0505d6320 100644 --- a/builtin/patch-id.c +++ b/builtin/patch-id.c @@ -85,16 +85,19 @@ static int get_one_patchid(struct object_id *next_oid, struct object_id *result, const char *p = line; int len; - /* Possibly skip over the prefix added by "log" or "format-patch" */ - if (!skip_prefix(line, "commit ", &p) && - !skip_prefix(line, "From ", &p) && - starts_with(line, "\\ ") && 12 < strlen(line)) { + /* + * If we see a line that begins with "", + * "commit " or "From ", it is + * the beginning of a patch. Return to the caller, as + * we are done with the one we have been processing. + */ + if (skip_prefix(line, "commit ", &p)) + ; + else if (skip_prefix(line, "From ", &p)) + ; + if (!get_oid_hex(p, next_oid)) { if (verbatim) the_hash_algo->update_fn(&ctx, line, strlen(line)); - continue; - } - - if (!get_oid_hex(p, next_oid)) { found_next = 1; break; } @@ -135,6 +138,16 @@ static int get_one_patchid(struct object_id *next_oid, struct object_id *result, break; } + /* + * A hunk about an incomplete line may have this + * marker at the end, which should just be ignored. + */ + if (starts_with(line, "\\ ") && 12 < strlen(line)) { + if (verbatim) + the_hash_algo->update_fn(&ctx, line, strlen(line)); + continue; + } + if (diff_is_binary) { if (starts_with(line, "diff ")) { diff_is_binary = 0; From a6e9429f728d088999b815c25adbd2f2c115e051 Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Mon, 29 Jul 2024 18:17:38 -0700 Subject: [PATCH 021/297] patch-id: tighten code to detect the patch header The get_one_patchid() function unconditionally takes a line that matches the patch header (namely, a line that begins with a full object name, possibly prefixed by "commit" or "From" plus a space) as the beginning of a patch. Even when it is *not* looking for one (namely, when the previous call found the patch header and returned, and then we are called again to skip the log message and process the patch whose header was found by the previous invocation). As a consequence, a line in the commit log message that begins with one of these patterns can be mistaken to start another patch, with current message entirely skipped (because we haven't even reached the patch at all). Allow the caller to tell us if it called us already and saw the patch header (in which case we shouldn't be looking for another one, until we see the "diff" part of the patch; instead we simply should be skipping these lines as part of the commit log message), and skip the header processing logic when that is the case. In the helper function, it also needs to flip this "are we looking for a header?" bit, once it finished skipping the commit log message and started processing the patches, as the patch header of the _next_ message is the only clue in the input that the current patch is done. Signed-off-by: Junio C Hamano --- builtin/patch-id.c | 49 +++++++++++++++++++++++++++++++++------------ t/t4204-patch-id.sh | 17 ++++++++++++++++ 2 files changed, 53 insertions(+), 13 deletions(-) diff --git a/builtin/patch-id.c b/builtin/patch-id.c index a2fdc0505d6320..f2e8feb6d3dff2 100644 --- a/builtin/patch-id.c +++ b/builtin/patch-id.c @@ -60,10 +60,17 @@ static int scan_hunk_header(const char *p, int *p_before, int *p_after) /* * flag bits to control get_one_patchid()'s behaviour. + * + * STABLE/VERBATIM are given from the command line option as + * --stable/--verbatim. FIND_HEADER conveys the internal state + * maintained by the caller to allow the function to avoid mistaking + * lines of log message before seeing the "diff" part as the beginning + * of the next patch. */ enum { GOPID_STABLE = (1<<0), /* --stable */ GOPID_VERBATIM = (1<<1), /* --verbatim */ + GOPID_FIND_HEADER = (1<<2), /* stop at the beginning of patch message */ }; static int get_one_patchid(struct object_id *next_oid, struct object_id *result, @@ -71,6 +78,7 @@ static int get_one_patchid(struct object_id *next_oid, struct object_id *result, { int stable = flags & GOPID_STABLE; int verbatim = flags & GOPID_VERBATIM; + int find_header = flags & GOPID_FIND_HEADER; int patchlen = 0, found_next = 0; int before = -1, after = -1; int diff_is_binary = 0; @@ -86,26 +94,39 @@ static int get_one_patchid(struct object_id *next_oid, struct object_id *result, int len; /* - * If we see a line that begins with "", - * "commit " or "From ", it is - * the beginning of a patch. Return to the caller, as - * we are done with the one we have been processing. + * The caller hasn't seen us find a patch header and + * return to it, or we have started processing patch + * and may encounter the beginning of the next patch. */ - if (skip_prefix(line, "commit ", &p)) - ; - else if (skip_prefix(line, "From ", &p)) - ; - if (!get_oid_hex(p, next_oid)) { - if (verbatim) - the_hash_algo->update_fn(&ctx, line, strlen(line)); - found_next = 1; - break; + if (find_header) { + /* + * If we see a line that begins with "", + * "commit " or "From ", it is + * the beginning of a patch. Return to the caller, as + * we are done with the one we have been processing. + */ + if (skip_prefix(line, "commit ", &p)) + ; + else if (skip_prefix(line, "From ", &p)) + ; + if (!get_oid_hex(p, next_oid)) { + if (verbatim) + the_hash_algo->update_fn(&ctx, line, strlen(line)); + found_next = 1; + break; + } } /* Ignore commit comments */ if (!patchlen && !starts_with(line, "diff ")) continue; + /* + * We are past the commit log message. Prepare to + * stop at the beginning of the next patch header. + */ + find_header = 1; + /* Parsing diff header? */ if (before == -1) { if (starts_with(line, "GIT binary patch") || @@ -201,11 +222,13 @@ static void generate_id_list(unsigned flags) struct strbuf line_buf = STRBUF_INIT; oidclr(&oid); + flags |= GOPID_FIND_HEADER; while (!feof(stdin)) { patchlen = get_one_patchid(&n, &result, &line_buf, flags); if (patchlen) flush_current_id(&oid, &result); oidcpy(&oid, &n); + flags &= ~GOPID_FIND_HEADER; } strbuf_release(&line_buf); } diff --git a/t/t4204-patch-id.sh b/t/t4204-patch-id.sh index 1627fdda1b266f..b1d98d4110700c 100755 --- a/t/t4204-patch-id.sh +++ b/t/t4204-patch-id.sh @@ -137,6 +137,23 @@ test_expect_success 'patch-id computes the same for various formats' ' test_cmp actual expect ' +hash=$(git rev-parse same:) +for cruft in "$hash" "commit $hash is bad" "From $hash status" +do + test_expect_success "patch-id with <$cruft> in log message" ' + git format-patch -1 --stdout same >patch-0 && + git patch-id expect && + + { + sed -e "/^$/q" patch-0 && + printf "random message\n%s\n\n" "$cruft" && + sed -e "1,/^$/d" patch-0 + } >patch-cruft && + git patch-id actual && + test_cmp actual expect + ' +done + test_expect_success 'whitespace is irrelevant in footer' ' get_patch_id main && git checkout same && From 1048aa8b7ad44d6c759e894f6ca763614068514d Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Tue, 30 Jul 2024 11:43:49 -0700 Subject: [PATCH 022/297] safe.directory: preliminary clean-up The paths given in the safe.directory configuration variable are allowed to contain "~user" (which interpolates to user's home directory) and "%(prefix)" (which interpolates to the installation location in RUNTIME_PREFIX-enabled builds, and a call to the git_config_pathname() function is tasked to obtain a copy of the path with these constructs interpolated. The function, when it succeeds, always yields an allocated string in the location given as the out-parameter; even when there is nothing to interpolate in the original, a literal copy is made. The code path that contains this caller somehow made two contradicting and incorrect assumptions of the behaviour when there is no need for interpolation, and was written with extra defensiveness against two phantom risks that do not exist. One wrong assumption was that the function might yield NULL when there is no interpolation. This led to the use of an extra "check" variable, conditionally holding either the interpolated or the original string. The assumption was with us since 8959555c (setup_git_directory(): add an owner check for the top-level directory, 2022-03-02) originally introduced the safe.directory feature. Another wrong assumption was that the function might yield the same pointer as the input when there is no interpolation. This led to a conditional free'ing of the interpolated copy, that the conditional never skipped, as we always received an allocated string. Simplify the code by removing the extra defensiveness. Signed-off-by: Junio C Hamano --- setup.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/setup.c b/setup.c index d458edcc028ef1..3177d010d1e082 100644 --- a/setup.c +++ b/setup.c @@ -1235,17 +1235,15 @@ static int safe_directory_cb(const char *key, const char *value, char *allowed = NULL; if (!git_config_pathname(&allowed, key, value)) { - const char *check = allowed ? allowed : value; - if (ends_with(check, "/*")) { - size_t len = strlen(check); - if (!fspathncmp(check, data->path, len - 1)) + if (ends_with(allowed, "/*")) { + size_t len = strlen(allowed); + if (!fspathncmp(allowed, data->path, len - 1)) data->is_safe = 1; - } else if (!fspathcmp(data->path, check)) { + } else if (!fspathcmp(data->path, allowed)) { data->is_safe = 1; } - } - if (allowed != value) free(allowed); + } } return 0; From 7f547c99a627bca120bf44abf3dd95c8837dfdfa Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Tue, 30 Jul 2024 11:43:50 -0700 Subject: [PATCH 023/297] safe.directory: normalize the checked path The pathname of a repository comes from getcwd() and it could be a path aliased via symbolic links, e.g., the real directory may be /home/u/repository but a symbolic link /home/u/repo may point at it, and the clone request may come as "git clone file:///home/u/repo/". A request to check if /home/u/repo is safe would be rejected if the safe.directory configuration allows /home/u/repository/ but not its alias /home/u/repo/. Normalize the path being checked before comparing with safe.directory value(s). Suggested-by: Phillip Wood Signed-off-by: Junio C Hamano --- setup.c | 16 ++++++++--- t/t0033-safe-directory.sh | 57 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 69 insertions(+), 4 deletions(-) diff --git a/setup.c b/setup.c index 3177d010d1e082..54cce7219b9567 100644 --- a/setup.c +++ b/setup.c @@ -1215,7 +1215,7 @@ static int canonicalize_ceiling_entry(struct string_list_item *item, } struct safe_directory_data { - const char *path; + char *path; int is_safe; }; @@ -1261,9 +1261,7 @@ static int ensure_valid_ownership(const char *gitfile, const char *worktree, const char *gitdir, struct strbuf *report) { - struct safe_directory_data data = { - .path = worktree ? worktree : gitdir - }; + struct safe_directory_data data = { 0 }; if (!git_env_bool("GIT_TEST_ASSUME_DIFFERENT_OWNER", 0) && (!gitfile || is_path_owned_by_current_user(gitfile, report)) && @@ -1271,6 +1269,15 @@ static int ensure_valid_ownership(const char *gitfile, (!gitdir || is_path_owned_by_current_user(gitdir, report))) return 1; + /* + * normalize the data.path for comparison with normalized paths + * that come from the configuration file. The path is unsafe + * if it cannot be normalized. + */ + data.path = real_pathdup(worktree ? worktree : gitdir, 0); + if (!data.path) + return 0; + /* * data.path is the "path" that identifies the repository and it is * constant regardless of what failed above. data.is_safe should be @@ -1278,6 +1285,7 @@ static int ensure_valid_ownership(const char *gitfile, */ git_protected_config(safe_directory_cb, &data); + free(data.path); return data.is_safe; } diff --git a/t/t0033-safe-directory.sh b/t/t0033-safe-directory.sh index 5fe61f129129ef..07ac0f9a016323 100755 --- a/t/t0033-safe-directory.sh +++ b/t/t0033-safe-directory.sh @@ -119,4 +119,61 @@ test_expect_success 'local clone of unowned repo accepted in safe directory' ' test_path_is_dir target ' +test_expect_success SYMLINKS 'checked paths are normalized' ' + test_when_finished "rm -rf repository; rm -f repo" && + ( + sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER && + git config --global --unset-all safe.directory + ) && + git init repository && + ln -s repository repo && + ( + cd repository && + sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER && + test_commit sample + ) && + + ( + sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER && + git config --global safe.directory "$(pwd)/repository" + ) && + git -C repository for-each-ref && + git -C repository/ for-each-ref && + git -C repo for-each-ref && + git -C repo/ for-each-ref && + test_must_fail git -C repository/.git for-each-ref && + test_must_fail git -C repository/.git/ for-each-ref && + test_must_fail git -C repo/.git for-each-ref && + test_must_fail git -C repo/.git/ for-each-ref +' + +test_expect_success SYMLINKS 'checked leading paths are normalized' ' + test_when_finished "rm -rf repository; rm -f repo" && + ( + sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER && + git config --global --unset-all safe.directory + ) && + mkdir -p repository && + git init repository/s && + ln -s repository repo && + ( + cd repository/s && + sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER && + test_commit sample + ) && + + ( + sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER && + git config --global safe.directory "$(pwd)/repository/*" + ) && + git -C repository/s for-each-ref && + git -C repository/s/ for-each-ref && + git -C repo/s for-each-ref && + git -C repo/s/ for-each-ref && + git -C repository/s/.git for-each-ref && + git -C repository/s/.git/ for-each-ref && + git -C repo/s/.git for-each-ref && + git -C repo/s/.git/ for-each-ref +' + test_done From dc0edbb01c8bb096525839b8c205d2b2663d6961 Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Tue, 30 Jul 2024 11:43:51 -0700 Subject: [PATCH 024/297] safe.directory: normalize the configured path The pathname of a repository comes from getcwd() and it could be a path aliased via symbolic links, e.g., the real directory may be /home/u/repository but a symbolic link /home/u/repo may point at it, and the clone request may come as "git clone file:///home/u/repo/" A request to check if /home/u/repository is safe would be rejected if the safe.directory configuration allows /home/u/repo/ but not its alias /home/u/repository/. Normalize the paths configured for the safe.directory configuration variable before comparing them with the path being checked. Two and a half things to note, compared to the previous step to normalize the actual path of the suspected repository, are: - A configured safe.directory may be coming from .gitignore in the home directory that may be shared across machines. The path meant to match with an entry may not necessarily exist on all of such machines, so not being able to convert them to real path on this machine is *not* a condition that is worthy of warning. Hence, we ignore a path that cannot be converted to a real path. - A configured safe.directory is essentially a random string that user throws at us, written completely unrelated to the directory the current process happens to be in. Hence it makes little sense to give a non-absolute path. Hence we ignore any non-absolute paths, except for ".". - The safe.directory set to "." was once advertised on the list as a valid workaround for the regression caused by the overly tight safe.directory check introduced in 2.45.1; we treat it to mean "if we are at the top level of a repository, it is OK". (cf. <834862fd-b579-438a-b9b3-5246bf27ce8a@gmail.com>). Suggested-by: Phillip Wood Helped-by: Jeff King Signed-off-by: Junio C Hamano --- setup.c | 38 +++++++++++++++++++++++--- t/t0033-safe-directory.sh | 57 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 91 insertions(+), 4 deletions(-) diff --git a/setup.c b/setup.c index 54cce7219b9567..5f81d9fac0d7e8 100644 --- a/setup.c +++ b/setup.c @@ -1235,13 +1235,43 @@ static int safe_directory_cb(const char *key, const char *value, char *allowed = NULL; if (!git_config_pathname(&allowed, key, value)) { - if (ends_with(allowed, "/*")) { - size_t len = strlen(allowed); - if (!fspathncmp(allowed, data->path, len - 1)) + char *normalized = NULL; + + /* + * Setting safe.directory to a non-absolute path + * makes little sense---it won't be relative to + * the configuration file the item is defined in. + * Except for ".", which means "if we are at the top + * level of a repository, then it is OK", which is + * slightly tighter than "*" that allows discovery. + */ + if (!is_absolute_path(allowed) && strcmp(allowed, ".")) { + warning(_("safe.directory '%s' not absolute"), + allowed); + goto next; + } + + /* + * A .gitconfig in $HOME may be shared across + * different machines and safe.directory entries + * may or may not exist as paths on all of these + * machines. In other words, it is not a warning + * worthy event when there is no such path on this + * machine---the entry may be useful elsewhere. + */ + normalized = real_pathdup(allowed, 0); + if (!normalized) + goto next; + + if (ends_with(normalized, "/*")) { + size_t len = strlen(normalized); + if (!fspathncmp(normalized, data->path, len - 1)) data->is_safe = 1; - } else if (!fspathcmp(data->path, allowed)) { + } else if (!fspathcmp(data->path, normalized)) { data->is_safe = 1; } + next: + free(normalized); free(allowed); } } diff --git a/t/t0033-safe-directory.sh b/t/t0033-safe-directory.sh index 07ac0f9a016323..ea7465725564cf 100755 --- a/t/t0033-safe-directory.sh +++ b/t/t0033-safe-directory.sh @@ -176,4 +176,61 @@ test_expect_success SYMLINKS 'checked leading paths are normalized' ' git -C repo/s/.git/ for-each-ref ' +test_expect_success SYMLINKS 'configured paths are normalized' ' + test_when_finished "rm -rf repository; rm -f repo" && + ( + sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER && + git config --global --unset-all safe.directory + ) && + git init repository && + ln -s repository repo && + ( + cd repository && + sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER && + test_commit sample + ) && + + ( + sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER && + git config --global safe.directory "$(pwd)/repo" + ) && + git -C repository for-each-ref && + git -C repository/ for-each-ref && + git -C repo for-each-ref && + git -C repo/ for-each-ref && + test_must_fail git -C repository/.git for-each-ref && + test_must_fail git -C repository/.git/ for-each-ref && + test_must_fail git -C repo/.git for-each-ref && + test_must_fail git -C repo/.git/ for-each-ref +' + +test_expect_success SYMLINKS 'configured leading paths are normalized' ' + test_when_finished "rm -rf repository; rm -f repo" && + ( + sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER && + git config --global --unset-all safe.directory + ) && + mkdir -p repository && + git init repository/s && + ln -s repository repo && + ( + cd repository/s && + sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER && + test_commit sample + ) && + + ( + sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER && + git config --global safe.directory "$(pwd)/repo/*" + ) && + git -C repository/s for-each-ref && + git -C repository/s/ for-each-ref && + git -C repository/s/.git for-each-ref && + git -C repository/s/.git/ for-each-ref && + git -C repo/s for-each-ref && + git -C repo/s/ for-each-ref && + git -C repo/s/.git for-each-ref && + git -C repo/s/.git/ for-each-ref +' + test_done From ee0be850b089a3fc6c2bfa6a4e5172ee449aa23a Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Tue, 30 Jul 2024 11:43:52 -0700 Subject: [PATCH 025/297] safe.directory: setting safe.directory="." allows the "current" directory When "git daemon" enters a repository, it chdir's to the requested repository and then uses "." (the curent directory) to consult the "is this repository considered safe?" when it is not owned by the same owner as the process. Make sure this access will be allowed by setting safe.directory to ".", as that was once advertised on the list as a valid workaround to the overly tight safe.directory settings introduced by 2.45.1 (cf. <834862fd-b579-438a-b9b3-5246bf27ce8a@gmail.com>). Also add simlar test to show what happens in the same setting if the safe.directory is set to "*" instead of "."; in short, "." is a bit tighter (as it is custom designed for git-daemon situation) than "anything goes" settings given by "*". Signed-off-by: Junio C Hamano --- t/t0033-safe-directory.sh | 64 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 64 insertions(+) diff --git a/t/t0033-safe-directory.sh b/t/t0033-safe-directory.sh index ea7465725564cf..e97a84764f7867 100755 --- a/t/t0033-safe-directory.sh +++ b/t/t0033-safe-directory.sh @@ -233,4 +233,68 @@ test_expect_success SYMLINKS 'configured leading paths are normalized' ' git -C repo/s/.git/ for-each-ref ' +test_expect_success 'safe.directory set to a dot' ' + test_when_finished "rm -rf repository" && + ( + sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER && + git config --global --unset-all safe.directory + ) && + mkdir -p repository/subdir && + git init repository && + ( + cd repository && + sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER && + test_commit sample + ) && + + ( + sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER && + git config --global safe.directory "." + ) && + git -C repository for-each-ref && + git -C repository/ for-each-ref && + git -C repository/.git for-each-ref && + git -C repository/.git/ for-each-ref && + + # What is allowed is repository/subdir but the repository + # path is repository. + test_must_fail git -C repository/subdir for-each-ref && + + # Likewise, repository .git/refs is allowed with "." but + # repository/.git that is accessed is not allowed. + test_must_fail git -C repository/.git/refs for-each-ref +' + +test_expect_success 'safe.directory set to asterisk' ' + test_when_finished "rm -rf repository" && + ( + sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER && + git config --global --unset-all safe.directory + ) && + mkdir -p repository/subdir && + git init repository && + ( + cd repository && + sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER && + test_commit sample + ) && + + ( + sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER && + git config --global safe.directory "*" + ) && + # these are trivial + git -C repository for-each-ref && + git -C repository/ for-each-ref && + git -C repository/.git for-each-ref && + git -C repository/.git/ for-each-ref && + + # With "*", everything is allowed, and the repository is + # discovered, which is different behaviour from "." above. + git -C repository/subdir for-each-ref && + + # Likewise. + git -C repository/.git/refs for-each-ref +' + test_done From 098be29f5b4e40b687529d65166c19e6bb096dcb Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= Date: Tue, 30 Jul 2024 16:00:20 +0200 Subject: [PATCH 026/297] t-example-decorate: remove test messages MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The test_msg() calls only repeat information already present in test descriptions and check definitions, which are shown automatically if the checks fail. Remove the redundant messages to simplify the tests and their output. Here it is with all of them failing before: # check "ret == NULL" failed at t/unit-tests/t-example-decorate.c:18 # when adding a brand-new object, NULL should be returned # check "ret == NULL" failed at t/unit-tests/t-example-decorate.c:21 # when adding a brand-new object, NULL should be returned not ok 1 - Add 2 objects, one with a non-NULL decoration and one with a NULL decoration. # check "ret == &vars->decoration_a" failed at t/unit-tests/t-example-decorate.c:29 # when readding an already existing object, existing decoration should be returned # check "ret == NULL" failed at t/unit-tests/t-example-decorate.c:32 # when readding an already existing object, existing decoration should be returned not ok 2 - When re-adding an already existing object, the old decoration is returned. # check "ret == NULL" failed at t/unit-tests/t-example-decorate.c:40 # lookup should return added declaration # check "ret == &vars->decoration_b" failed at t/unit-tests/t-example-decorate.c:43 # lookup should return added declaration # check "ret == NULL" failed at t/unit-tests/t-example-decorate.c:46 # lookup for unknown object should return NULL not ok 3 - Lookup returns the added declarations, or NULL if the object was never added. # check "objects_noticed == 2" failed at t/unit-tests/t-example-decorate.c:58 # left: 1 # right: 2 # should have 2 objects not ok 4 - The user can also loop through all entries. 1..4 ... and here with the patch applied: # check "ret == NULL" failed at t/unit-tests/t-example-decorate.c:18 # check "ret == NULL" failed at t/unit-tests/t-example-decorate.c:20 not ok 1 - Add 2 objects, one with a non-NULL decoration and one with a NULL decoration. # check "ret == &vars->decoration_a" failed at t/unit-tests/t-example-decorate.c:27 # check "ret == NULL" failed at t/unit-tests/t-example-decorate.c:29 not ok 2 - When re-adding an already existing object, the old decoration is returned. # check "ret == NULL" failed at t/unit-tests/t-example-decorate.c:36 # check "ret == &vars->decoration_b" failed at t/unit-tests/t-example-decorate.c:38 # check "ret == NULL" failed at t/unit-tests/t-example-decorate.c:40 not ok 3 - Lookup returns the added declarations, or NULL if the object was never added. # check "objects_noticed == 2" failed at t/unit-tests/t-example-decorate.c:51 # left: 1 # right: 2 not ok 4 - The user can also loop through all entries. 1..4 Signed-off-by: René Scharfe Signed-off-by: Junio C Hamano --- t/unit-tests/t-example-decorate.c | 24 ++++++++---------------- 1 file changed, 8 insertions(+), 16 deletions(-) diff --git a/t/unit-tests/t-example-decorate.c b/t/unit-tests/t-example-decorate.c index a4a75db735afbe..8bf0709c41f681 100644 --- a/t/unit-tests/t-example-decorate.c +++ b/t/unit-tests/t-example-decorate.c @@ -15,36 +15,29 @@ static void t_add(struct test_vars *vars) { void *ret = add_decoration(&vars->n, vars->one, &vars->decoration_a); - if (!check(ret == NULL)) - test_msg("when adding a brand-new object, NULL should be returned"); + check(ret == NULL); ret = add_decoration(&vars->n, vars->two, NULL); - if (!check(ret == NULL)) - test_msg("when adding a brand-new object, NULL should be returned"); + check(ret == NULL); } static void t_readd(struct test_vars *vars) { void *ret = add_decoration(&vars->n, vars->one, NULL); - if (!check(ret == &vars->decoration_a)) - test_msg("when readding an already existing object, existing decoration should be returned"); + check(ret == &vars->decoration_a); ret = add_decoration(&vars->n, vars->two, &vars->decoration_b); - if (!check(ret == NULL)) - test_msg("when readding an already existing object, existing decoration should be returned"); + check(ret == NULL); } static void t_lookup(struct test_vars *vars) { void *ret = lookup_decoration(&vars->n, vars->one); - if (!check(ret == NULL)) - test_msg("lookup should return added declaration"); + check(ret == NULL); ret = lookup_decoration(&vars->n, vars->two); - if (!check(ret == &vars->decoration_b)) - test_msg("lookup should return added declaration"); + check(ret == &vars->decoration_b); ret = lookup_decoration(&vars->n, vars->three); - if (!check(ret == NULL)) - test_msg("lookup for unknown object should return NULL"); + check(ret == NULL); } static void t_loop(struct test_vars *vars) @@ -55,8 +48,7 @@ static void t_loop(struct test_vars *vars) if (vars->n.entries[i].base) objects_noticed++; } - if (!check_int(objects_noticed, ==, 2)) - test_msg("should have 2 objects"); + check_int(objects_noticed, ==, 2); } int cmd_main(int argc UNUSED, const char **argv UNUSED) From 63ad8dbf169ec8e2b3cef40ff51499ee751a84a5 Mon Sep 17 00:00:00 2001 From: D Harithamma Date: Wed, 31 Jul 2024 13:33:59 +0000 Subject: [PATCH 027/297] convert: return early when not tracing When Git adds a file requiring encoding conversion and tracing of encoding conversion is not requested via the GIT_TRACE_WORKING_TREE_ENCODING environment variable, the `trace_encoding()` function still allocates & prepares "human readable" copies of the file contents before and after conversion to show in the trace. This results in a high memory footprint and increased runtime without providing any user-visible benefit. This fix introduces an early exit from the `trace_encoding()` function when tracing is not requested, preventing unnecessary memory allocation and processing. Signed-off-by: D Harithamma Signed-off-by: Junio C Hamano --- convert.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/convert.c b/convert.c index 35b25eb3cb9212..ee4d32978e0fb6 100644 --- a/convert.c +++ b/convert.c @@ -322,6 +322,9 @@ static void trace_encoding(const char *context, const char *path, struct strbuf trace = STRBUF_INIT; int i; + if (!trace_want(&coe)) + return; + strbuf_addf(&trace, "%s (%s, considered %s):\n", context, path, encoding); for (i = 0; i < len && buf; ++i) { strbuf_addf( From 49f4fd901a97863c6458acaa0904b940dc827c40 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Wed, 31 Jul 2024 12:37:40 +0200 Subject: [PATCH 028/297] t98xx: fix Perforce tests with p4d r23 and newer Some of the tests in t98xx modify the Perforce depot in ways that the tool wouldn't normally allow. This is done to test behaviour of git-p4 in certain edge cases that we have observed in the wild, but which should in theory not be possible. Naturally, modifying the depot on disk directly is quite intimate with the tool and thus prone to breakage when Perforce updates the way that data is stored. And indeed, those tests are broken nowadays with r23 of Perforce. While a file revision was previously stored as a plain file "depot/file,v", it is now stored in a directory "depot/file,d" with compression. Adapt those tests to handle both old- and new-style depot layouts. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- t/t9800-git-p4-basic.sh | 16 ++++++++++++++-- t/t9802-git-p4-filetype.sh | 18 +++++++++++++++--- t/t9825-git-p4-handle-utf16-without-bom.sh | 22 +++++++++++++++++++--- 3 files changed, 48 insertions(+), 8 deletions(-) diff --git a/t/t9800-git-p4-basic.sh b/t/t9800-git-p4-basic.sh index 53af8e34ac1c40..0816763e46639c 100755 --- a/t/t9800-git-p4-basic.sh +++ b/t/t9800-git-p4-basic.sh @@ -297,8 +297,20 @@ test_expect_success 'exit when p4 fails to produce marshaled output' ' # p4 changes, files, or describe; just in p4 print. If P4CLIENT is unset, the # message will include "Librarian checkout". test_expect_success 'exit gracefully for p4 server errors' ' - test_when_finished "mv \"$db\"/depot/file1,v,hidden \"$db\"/depot/file1,v" && - mv "$db"/depot/file1,v "$db"/depot/file1,v,hidden && + # Note that newer Perforce versions started to store files + # compressed in directories. The case statement handles both + # old and new layout. + case "$(echo "$db"/depot/file1*)" in + *,v) + test_when_finished "mv \"$db\"/depot/file1,v,hidden \"$db\"/depot/file1,v" && + mv "$db"/depot/file1,v "$db"/depot/file1,v,hidden;; + *,d) + path="$(echo "$db"/depot/file1,d/*.gz)" && + test_when_finished "mv \"$path\",hidden \"$path\"" && + mv "$path" "$path",hidden;; + *) + BUG "unhandled p4d layout";; + esac && test_when_finished cleanup_git && test_expect_code 1 git p4 clone --dest="$git" //depot@1 >out 2>err && test_grep "Error from p4 print" err diff --git a/t/t9802-git-p4-filetype.sh b/t/t9802-git-p4-filetype.sh index bb236cd2b57a3c..df01a5d3389861 100755 --- a/t/t9802-git-p4-filetype.sh +++ b/t/t9802-git-p4-filetype.sh @@ -300,10 +300,22 @@ test_expect_success SYMLINKS 'empty symlink target' ' # text # @@ # + # Note that newer Perforce versions started to store files + # compressed in directories. The case statement handles both + # old and new layout. cd "$db/depot" && - sed "/@target1/{; s/target1/@/; n; d; }" \ - empty-symlink,v >empty-symlink,v.tmp && - mv empty-symlink,v.tmp empty-symlink,v + case "$(echo empty-symlink*)" in + empty-symlink,v) + sed "/@target1/{; s/target1/@/; n; d; }" \ + empty-symlink,v >empty-symlink,v.tmp && + mv empty-symlink,v.tmp empty-symlink,v;; + empty-symlink,d) + path="empty-symlink,d/$(ls empty-symlink,d/ | tail -n1)" && + rm "$path" && + gzip "$path";; + *) + BUG "unhandled p4d layout";; + esac ) && ( # Make sure symlink really is empty. Asking diff --git a/t/t9825-git-p4-handle-utf16-without-bom.sh b/t/t9825-git-p4-handle-utf16-without-bom.sh index f049ff8229c6d3..6a60b323493913 100755 --- a/t/t9825-git-p4-handle-utf16-without-bom.sh +++ b/t/t9825-git-p4-handle-utf16-without-bom.sh @@ -22,9 +22,25 @@ test_expect_success 'init depot with UTF-16 encoded file and artificially remove cd db && p4d -jc && # P4D automatically adds a BOM. Remove it here to make the file invalid. - sed -e "\$d" depot/file1,v >depot/file1,v.new && - mv depot/file1,v.new depot/file1,v && - printf "@$UTF16@" >>depot/file1,v && + # + # Note that newer Perforce versions started to store files + # compressed in directories. The case statement handles both + # old and new layout. + case "$(echo depot/file1*)" in + depot/file1,v) + sed -e "\$d" depot/file1,v >depot/file1,v.new && + mv depot/file1,v.new depot/file1,v && + printf "@$UTF16@" >>depot/file1,v;; + depot/file1,d) + path="$(echo depot/file1,d/*.gz)" && + gunzip -c "$path" >"$path.unzipped" && + sed -e "\$d" "$path.unzipped" >"$path.new" && + printf "$UTF16" >>"$path.new" && + gzip -c "$path.new" >"$path" && + rm "$path.unzipped" "$path.new";; + *) + BUG "unhandled p4d layout";; + esac && p4d -jrF checkpoint.1 ) ' From d707d23d2c23d55845e7ebcd96c3dbc6e8415d8d Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Wed, 31 Jul 2024 12:37:45 +0200 Subject: [PATCH 029/297] ci: update Perforce version to r23.2 Update our Perforce version from r21.2 to r23.2. Note that the updated version is not the newest version. Instead, it is the last version where the way that Perforce is being distributed remains the same as in r21.2. Newer releases stopped distributing p4 and p4d executables as well as the macOS archives directly and would thus require more work. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- ci/install-dependencies.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh index 6ec0f85972f1f9..b59fd7c1fd6af1 100755 --- a/ci/install-dependencies.sh +++ b/ci/install-dependencies.sh @@ -7,7 +7,7 @@ begin_group "Install dependencies" -P4WHENCE=https://cdist2.perforce.com/perforce/r21.2 +P4WHENCE=https://cdist2.perforce.com/perforce/r23.2 LFSWHENCE=https://github.com/github/git-lfs/releases/download/v$LINUX_GIT_LFS_VERSION JGITWHENCE=https://repo.eclipse.org/content/groups/releases//org/eclipse/jgit/org.eclipse.jgit.pgm/6.8.0.202311291450-r/org.eclipse.jgit.pgm-6.8.0.202311291450-r.sh From 63ee9333837ec5c4cad79935a3061b02b51f32fd Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Wed, 31 Jul 2024 12:37:56 +0200 Subject: [PATCH 030/297] t98xx: mark Perforce tests as memory-leak free All the Perforce tests are free of memory leaks. This went unnoticed because most folks do not have p4 and p4d installed on their computers. Consequently, given that the prerequisites for running those tests aren't fulfilled, `TEST_PASSES_SANITIZE_LEAK=check` won't notice that those tests are indeed memory leak free. Mark those tests accordingly. Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- t/t9800-git-p4-basic.sh | 1 + t/t9801-git-p4-branch.sh | 1 + t/t9802-git-p4-filetype.sh | 1 + t/t9803-git-p4-shell-metachars.sh | 1 + t/t9804-git-p4-label.sh | 1 + t/t9805-git-p4-skip-submit-edit.sh | 1 + t/t9806-git-p4-options.sh | 1 + t/t9808-git-p4-chdir.sh | 1 + t/t9809-git-p4-client-view.sh | 1 + t/t9810-git-p4-rcs.sh | 1 + t/t9811-git-p4-label-import.sh | 1 + t/t9812-git-p4-wildcards.sh | 1 + t/t9813-git-p4-preserve-users.sh | 1 + t/t9814-git-p4-rename.sh | 1 + t/t9815-git-p4-submit-fail.sh | 1 + t/t9816-git-p4-locked.sh | 1 + t/t9817-git-p4-exclude.sh | 1 + t/t9818-git-p4-block.sh | 1 + t/t9819-git-p4-case-folding.sh | 1 + t/t9820-git-p4-editor-handling.sh | 1 + t/t9821-git-p4-path-variations.sh | 1 + t/t9822-git-p4-path-encoding.sh | 1 + t/t9823-git-p4-mock-lfs.sh | 1 + t/t9825-git-p4-handle-utf16-without-bom.sh | 1 + t/t9826-git-p4-keep-empty-commits.sh | 1 + t/t9827-git-p4-change-filetype.sh | 1 + t/t9828-git-p4-map-user.sh | 1 + t/t9829-git-p4-jobs.sh | 1 + t/t9830-git-p4-symlink-dir.sh | 1 + t/t9831-git-p4-triggers.sh | 1 + t/t9832-unshelve.sh | 1 + t/t9833-errors.sh | 1 + t/t9834-git-p4-file-dir-bug.sh | 1 + t/t9835-git-p4-metadata-encoding-python2.sh | 1 + t/t9836-git-p4-metadata-encoding-python3.sh | 1 + 35 files changed, 35 insertions(+) diff --git a/t/t9800-git-p4-basic.sh b/t/t9800-git-p4-basic.sh index 0816763e46639c..3e6dfce24833a0 100755 --- a/t/t9800-git-p4-basic.sh +++ b/t/t9800-git-p4-basic.sh @@ -5,6 +5,7 @@ test_description='git p4 tests' GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh test_expect_success 'start p4d' ' diff --git a/t/t9801-git-p4-branch.sh b/t/t9801-git-p4-branch.sh index c598011635ac2f..cdbfacc727a7c6 100755 --- a/t/t9801-git-p4-branch.sh +++ b/t/t9801-git-p4-branch.sh @@ -5,6 +5,7 @@ test_description='git p4 tests for p4 branches' GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh test_expect_success 'start p4d' ' diff --git a/t/t9802-git-p4-filetype.sh b/t/t9802-git-p4-filetype.sh index df01a5d3389861..1bc48305b01789 100755 --- a/t/t9802-git-p4-filetype.sh +++ b/t/t9802-git-p4-filetype.sh @@ -2,6 +2,7 @@ test_description='git p4 filetype tests' +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh test_expect_success 'start p4d' ' diff --git a/t/t9803-git-p4-shell-metachars.sh b/t/t9803-git-p4-shell-metachars.sh index 2913277013da56..ab7fe16266872c 100755 --- a/t/t9803-git-p4-shell-metachars.sh +++ b/t/t9803-git-p4-shell-metachars.sh @@ -2,6 +2,7 @@ test_description='git p4 transparency to shell metachars in filenames' +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh test_expect_success 'start p4d' ' diff --git a/t/t9804-git-p4-label.sh b/t/t9804-git-p4-label.sh index 32364571063d4c..c8963fd3980e0e 100755 --- a/t/t9804-git-p4-label.sh +++ b/t/t9804-git-p4-label.sh @@ -2,6 +2,7 @@ test_description='git p4 label tests' +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh test_expect_success 'start p4d' ' diff --git a/t/t9805-git-p4-skip-submit-edit.sh b/t/t9805-git-p4-skip-submit-edit.sh index 90ef647db7e610..72dce3d2b46f4d 100755 --- a/t/t9805-git-p4-skip-submit-edit.sh +++ b/t/t9805-git-p4-skip-submit-edit.sh @@ -2,6 +2,7 @@ test_description='git p4 skipSubmitEdit config variables' +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh test_expect_success 'start p4d' ' diff --git a/t/t9806-git-p4-options.sh b/t/t9806-git-p4-options.sh index c26d29743307b5..e4ce44ebf3783b 100755 --- a/t/t9806-git-p4-options.sh +++ b/t/t9806-git-p4-options.sh @@ -5,6 +5,7 @@ test_description='git p4 options' GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh test_expect_success 'start p4d' ' diff --git a/t/t9808-git-p4-chdir.sh b/t/t9808-git-p4-chdir.sh index 58a9b3b71e6d88..342f7f3d4a09d2 100755 --- a/t/t9808-git-p4-chdir.sh +++ b/t/t9808-git-p4-chdir.sh @@ -2,6 +2,7 @@ test_description='git p4 relative chdir' +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh test_expect_success 'start p4d' ' diff --git a/t/t9809-git-p4-client-view.sh b/t/t9809-git-p4-client-view.sh index 9c9710d8c7b871..f33fdea889edc3 100755 --- a/t/t9809-git-p4-client-view.sh +++ b/t/t9809-git-p4-client-view.sh @@ -2,6 +2,7 @@ test_description='git p4 client view' +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh test_expect_success 'start p4d' ' diff --git a/t/t9810-git-p4-rcs.sh b/t/t9810-git-p4-rcs.sh index 5fe83315ecd57a..15e32c9f353a73 100755 --- a/t/t9810-git-p4-rcs.sh +++ b/t/t9810-git-p4-rcs.sh @@ -2,6 +2,7 @@ test_description='git p4 rcs keywords' +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh CP1252="\223\224" diff --git a/t/t9811-git-p4-label-import.sh b/t/t9811-git-p4-label-import.sh index 5ac5383fb71715..52a4b0af811294 100755 --- a/t/t9811-git-p4-label-import.sh +++ b/t/t9811-git-p4-label-import.sh @@ -5,6 +5,7 @@ test_description='git p4 label tests' GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh test_expect_success 'start p4d' ' diff --git a/t/t9812-git-p4-wildcards.sh b/t/t9812-git-p4-wildcards.sh index 254a7c244698a0..46aa5fd56c7706 100755 --- a/t/t9812-git-p4-wildcards.sh +++ b/t/t9812-git-p4-wildcards.sh @@ -2,6 +2,7 @@ test_description='git p4 wildcards' +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh test_expect_success 'start p4d' ' diff --git a/t/t9813-git-p4-preserve-users.sh b/t/t9813-git-p4-preserve-users.sh index fd018c87a80636..0efea28da2cb9d 100755 --- a/t/t9813-git-p4-preserve-users.sh +++ b/t/t9813-git-p4-preserve-users.sh @@ -2,6 +2,7 @@ test_description='git p4 preserve users' +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh test_expect_success 'start p4d' ' diff --git a/t/t9814-git-p4-rename.sh b/t/t9814-git-p4-rename.sh index 2a9838f37fe293..00df6ebd3bdc03 100755 --- a/t/t9814-git-p4-rename.sh +++ b/t/t9814-git-p4-rename.sh @@ -2,6 +2,7 @@ test_description='git p4 rename' +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh test_expect_success 'start p4d' ' diff --git a/t/t9815-git-p4-submit-fail.sh b/t/t9815-git-p4-submit-fail.sh index c766fd159f14b6..92ef9d8c242559 100755 --- a/t/t9815-git-p4-submit-fail.sh +++ b/t/t9815-git-p4-submit-fail.sh @@ -2,6 +2,7 @@ test_description='git p4 submit failure handling' +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh test_expect_success 'start p4d' ' diff --git a/t/t9816-git-p4-locked.sh b/t/t9816-git-p4-locked.sh index 5e904ac80d8522..e687fbc25f60f5 100755 --- a/t/t9816-git-p4-locked.sh +++ b/t/t9816-git-p4-locked.sh @@ -2,6 +2,7 @@ test_description='git p4 locked file behavior' +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh test_expect_success 'start p4d' ' diff --git a/t/t9817-git-p4-exclude.sh b/t/t9817-git-p4-exclude.sh index ec3d937c6a73bb..3deb334fed1c6a 100755 --- a/t/t9817-git-p4-exclude.sh +++ b/t/t9817-git-p4-exclude.sh @@ -2,6 +2,7 @@ test_description='git p4 tests for excluded paths during clone and sync' +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh test_expect_success 'start p4d' ' diff --git a/t/t9818-git-p4-block.sh b/t/t9818-git-p4-block.sh index de591d875c2bbc..091bb72bdb62bd 100755 --- a/t/t9818-git-p4-block.sh +++ b/t/t9818-git-p4-block.sh @@ -2,6 +2,7 @@ test_description='git p4 fetching changes in multiple blocks' +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh test_expect_success 'start p4d' ' diff --git a/t/t9819-git-p4-case-folding.sh b/t/t9819-git-p4-case-folding.sh index b4d93f0c17c357..985be203571620 100755 --- a/t/t9819-git-p4-case-folding.sh +++ b/t/t9819-git-p4-case-folding.sh @@ -2,6 +2,7 @@ test_description='interaction with P4 case-folding' +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh if test_have_prereq CASE_INSENSITIVE_FS diff --git a/t/t9820-git-p4-editor-handling.sh b/t/t9820-git-p4-editor-handling.sh index fa1bba1dd93614..48e4dfb95c03c3 100755 --- a/t/t9820-git-p4-editor-handling.sh +++ b/t/t9820-git-p4-editor-handling.sh @@ -2,6 +2,7 @@ test_description='git p4 handling of EDITOR' +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh test_expect_success 'start p4d' ' diff --git a/t/t9821-git-p4-path-variations.sh b/t/t9821-git-p4-path-variations.sh index ef80f1690bcb9a..49691c53dadddb 100755 --- a/t/t9821-git-p4-path-variations.sh +++ b/t/t9821-git-p4-path-variations.sh @@ -2,6 +2,7 @@ test_description='Clone repositories with path case variations' +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh test_expect_success 'start p4d with case folding enabled' ' diff --git a/t/t9822-git-p4-path-encoding.sh b/t/t9822-git-p4-path-encoding.sh index 572d395498e30a..e62ed49f51e022 100755 --- a/t/t9822-git-p4-path-encoding.sh +++ b/t/t9822-git-p4-path-encoding.sh @@ -2,6 +2,7 @@ test_description='Clone repositories with non ASCII paths' +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh UTF8_ESCAPED="a-\303\244_o-\303\266_u-\303\274.txt" diff --git a/t/t9823-git-p4-mock-lfs.sh b/t/t9823-git-p4-mock-lfs.sh index 88b76dc4d6c26f..98a40d8af3793b 100755 --- a/t/t9823-git-p4-mock-lfs.sh +++ b/t/t9823-git-p4-mock-lfs.sh @@ -2,6 +2,7 @@ test_description='Clone repositories and store files in Mock LFS' +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh test_file_is_not_in_mock_lfs () { diff --git a/t/t9825-git-p4-handle-utf16-without-bom.sh b/t/t9825-git-p4-handle-utf16-without-bom.sh index 6a60b323493913..d0b86537dd975a 100755 --- a/t/t9825-git-p4-handle-utf16-without-bom.sh +++ b/t/t9825-git-p4-handle-utf16-without-bom.sh @@ -2,6 +2,7 @@ test_description='git p4 handling of UTF-16 files without BOM' +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh UTF16="\227\000\227\000" diff --git a/t/t9826-git-p4-keep-empty-commits.sh b/t/t9826-git-p4-keep-empty-commits.sh index fd64afe064e5a9..54083f842e9259 100755 --- a/t/t9826-git-p4-keep-empty-commits.sh +++ b/t/t9826-git-p4-keep-empty-commits.sh @@ -2,6 +2,7 @@ test_description='Clone repositories and keep empty commits' +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh test_expect_success 'start p4d' ' diff --git a/t/t9827-git-p4-change-filetype.sh b/t/t9827-git-p4-change-filetype.sh index d3670bd7a24dbf..3476ea2fd3b577 100755 --- a/t/t9827-git-p4-change-filetype.sh +++ b/t/t9827-git-p4-change-filetype.sh @@ -2,6 +2,7 @@ test_description='git p4 support for file type change' +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh test_expect_success 'start p4d' ' diff --git a/t/t9828-git-p4-map-user.sh b/t/t9828-git-p4-map-user.sh index ca6c2942bdf200..7c8f9e3930406d 100755 --- a/t/t9828-git-p4-map-user.sh +++ b/t/t9828-git-p4-map-user.sh @@ -2,6 +2,7 @@ test_description='Clone repositories and map users' +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh test_expect_success 'start p4d' ' diff --git a/t/t9829-git-p4-jobs.sh b/t/t9829-git-p4-jobs.sh index 88cfb1fcd3f0a1..3fc0948d9cff20 100755 --- a/t/t9829-git-p4-jobs.sh +++ b/t/t9829-git-p4-jobs.sh @@ -2,6 +2,7 @@ test_description='git p4 retrieve job info' +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh test_expect_success 'start p4d' ' diff --git a/t/t9830-git-p4-symlink-dir.sh b/t/t9830-git-p4-symlink-dir.sh index 3fb6960c18fc0c..02561a7f0e62f7 100755 --- a/t/t9830-git-p4-symlink-dir.sh +++ b/t/t9830-git-p4-symlink-dir.sh @@ -2,6 +2,7 @@ test_description='git p4 symlinked directories' +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh test_expect_success 'start p4d' ' diff --git a/t/t9831-git-p4-triggers.sh b/t/t9831-git-p4-triggers.sh index ff6c0352e68824..f287f41e3741e1 100755 --- a/t/t9831-git-p4-triggers.sh +++ b/t/t9831-git-p4-triggers.sh @@ -2,6 +2,7 @@ test_description='git p4 with server triggers' +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh test_expect_success 'start p4d' ' diff --git a/t/t9832-unshelve.sh b/t/t9832-unshelve.sh index 6b3cb0414aa08c..a2667754086a5c 100755 --- a/t/t9832-unshelve.sh +++ b/t/t9832-unshelve.sh @@ -6,6 +6,7 @@ last_shelved_change () { test_description='git p4 unshelve' +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh test_expect_success 'start p4d' ' diff --git a/t/t9833-errors.sh b/t/t9833-errors.sh index e22369ccdf5f15..da1d30c142c4fa 100755 --- a/t/t9833-errors.sh +++ b/t/t9833-errors.sh @@ -2,6 +2,7 @@ test_description='git p4 errors' +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh test_expect_success 'start p4d' ' diff --git a/t/t9834-git-p4-file-dir-bug.sh b/t/t9834-git-p4-file-dir-bug.sh index dac67e89d7d720..565870fc740cb2 100755 --- a/t/t9834-git-p4-file-dir-bug.sh +++ b/t/t9834-git-p4-file-dir-bug.sh @@ -6,6 +6,7 @@ This test creates files and directories with the same name in perforce and checks that git-p4 recovers from the error at the same time as the perforce repository.' +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh test_expect_success 'start p4d' ' diff --git a/t/t9835-git-p4-metadata-encoding-python2.sh b/t/t9835-git-p4-metadata-encoding-python2.sh index 036bf79c6674f6..ad20ffdededdc9 100755 --- a/t/t9835-git-p4-metadata-encoding-python2.sh +++ b/t/t9835-git-p4-metadata-encoding-python2.sh @@ -6,6 +6,7 @@ This test checks that the import process handles inconsistent text encoding in p4 metadata (author names, commit messages, etc) without failing, and produces maximally sane output in git.' +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh python_target_version='2' diff --git a/t/t9836-git-p4-metadata-encoding-python3.sh b/t/t9836-git-p4-metadata-encoding-python3.sh index 63350dc4b5c626..71ae763399c084 100755 --- a/t/t9836-git-p4-metadata-encoding-python3.sh +++ b/t/t9836-git-p4-metadata-encoding-python3.sh @@ -6,6 +6,7 @@ This test checks that the import process handles inconsistent text encoding in p4 metadata (author names, commit messages, etc) without failing, and produces maximally sane output in git.' +TEST_PASSES_SANITIZE_LEAK=true . ./lib-git-p4.sh python_target_version='3' From b201316835bbf2c49c2780f23cfd6146f6b8d1a2 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Thu, 1 Aug 2024 04:25:56 -0400 Subject: [PATCH 031/297] credential/osxkeychain: respect NUL terminator in username This patch fixes a case where git-credential-osxkeychain might output uninitialized bytes to stdout. We need to get the username string from a system API using CFStringGetCString(). To do that, we get the max size for the string from CFStringGetMaximumSizeForEncoding(), allocate a buffer based on that, and then read into it. But then we print the entire buffer to stdout, including the trailing NUL and any extra bytes which were not needed. Instead, we should stop at the NUL. This code comes from 9abe31f5f1 (osxkeychain: replace deprecated SecKeychain API, 2024-02-17). The bug was probably overlooked back then because this code is only used as a fallback when we can't get the string via CFStringGetCStringPtr(). According to Apple's documentation: Whether or not this function returns a valid pointer or NULL depends on many factors, all of which depend on how the string was created and its properties. So it's not clear how we could make a test for this, and we'll have to rely on manually testing on a system that triggered the bug in the first place. Reported-by: Hong Jiang Signed-off-by: Jeff King Tested-by: Hong Jiang Signed-off-by: Junio C Hamano --- contrib/credential/osxkeychain/git-credential-osxkeychain.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/contrib/credential/osxkeychain/git-credential-osxkeychain.c b/contrib/credential/osxkeychain/git-credential-osxkeychain.c index 6a40917b1ef866..b5346005a46a36 100644 --- a/contrib/credential/osxkeychain/git-credential-osxkeychain.c +++ b/contrib/credential/osxkeychain/git-credential-osxkeychain.c @@ -140,7 +140,7 @@ static void find_username_in_item(CFDictionaryRef item) username_buf, buffer_len, ENCODING)) { - write_item("username", username_buf, buffer_len - 1); + write_item("username", username_buf, strlen(username_buf)); } free(username_buf); } From 615d2de3b457272216d4179ceb82b3b2b86b1929 Mon Sep 17 00:00:00 2001 From: Taylor Blau Date: Thu, 1 Aug 2024 13:06:54 -0400 Subject: [PATCH 032/297] config.c: avoid segfault with --fixed-value and valueless config When using `--fixed-value` with a key whose value is left empty (implied as being "true"), 'git config' may crash when invoked like either of: $ git config set --file=config --value=value --fixed-value \ section.key pattern $ git config --file=config --fixed-value section.key value pattern The original bugreport[1] bisects to 00bbdde141 (builtin/config: introduce "set" subcommand, 2024-05-06), which is a red-herring, since the original bugreport uses the new 'git config set' invocation. The behavior likely bisects back to c90702a1f6 (config: plumb --fixed-value into config API, 2020-11-25), which introduces the new --fixed-value option in the first place. Looking at the relevant frame from a failed process's coredump, the crash appears in config.c::matches() like so: (gdb) up #1 0x000055b3e8b06022 in matches (key=0x55b3ea894360 "section.key", value=0x0, store=0x7ffe99076eb0) at config.c:2884 2884 return !strcmp(store->fixed_value, value); where we are trying to compare the `--fixed-value` argument to `value`, which is NULL. Avoid attempting to match `--fixed-value` for configuration keys with no explicit value. A future patch could consider the empty value to mean "true", "yes", "on", etc. when invoked with `--type=bool`, but let's punt on that for now in the name of avoiding the segfault. [1]: https://lore.kernel.org/git/CANrWfmTek1xErBLrnoyhHN+gWU+rw14y6SQ+abZyzGoaBjmiKA@mail.gmail.com/ Reported-by: Han Jiang Signed-off-by: Taylor Blau Signed-off-by: Junio C Hamano --- config.c | 2 +- t/t1300-config.sh | 9 +++++++++ 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/config.c b/config.c index aa2888d301a35f..53ecb2f32df30a 100644 --- a/config.c +++ b/config.c @@ -2876,7 +2876,7 @@ static int matches(const char *key, const char *value, { if (strcmp(key, store->key)) return 0; /* not ours */ - if (store->fixed_value) + if (store->fixed_value && value) return !strcmp(store->fixed_value, value); if (!store->value_pattern) return 1; /* always matches */ diff --git a/t/t1300-config.sh b/t/t1300-config.sh index f1d42b62b0585c..127671aa3f2618 100755 --- a/t/t1300-config.sh +++ b/t/t1300-config.sh @@ -2434,6 +2434,15 @@ test_expect_success '--get and --get-all with --fixed-value' ' test_must_fail git config --file=config --get-regexp --fixed-value fixed+ non-existent ' +test_expect_success '--fixed-value with value-less configuration' ' + test_when_finished rm -f config && + cat >config <<-\EOF && + [section] + key + EOF + git config --file=config --fixed-value section.key value pattern +' + test_expect_success 'includeIf.hasconfig:remote.*.url' ' git init hasremoteurlTest && test_when_finished "rm -rf hasremoteurlTest" && From 7c7516b8db8b2e6d03379c0e81292cc11e7836bf Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Thu, 1 Aug 2024 11:51:12 -0700 Subject: [PATCH 033/297] t0018: remove leftover debugging cruft The actual file is copied out to /tmp, presumably so that the tester can inspect it after the test is done, which may have been a useful debugging aid. But in the final shape of the test suite, such a code should not exist. We cannot even assume that we are allowed to write into /tmp (our TMPDIR may not even be pointing at it) or read from it for that matter. Noticed-by: Randall S. Becker Signed-off-by: Junio C Hamano --- t/t0018-advice.sh | 1 - 1 file changed, 1 deletion(-) diff --git a/t/t0018-advice.sh b/t/t0018-advice.sh index b02448ea16353d..040a08be07d2e0 100755 --- a/t/t0018-advice.sh +++ b/t/t0018-advice.sh @@ -93,7 +93,6 @@ EOF >README && GIT_ADVICE=true git status ) >actual && - cat actual > /tmp/actual && test_cmp expect actual ' From 9e89dcb66a70902fef966083998a75272d47d998 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Fri, 2 Aug 2024 06:44:11 +0200 Subject: [PATCH 034/297] builtin/ls-remote: fall back to SHA1 outside of a repo In c8aed5e8da (repository: stop setting SHA1 as the default object hash, 2024-05-07), we have stopped setting the default hash algorithm for `the_repository`. Consequently, code that relies on `the_hash_algo` will now crash when it hasn't explicitly been initialized, which may be the case when running outside of a Git repository. It was reported that git-ls-remote(1) may crash in such a way when using a remote helper that advertises refspecs. This is because the refspec announced by the helper will get parsed during capability negotiation. At that point we haven't yet figured out what object format the remote uses though, so when run outside of a repository then we will fail. The course of action is somewhat dubious in the first place. Ideally, we should only parse object IDs once we have asked the remote helper for the object format. And if the helper didn't announce the "object-format" capability, then we should always assume SHA256. But instead, we used to take either SHA1 if there was no repository, or we used the hash of the local repository, which is wrong. Arguably though, crashing hard may not be in the best interest of our users, either. So while the old behaviour was buggy, let's restore it for now as a short-term fix. We should eventually revisit, potentially by deferring the point in time when we parse the refspec until after we have figured out the remote's object hash. Reported-by: Mike Hommey Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- builtin/ls-remote.c | 15 +++++++++++++++ t/t5512-ls-remote.sh | 13 +++++++++++++ 2 files changed, 28 insertions(+) diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c index debf2d4f887ed2..6da63a67f569eb 100644 --- a/builtin/ls-remote.c +++ b/builtin/ls-remote.c @@ -91,6 +91,21 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix) PARSE_OPT_STOP_AT_NON_OPTION); dest = argv[0]; + /* + * TODO: This is buggy, but required for transport helpers. When a + * transport helper advertises a "refspec", then we'd add that to a + * list of refspecs via `refspec_append()`, which transitively depends + * on `the_hash_algo`. Thus, when the hash algorithm isn't properly set + * up, this would lead to a segfault. + * + * We really should fix this in the transport helper logic such that we + * lazily parse refspec capabilities _after_ we have learned about the + * remote's object format. Otherwise, we may end up misparsing refspecs + * depending on what object hash the remote uses. + */ + if (!the_repository->hash_algo) + repo_set_hash_algo(the_repository, GIT_HASH_SHA1); + packet_trace_identity("ls-remote"); if (argc > 1) { diff --git a/t/t5512-ls-remote.sh b/t/t5512-ls-remote.sh index 42e77eb5a98adc..bc442ec221da93 100755 --- a/t/t5512-ls-remote.sh +++ b/t/t5512-ls-remote.sh @@ -402,4 +402,17 @@ test_expect_success 'v0 clients can handle multiple symrefs' ' test_cmp expect actual ' +test_expect_success 'helper with refspec capability fails gracefully' ' + mkdir test-bin && + write_script test-bin/git-remote-foo <<-EOF && + echo import + echo refspec ${SQ}*:*${SQ} + EOF + ( + PATH="$PWD/test-bin:$PATH" && + export PATH && + test_must_fail nongit git ls-remote foo::bar + ) +' + test_done From e95d515141648e4067dbda8af8516e1c718c8f65 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Mon, 5 Aug 2024 02:00:10 -0400 Subject: [PATCH 035/297] apply: canonicalize modes read from patches Git stores only canonical modes for blobs. So for a regular file, we care about only "100644" or "100755" (depending only on the executable bit), but never modes where the group or other permissions are more exotic. So never "100664", "100700", etc. When a file in the working tree has such a mode, we quietly turn it into one of the two canonical modes, and that's what is stored both in the index and in tree objects. However, we don't canonicalize modes we read from incoming patches in git-apply. These may appear in a few lines: - "old mode" / "new mode" lines for mode changes - "new file mode" lines for newly created files - "deleted file mode" for removing files For "new mode" and for "new file mode", this is harmless. The patch is asking the result to have a certain mode, but: - when we add an index entry (for --index or --cached), it is canonicalized as we create the entry, via create_ce_mode(). - for a working tree file, try_create_file() passes either 0777 or 0666 to open(), so what you get depends only on your umask, not any other bits (aside from the executable bit) in the original mode. However, for "old mode" and "deleted file mode", there is a minor annoyance. We compare the patch's expected preimage mode with the current state. But that current state is always going to be a canonical mode itself: - updating an index entry via --cached will have the canonical mode in the index - for updating a working tree file, check_preimage() runs the mode through ce_mode_from_stat(), which does the usual canonicalization So if the patch feeds a non-canonical mode, it's impossible for it to match, and we will always complain with something like: file has type 100644, expected 100664 Since this is just a warning, the operation proceeds, but it's confusing and annoying. These cases should be pretty rare in practice. Git would never produce a patch with non-canonical modes itself (since it doesn't store them). And while we do accept patches from other programs, all of those lines were invented by Git. So you'd need a program trying to be Git compatible, but not handling canonicalization the same way. Reportedly "quilt" is such a program. We should canonicalize the modes as we read them so that the user never sees the useless warning. A few notes on the tests: - I've covered instances of all lines for completeness, even though the "new mode" / "new file mode" ones behave OK currently. - the tests apply patches to both the index and working tree, and check the result of both. Again, we know that all of these paths canonicalize anyway, but it's giving us extra coverage (although we are even less likely to have such a bug now since we canonicalize up front). - the test patches are missing "index" lines, which is also something Git would never produce. But they don't matter for the test, they do match the case from quilt we saw in the wild, and they avoid some sha1/sha256 complexity. Reported-by: Andrew Morton Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- apply.c | 1 + t/t4129-apply-samemode.sh | 62 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 63 insertions(+) diff --git a/apply.c b/apply.c index 901b67e6255af4..bfa80f6584da59 100644 --- a/apply.c +++ b/apply.c @@ -992,6 +992,7 @@ static int parse_mode_line(const char *line, int linenr, unsigned int *mode) *mode = strtoul(line, &end, 8); if (end == line || !isspace(*end)) return error(_("invalid mode on line %d: %s"), linenr, line); + *mode = canon_mode(*mode); return 0; } diff --git a/t/t4129-apply-samemode.sh b/t/t4129-apply-samemode.sh index 4eb84440298552..d9a1084b5e6f2a 100755 --- a/t/t4129-apply-samemode.sh +++ b/t/t4129-apply-samemode.sh @@ -130,4 +130,66 @@ test_expect_success 'git apply respects core.fileMode' ' test_grep ! "has type 100644, expected 100755" err ' +test_expect_success POSIXPERM 'patch mode for new file is canonicalized' ' + cat >patch <<-\EOF && + diff --git a/non-canon b/non-canon + new file mode 100660 + --- /dev/null + +++ b/non-canon + +content + EOF + test_when_finished "git reset --hard" && + ( + umask 0 && + git apply --index patch 2>err + ) && + test_must_be_empty err && + git ls-files -s -- non-canon >staged && + test_grep "^100644" staged && + ls -l non-canon >worktree && + test_grep "^-rw-rw-rw" worktree +' + +test_expect_success POSIXPERM 'patch mode for deleted file is canonicalized' ' + test_when_finished "git reset --hard" && + echo content >non-canon && + git add non-canon && + chmod 666 non-canon && + + cat >patch <<-\EOF && + diff --git a/non-canon b/non-canon + deleted file mode 100660 + --- a/non-canon + +++ /dev/null + @@ -1 +0,0 @@ + -content + EOF + git apply --index patch 2>err && + test_must_be_empty err && + git ls-files -- non-canon >staged && + test_must_be_empty staged && + test_path_is_missing non-canon +' + +test_expect_success POSIXPERM 'patch mode for mode change is canonicalized' ' + test_when_finished "git reset --hard" && + echo content >non-canon && + git add non-canon && + + cat >patch <<-\EOF && + diff --git a/non-canon b/non-canon + old mode 100660 + new mode 100770 + EOF + ( + umask 0 && + git apply --index patch 2>err + ) && + test_must_be_empty err && + git ls-files -s -- non-canon >staged && + test_grep "^100755" staged && + ls -l non-canon >worktree && + test_grep "^-rwxrwxrwx" worktree +' + test_done From e2e373ba82d2cb19b60e91318626bda78eba7872 Mon Sep 17 00:00:00 2001 From: Sven Strickroth Date: Mon, 5 Aug 2024 09:53:32 +0000 Subject: [PATCH 036/297] refs/files: prevent memory leak by freeing packed_ref_store This complements 64a6dd8ffc (refs: implement removal of ref storages, 2024-06-06). Signed-off-by: Sven Strickroth Acked-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- refs/files-backend.c | 1 + 1 file changed, 1 insertion(+) diff --git a/refs/files-backend.c b/refs/files-backend.c index e6637811992f01..48f6717d9a5d5a 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -155,6 +155,7 @@ static void files_ref_store_release(struct ref_store *ref_store) free_ref_cache(refs->loose); free(refs->gitcommondir); ref_store_release(refs->packed_ref_store); + free(refs->packed_ref_store); } static void files_reflog_path(struct files_ref_store *refs, From 0c4d5aa22d65fb395962bdc0165840f66b7b91ea Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ren=C3=A9=20Scharfe?= Date: Sat, 3 Aug 2024 14:33:24 +0200 Subject: [PATCH 037/297] log-tree: use decimal_width() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Reduce code duplication by calling decimal_width() to count the digits in the number of commits instead of calculating it locally. It also has the advantage of returning int, which is the exact type expected by the printf()-like function strbuf_addf() for field width arguments. Additionally, decimal_width() supports numbers bigger than 1410065407, which is (hopefully) just a theoretical advantage. Signed-off-by: René Scharfe Signed-off-by: Junio C Hamano --- log-tree.c | 13 ++----------- 1 file changed, 2 insertions(+), 11 deletions(-) diff --git a/log-tree.c b/log-tree.c index 337b9334cdbafe..0b0c63d5c2a90d 100644 --- a/log-tree.c +++ b/log-tree.c @@ -29,6 +29,7 @@ #include "tree.h" #include "wildmatch.h" #include "write-or-die.h" +#include "pager.h" static struct decoration name_decoration = { "object names" }; static int decoration_loaded; @@ -406,16 +407,6 @@ void show_decorations(struct rev_info *opt, struct commit *commit) strbuf_release(&sb); } -static unsigned int digits_in_number(unsigned int number) -{ - unsigned int i = 10, result = 1; - while (i <= number) { - i *= 10; - result++; - } - return result; -} - void fmt_output_subject(struct strbuf *filename, const char *subject, struct rev_info *info) @@ -459,7 +450,7 @@ void fmt_output_email_subject(struct strbuf *sb, struct rev_info *opt) strbuf_addf(sb, "Subject: [%s%s%0*d/%d] ", opt->subject_prefix, *opt->subject_prefix ? " " : "", - digits_in_number(opt->total), + decimal_width(opt->total), opt->nr, opt->total); } else if (opt->total == 0 && opt->subject_prefix && *opt->subject_prefix) { strbuf_addf(sb, "Subject: [%s] ", From b928d57ca9aa7457ec0dee022c1664e8cd606b22 Mon Sep 17 00:00:00 2001 From: Kyle Lippincott Date: Mon, 5 Aug 2024 17:10:07 +0000 Subject: [PATCH 038/297] set errno=0 before strtoX calls To detect conversion failure after calls to functions like `strtod`, one can check `errno == ERANGE`. These functions are not guaranteed to set `errno` to `0` on successful conversion, however. Manual manipulation of `errno` can likely be avoided by checking that the output pointer differs from the input pointer, but that's not how other locations, such as parse.c:139, handle this issue; they set errno to 0 prior to executing the function. For every place I could find a strtoX function with an ERANGE check following it, set `errno = 0;` prior to executing the conversion function. Signed-off-by: Kyle Lippincott Signed-off-by: Junio C Hamano --- builtin/get-tar-commit-id.c | 1 + ref-filter.c | 1 + t/helper/test-json-writer.c | 2 ++ t/helper/test-trace2.c | 1 + 4 files changed, 5 insertions(+) diff --git a/builtin/get-tar-commit-id.c b/builtin/get-tar-commit-id.c index 66a7389f9f4a21..7195a072edc3d6 100644 --- a/builtin/get-tar-commit-id.c +++ b/builtin/get-tar-commit-id.c @@ -35,6 +35,7 @@ int cmd_get_tar_commit_id(int argc, const char **argv UNUSED, const char *prefix if (header->typeflag[0] != TYPEFLAG_GLOBAL_HEADER) return 1; + errno = 0; len = strtol(content, &end, 10); if (errno == ERANGE || end == content || len < 0) return 1; diff --git a/ref-filter.c b/ref-filter.c index 8c5e673fc0a521..54880a2497aa90 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -1628,6 +1628,7 @@ static void grab_date(const char *buf, struct atom_value *v, const char *atomnam timestamp = parse_timestamp(eoemail + 2, &zone, 10); if (timestamp == TIME_MAX) goto bad; + errno = 0; tz = strtol(zone, NULL, 10); if ((tz == LONG_MIN || tz == LONG_MAX) && errno == ERANGE) goto bad; diff --git a/t/helper/test-json-writer.c b/t/helper/test-json-writer.c index ed52eb76bfcbb2..a288069b04cb3b 100644 --- a/t/helper/test-json-writer.c +++ b/t/helper/test-json-writer.c @@ -415,6 +415,7 @@ static void get_i(struct line *line, intmax_t *s_in) get_s(line, &s); + errno = 0; *s_in = strtol(s, &endptr, 10); if (*endptr || errno == ERANGE) die("line[%d]: invalid integer value", line->nr); @@ -427,6 +428,7 @@ static void get_d(struct line *line, double *s_in) get_s(line, &s); + errno = 0; *s_in = strtod(s, &endptr); if (*endptr || errno == ERANGE) die("line[%d]: invalid float value", line->nr); diff --git a/t/helper/test-trace2.c b/t/helper/test-trace2.c index cd955ec63e90fd..c588c273ce73f1 100644 --- a/t/helper/test-trace2.c +++ b/t/helper/test-trace2.c @@ -26,6 +26,7 @@ static int get_i(int *p_value, const char *data) if (!data || !*data) return MyError; + errno = 0; *p_value = strtol(data, &endptr, 10); if (*endptr || errno == ERANGE) return MyError; From ec60bb9fc416e0ded2336e8923c8249c661bfc80 Mon Sep 17 00:00:00 2001 From: Kyle Lippincott Date: Mon, 5 Aug 2024 17:10:08 +0000 Subject: [PATCH 039/297] t6421: fix test to work when repo dir contains d0 The `grep` statement in this test looks for `d0.*`, attempting to filter to only show lines that had tabular output where the 2nd column had `d0` and the final column had a substring of [`git -c `]`fetch.negotiationAlgorithm`. These lines also have `child_start` in the 4th column, but this isn't part of the condition. A subsequent line will have `d1` in the 2nd column, `start` in the 4th column, and `/path/to/git/git -c fetch.negotiationAlgorihm` in the final column. If `/path/to/git/git` contains the substring `d0`, then this line is included by `grep` as well as the desired line, leading to an effective doubling of the number of lines, and test failures. Tighten the grep expression to require `d0` to be surrounded by spaces, and to have the `child_start` label. Signed-off-by: Kyle Lippincott Signed-off-by: Junio C Hamano --- t/t6421-merge-partial-clone.sh | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/t/t6421-merge-partial-clone.sh b/t/t6421-merge-partial-clone.sh index 711b709e755268..b99f29ef9baded 100755 --- a/t/t6421-merge-partial-clone.sh +++ b/t/t6421-merge-partial-clone.sh @@ -230,8 +230,9 @@ test_expect_merge_algorithm failure success 'Objects downloaded for single relev grep fetch_count trace.output | cut -d "|" -f 9 | tr -d " ." >actual && test_cmp expect actual && - # Check the number of fetch commands exec-ed - grep d0.*fetch.negotiationAlgorithm trace.output >fetches && + # Check the number of fetch commands exec-ed by filtering trace to + # child_start events by the top-level program (2nd field == d0) + grep " d0 .* child_start .*fetch.negotiationAlgorithm" trace.output >fetches && test_line_count = 2 fetches && git rev-list --objects --all --missing=print | @@ -318,8 +319,9 @@ test_expect_merge_algorithm failure success 'Objects downloaded when a directory grep fetch_count trace.output | cut -d "|" -f 9 | tr -d " ." >actual && test_cmp expect actual && - # Check the number of fetch commands exec-ed - grep d0.*fetch.negotiationAlgorithm trace.output >fetches && + # Check the number of fetch commands exec-ed by filtering trace to + # child_start events by the top-level program (2nd field == d0) + grep " d0 .* child_start .*fetch.negotiationAlgorithm" trace.output >fetches && test_line_count = 1 fetches && git rev-list --objects --all --missing=print | @@ -422,8 +424,9 @@ test_expect_merge_algorithm failure success 'Objects downloaded with lots of ren grep fetch_count trace.output | cut -d "|" -f 9 | tr -d " ." >actual && test_cmp expect actual && - # Check the number of fetch commands exec-ed - grep d0.*fetch.negotiationAlgorithm trace.output >fetches && + # Check the number of fetch commands exec-ed by filtering trace to + # child_start events by the top-level program (2nd field == d0) + grep " d0 .* child_start .*fetch.negotiationAlgorithm" trace.output >fetches && test_line_count = 4 fetches && git rev-list --objects --all --missing=print | From 0d66f601a9f82a6f3b4240cffa2b02ed5393f1ee Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Thu, 8 Aug 2024 14:19:25 -0700 Subject: [PATCH 040/297] tests: drop use of 'tee' that hides exit status A few tests have "| tee output" downstream of a git command, and then inspect the contents of the file. The net effect is that we use an extra process, and hide the exit status from the upstream git command. In any of these tests, I do not see a reason why we want to hide a possible failure from these git commands. Replace the use of tee with a plain simple redirection. Signed-off-by: Junio C Hamano --- t/t1001-read-tree-m-2way.sh | 2 +- t/t5523-push-upstream.sh | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/t/t1001-read-tree-m-2way.sh b/t/t1001-read-tree-m-2way.sh index 3fb1b0c162ded2..9dc6eb3e4c17c7 100755 --- a/t/t1001-read-tree-m-2way.sh +++ b/t/t1001-read-tree-m-2way.sh @@ -397,7 +397,7 @@ test_expect_success 'a/b vs a, plus c/d case setup.' ' test_expect_success 'a/b vs a, plus c/d case test.' ' read_tree_u_must_succeed -u -m "$treeH" "$treeM" && - git ls-files --stage | tee >treeMcheck.out && + git ls-files --stage >treeMcheck.out && test_cmp treeM.out treeMcheck.out ' diff --git a/t/t5523-push-upstream.sh b/t/t5523-push-upstream.sh index c9acc076353a6a..77c4075fbd1fe4 100755 --- a/t/t5523-push-upstream.sh +++ b/t/t5523-push-upstream.sh @@ -116,14 +116,14 @@ test_expect_success TTY 'push --no-progress suppresses progress' ' test_expect_success TTY 'quiet push' ' ensure_fresh_upstream && - test_terminal git push --quiet --no-progress upstream main 2>&1 | tee output && + test_terminal git push --quiet --no-progress upstream main >output 2>&1 && test_must_be_empty output ' test_expect_success TTY 'quiet push -u' ' ensure_fresh_upstream && - test_terminal git push --quiet -u --no-progress upstream main 2>&1 | tee output && + test_terminal git push --quiet -u --no-progress upstream main >output 2>&1 && test_must_be_empty output ' From a77554ea0972e4ec3518fb8fb0b2582c76cf3105 Mon Sep 17 00:00:00 2001 From: Xing Xin Date: Fri, 9 Aug 2024 07:24:52 +0000 Subject: [PATCH 041/297] diff-tree: fix crash when used with --remerge-diff When using "git-diff-tree" to get the tree diff for merge commits with the diff format set to `remerge`, a bug is triggered as shown below: $ git diff-tree -r --remerge-diff 363337e6eb 363337e6eb812d0c0d785ed4261544f35559ff8b BUG: log-tree.c:1006: did a remerge diff without remerge_objdir?!? This bug is reported by `log-tree.c:do_remerge_diff`, where a bug check added in commit 7b90ab467a (log: clean unneeded objects during log --remerge-diff, 2022-02-02) detects the absence of `remerge_objdir` when attempting to clean up temporary objects generated during the remerge process. After some further digging, I find that the remerge-related diff options were introduced in db757e8b8d (show, log: provide a --remerge-diff capability, 2022-02-02), which also affect the setup of `rev_info` for "git-diff-tree", but were not accounted for in the original implementation (inferred from the commit message). Elijah Newren, the author of the remerge diff feature, notes that other callers of `log-tree.c:log_tree_commit` (the only caller of `log-tree.c:do_remerge_diff`) also exist, but: `builtin/am.c`: manually sets all flags; remerge_diff is not among them `sequencer.c`: manually sets all flags; remerge_diff is not among them so `builtin/diff-tree.c` really is the only caller that was overlooked when remerge-diff functionality was added. This commit resolves the crash by adding `remerge_objdir` setup logic to `builtin/diff-tree.c`, mirroring `builtin/log.c:cmd_log_walk_no_free`. It also includes the necessary cleanup for `remerge_objdir`. Reviewed-by: Elijah Newren Signed-off-by: Xing Xin Signed-off-by: Junio C Hamano --- builtin/diff-tree.c | 13 +++++++++++++ t/t4069-remerge-diff.sh | 35 +++++++++++++++++++++++++++++++++++ 2 files changed, 48 insertions(+) diff --git a/builtin/diff-tree.c b/builtin/diff-tree.c index a8e68ce8ef6275..12012ea0939fd5 100644 --- a/builtin/diff-tree.c +++ b/builtin/diff-tree.c @@ -9,6 +9,7 @@ #include "read-cache-ll.h" #include "repository.h" #include "revision.h" +#include "tmp-objdir.h" #include "tree.h" static struct rev_info log_tree_opt; @@ -167,6 +168,13 @@ int cmd_diff_tree(int argc, const char **argv, const char *prefix) opt->diffopt.rotate_to_strict = 1; + if (opt->remerge_diff) { + opt->remerge_objdir = tmp_objdir_create("remerge-diff"); + if (!opt->remerge_objdir) + die(_("unable to create temporary object directory")); + tmp_objdir_replace_primary_odb(opt->remerge_objdir, 1); + } + /* * NOTE! We expect "a..b" to expand to "^a b" but it is * perfectly valid for revision range parser to yield "b ^a", @@ -231,5 +239,10 @@ int cmd_diff_tree(int argc, const char **argv, const char *prefix) diff_free(&opt->diffopt); } + if (opt->remerge_diff) { + tmp_objdir_destroy(opt->remerge_objdir); + opt->remerge_objdir = NULL; + } + return diff_result_code(&opt->diffopt); } diff --git a/t/t4069-remerge-diff.sh b/t/t4069-remerge-diff.sh index 07323ebafe0d0c..ca8f999caba54d 100755 --- a/t/t4069-remerge-diff.sh +++ b/t/t4069-remerge-diff.sh @@ -110,6 +110,41 @@ test_expect_success 'can filter out additional headers with pickaxe' ' test_must_be_empty actual ' +test_expect_success 'remerge-diff also works for git-diff-tree' ' + # With a clean merge + git diff-tree -r -p --remerge-diff --no-commit-id bc_resolution >actual && + test_must_be_empty actual && + + # With both a resolved conflict and an unrelated change + cat <<-EOF >tmp && + diff --git a/numbers b/numbers + remerge CONFLICT (content): Merge conflict in numbers + index a1fb731..6875544 100644 + --- a/numbers + +++ b/numbers + @@ -1,13 +1,9 @@ + 1 + 2 + -<<<<<<< b0ed5cb (change_a) + -three + -======= + -tres + ->>>>>>> 6cd3f82 (change_b) + +drei + 4 + 5 + 6 + 7 + -eight + +acht + 9 + EOF + sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >expect && + git diff-tree -r -p --remerge-diff --no-commit-id ab_resolution >tmp && + sed -e "s/[0-9a-f]\{7,\}/HASH/g" tmp >actual && + test_cmp expect actual +' + test_expect_success 'setup non-content conflicts' ' git switch --orphan base && From 9a91f7a4de430235fe401eb8b598724f70bd70b8 Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Fri, 9 Aug 2024 10:13:49 -0700 Subject: [PATCH 042/297] tutorial: grammofix We say "these", so "range notations" must be plural. Reported-by: Furkan Akkurt Signed-off-by: Junio C Hamano --- Documentation/gittutorial.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/gittutorial.txt b/Documentation/gittutorial.txt index 0e0b863105b104..8b5b15ce0574a2 100644 --- a/Documentation/gittutorial.txt +++ b/Documentation/gittutorial.txt @@ -361,7 +361,7 @@ $ gitk HEAD...FETCH_HEAD This means "show everything that is reachable from either one, but exclude anything that is reachable from both of them". -Please note that these range notation can be used with both gitk +Please note that these range notations can be used with both gitk and "git log". After inspecting what Bob did, if there is nothing urgent, Alice may From 170cdfc5a41a54014344d25d051e2c7dfb6087da Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Fri, 9 Aug 2024 10:14:12 -0700 Subject: [PATCH 043/297] doc: grammofix in git-diff-tree Describe in present tense what the option does when it is given. Signed-off-by: Junio C Hamano --- Documentation/git-diff-tree.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/git-diff-tree.txt b/Documentation/git-diff-tree.txt index 274d5eaba93dab..59bdc228e4cbcf 100644 --- a/Documentation/git-diff-tree.txt +++ b/Documentation/git-diff-tree.txt @@ -88,7 +88,7 @@ include::pretty-options.txt[] --no-commit-id:: 'git diff-tree' outputs a line with the commit ID when - applicable. This flag suppressed the commit ID output. + applicable. This flag suppresses the commit ID output. -c:: This flag changes the way a merge commit is displayed From 7298bcc573f8e022854b1811a9a7413281da05ea Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Tue, 13 Aug 2024 11:18:08 +0200 Subject: [PATCH 044/297] builtin/bundle: have unbundle check for repo before opening its bundle The `git bundle unbundle` subcommand requires a repository to unbundle the contents into. As thus, the subcommand checks whether we have a startup repository in the first place, and if not it dies. This check happens after we have already opened the bundle though. This causes a segfault when running outside of a repository starting with c8aed5e8da (repository: stop setting SHA1 as the default object hash, 2024-05-07) because we have no hash function set up, but we do try to parse refs advertised by the bundle's header. The next commit will fix that underlying issue by defaulting to the SHA1 object format for bundles, which will also fix the described segfault here. But as we know that we will die anyway, we can do better than that and avoid some vain work by moving the check for a repository before we try to open the bundle. Reported-by: ArcticLampyrid Suggested-by: Jeff King Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- builtin/bundle.c | 5 +++-- t/t6020-bundle-misc.sh | 7 +++++++ 2 files changed, 10 insertions(+), 2 deletions(-) diff --git a/builtin/bundle.c b/builtin/bundle.c index d5d41a8f6722cf..86d0ed7049c653 100644 --- a/builtin/bundle.c +++ b/builtin/bundle.c @@ -207,12 +207,13 @@ static int cmd_bundle_unbundle(int argc, const char **argv, const char *prefix) builtin_bundle_unbundle_usage, options, &bundle_file); /* bundle internals use argv[1] as further parameters */ + if (!startup_info->have_repository) + die(_("Need a repository to unbundle.")); + if ((bundle_fd = open_bundle(bundle_file, &header, NULL)) < 0) { ret = 1; goto cleanup; } - if (!startup_info->have_repository) - die(_("Need a repository to unbundle.")); if (progress) strvec_pushl(&extra_index_pack_args, "-v", "--progress-title", _("Unbundling objects"), NULL); diff --git a/t/t6020-bundle-misc.sh b/t/t6020-bundle-misc.sh index fe75a06572aed6..703434b4720ac8 100755 --- a/t/t6020-bundle-misc.sh +++ b/t/t6020-bundle-misc.sh @@ -652,4 +652,11 @@ test_expect_success 'send a bundle to standard output' ' test_cmp expect actual ' +test_expect_success 'unbundle outside of a repository' ' + git bundle create some.bundle HEAD && + echo "fatal: Need a repository to unbundle." >expect && + nongit test_must_fail git bundle unbundle "$(pwd)/some.bundle" 2>err && + test_cmp expect err +' + test_done From 96a9a3e42e85874ba5edfcf86d91f7d8c05d5f94 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Tue, 13 Aug 2024 11:18:15 +0200 Subject: [PATCH 045/297] bundle: default to SHA1 when reading bundle headers We hit a segfault when trying to open a bundle via `git bundle list-heads` when running outside of a repository. This is caused by c8aed5e8da (repository: stop setting SHA1 as the default object hash, 2024-05-07), which stopped setting the default object hash so that `the_hash_algo` is a `NULL` pointer when running outside of any repo. This is only a symptom of a deeper issue though. Bundles default to the SHA1 object format unless they advertise an "@object-format=" header. Consequently, it has been wrong in the first place to use the object format used by the current repository when parsing bundles. The consequence is that trying to open a bundle that uses a different object hash than the current repository will fail: $ git bundle list-heads sha1.bundle error: unrecognized header: ee4b540943284700a32591ad09f7e15bdeb2a10c HEAD (45) Fix the bug by defaulting to the SHA1 object hash. We already handle the "@object-format=" header as expected, so we don't need to adapt this part. Helped-by: brian m. carlson Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- bundle.c | 7 ++++++- t/t6020-bundle-misc.sh | 25 +++++++++++++++++++++++++ 2 files changed, 31 insertions(+), 1 deletion(-) diff --git a/bundle.c b/bundle.c index ce164c37bc890c..b0a8a925cb1e79 100644 --- a/bundle.c +++ b/bundle.c @@ -89,7 +89,12 @@ int read_bundle_header_fd(int fd, struct bundle_header *header, goto abort; } - header->hash_algo = the_hash_algo; + /* + * The default hash format for bundles is SHA1, unless told otherwise + * by an "object-format=" capability, which is being handled in + * `parse_capability()`. + */ + header->hash_algo = &hash_algos[GIT_HASH_SHA1]; /* The bundle header ends with an empty line */ while (!strbuf_getwholeline_fd(&buf, fd, '\n') && diff --git a/t/t6020-bundle-misc.sh b/t/t6020-bundle-misc.sh index 703434b4720ac8..34b5cd62c201d9 100755 --- a/t/t6020-bundle-misc.sh +++ b/t/t6020-bundle-misc.sh @@ -659,4 +659,29 @@ test_expect_success 'unbundle outside of a repository' ' test_cmp expect err ' +test_expect_success 'list-heads outside of a repository' ' + git bundle create some.bundle HEAD && + cat >expect <<-EOF && + $(git rev-parse HEAD) HEAD + EOF + nongit git bundle list-heads "$(pwd)/some.bundle" >actual && + test_cmp expect actual +' + +for hash in sha1 sha256 +do + test_expect_success "list-heads with bundle using $hash" ' + test_when_finished "rm -rf hash" && + git init --object-format=$hash hash && + test_commit -C hash initial && + git -C hash bundle create hash.bundle HEAD && + + cat >expect <<-EOF && + $(git -C hash rev-parse HEAD) HEAD + EOF + git bundle list-heads hash/hash.bundle >actual && + test_cmp expect actual + ' +done + test_done From 983555a1f262b6a5819cdd235c1f3f9788bb147d Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Wed, 14 Aug 2024 16:03:26 -0700 Subject: [PATCH 046/297] howto-maintain: mention preformatted docs Forgot to mention that the preformatted documentation repositories are updated every time the master branch of the project advances. Signed-off-by: Junio C Hamano --- Documentation/howto/maintain-git.txt | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/Documentation/howto/maintain-git.txt b/Documentation/howto/maintain-git.txt index 41f54050f84e0e..da31332f11398d 100644 --- a/Documentation/howto/maintain-git.txt +++ b/Documentation/howto/maintain-git.txt @@ -181,6 +181,10 @@ by doing the following: $ git diff ORIG_HEAD.. ;# final review $ make test ;# final review + If the tip of 'master' is updated, also generate the preformatted + documentation and push the out result to git-htmldocs and + git-manpages repositories. + - Handle the remaining patches: - Anything unobvious that is applicable to 'master' (in other From 49e5cc5b26550951c2381d959f866db656a97c3c Mon Sep 17 00:00:00 2001 From: Jeff King Date: Thu, 15 Aug 2024 11:30:07 -0400 Subject: [PATCH 047/297] t4129: fix racy index when calling chmod after git-add This patch fixes a racy test failure in t4129. The deletion test added by e95d515141 (apply: canonicalize modes read from patches, 2024-08-05) wants to make sure that git-apply does not complain about a non-canonical mode in the patch, even if that mode does not match the working tree file. So it does this: echo content >non-canon && git add non-canon && chmod 666 non-canon && This is wrong, because running chmod will update the ctime on the file, making it stat-dirty and causing git-apply to refuse to apply the patch. But this only happens sometimes, since it depends on the timestamps crossing a second boundary (but it triggers pretty quickly when run with --stress). We can fix this by doing the chmod before updating the index. The order isn't important here, as the mode will be canonicalized to 100644 in the index anyway (in fact, the chmod is not even that important in the first place, since git-apply will only look at the index; I only added it as an extra confirmation that git-apply would not be confused by it). Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- t/t4129-apply-samemode.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/t/t4129-apply-samemode.sh b/t/t4129-apply-samemode.sh index d9a1084b5e6f2a..87ffd2b8e1a3d3 100755 --- a/t/t4129-apply-samemode.sh +++ b/t/t4129-apply-samemode.sh @@ -153,8 +153,8 @@ test_expect_success POSIXPERM 'patch mode for new file is canonicalized' ' test_expect_success POSIXPERM 'patch mode for deleted file is canonicalized' ' test_when_finished "git reset --hard" && echo content >non-canon && - git add non-canon && chmod 666 non-canon && + git add non-canon && cat >patch <<-\EOF && diff --git a/non-canon b/non-canon From e3209bd4df1f609fb2d730e52d3eea0633dbd786 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Fri, 16 Aug 2024 12:42:25 +0200 Subject: [PATCH 048/297] builtin/stash: fix `--keep-index --include-untracked` with empty HEAD It was reported that creating a stash with `--keep-index --include-untracked` causes an error when HEAD points to a commit whose tree is empty: $ git stash push --keep-index --include-untracked error: pathspec ':/' did not match any file(s) known to git This error comes from `git checkout --no-overlay $i_tree -- :/`, which we execute to reset the working tree to the state in our index. As the tree generated from the index is empty in our case, ':/' does not match any files and thus causes git-checkout(1) to error out. Fix the issue by skipping the checkout when the index tree is empty. As explained in the in-code comment, this should be the correct thing to do as there is nothing that we'd have to reset in the first place. Reported-by: Piotr Siupa Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- builtin/stash.c | 23 ++++++++++++++++++++++- t/t3903-stash.sh | 15 +++++++++++++++ 2 files changed, 37 insertions(+), 1 deletion(-) diff --git a/builtin/stash.c b/builtin/stash.c index 3a4f9fd566d27c..d2b701ef88f755 100644 --- a/builtin/stash.c +++ b/builtin/stash.c @@ -1647,7 +1647,28 @@ static int do_push_stash(const struct pathspec *ps, const char *stash_msg, int q } } - if (keep_index == 1 && !is_null_oid(&info.i_tree)) { + /* + * When keeping staged entries, we need to reset the working + * directory to match the state of our index. This can be + * skipped when the index is the empty tree, because there is + * nothing to reset in that case: + * + * - When the index has any file, regardless of whether + * staged or not, the tree cannot be empty by definition + * and thus we enter the condition. + * + * - When the index has no files, the only thing we need to + * care about is untracked files when `--include-untracked` + * is given. But as we already execute git-clean(1) further + * up to delete such untracked files we don't have to do + * anything here, either. + * + * We thus skip calling git-checkout(1) in this case, also + * because running it on an empty tree will cause it to fail + * due to the pathspec not matching anything. + */ + if (keep_index == 1 && !is_null_oid(&info.i_tree) && + !is_empty_tree_oid(&info.i_tree, the_repository->hash_algo)) { struct child_process cp = CHILD_PROCESS_INIT; cp.git_cmd = 1; diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh index 376cc8f4ab8429..38906785e2a762 100755 --- a/t/t3903-stash.sh +++ b/t/t3903-stash.sh @@ -1384,6 +1384,21 @@ test_expect_success 'stash --keep-index with file deleted in index does not resu test_path_is_missing to-remove ' +test_expect_success 'stash --keep-index --include-untracked with empty tree' ' + test_when_finished "rm -rf empty" && + git init empty && + ( + cd empty && + git commit --allow-empty --message "empty" && + echo content >file && + git stash push --keep-index --include-untracked && + test_path_is_missing file && + git stash pop && + echo content >expect && + test_cmp expect file + ) +' + test_expect_success 'stash apply should succeed with unmodified file' ' echo base >file && git add file && From fa3b914457773d5a0bb02b382862dafd1c4c357e Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Wed, 14 Aug 2024 15:02:29 -0700 Subject: [PATCH 049/297] Prepare for 2.46.1 Signed-off-by: Junio C Hamano --- Documentation/RelNotes/2.46.1.txt | 37 +++++++++++++++++++++++++++++++ GIT-VERSION-GEN | 2 +- RelNotes | 2 +- 3 files changed, 39 insertions(+), 2 deletions(-) create mode 100644 Documentation/RelNotes/2.46.1.txt diff --git a/Documentation/RelNotes/2.46.1.txt b/Documentation/RelNotes/2.46.1.txt new file mode 100644 index 00000000000000..52afb3556ad3f7 --- /dev/null +++ b/Documentation/RelNotes/2.46.1.txt @@ -0,0 +1,37 @@ +Git 2.46.1 Release Notes +======================== + +This release is primarily to merge fixes accumulated on the 'master' +front to prepare for 2.47 release that are still relevant to 2.46.x +maintenance track. + +Fixes since Git 2.46 +-------------------- + + * "git checkout --ours" (no other arguments) complained that the + option is incompatible with branch switching, which is technically + correct, but found confusing by some users. It now says that the + user needs to give pathspec to specify what paths to checkout. + + * It has been documented that we avoid "VAR=VAL shell_func" and why. + + * "git add -p" by users with diff.suppressBlankEmpty set to true + failed to parse the patch that represents an unmodified empty line + with an empty line (not a line with a single space on it), which + has been corrected. + + * "git rebase --help" referred to "offset" (the difference between + the location a change was taken from and the change gets replaced) + incorrectly and called it "fuzz", which has been corrected. + + * "git notes add -m '' --allow-empty" and friends that take prepared + data to create notes should not invoke an editor, but it started + doing so since Git 2.42, which has been corrected. + + * An expensive operation to prepare tracing was done in re-encoding + code path even when the tracing was not requested, which has been + corrected. + + * Perforce tests have been updated. + +Also contains minor documentation updates and code clean-ups. diff --git a/GIT-VERSION-GEN b/GIT-VERSION-GEN index 7b81915e527028..91a06caf636994 100755 --- a/GIT-VERSION-GEN +++ b/GIT-VERSION-GEN @@ -1,7 +1,7 @@ #!/bin/sh GVF=GIT-VERSION-FILE -DEF_VER=v2.46.0 +DEF_VER=v2.46.1-pre LF=' ' diff --git a/RelNotes b/RelNotes index cc696fc869049d..89b65f22b19959 120000 --- a/RelNotes +++ b/RelNotes @@ -1 +1 @@ -Documentation/RelNotes/2.46.0.txt \ No newline at end of file +Documentation/RelNotes/2.46.1.txt \ No newline at end of file From 46cbfd3f7ec0b549f3c298a6dda7812711197921 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 20 Aug 2024 14:31:09 +0000 Subject: [PATCH 050/297] ci: bump microsoft/setup-msbuild from v1 to v2 The main benefit: The new version uses a node.js version that is not yet deprecated. Links: - [Release notes](https://github.com/microsoft/setup-msbuild/releases) - [Changelog](https://github.com/microsoft/setup-msbuild/blob/main/building-release.md) - [Commits](https://github.com/microsoft/setup-msbuild/compare/v1...v2) This patch was originally by GitHub's Dependabot, but I cannot attribute that bot properly because it has no dedicated email address. Probably because it hasn't reached legal age yet, or something. Signed-off-by: Johannes Schindelin Signed-off-by: Junio C Hamano --- .github/workflows/main.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index 13cc0fe8077612..85e5767aae69ea 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -190,7 +190,7 @@ jobs: Expand-Archive compat.zip -DestinationPath . -Force Remove-Item compat.zip - name: add msbuild to PATH - uses: microsoft/setup-msbuild@v1 + uses: microsoft/setup-msbuild@v2 - name: copy dlls to root shell: cmd run: compat\vcbuild\vcpkg_copy_dlls.bat release From 9f39e2fa2652131c45cb94758a4720a48ee55cf8 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 20 Aug 2024 14:31:10 +0000 Subject: [PATCH 051/297] ci(win+VS): download the vcpkg artifacts using a dedicated GitHub Action The Git for Windows project provides a GitHub Action to download and cache Azure Pipelines artifacts (such as the `vcpkg` artifacts), hiding gnarly internals, and also providing some robustness against network glitches. Let's use it. Signed-off-by: Johannes Schindelin Signed-off-by: Junio C Hamano --- .github/workflows/main.yml | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-) diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index 85e5767aae69ea..1ee0433acc14a2 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -181,14 +181,10 @@ jobs: repository: 'microsoft/vcpkg' path: 'compat/vcbuild/vcpkg' - name: download vcpkg artifacts - shell: powershell - run: | - $urlbase = "https://dev.azure.com/git/git/_apis/build/builds" - $id = ((Invoke-WebRequest -UseBasicParsing "${urlbase}?definitions=9&statusFilter=completed&resultFilter=succeeded&`$top=1").content | ConvertFrom-JSON).value[0].id - $downloadUrl = ((Invoke-WebRequest -UseBasicParsing "${urlbase}/$id/artifacts").content | ConvertFrom-JSON).value[0].resource.downloadUrl - (New-Object Net.WebClient).DownloadFile($downloadUrl, "compat.zip") - Expand-Archive compat.zip -DestinationPath . -Force - Remove-Item compat.zip + uses: git-for-windows/get-azure-pipelines-artifact@v0 + with: + repository: git/git + definitionId: 9 - name: add msbuild to PATH uses: microsoft/setup-msbuild@v2 - name: copy dlls to root From 44db6f75cce574a0e410df5be61d40f28ec16f0a Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Tue, 20 Aug 2024 13:36:11 -0700 Subject: [PATCH 052/297] CodingGuidelines: spaces around C operators As we have operated with "write like how your surrounding code is written" for too long, after a huge code drop from another project, we'll end up being inconsistent before such an imported code is cleaned up. We have many uses of cast operator with a space before its operand, mostly in the reftable code. Spell the convention out before it spreads to other places. Signed-off-by: Junio C Hamano --- Documentation/CodingGuidelines | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines index 52afb2725f240d..4e6edf3d66c44d 100644 --- a/Documentation/CodingGuidelines +++ b/Documentation/CodingGuidelines @@ -293,7 +293,9 @@ For C programs: v12.01, 2022-03-28). - Variables have to be declared at the beginning of the block, before - the first statement (i.e. -Wdeclaration-after-statement). + the first statement (i.e. -Wdeclaration-after-statement). It is + encouraged to have a blank line between the end of the declarations + and the first statement in the block. - NULL pointers shall be written as NULL, not as 0. @@ -313,6 +315,13 @@ For C programs: while( condition ) func (bar+1); + - A binary operator (other than ",") and ternary conditional "?:" + have a space on each side of the operator to separate it from its + operands. E.g. "A + 1", not "A+1". + + - A unary operator (other than "." and "->") have no space between it + and its operand. E.g. "(char *)ptr", not "(char *) ptr". + - Do not explicitly compare an integral value with constant 0 or '\0', or a pointer value with constant NULL. For instance, to validate that counted array is initialized but has no elements, write: From 4881328617ee714cdef9d92d3ee25b3a4402ed4f Mon Sep 17 00:00:00 2001 From: ahmed akef Date: Thu, 22 Aug 2024 19:50:31 +0000 Subject: [PATCH 053/297] docs: explain the order of output in the batched mode of git-cat-file(1) The batched mode of git-cat-file(1) reads multiple objects from stdin and prints their respective contents to stdout. The order in which those objects are printed is not documented and may not be immediately obvious to the user. Document it. Signed-off-by: ahmed akef Signed-off-by: Junio C Hamano --- Documentation/git-cat-file.txt | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt index bd95a6c10a7d0b..d5890ae3686f6b 100644 --- a/Documentation/git-cat-file.txt +++ b/Documentation/git-cat-file.txt @@ -270,9 +270,9 @@ BATCH OUTPUT ------------ If `--batch` or `--batch-check` is given, `cat-file` will read objects -from stdin, one per line, and print information about them. By default, -the whole line is considered as an object, as if it were fed to -linkgit:git-rev-parse[1]. +from stdin, one per line, and print information about them in the same +order as they have been read. By default, the whole line is +considered as an object, as if it were fed to linkgit:git-rev-parse[1]. When `--batch-command` is given, `cat-file` will read commands from stdin, one per line, and print information based on the command given. With From 596f4ff6ad76dfab2644013da42bc5cfac4a65a4 Mon Sep 17 00:00:00 2001 From: Celeste Liu Date: Fri, 23 Aug 2024 16:21:08 +0800 Subject: [PATCH 054/297] doc: replace 3 dash with correct 2 dash in git-config(1) Commit 4e51389000 (builtin/config: introduce "get" subcommand, 2024-05-06) introduced this typo. It uses 3 dashes for regexp argument instead of correct 2 dashes. Signed-off-by: Celeste Liu Signed-off-by: Junio C Hamano --- Documentation/git-config.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt index 65c645d461aa14..79360328aa9de3 100644 --- a/Documentation/git-config.txt +++ b/Documentation/git-config.txt @@ -130,7 +130,7 @@ OPTIONS --all:: With `get`, return all values for a multi-valued key. ----regexp:: +--regexp:: With `get`, interpret the name as a regular expression. Regular expression matching is currently case-sensitive and done against a canonicalized version of the key in which section and variable names From 6809f8ccade093875ace7a8493fe09babd539cb1 Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Mon, 26 Aug 2024 11:13:19 -0700 Subject: [PATCH 055/297] A bit more topics for 2.46.x maintenance track Signed-off-by: Junio C Hamano --- Documentation/RelNotes/2.46.1.txt | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/Documentation/RelNotes/2.46.1.txt b/Documentation/RelNotes/2.46.1.txt index 52afb3556ad3f7..3b4c44305f7e98 100644 --- a/Documentation/RelNotes/2.46.1.txt +++ b/Documentation/RelNotes/2.46.1.txt @@ -34,4 +34,26 @@ Fixes since Git 2.46 * Perforce tests have been updated. + * The credential helper to talk to OSX keychain sometimes sent + garbage bytes after the username, which has been corrected. + + * A recent update broke "git ls-remote" used outside a repository, + which has been corrected. + + * "git config --value=foo --fixed-value section.key newvalue" barfed + when the existing value in the configuration file used the + valueless true syntax, which has been corrected. + + * "git reflog expire" failed to honor annotated tags when computing + reachable commits. + + * A flakey test and incorrect calls to strtoX() functions have been + fixed. + + * Follow-up on 2.45.1 regression fix. + + * "git rev-list ... | git diff-tree -p --remerge-diff --stdin" should + behave more or less like "git log -p --remerge-diff" but instead it + crashed, forgetting to prepare a temporary object store needed. + Also contains minor documentation updates and code clean-ups. From 686e9f616fea49f892ee400b4a5424ffef8dc3bb Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Mon, 26 Aug 2024 10:31:19 -0700 Subject: [PATCH 056/297] git-config.1: --get-all description update "git config --get-all foo.bar" shows all values for the foo.bar variable, but does not give the variable name in each output entry. Hence it is equivalent to "git config get --all foo.bar", without "--show-names", in the more modern syntax. Signed-off-by: Junio C Hamano --- Documentation/git-config.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt index 65c645d461aa14..c4c9ed1e98035e 100644 --- a/Documentation/git-config.txt +++ b/Documentation/git-config.txt @@ -309,7 +309,7 @@ recommended to migrate to the new syntax. Replaced by `git config get [--value=] `. --get-all []:: - Replaced by `git config get [--value=] --all --show-names `. + Replaced by `git config get [--value=] --all `. --get-regexp :: Replaced by `git config get --all --show-names --regexp `. From 160947040917c340633f79c6f44d3cae4653d53a Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Mon, 26 Aug 2024 11:48:57 -0700 Subject: [PATCH 057/297] git-config.1: fix description of --regexp in synopsis The synopsis says --regexp= but the --regexp option is a Boolean that says "the name given is not literal, but a pattern to match the name". Signed-off-by: Junio C Hamano --- Documentation/git-config.txt | 2 +- builtin/config.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt index c4c9ed1e98035e..1ee5c89ba2bca9 100644 --- a/Documentation/git-config.txt +++ b/Documentation/git-config.txt @@ -10,7 +10,7 @@ SYNOPSIS -------- [verse] 'git config list' [] [] [--includes] -'git config get' [] [] [--includes] [--all] [--regexp=] [--value=] [--fixed-value] [--default=] +'git config get' [] [] [--includes] [--all] [--regexp] [--value=] [--fixed-value] [--default=] 'git config set' [] [--type=] [--all] [--value=] [--fixed-value] 'git config unset' [] [--all] [--value=] [--fixed-value] 'git config rename-section' [] diff --git a/builtin/config.c b/builtin/config.c index 20a0b64090eab5..97e4d5f57ce318 100644 --- a/builtin/config.c +++ b/builtin/config.c @@ -17,7 +17,7 @@ static const char *const builtin_config_usage[] = { N_("git config list [] [] [--includes]"), - N_("git config get [] [] [--includes] [--all] [--regexp=] [--value=] [--fixed-value] [--default=] "), + N_("git config get [] [] [--includes] [--all] [--regexp] [--value=] [--fixed-value] [--default=] "), N_("git config set [] [--type=] [--all] [--value=] [--fixed-value] "), N_("git config unset [] [--all] [--value=] [--fixed-value] "), N_("git config rename-section [] "), From 6bd2ae67a5be4e08cd2bfc611c3d08707b1416e1 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Fri, 30 Aug 2024 16:53:31 -0400 Subject: [PATCH 058/297] revision: free commit buffers for skipped commits In git-log we leave the save_commit_buffer flag set to "1", which tells the commit parsing code to store the object content after it has parsed it to find parents, tree, etc. That lets us reuse the contents for pretty-printing the commit in the output. And then after printing each commit, we call free_commit_buffer(), since we don't need it anymore. But some options may cause us to traverse commits which are not part of the output. And so git-log does not see them at all, and doesn't free them. One such case is something like: git log -n 1000 --skip=1000000 which will churn through a million commits, before showing only a thousand. We loop through these inside get_revision(), without freeing the contents. As a result, we end up storing the object data for those million commits simultaneously. We should free the stored buffers (if any) for those commits as we skip over them, which is what this patch does. Running the above command in linux.git drops the peak heap usage from ~1.1GB to ~200MB, according to valgrind/massif. (I thought we might get an even bigger improvement, but the remaining memory is going to commit/tree structs, which we do hold on to forever). Note that this problem doesn't occur if: - you're running a git-rev-list without a --format parameter; it turns off save_commit_buffer by default, since it only output the object id - you've built a commit-graph file, since in that case we'd use the optimized graph data instead of the initial parse, and then do a lazy parse for commits we're actually going to output There are probably some other option combinations that can likewise end up with useless stored commit buffers. For example, if you ask for "foo..bar", then we'll have to walk down to the merge base, and everything on the "foo" side won't be shown. Tuning the "save" behavior to handle that might be tricky (I guess maybe drop buffers for anything we mark as UNINTERESTING?). And in the long run, the right solution here is probably to make sure the commit-graph is built (since it fixes the memory problem _and_ drastically reduces CPU usage). But since this "--skip" case is an easy one-liner, it's worth fixing in the meantime. It should be OK to make this call even if there is no saved buffer (e.g., because save_commit_buffer=0, or because a commit-graph was used), since it's O(1) to look up the buffer and is a noop if it isn't present. I verified by running the above command after "git commit-graph write --reachable", and it takes the same time with and without this patch. Reported-by: Yuri Karnilaev Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- revision.c | 1 + 1 file changed, 1 insertion(+) diff --git a/revision.c b/revision.c index 21f5f572c22ec7..f60c556490020b 100644 --- a/revision.c +++ b/revision.c @@ -4249,6 +4249,7 @@ static struct commit *get_revision_internal(struct rev_info *revs) c = get_revision_1(revs); if (!c) break; + free_commit_buffer(revs->repo->parsed_objects, c); } } From d4dc0efd7d4b71acadae61f619b9e84f97d15c83 Mon Sep 17 00:00:00 2001 From: Ramsay Jones Date: Sat, 31 Aug 2024 15:58:56 +0100 Subject: [PATCH 059/297] compat/terminal: mark parameter of git_terminal_prompt() UNUSED If neither HAVE_DEV_TTY nor GIT_WINDOWS_NATIVE is set, the fallback code calls the system getpass(). This unfortunately ignores the "echo" boolean parameter, as we have no way to implement that functionality. But we still have to keep the unused parameter, since our interface has to match the other implementations. Co-authored-by: Jeff King Signed-off-by: Ramsay Jones Signed-off-by: Junio C Hamano --- compat/terminal.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/compat/terminal.c b/compat/terminal.c index 0afda730f278cc..d54efa1c5d6ebf 100644 --- a/compat/terminal.c +++ b/compat/terminal.c @@ -594,7 +594,7 @@ void restore_term(void) { } -char *git_terminal_prompt(const char *prompt, int echo) +char *git_terminal_prompt(const char *prompt, int echo UNUSED) { return getpass(prompt); } From b2dbf97f47870dd71eab319a55b362102d65c209 Mon Sep 17 00:00:00 2001 From: Patrick Steinhardt Date: Wed, 4 Sep 2024 08:26:24 +0200 Subject: [PATCH 060/297] builtin/index-pack: fix segfaults when running outside of a repo It was reported that git-verify-pack(1) has started to crash with Git v2.46.0 when run outside of a repository. This is another fallout from c8aed5e8da (repository: stop setting SHA1 as the default object hash, 2024-05-07), where we have stopped setting the default hash algorithm for `the_repository`. Consequently, code that relies on `the_hash_algo` will now crash when it hasn't explicitly been initialized, which may be the case when running outside of a Git repository. The crash is not in git-verify-pack(1) but instead in git-index-pack(1), which gets called by the former. Ideally, both of these programs should be able to identify the hash algorithm used by the packfile and index without having to rely on external information. But unfortunately, the format for neither of them is completely self-describing, so it is not possible to derive that information. This is a design issue that we should address by introducing a new packfile version that encodes its object hash. For now though the more important fix is to not make either of these programs crash anymore, which we do by falling back to SHA1 when the object hash is unconfigured. This pessimizes reading packfiles which use a different hash than SHA1, but restores previous behaviour. Reported-by: Ilya K Signed-off-by: Patrick Steinhardt Signed-off-by: Junio C Hamano --- builtin/index-pack.c | 9 +++++++++ t/t5300-pack-object.sh | 39 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 48 insertions(+) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index fd968d673d22d8..763b01372aade4 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1868,6 +1868,15 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) if (!index_name && pack_name) index_name = derive_filename(pack_name, "pack", "idx", &index_name_buf); + /* + * Packfiles and indices do not carry enough information to be able to + * identify their object hash. So when we are neither in a repository + * nor has the user told us which object hash to use we have no other + * choice but to guess the object hash. + */ + if (!the_repository->hash_algo) + repo_set_hash_algo(the_repository, GIT_HASH_SHA1); + opts.flags &= ~(WRITE_REV | WRITE_REV_VERIFY); if (rev_index) { opts.flags |= verify ? WRITE_REV_VERIFY : WRITE_REV; diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh index 4ad023c846d06a..3b9dae331a5ea9 100755 --- a/t/t5300-pack-object.sh +++ b/t/t5300-pack-object.sh @@ -635,4 +635,43 @@ test_expect_success 'negative window clamps to 0' ' check_deltas stderr = 0 ' +for hash in sha1 sha256 +do + test_expect_success "verify-pack with $hash packfile" ' + test_when_finished "rm -rf repo" && + git init --object-format=$hash repo && + test_commit -C repo initial && + git -C repo repack -ad && + git -C repo verify-pack "$(pwd)"/repo/.git/objects/pack/*.idx && + if test $hash = sha1 + then + nongit git verify-pack "$(pwd)"/repo/.git/objects/pack/*.idx + else + # We have no way to identify the hash used by packfiles + # or indices, so we always fall back to SHA1. + nongit test_must_fail git verify-pack "$(pwd)"/repo/.git/objects/pack/*.idx && + # But with an explicit object format we should succeed. + nongit git verify-pack --object-format=$hash "$(pwd)"/repo/.git/objects/pack/*.idx + fi + ' + + test_expect_success "index-pack outside of a $hash repository" ' + test_when_finished "rm -rf repo" && + git init --object-format=$hash repo && + test_commit -C repo initial && + git -C repo repack -ad && + git -C repo index-pack --verify "$(pwd)"/repo/.git/objects/pack/*.pack && + if test $hash = sha1 + then + nongit git index-pack --verify "$(pwd)"/repo/.git/objects/pack/*.pack + else + # We have no way to identify the hash used by packfiles + # or indices, so we always fall back to SHA1. + nongit test_must_fail git index-pack --verify "$(pwd)"/repo/.git/objects/pack/*.pack 2>err && + # But with an explicit object format we should succeed. + nongit git index-pack --object-format=$hash --verify "$(pwd)"/repo/.git/objects/pack/*.pack + fi + ' +done + test_done From 6074a7d4ae6b658c18465f10bbbf144882d2d4b0 Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Thu, 12 Sep 2024 11:09:46 -0700 Subject: [PATCH 061/297] Another batch of topics for 2.46.1 Signed-off-by: Junio C Hamano --- Documentation/RelNotes/2.46.1.txt | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/Documentation/RelNotes/2.46.1.txt b/Documentation/RelNotes/2.46.1.txt index 3b4c44305f7e98..07dc76d030aadb 100644 --- a/Documentation/RelNotes/2.46.1.txt +++ b/Documentation/RelNotes/2.46.1.txt @@ -56,4 +56,11 @@ Fixes since Git 2.46 behave more or less like "git log -p --remerge-diff" but instead it crashed, forgetting to prepare a temporary object store needed. + * The patch parser in "git patch-id" has been tightened to avoid + getting confused by lines that look like a patch header in the log + message. + + * "git bundle unbundle" outside a repository triggered a BUG() + unnecessarily, which has been corrected. + Also contains minor documentation updates and code clean-ups. From a731929aa8016750c09bccc67c68feaf1259ce90 Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Fri, 13 Sep 2024 14:05:56 -0700 Subject: [PATCH 062/297] Git 2.46.1 Signed-off-by: Junio C Hamano --- Documentation/RelNotes/2.46.1.txt | 9 +++++++++ GIT-VERSION-GEN | 2 +- 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/Documentation/RelNotes/2.46.1.txt b/Documentation/RelNotes/2.46.1.txt index 07dc76d030aadb..e55c2c4a4681e6 100644 --- a/Documentation/RelNotes/2.46.1.txt +++ b/Documentation/RelNotes/2.46.1.txt @@ -63,4 +63,13 @@ Fixes since Git 2.46 * "git bundle unbundle" outside a repository triggered a BUG() unnecessarily, which has been corrected. + * The code forgot to discard unnecessary in-core commit buffer data + for commits that "git log --skip=" traversed but omitted + from the output, which has been corrected. + + * "git verify-pack" and "git index-pack" started dying outside a + repository, which has been corrected. + + * A corner case bug in "git stash" was fixed. + Also contains minor documentation updates and code clean-ups. diff --git a/GIT-VERSION-GEN b/GIT-VERSION-GEN index 91a06caf636994..55850912d13034 100755 --- a/GIT-VERSION-GEN +++ b/GIT-VERSION-GEN @@ -1,7 +1,7 @@ #!/bin/sh GVF=GIT-VERSION-FILE -DEF_VER=v2.46.1-pre +DEF_VER=v2.46.1 LF=' ' From 86802eb2a86f2a02c4f1d38240e9e9d11f1ef8c6 Mon Sep 17 00:00:00 2001 From: Sverre Rabbelier Date: Sun, 24 Jul 2011 15:54:04 +0200 Subject: [PATCH 063/297] t9350: point out that refs are not updated correctly This happens only when the corresponding commits are not exported in the current fast-export run. This can happen either when the relevant commit is already marked, or when the commit is explicitly marked as UNINTERESTING with a negative ref by another argument. This breaks fast-export basec remote helpers. Signed-off-by: Sverre Rabbelier --- t/t9350-fast-export.sh | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/t/t9350-fast-export.sh b/t/t9350-fast-export.sh index 1eb035ee4ce547..f63094d2da280e 100755 --- a/t/t9350-fast-export.sh +++ b/t/t9350-fast-export.sh @@ -801,4 +801,15 @@ test_expect_success 'fast-export handles --end-of-options' ' test_cmp expect actual ' +cat > expected << EOF +reset refs/heads/master +from $(git rev-parse master) + +EOF + +test_expect_failure 'refs are updated even if no commits need to be exported' ' + git fast-export master..master > actual && + test_cmp expected actual +' + test_done From f9372db7d4cd5d34c171feb1efa36af59ce8990a Mon Sep 17 00:00:00 2001 From: Sverre Rabbelier Date: Sat, 28 Aug 2010 20:49:01 -0500 Subject: [PATCH 064/297] transport-helper: add trailing -- [PT: ensure we add an additional element to the argv array] Signed-off-by: Sverre Rabbelier Signed-off-by: Johannes Schindelin --- transport-helper.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/transport-helper.c b/transport-helper.c index 09b3560ffdc41e..dbff8362a6123c 100644 --- a/transport-helper.c +++ b/transport-helper.c @@ -488,6 +488,8 @@ static int get_exporter(struct transport *transport, for (i = 0; i < revlist_args->nr; i++) strvec_push(&fastexport->args, revlist_args->items[i].string); + strvec_push(&fastexport->args, "--"); + fastexport->git_cmd = 1; return start_command(fastexport); } From 9a0537d50cc8c8c4f3eddd0217c3dd1f9243a767 Mon Sep 17 00:00:00 2001 From: Sverre Rabbelier Date: Sun, 24 Jul 2011 00:06:00 +0200 Subject: [PATCH 065/297] remote-helper: check helper status after import/export Signed-off-by: Johannes Schindelin Signed-off-by: Sverre Rabbelier --- t/t5801-remote-helpers.sh | 2 +- transport-helper.c | 15 +++++++++++++++ 2 files changed, 16 insertions(+), 1 deletion(-) diff --git a/t/t5801-remote-helpers.sh b/t/t5801-remote-helpers.sh index 20f43f7b7d3e21..0ba5d0036552c3 100755 --- a/t/t5801-remote-helpers.sh +++ b/t/t5801-remote-helpers.sh @@ -262,7 +262,7 @@ test_expect_success 'push update refs failure' ' echo "update fail" >>file && git commit -a -m "update fail" && git rev-parse --verify testgit/origin/heads/update >expect && - test_expect_code 1 env GIT_REMOTE_TESTGIT_FAILURE="non-fast forward" \ + test_must_fail env GIT_REMOTE_TESTGIT_FAILURE="non-fast forward" \ git push origin update && git rev-parse --verify testgit/origin/heads/update >actual && test_cmp expect actual diff --git a/transport-helper.c b/transport-helper.c index dbff8362a6123c..8bc913c8770288 100644 --- a/transport-helper.c +++ b/transport-helper.c @@ -494,6 +494,19 @@ static int get_exporter(struct transport *transport, return start_command(fastexport); } +static void check_helper_status(struct helper_data *data) +{ + int pid, status; + + pid = waitpid(data->helper->pid, &status, WNOHANG); + if (pid < 0) + die("Could not retrieve status of remote helper '%s'", + data->name); + if (pid > 0 && WIFEXITED(status)) + die("Remote helper '%s' died with %d", + data->name, WEXITSTATUS(status)); +} + static int fetch_with_import(struct transport *transport, int nr_heads, struct ref **to_fetch) { @@ -530,6 +543,7 @@ static int fetch_with_import(struct transport *transport, if (finish_command(&fastimport)) die(_("error while running fast-import")); + check_helper_status(data); /* * The fast-import stream of a remote helper that advertises @@ -1142,6 +1156,7 @@ static int push_refs_with_export(struct transport *transport, if (finish_command(&exporter)) die(_("error while running fast-export")); + check_helper_status(data); if (push_update_refs_status(data, remote_refs, flags)) return 1; From c3458b0a3a0eca20c8df99837942d590b6913164 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Thu, 19 Jan 2023 13:40:31 +0100 Subject: [PATCH 066/297] gitk(Windows): avoid inadvertently calling executables in the worktree Just like CVE-2022-41953 for Git GUI, there exists a vulnerability of `gitk` where it looks for `taskkill.exe` in the current directory before searching `PATH`. Note that the many `exec git` calls are unaffected, due to an obscure quirk in Tcl's `exec` function. Typically, `git.exe` lives next to `wish.exe` (i.e. the program that is run to execute `gitk` or Git GUI) in Git for Windows, and that is the saving grace for `git.exe because `exec` searches the directory where `wish.exe` lives even before the current directory, according to https://www.tcl-lang.org/man/tcl/TclCmd/exec.htm#M24: If a directory name was not specified as part of the application name, the following directories are automatically searched in order when attempting to locate the application: The directory from which the Tcl executable was loaded. The current directory. The Windows 32-bit system directory. The Windows home directory. The directories listed in the path. The same is not true, however, for `taskkill.exe`: it lives in the Windows system directory (never mind the 32-bit, Tcl's documentation is outdated on that point, it really means `C:\Windows\system32`). Signed-off-by: Johannes Schindelin --- gitk-git/gitk | 135 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 135 insertions(+) diff --git a/gitk-git/gitk b/gitk-git/gitk index 7a087f123d7563..076ca3687d7653 100755 --- a/gitk-git/gitk +++ b/gitk-git/gitk @@ -9,6 +9,141 @@ exec wish "$0" -- "$@" package require Tk +###################################################################### +## +## Enabling platform-specific code paths + +proc is_MacOSX {} { + if {[tk windowingsystem] eq {aqua}} { + return 1 + } + return 0 +} + +proc is_Windows {} { + if {$::tcl_platform(platform) eq {windows}} { + return 1 + } + return 0 +} + +set _iscygwin {} +proc is_Cygwin {} { + global _iscygwin + if {$_iscygwin eq {}} { + if {[string match "CYGWIN_*" $::tcl_platform(os)]} { + set _iscygwin 1 + } else { + set _iscygwin 0 + } + } + return $_iscygwin +} + +###################################################################### +## +## PATH lookup + +set _search_path {} +proc _which {what args} { + global env _search_exe _search_path + + if {$_search_path eq {}} { + if {[is_Cygwin] && [regexp {^(/|\.:)} $env(PATH)]} { + set _search_path [split [exec cygpath \ + --windows \ + --path \ + --absolute \ + $env(PATH)] {;}] + set _search_exe .exe + } elseif {[is_Windows]} { + set gitguidir [file dirname [info script]] + regsub -all ";" $gitguidir "\\;" gitguidir + set env(PATH) "$gitguidir;$env(PATH)" + set _search_path [split $env(PATH) {;}] + # Skip empty `PATH` elements + set _search_path [lsearch -all -inline -not -exact \ + $_search_path ""] + set _search_exe .exe + } else { + set _search_path [split $env(PATH) :] + set _search_exe {} + } + } + + if {[is_Windows] && [lsearch -exact $args -script] >= 0} { + set suffix {} + } else { + set suffix $_search_exe + } + + foreach p $_search_path { + set p [file join $p $what$suffix] + if {[file exists $p]} { + return [file normalize $p] + } + } + return {} +} + +proc sanitize_command_line {command_line from_index} { + set i $from_index + while {$i < [llength $command_line]} { + set cmd [lindex $command_line $i] + if {[file pathtype $cmd] ne "absolute"} { + set fullpath [_which $cmd] + if {$fullpath eq ""} { + throw {NOT-FOUND} "$cmd not found in PATH" + } + lset command_line $i $fullpath + } + + # handle piped commands, e.g. `exec A | B` + for {incr i} {$i < [llength $command_line]} {incr i} { + if {[lindex $command_line $i] eq "|"} { + incr i + break + } + } + } + return $command_line +} + +# Override `exec` to avoid unsafe PATH lookup + +rename exec real_exec + +proc exec {args} { + # skip options + for {set i 0} {$i < [llength $args]} {incr i} { + set arg [lindex $args $i] + if {$arg eq "--"} { + incr i + break + } + if {[string range $arg 0 0] ne "-"} { + break + } + } + set args [sanitize_command_line $args $i] + uplevel 1 real_exec $args +} + +# Override `open` to avoid unsafe PATH lookup + +rename open real_open + +proc open {args} { + set arg0 [lindex $args 0] + if {[string range $arg0 0 0] eq "|"} { + set command_line [string trim [string range $arg0 1 end]] + lset args 0 "| [sanitize_command_line $command_line 0]" + } + uplevel 1 real_open $args +} + +# End of safe PATH lookup stuff + proc hasworktree {} { return [expr {[exec git rev-parse --is-bare-repository] == "false" && [exec git rev-parse --is-inside-git-dir] == "false"}] From c875431ba7b738c1e6c875320be173fc63ae0374 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 9 Apr 2012 13:04:35 -0500 Subject: [PATCH 067/297] Always auto-gc after calling a fast-import transport After importing anything with fast-import, we should always let the garbage collector do its job, since the objects are written to disk inefficiently. This brings down an initial import of http://selenic.com/hg from about 230 megabytes to about 14. In the future, we may want to make this configurable on a per-remote basis, or maybe teach fast-import about it in the first place. Signed-off-by: Johannes Schindelin --- transport-helper.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/transport-helper.c b/transport-helper.c index 8bc913c8770288..d4b09c1e84d14a 100644 --- a/transport-helper.c +++ b/transport-helper.c @@ -22,6 +22,8 @@ #include "packfile.h" static int debug; +/* TODO: put somewhere sensible, e.g. git_transport_options? */ +static int auto_gc = 1; struct helper_data { char *name; @@ -577,6 +579,13 @@ static int fetch_with_import(struct transport *transport, } } strbuf_release(&buf); + if (auto_gc) { + struct child_process cmd = CHILD_PROCESS_INIT; + + cmd.git_cmd = 1; + strvec_pushl(&cmd.args, "gc", "--auto", "--quiet", NULL); + run_command(&cmd); + } return 0; } From a165dcacce2490e12cfbc4e19a579f28dd6bd170 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 18 Apr 2017 12:09:08 +0200 Subject: [PATCH 068/297] mingw: demonstrate a problem with certain absolute paths On Windows, there are several categories of absolute paths. One such category starts with a backslash and is implicitly relative to the drive associated with the current working directory. Example: c: git clone https://github.com/git-for-windows/git \G4W should clone into C:\G4W. There is currently a problem with that, in that mingw_mktemp() does not expect the _wmktemp() function to prefix the absolute path with the drive prefix, and as a consequence, the resulting path does not fit into the originally-passed string buffer. The symptom is a "Result too large" error. Reported by Juan Carlos Arevalo Baeza. Signed-off-by: Johannes Schindelin --- t/t5580-unc-paths.sh | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/t/t5580-unc-paths.sh b/t/t5580-unc-paths.sh index d7537a162b21fe..dc0788cdbb1552 100755 --- a/t/t5580-unc-paths.sh +++ b/t/t5580-unc-paths.sh @@ -21,14 +21,11 @@ fi UNCPATH="$(winpwd)" case "$UNCPATH" in [A-Z]:*) + WITHOUTDRIVE="${UNCPATH#?:}" # Use administrative share e.g. \\localhost\C$\git-sdk-64\usr\src\git # (we use forward slashes here because MSYS2 and Git accept them, and # they are easier on the eyes) - UNCPATH="//localhost/${UNCPATH%%:*}\$/${UNCPATH#?:}" - test -d "$UNCPATH" || { - skip_all='could not access administrative share; skipping' - test_done - } + UNCPATH="//localhost/${UNCPATH%%:*}\$$WITHOUTDRIVE" ;; *) skip_all='skipping UNC path tests, cannot determine current path as UNC' @@ -36,6 +33,18 @@ case "$UNCPATH" in ;; esac +test_expect_failure 'clone into absolute path lacking a drive prefix' ' + USINGBACKSLASHES="$(echo "$WITHOUTDRIVE"/without-drive-prefix | + tr / \\\\)" && + git clone . "$USINGBACKSLASHES" && + test -f without-drive-prefix/.git/HEAD +' + +test -d "$UNCPATH" || { + skip_all='could not access administrative share; skipping' + test_done +} + test_expect_success setup ' test_commit initial ' From bf29ea6f40373ca2310f36eba3cfdcf1d96d5bad Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 7 Dec 2018 13:39:30 +0100 Subject: [PATCH 069/297] clean: do not traverse mount points It seems to be not exactly rare on Windows to install NTFS junction points (the equivalent of "bind mounts" on Linux/Unix) in worktrees, e.g. to map some development tools into a subdirectory. In such a scenario, it is pretty horrible if `git clean -dfx` traverses into the mapped directory and starts to "clean up". Let's just not do that. Let's make sure before we traverse into a directory that it is not a mount point (or junction). This addresses https://github.com/git-for-windows/git/issues/607 Signed-off-by: Johannes Schindelin --- builtin/clean.c | 14 ++++++++++++++ compat/mingw.c | 22 ++++++++++++++++++++++ compat/mingw.h | 3 +++ git-compat-util.h | 4 ++++ path.c | 39 +++++++++++++++++++++++++++++++++++++++ path.h | 1 + t/t7300-clean.sh | 9 +++++++++ 7 files changed, 92 insertions(+) diff --git a/builtin/clean.c b/builtin/clean.c index ded5a91534c494..e700df6ca66acf 100644 --- a/builtin/clean.c +++ b/builtin/clean.c @@ -38,6 +38,8 @@ static const char *msg_remove = N_("Removing %s\n"); static const char *msg_would_remove = N_("Would remove %s\n"); static const char *msg_skip_git_dir = N_("Skipping repository %s\n"); static const char *msg_would_skip_git_dir = N_("Would skip repository %s\n"); +static const char *msg_skip_mount_point = N_("Skipping mount point %s\n"); +static const char *msg_would_skip_mount_point = N_("Would skip mount point %s\n"); static const char *msg_warn_remove_failed = N_("failed to remove %s"); static const char *msg_warn_lstat_failed = N_("could not lstat %s\n"); static const char *msg_skip_cwd = N_("Refusing to remove current working directory\n"); @@ -182,6 +184,18 @@ static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag, goto out; } + if (is_mount_point(path)) { + if (!quiet) { + quote_path(path->buf, prefix, "ed, 0); + printf(dry_run ? + _(msg_would_skip_mount_point) : + _(msg_skip_mount_point), quoted.buf); + } + *dir_gone = 0; + + goto out; + } + dir = opendir(path->buf); if (!dir) { /* an empty dir could be removed even if it is unreadble */ diff --git a/compat/mingw.c b/compat/mingw.c index 29d3f09768c29a..b89d7660ed3381 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -2533,6 +2533,28 @@ pid_t waitpid(pid_t pid, int *status, int options) return -1; } +int mingw_is_mount_point(struct strbuf *path) +{ + WIN32_FIND_DATAW findbuf = { 0 }; + HANDLE handle; + wchar_t wfilename[MAX_PATH]; + int wlen = xutftowcs_path(wfilename, path->buf); + if (wlen < 0) + die(_("could not get long path for '%s'"), path->buf); + + /* remove trailing slash, if any */ + if (wlen > 0 && wfilename[wlen - 1] == L'/') + wfilename[--wlen] = L'\0'; + + handle = FindFirstFileW(wfilename, &findbuf); + if (handle == INVALID_HANDLE_VALUE) + return 0; + FindClose(handle); + + return (findbuf.dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT) && + (findbuf.dwReserved0 == IO_REPARSE_TAG_MOUNT_POINT); +} + int xutftowcsn(wchar_t *wcs, const char *utfs, size_t wcslen, int utflen) { int upos = 0, wpos = 0; diff --git a/compat/mingw.h b/compat/mingw.h index 27b61284f46be6..a1075cd032d294 100644 --- a/compat/mingw.h +++ b/compat/mingw.h @@ -454,6 +454,9 @@ static inline void convert_slashes(char *path) if (*path == '\\') *path = '/'; } +struct strbuf; +int mingw_is_mount_point(struct strbuf *path); +#define is_mount_point mingw_is_mount_point #define PATH_SEP ';' char *mingw_query_user_email(void); #define query_user_email mingw_query_user_email diff --git a/git-compat-util.h b/git-compat-util.h index 71b4d23f0386b2..a61ad81de31cb2 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -603,6 +603,10 @@ static inline int git_has_dir_sep(const char *path) #define has_dir_sep(path) git_has_dir_sep(path) #endif +#ifndef is_mount_point +#define is_mount_point is_mount_point_via_stat +#endif + #ifndef query_user_email #define query_user_email() NULL #endif diff --git a/path.c b/path.c index 19f7684f3876bd..53b713a094710c 100644 --- a/path.c +++ b/path.c @@ -1267,6 +1267,45 @@ char *strip_path_suffix(const char *path, const char *suffix) return offset == -1 ? NULL : xstrndup(path, offset); } +int is_mount_point_via_stat(struct strbuf *path) +{ + size_t len = path->len; + unsigned int current_dev; + struct stat st; + + if (!strcmp("/", path->buf)) + return 1; + + strbuf_addstr(path, "/."); + if (lstat(path->buf, &st)) { + /* + * If we cannot access the current directory, we cannot say + * that it is a bind mount. + */ + strbuf_setlen(path, len); + return 0; + } + current_dev = st.st_dev; + + /* Now look at the parent directory */ + strbuf_addch(path, '.'); + if (lstat(path->buf, &st)) { + /* + * If we cannot access the parent directory, we cannot say + * that it is a bind mount. + */ + strbuf_setlen(path, len); + return 0; + } + strbuf_setlen(path, len); + + /* + * If the device ID differs between current and parent directory, + * then it is a bind mount. + */ + return current_dev != st.st_dev; +} + int daemon_avoid_alias(const char *p) { int sl, ndot; diff --git a/path.h b/path.h index a6f0b70692a5c0..e96df1f640d649 100644 --- a/path.h +++ b/path.h @@ -199,6 +199,7 @@ int normalize_path_copy(char *dst, const char *src); int strbuf_normalize_path(struct strbuf *src); int longest_ancestor_length(const char *path, struct string_list *prefixes); char *strip_path_suffix(const char *path, const char *suffix); +int is_mount_point_via_stat(struct strbuf *path); int daemon_avoid_alias(const char *path); /* diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh index 0aae0dee67078b..ad9336489c234c 100755 --- a/t/t7300-clean.sh +++ b/t/t7300-clean.sh @@ -801,4 +801,13 @@ test_expect_success 'traverse into directories that may have ignored entries' ' ) ' +test_expect_success MINGW 'clean does not traverse mount points' ' + mkdir target && + >target/dont-clean-me && + git init with-mountpoint && + cmd //c "mklink /j with-mountpoint\\mountpoint target" && + git -C with-mountpoint clean -dfx && + test_path_is_file target/dont-clean-me +' + test_done From 1f452546f2ad2b623abfda124f2788eb3653ab8a Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Sun, 20 Oct 2019 22:08:58 +0200 Subject: [PATCH 070/297] win32/pthread: avoid name clashes with winpthread The mingw-w64 GCC seems to link implicitly to libwinpthread, which does implement a pthread emulation (that is more complete than Git's). Let's keep preferring Git's. To avoid linker errors where it thinks that the `pthread_self` and the `pthread_create` symbols are defined twice, let's give our version a `win32_` prefix, just like we already do for `pthread_join()`. Signed-off-by: Johannes Schindelin --- compat/win32/pthread.c | 6 +++--- compat/win32/pthread.h | 8 +++++--- 2 files changed, 8 insertions(+), 6 deletions(-) diff --git a/compat/win32/pthread.c b/compat/win32/pthread.c index 85f8f7920ce48d..2f35b603248856 100644 --- a/compat/win32/pthread.c +++ b/compat/win32/pthread.c @@ -21,8 +21,8 @@ static unsigned __stdcall win32_start_routine(void *arg) return 0; } -int pthread_create(pthread_t *thread, const void *unused, - void *(*start_routine)(void *), void *arg) +int win32_pthread_create(pthread_t *thread, const void *unused, + void *(*start_routine)(void *), void *arg) { thread->arg = arg; thread->start_routine = start_routine; @@ -53,7 +53,7 @@ int win32_pthread_join(pthread_t *thread, void **value_ptr) } } -pthread_t pthread_self(void) +pthread_t win32_pthread_self(void) { pthread_t t = { NULL }; t.tid = GetCurrentThreadId(); diff --git a/compat/win32/pthread.h b/compat/win32/pthread.h index cc3221cb2c8a84..d0061ecf33c190 100644 --- a/compat/win32/pthread.h +++ b/compat/win32/pthread.h @@ -50,8 +50,9 @@ typedef struct { DWORD tid; } pthread_t; -int pthread_create(pthread_t *thread, const void *unused, - void *(*start_routine)(void*), void *arg); +int win32_pthread_create(pthread_t *thread, const void *unused, + void *(*start_routine)(void*), void *arg); +#define pthread_create win32_pthread_create /* * To avoid the need of copying a struct, we use small macro wrapper to pass @@ -62,7 +63,8 @@ int pthread_create(pthread_t *thread, const void *unused, int win32_pthread_join(pthread_t *thread, void **value_ptr); #define pthread_equal(t1, t2) ((t1).tid == (t2).tid) -pthread_t pthread_self(void); +pthread_t win32_pthread_self(void); +#define pthread_self win32_pthread_self static inline void NORETURN pthread_exit(void *ret) { From 68477eb78833b8a915ed143a618b7c5e62236691 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 18 Apr 2017 12:38:30 +0200 Subject: [PATCH 071/297] mingw: allow absolute paths without drive prefix When specifying an absolute path without a drive prefix, we convert that path internally. Let's make sure that we handle that case properly, too ;-) This fixes the command git clone https://github.com/git-for-windows/git \G4W Signed-off-by: Johannes Schindelin --- compat/mingw.c | 10 +++++++++- t/t5580-unc-paths.sh | 2 +- 2 files changed, 10 insertions(+), 2 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index 29d3f09768c29a..6d5dc618808384 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -1071,11 +1071,19 @@ unsigned int sleep (unsigned int seconds) char *mingw_mktemp(char *template) { wchar_t wtemplate[MAX_PATH]; + int offset = 0; + if (xutftowcs_path(wtemplate, template) < 0) return NULL; + + if (is_dir_sep(template[0]) && !is_dir_sep(template[1]) && + iswalpha(wtemplate[0]) && wtemplate[1] == L':') { + /* We have an absolute path missing the drive prefix */ + offset = 2; + } if (!_wmktemp(wtemplate)) return NULL; - if (xwcstoutf(template, wtemplate, strlen(template) + 1) < 0) + if (xwcstoutf(template, wtemplate + offset, strlen(template) + 1) < 0) return NULL; return template; } diff --git a/t/t5580-unc-paths.sh b/t/t5580-unc-paths.sh index dc0788cdbb1552..a513dede7e9e88 100755 --- a/t/t5580-unc-paths.sh +++ b/t/t5580-unc-paths.sh @@ -33,7 +33,7 @@ case "$UNCPATH" in ;; esac -test_expect_failure 'clone into absolute path lacking a drive prefix' ' +test_expect_success 'clone into absolute path lacking a drive prefix' ' USINGBACKSLASHES="$(echo "$WITHOUTDRIVE"/without-drive-prefix | tr / \\\\)" && git clone . "$USINGBACKSLASHES" && From 6498bc6abe1c7b2c8add7b65a2ba7705adc86cc5 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 11 Dec 2018 12:55:26 +0100 Subject: [PATCH 072/297] clean: remove mount points when possible Windows' equivalent to "bind mounts", NTFS junction points, can be unlinked without affecting the mount target. This is clearly what users expect to happen when they call `git clean -dfx` in a worktree that contains NTFS junction points: the junction should be removed, and the target directory of said junction should be left alone (unless it is inside the worktree). Signed-off-by: Johannes Schindelin --- builtin/clean.c | 13 +++++++++++++ compat/mingw.h | 1 + t/t7300-clean.sh | 1 + 3 files changed, 15 insertions(+) diff --git a/builtin/clean.c b/builtin/clean.c index e700df6ca66acf..9b416b42a96dfe 100644 --- a/builtin/clean.c +++ b/builtin/clean.c @@ -38,8 +38,10 @@ static const char *msg_remove = N_("Removing %s\n"); static const char *msg_would_remove = N_("Would remove %s\n"); static const char *msg_skip_git_dir = N_("Skipping repository %s\n"); static const char *msg_would_skip_git_dir = N_("Would skip repository %s\n"); +#ifndef CAN_UNLINK_MOUNT_POINTS static const char *msg_skip_mount_point = N_("Skipping mount point %s\n"); static const char *msg_would_skip_mount_point = N_("Would skip mount point %s\n"); +#endif static const char *msg_warn_remove_failed = N_("failed to remove %s"); static const char *msg_warn_lstat_failed = N_("could not lstat %s\n"); static const char *msg_skip_cwd = N_("Refusing to remove current working directory\n"); @@ -185,6 +187,7 @@ static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag, } if (is_mount_point(path)) { +#ifndef CAN_UNLINK_MOUNT_POINTS if (!quiet) { quote_path(path->buf, prefix, "ed, 0); printf(dry_run ? @@ -192,6 +195,16 @@ static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag, _(msg_skip_mount_point), quoted.buf); } *dir_gone = 0; +#else + if (!dry_run && unlink(path->buf)) { + int saved_errno = errno; + quote_path(path->buf, prefix, "ed, 0); + errno = saved_errno; + warning_errno(_(msg_warn_remove_failed), quoted.buf); + *dir_gone = 0; + ret = -1; + } +#endif goto out; } diff --git a/compat/mingw.h b/compat/mingw.h index a1075cd032d294..344586d1c28b7e 100644 --- a/compat/mingw.h +++ b/compat/mingw.h @@ -457,6 +457,7 @@ static inline void convert_slashes(char *path) struct strbuf; int mingw_is_mount_point(struct strbuf *path); #define is_mount_point mingw_is_mount_point +#define CAN_UNLINK_MOUNT_POINTS 1 #define PATH_SEP ';' char *mingw_query_user_email(void); #define query_user_email mingw_query_user_email diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh index ad9336489c234c..63e32961709e54 100755 --- a/t/t7300-clean.sh +++ b/t/t7300-clean.sh @@ -807,6 +807,7 @@ test_expect_success MINGW 'clean does not traverse mount points' ' git init with-mountpoint && cmd //c "mklink /j with-mountpoint\\mountpoint target" && git -C with-mountpoint clean -dfx && + test_path_is_missing with-mountpoint/mountpoint && test_path_is_file target/dont-clean-me ' From 7299e07bdbfaf40d9eac9ad43fd85cdfc792724f Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 16 Feb 2015 14:06:59 +0100 Subject: [PATCH 073/297] mingw: include the Python parts in the build While Git for Windows does not _ship_ Python (in order to save on bandwidth), MSYS2 provides very fine Python interpreters that users can easily take advantage of, by using Git for Windows within its SDK. Signed-off-by: Johannes Schindelin --- config.mak.uname | 1 + 1 file changed, 1 insertion(+) diff --git a/config.mak.uname b/config.mak.uname index 85d63821ec95f6..984a1ef67b450b 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -738,6 +738,7 @@ ifeq ($(uname_S),MINGW) USE_GETTEXT_SCHEME = fallthrough USE_LIBPCRE = YesPlease USE_NED_ALLOCATOR = YesPlease + NO_PYTHON = ifeq (/mingw64,$(subst 32,64,$(prefix))) # Move system config into top-level /etc/ ETC_GITCONFIG = ../etc/gitconfig From 199592baa9a79eeaeef04fadffb90b87e94815ff Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 12 Aug 2022 12:44:15 +0200 Subject: [PATCH 074/297] git-compat-util: avoid redeclaring _DEFAULT_SOURCE We are about to vendor in `mimalloc`'s source code which we will want to include `git-compat-util.h` after defining that constant. Signed-off-by: Johannes Schindelin --- git-compat-util.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/git-compat-util.h b/git-compat-util.h index 71b4d23f0386b2..468d40498fd35a 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -191,7 +191,9 @@ struct strbuf; #define _ALL_SOURCE 1 #define _GNU_SOURCE 1 #define _BSD_SOURCE 1 +#ifndef _DEFAULT_SOURCE #define _DEFAULT_SOURCE 1 +#endif #define _NETBSD_SOURCE 1 #define _SGI_SOURCE 1 From e7da041ff13c3001e9a47a094efb0bd71cfa63e8 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 24 Jun 2019 21:31:30 +0200 Subject: [PATCH 075/297] Import the source code of mimalloc v2.1.2 This commit imports mimalloc's source code as per v2.1.2, fetched from the tag at https://github.com/microsoft/mimalloc. The .c files are from the src/ subdirectory, and the .h files from the include/ and include/mimalloc/ subdirectories. We will subsequently modify the source code to accommodate building within Git's context. Since we plan on using the `mi_*()` family of functions, we skip the C++-specific source code, some POSIX compliant functions to interact with mimalloc, and the code that wants to support auto-magic overriding of the `malloc()` function (mimalloc-new-delete.h, alloc-posix.c, mimalloc-override.h, alloc-override.c, alloc-override-osx.c, alloc-override-win.c and static.c). To appease the `check-whitespace` job of Git's Continuous Integration, this commit was washed one time via `git rebase --whitespace=fix`. Signed-off-by: Johannes Schindelin --- Makefile | 1 + compat/mimalloc/LICENSE | 21 + compat/mimalloc/alloc-aligned.c | 298 +++++ compat/mimalloc/alloc.c | 1060 ++++++++++++++++++ compat/mimalloc/arena.c | 935 ++++++++++++++++ compat/mimalloc/bitmap.c | 432 +++++++ compat/mimalloc/bitmap.h | 115 ++ compat/mimalloc/heap.c | 626 +++++++++++ compat/mimalloc/init.c | 709 ++++++++++++ compat/mimalloc/mimalloc.h | 565 ++++++++++ compat/mimalloc/mimalloc/atomic.h | 385 +++++++ compat/mimalloc/mimalloc/internal.h | 979 ++++++++++++++++ compat/mimalloc/mimalloc/prim.h | 323 ++++++ compat/mimalloc/mimalloc/track.h | 147 +++ compat/mimalloc/mimalloc/types.h | 670 +++++++++++ compat/mimalloc/options.c | 571 ++++++++++ compat/mimalloc/os.c | 689 ++++++++++++ compat/mimalloc/page-queue.c | 332 ++++++ compat/mimalloc/page.c | 939 ++++++++++++++++ compat/mimalloc/prim/windows/prim.c | 622 +++++++++++ compat/mimalloc/random.c | 254 +++++ compat/mimalloc/segment-cache.c | 0 compat/mimalloc/segment-map.c | 153 +++ compat/mimalloc/segment.c | 1617 +++++++++++++++++++++++++++ compat/mimalloc/stats.c | 467 ++++++++ 25 files changed, 12910 insertions(+) create mode 100644 compat/mimalloc/LICENSE create mode 100644 compat/mimalloc/alloc-aligned.c create mode 100644 compat/mimalloc/alloc.c create mode 100644 compat/mimalloc/arena.c create mode 100644 compat/mimalloc/bitmap.c create mode 100644 compat/mimalloc/bitmap.h create mode 100644 compat/mimalloc/heap.c create mode 100644 compat/mimalloc/init.c create mode 100644 compat/mimalloc/mimalloc.h create mode 100644 compat/mimalloc/mimalloc/atomic.h create mode 100644 compat/mimalloc/mimalloc/internal.h create mode 100644 compat/mimalloc/mimalloc/prim.h create mode 100644 compat/mimalloc/mimalloc/track.h create mode 100644 compat/mimalloc/mimalloc/types.h create mode 100644 compat/mimalloc/options.c create mode 100644 compat/mimalloc/os.c create mode 100644 compat/mimalloc/page-queue.c create mode 100644 compat/mimalloc/page.c create mode 100644 compat/mimalloc/prim/windows/prim.c create mode 100644 compat/mimalloc/random.c create mode 100644 compat/mimalloc/segment-cache.c create mode 100644 compat/mimalloc/segment-map.c create mode 100644 compat/mimalloc/segment.c create mode 100644 compat/mimalloc/stats.c diff --git a/Makefile b/Makefile index d6479092a0b8df..db48c72620c78d 100644 --- a/Makefile +++ b/Makefile @@ -1325,6 +1325,7 @@ BUILTIN_OBJS += builtin/write-tree.o # upstream unnecessarily (making merging in future changes easier). THIRD_PARTY_SOURCES += compat/inet_ntop.c THIRD_PARTY_SOURCES += compat/inet_pton.c +THIRD_PARTY_SOURCES += compat/mimalloc/% THIRD_PARTY_SOURCES += compat/nedmalloc/% THIRD_PARTY_SOURCES += compat/obstack.% THIRD_PARTY_SOURCES += compat/poll/% diff --git a/compat/mimalloc/LICENSE b/compat/mimalloc/LICENSE new file mode 100644 index 00000000000000..670b668a0c928e --- /dev/null +++ b/compat/mimalloc/LICENSE @@ -0,0 +1,21 @@ +MIT License + +Copyright (c) 2018-2021 Microsoft Corporation, Daan Leijen + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. diff --git a/compat/mimalloc/alloc-aligned.c b/compat/mimalloc/alloc-aligned.c new file mode 100644 index 00000000000000..e975af5f7c2ad4 --- /dev/null +++ b/compat/mimalloc/alloc-aligned.c @@ -0,0 +1,298 @@ +/* ---------------------------------------------------------------------------- +Copyright (c) 2018-2021, Microsoft Research, Daan Leijen +This is free software; you can redistribute it and/or modify it under the +terms of the MIT license. A copy of the license can be found in the file +"LICENSE" at the root of this distribution. +-----------------------------------------------------------------------------*/ + +#include "mimalloc.h" +#include "mimalloc/internal.h" +#include "mimalloc/prim.h" // mi_prim_get_default_heap + +#include // memset + +// ------------------------------------------------------ +// Aligned Allocation +// ------------------------------------------------------ + +// Fallback primitive aligned allocation -- split out for better codegen +static mi_decl_noinline void* mi_heap_malloc_zero_aligned_at_fallback(mi_heap_t* const heap, const size_t size, const size_t alignment, const size_t offset, const bool zero) mi_attr_noexcept +{ + mi_assert_internal(size <= PTRDIFF_MAX); + mi_assert_internal(alignment != 0 && _mi_is_power_of_two(alignment)); + + const uintptr_t align_mask = alignment - 1; // for any x, `(x & align_mask) == (x % alignment)` + const size_t padsize = size + MI_PADDING_SIZE; + + // use regular allocation if it is guaranteed to fit the alignment constraints + if (offset==0 && alignment<=padsize && padsize<=MI_MAX_ALIGN_GUARANTEE && (padsize&align_mask)==0) { + void* p = _mi_heap_malloc_zero(heap, size, zero); + mi_assert_internal(p == NULL || ((uintptr_t)p % alignment) == 0); + return p; + } + + void* p; + size_t oversize; + if mi_unlikely(alignment > MI_ALIGNMENT_MAX) { + // use OS allocation for very large alignment and allocate inside a huge page (dedicated segment with 1 page) + // This can support alignments >= MI_SEGMENT_SIZE by ensuring the object can be aligned at a point in the + // first (and single) page such that the segment info is `MI_SEGMENT_SIZE` bytes before it (so it can be found by aligning the pointer down) + if mi_unlikely(offset != 0) { + // todo: cannot support offset alignment for very large alignments yet + #if MI_DEBUG > 0 + _mi_error_message(EOVERFLOW, "aligned allocation with a very large alignment cannot be used with an alignment offset (size %zu, alignment %zu, offset %zu)\n", size, alignment, offset); + #endif + return NULL; + } + oversize = (size <= MI_SMALL_SIZE_MAX ? MI_SMALL_SIZE_MAX + 1 /* ensure we use generic malloc path */ : size); + p = _mi_heap_malloc_zero_ex(heap, oversize, false, alignment); // the page block size should be large enough to align in the single huge page block + // zero afterwards as only the area from the aligned_p may be committed! + if (p == NULL) return NULL; + } + else { + // otherwise over-allocate + oversize = size + alignment - 1; + p = _mi_heap_malloc_zero(heap, oversize, zero); + if (p == NULL) return NULL; + } + + // .. and align within the allocation + const uintptr_t poffset = ((uintptr_t)p + offset) & align_mask; + const uintptr_t adjust = (poffset == 0 ? 0 : alignment - poffset); + mi_assert_internal(adjust < alignment); + void* aligned_p = (void*)((uintptr_t)p + adjust); + if (aligned_p != p) { + mi_page_t* page = _mi_ptr_page(p); + mi_page_set_has_aligned(page, true); + _mi_padding_shrink(page, (mi_block_t*)p, adjust + size); + } + // todo: expand padding if overallocated ? + + mi_assert_internal(mi_page_usable_block_size(_mi_ptr_page(p)) >= adjust + size); + mi_assert_internal(p == _mi_page_ptr_unalign(_mi_ptr_segment(aligned_p), _mi_ptr_page(aligned_p), aligned_p)); + mi_assert_internal(((uintptr_t)aligned_p + offset) % alignment == 0); + mi_assert_internal(mi_usable_size(aligned_p)>=size); + mi_assert_internal(mi_usable_size(p) == mi_usable_size(aligned_p)+adjust); + + // now zero the block if needed + if (alignment > MI_ALIGNMENT_MAX) { + // for the tracker, on huge aligned allocations only from the start of the large block is defined + mi_track_mem_undefined(aligned_p, size); + if (zero) { + _mi_memzero_aligned(aligned_p, mi_usable_size(aligned_p)); + } + } + + if (p != aligned_p) { + mi_track_align(p,aligned_p,adjust,mi_usable_size(aligned_p)); + } + return aligned_p; +} + +// Primitive aligned allocation +static void* mi_heap_malloc_zero_aligned_at(mi_heap_t* const heap, const size_t size, const size_t alignment, const size_t offset, const bool zero) mi_attr_noexcept +{ + // note: we don't require `size > offset`, we just guarantee that the address at offset is aligned regardless of the allocated size. + if mi_unlikely(alignment == 0 || !_mi_is_power_of_two(alignment)) { // require power-of-two (see ) + #if MI_DEBUG > 0 + _mi_error_message(EOVERFLOW, "aligned allocation requires the alignment to be a power-of-two (size %zu, alignment %zu)\n", size, alignment); + #endif + return NULL; + } + + if mi_unlikely(size > PTRDIFF_MAX) { // we don't allocate more than PTRDIFF_MAX (see ) + #if MI_DEBUG > 0 + _mi_error_message(EOVERFLOW, "aligned allocation request is too large (size %zu, alignment %zu)\n", size, alignment); + #endif + return NULL; + } + const uintptr_t align_mask = alignment-1; // for any x, `(x & align_mask) == (x % alignment)` + const size_t padsize = size + MI_PADDING_SIZE; // note: cannot overflow due to earlier size > PTRDIFF_MAX check + + // try first if there happens to be a small block available with just the right alignment + if mi_likely(padsize <= MI_SMALL_SIZE_MAX && alignment <= padsize) { + mi_page_t* page = _mi_heap_get_free_small_page(heap, padsize); + const bool is_aligned = (((uintptr_t)page->free+offset) & align_mask)==0; + if mi_likely(page->free != NULL && is_aligned) + { + #if MI_STAT>1 + mi_heap_stat_increase(heap, malloc, size); + #endif + void* p = _mi_page_malloc(heap, page, padsize, zero); // TODO: inline _mi_page_malloc + mi_assert_internal(p != NULL); + mi_assert_internal(((uintptr_t)p + offset) % alignment == 0); + mi_track_malloc(p,size,zero); + return p; + } + } + // fallback + return mi_heap_malloc_zero_aligned_at_fallback(heap, size, alignment, offset, zero); +} + + +// ------------------------------------------------------ +// Optimized mi_heap_malloc_aligned / mi_malloc_aligned +// ------------------------------------------------------ + +mi_decl_nodiscard mi_decl_restrict void* mi_heap_malloc_aligned_at(mi_heap_t* heap, size_t size, size_t alignment, size_t offset) mi_attr_noexcept { + return mi_heap_malloc_zero_aligned_at(heap, size, alignment, offset, false); +} + +mi_decl_nodiscard mi_decl_restrict void* mi_heap_malloc_aligned(mi_heap_t* heap, size_t size, size_t alignment) mi_attr_noexcept { + if mi_unlikely(alignment == 0 || !_mi_is_power_of_two(alignment)) return NULL; + #if !MI_PADDING + // without padding, any small sized allocation is naturally aligned (see also `_mi_segment_page_start`) + if mi_likely(_mi_is_power_of_two(size) && size >= alignment && size <= MI_SMALL_SIZE_MAX) + #else + // with padding, we can only guarantee this for fixed alignments + if mi_likely((alignment == sizeof(void*) || (alignment == MI_MAX_ALIGN_SIZE && size > (MI_MAX_ALIGN_SIZE/2))) + && size <= MI_SMALL_SIZE_MAX) + #endif + { + // fast path for common alignment and size + return mi_heap_malloc_small(heap, size); + } + else { + return mi_heap_malloc_aligned_at(heap, size, alignment, 0); + } +} + +// ensure a definition is emitted +#if defined(__cplusplus) +static void* _mi_heap_malloc_aligned = (void*)&mi_heap_malloc_aligned; +#endif + +// ------------------------------------------------------ +// Aligned Allocation +// ------------------------------------------------------ + +mi_decl_nodiscard mi_decl_restrict void* mi_heap_zalloc_aligned_at(mi_heap_t* heap, size_t size, size_t alignment, size_t offset) mi_attr_noexcept { + return mi_heap_malloc_zero_aligned_at(heap, size, alignment, offset, true); +} + +mi_decl_nodiscard mi_decl_restrict void* mi_heap_zalloc_aligned(mi_heap_t* heap, size_t size, size_t alignment) mi_attr_noexcept { + return mi_heap_zalloc_aligned_at(heap, size, alignment, 0); +} + +mi_decl_nodiscard mi_decl_restrict void* mi_heap_calloc_aligned_at(mi_heap_t* heap, size_t count, size_t size, size_t alignment, size_t offset) mi_attr_noexcept { + size_t total; + if (mi_count_size_overflow(count, size, &total)) return NULL; + return mi_heap_zalloc_aligned_at(heap, total, alignment, offset); +} + +mi_decl_nodiscard mi_decl_restrict void* mi_heap_calloc_aligned(mi_heap_t* heap, size_t count, size_t size, size_t alignment) mi_attr_noexcept { + return mi_heap_calloc_aligned_at(heap,count,size,alignment,0); +} + +mi_decl_nodiscard mi_decl_restrict void* mi_malloc_aligned_at(size_t size, size_t alignment, size_t offset) mi_attr_noexcept { + return mi_heap_malloc_aligned_at(mi_prim_get_default_heap(), size, alignment, offset); +} + +mi_decl_nodiscard mi_decl_restrict void* mi_malloc_aligned(size_t size, size_t alignment) mi_attr_noexcept { + return mi_heap_malloc_aligned(mi_prim_get_default_heap(), size, alignment); +} + +mi_decl_nodiscard mi_decl_restrict void* mi_zalloc_aligned_at(size_t size, size_t alignment, size_t offset) mi_attr_noexcept { + return mi_heap_zalloc_aligned_at(mi_prim_get_default_heap(), size, alignment, offset); +} + +mi_decl_nodiscard mi_decl_restrict void* mi_zalloc_aligned(size_t size, size_t alignment) mi_attr_noexcept { + return mi_heap_zalloc_aligned(mi_prim_get_default_heap(), size, alignment); +} + +mi_decl_nodiscard mi_decl_restrict void* mi_calloc_aligned_at(size_t count, size_t size, size_t alignment, size_t offset) mi_attr_noexcept { + return mi_heap_calloc_aligned_at(mi_prim_get_default_heap(), count, size, alignment, offset); +} + +mi_decl_nodiscard mi_decl_restrict void* mi_calloc_aligned(size_t count, size_t size, size_t alignment) mi_attr_noexcept { + return mi_heap_calloc_aligned(mi_prim_get_default_heap(), count, size, alignment); +} + + +// ------------------------------------------------------ +// Aligned re-allocation +// ------------------------------------------------------ + +static void* mi_heap_realloc_zero_aligned_at(mi_heap_t* heap, void* p, size_t newsize, size_t alignment, size_t offset, bool zero) mi_attr_noexcept { + mi_assert(alignment > 0); + if (alignment <= sizeof(uintptr_t)) return _mi_heap_realloc_zero(heap,p,newsize,zero); + if (p == NULL) return mi_heap_malloc_zero_aligned_at(heap,newsize,alignment,offset,zero); + size_t size = mi_usable_size(p); + if (newsize <= size && newsize >= (size - (size / 2)) + && (((uintptr_t)p + offset) % alignment) == 0) { + return p; // reallocation still fits, is aligned and not more than 50% waste + } + else { + // note: we don't zero allocate upfront so we only zero initialize the expanded part + void* newp = mi_heap_malloc_aligned_at(heap,newsize,alignment,offset); + if (newp != NULL) { + if (zero && newsize > size) { + // also set last word in the previous allocation to zero to ensure any padding is zero-initialized + size_t start = (size >= sizeof(intptr_t) ? size - sizeof(intptr_t) : 0); + _mi_memzero((uint8_t*)newp + start, newsize - start); + } + _mi_memcpy_aligned(newp, p, (newsize > size ? size : newsize)); + mi_free(p); // only free if successful + } + return newp; + } +} + +static void* mi_heap_realloc_zero_aligned(mi_heap_t* heap, void* p, size_t newsize, size_t alignment, bool zero) mi_attr_noexcept { + mi_assert(alignment > 0); + if (alignment <= sizeof(uintptr_t)) return _mi_heap_realloc_zero(heap,p,newsize,zero); + size_t offset = ((uintptr_t)p % alignment); // use offset of previous allocation (p can be NULL) + return mi_heap_realloc_zero_aligned_at(heap,p,newsize,alignment,offset,zero); +} + +mi_decl_nodiscard void* mi_heap_realloc_aligned_at(mi_heap_t* heap, void* p, size_t newsize, size_t alignment, size_t offset) mi_attr_noexcept { + return mi_heap_realloc_zero_aligned_at(heap,p,newsize,alignment,offset,false); +} + +mi_decl_nodiscard void* mi_heap_realloc_aligned(mi_heap_t* heap, void* p, size_t newsize, size_t alignment) mi_attr_noexcept { + return mi_heap_realloc_zero_aligned(heap,p,newsize,alignment,false); +} + +mi_decl_nodiscard void* mi_heap_rezalloc_aligned_at(mi_heap_t* heap, void* p, size_t newsize, size_t alignment, size_t offset) mi_attr_noexcept { + return mi_heap_realloc_zero_aligned_at(heap, p, newsize, alignment, offset, true); +} + +mi_decl_nodiscard void* mi_heap_rezalloc_aligned(mi_heap_t* heap, void* p, size_t newsize, size_t alignment) mi_attr_noexcept { + return mi_heap_realloc_zero_aligned(heap, p, newsize, alignment, true); +} + +mi_decl_nodiscard void* mi_heap_recalloc_aligned_at(mi_heap_t* heap, void* p, size_t newcount, size_t size, size_t alignment, size_t offset) mi_attr_noexcept { + size_t total; + if (mi_count_size_overflow(newcount, size, &total)) return NULL; + return mi_heap_rezalloc_aligned_at(heap, p, total, alignment, offset); +} + +mi_decl_nodiscard void* mi_heap_recalloc_aligned(mi_heap_t* heap, void* p, size_t newcount, size_t size, size_t alignment) mi_attr_noexcept { + size_t total; + if (mi_count_size_overflow(newcount, size, &total)) return NULL; + return mi_heap_rezalloc_aligned(heap, p, total, alignment); +} + +mi_decl_nodiscard void* mi_realloc_aligned_at(void* p, size_t newsize, size_t alignment, size_t offset) mi_attr_noexcept { + return mi_heap_realloc_aligned_at(mi_prim_get_default_heap(), p, newsize, alignment, offset); +} + +mi_decl_nodiscard void* mi_realloc_aligned(void* p, size_t newsize, size_t alignment) mi_attr_noexcept { + return mi_heap_realloc_aligned(mi_prim_get_default_heap(), p, newsize, alignment); +} + +mi_decl_nodiscard void* mi_rezalloc_aligned_at(void* p, size_t newsize, size_t alignment, size_t offset) mi_attr_noexcept { + return mi_heap_rezalloc_aligned_at(mi_prim_get_default_heap(), p, newsize, alignment, offset); +} + +mi_decl_nodiscard void* mi_rezalloc_aligned(void* p, size_t newsize, size_t alignment) mi_attr_noexcept { + return mi_heap_rezalloc_aligned(mi_prim_get_default_heap(), p, newsize, alignment); +} + +mi_decl_nodiscard void* mi_recalloc_aligned_at(void* p, size_t newcount, size_t size, size_t alignment, size_t offset) mi_attr_noexcept { + return mi_heap_recalloc_aligned_at(mi_prim_get_default_heap(), p, newcount, size, alignment, offset); +} + +mi_decl_nodiscard void* mi_recalloc_aligned(void* p, size_t newcount, size_t size, size_t alignment) mi_attr_noexcept { + return mi_heap_recalloc_aligned(mi_prim_get_default_heap(), p, newcount, size, alignment); +} diff --git a/compat/mimalloc/alloc.c b/compat/mimalloc/alloc.c new file mode 100644 index 00000000000000..961f6d53d0f2c7 --- /dev/null +++ b/compat/mimalloc/alloc.c @@ -0,0 +1,1060 @@ +/* ---------------------------------------------------------------------------- +Copyright (c) 2018-2022, Microsoft Research, Daan Leijen +This is free software; you can redistribute it and/or modify it under the +terms of the MIT license. A copy of the license can be found in the file +"LICENSE" at the root of this distribution. +-----------------------------------------------------------------------------*/ +#ifndef _DEFAULT_SOURCE +#define _DEFAULT_SOURCE // for realpath() on Linux +#endif + +#include "mimalloc.h" +#include "mimalloc/internal.h" +#include "mimalloc/atomic.h" +#include "mimalloc/prim.h" // _mi_prim_thread_id() + +#include // memset, strlen (for mi_strdup) +#include // malloc, abort + +#define MI_IN_ALLOC_C +#include "alloc-override.c" +#undef MI_IN_ALLOC_C + +// ------------------------------------------------------ +// Allocation +// ------------------------------------------------------ + +// Fast allocation in a page: just pop from the free list. +// Fall back to generic allocation only if the list is empty. +extern inline void* _mi_page_malloc(mi_heap_t* heap, mi_page_t* page, size_t size, bool zero) mi_attr_noexcept { + mi_assert_internal(page->xblock_size==0||mi_page_block_size(page) >= size); + mi_block_t* const block = page->free; + if mi_unlikely(block == NULL) { + return _mi_malloc_generic(heap, size, zero, 0); + } + mi_assert_internal(block != NULL && _mi_ptr_page(block) == page); + // pop from the free list + page->used++; + page->free = mi_block_next(page, block); + mi_assert_internal(page->free == NULL || _mi_ptr_page(page->free) == page); + #if MI_DEBUG>3 + if (page->free_is_zero) { + mi_assert_expensive(mi_mem_is_zero(block+1,size - sizeof(*block))); + } + #endif + + // allow use of the block internally + // note: when tracking we need to avoid ever touching the MI_PADDING since + // that is tracked by valgrind etc. as non-accessible (through the red-zone, see `mimalloc/track.h`) + mi_track_mem_undefined(block, mi_page_usable_block_size(page)); + + // zero the block? note: we need to zero the full block size (issue #63) + if mi_unlikely(zero) { + mi_assert_internal(page->xblock_size != 0); // do not call with zero'ing for huge blocks (see _mi_malloc_generic) + mi_assert_internal(page->xblock_size >= MI_PADDING_SIZE); + if (page->free_is_zero) { + block->next = 0; + mi_track_mem_defined(block, page->xblock_size - MI_PADDING_SIZE); + } + else { + _mi_memzero_aligned(block, page->xblock_size - MI_PADDING_SIZE); + } + } + +#if (MI_DEBUG>0) && !MI_TRACK_ENABLED && !MI_TSAN + if (!zero && !mi_page_is_huge(page)) { + memset(block, MI_DEBUG_UNINIT, mi_page_usable_block_size(page)); + } +#elif (MI_SECURE!=0) + if (!zero) { block->next = 0; } // don't leak internal data +#endif + +#if (MI_STAT>0) + const size_t bsize = mi_page_usable_block_size(page); + if (bsize <= MI_MEDIUM_OBJ_SIZE_MAX) { + mi_heap_stat_increase(heap, normal, bsize); + mi_heap_stat_counter_increase(heap, normal_count, 1); +#if (MI_STAT>1) + const size_t bin = _mi_bin(bsize); + mi_heap_stat_increase(heap, normal_bins[bin], 1); +#endif + } +#endif + +#if MI_PADDING // && !MI_TRACK_ENABLED + mi_padding_t* const padding = (mi_padding_t*)((uint8_t*)block + mi_page_usable_block_size(page)); + ptrdiff_t delta = ((uint8_t*)padding - (uint8_t*)block - (size - MI_PADDING_SIZE)); + #if (MI_DEBUG>=2) + mi_assert_internal(delta >= 0 && mi_page_usable_block_size(page) >= (size - MI_PADDING_SIZE + delta)); + #endif + mi_track_mem_defined(padding,sizeof(mi_padding_t)); // note: re-enable since mi_page_usable_block_size may set noaccess + padding->canary = (uint32_t)(mi_ptr_encode(page,block,page->keys)); + padding->delta = (uint32_t)(delta); + #if MI_PADDING_CHECK + if (!mi_page_is_huge(page)) { + uint8_t* fill = (uint8_t*)padding - delta; + const size_t maxpad = (delta > MI_MAX_ALIGN_SIZE ? MI_MAX_ALIGN_SIZE : delta); // set at most N initial padding bytes + for (size_t i = 0; i < maxpad; i++) { fill[i] = MI_DEBUG_PADDING; } + } + #endif +#endif + + return block; +} + +static inline mi_decl_restrict void* mi_heap_malloc_small_zero(mi_heap_t* heap, size_t size, bool zero) mi_attr_noexcept { + mi_assert(heap != NULL); + #if MI_DEBUG + const uintptr_t tid = _mi_thread_id(); + mi_assert(heap->thread_id == 0 || heap->thread_id == tid); // heaps are thread local + #endif + mi_assert(size <= MI_SMALL_SIZE_MAX); + #if (MI_PADDING) + if (size == 0) { size = sizeof(void*); } + #endif + mi_page_t* page = _mi_heap_get_free_small_page(heap, size + MI_PADDING_SIZE); + void* const p = _mi_page_malloc(heap, page, size + MI_PADDING_SIZE, zero); + mi_track_malloc(p,size,zero); + #if MI_STAT>1 + if (p != NULL) { + if (!mi_heap_is_initialized(heap)) { heap = mi_prim_get_default_heap(); } + mi_heap_stat_increase(heap, malloc, mi_usable_size(p)); + } + #endif + #if MI_DEBUG>3 + if (p != NULL && zero) { + mi_assert_expensive(mi_mem_is_zero(p, size)); + } + #endif + return p; +} + +// allocate a small block +mi_decl_nodiscard extern inline mi_decl_restrict void* mi_heap_malloc_small(mi_heap_t* heap, size_t size) mi_attr_noexcept { + return mi_heap_malloc_small_zero(heap, size, false); +} + +mi_decl_nodiscard extern inline mi_decl_restrict void* mi_malloc_small(size_t size) mi_attr_noexcept { + return mi_heap_malloc_small(mi_prim_get_default_heap(), size); +} + +// The main allocation function +extern inline void* _mi_heap_malloc_zero_ex(mi_heap_t* heap, size_t size, bool zero, size_t huge_alignment) mi_attr_noexcept { + if mi_likely(size <= MI_SMALL_SIZE_MAX) { + mi_assert_internal(huge_alignment == 0); + return mi_heap_malloc_small_zero(heap, size, zero); + } + else { + mi_assert(heap!=NULL); + mi_assert(heap->thread_id == 0 || heap->thread_id == _mi_thread_id()); // heaps are thread local + void* const p = _mi_malloc_generic(heap, size + MI_PADDING_SIZE, zero, huge_alignment); // note: size can overflow but it is detected in malloc_generic + mi_track_malloc(p,size,zero); + #if MI_STAT>1 + if (p != NULL) { + if (!mi_heap_is_initialized(heap)) { heap = mi_prim_get_default_heap(); } + mi_heap_stat_increase(heap, malloc, mi_usable_size(p)); + } + #endif + #if MI_DEBUG>3 + if (p != NULL && zero) { + mi_assert_expensive(mi_mem_is_zero(p, size)); + } + #endif + return p; + } +} + +extern inline void* _mi_heap_malloc_zero(mi_heap_t* heap, size_t size, bool zero) mi_attr_noexcept { + return _mi_heap_malloc_zero_ex(heap, size, zero, 0); +} + +mi_decl_nodiscard extern inline mi_decl_restrict void* mi_heap_malloc(mi_heap_t* heap, size_t size) mi_attr_noexcept { + return _mi_heap_malloc_zero(heap, size, false); +} + +mi_decl_nodiscard extern inline mi_decl_restrict void* mi_malloc(size_t size) mi_attr_noexcept { + return mi_heap_malloc(mi_prim_get_default_heap(), size); +} + +// zero initialized small block +mi_decl_nodiscard mi_decl_restrict void* mi_zalloc_small(size_t size) mi_attr_noexcept { + return mi_heap_malloc_small_zero(mi_prim_get_default_heap(), size, true); +} + +mi_decl_nodiscard extern inline mi_decl_restrict void* mi_heap_zalloc(mi_heap_t* heap, size_t size) mi_attr_noexcept { + return _mi_heap_malloc_zero(heap, size, true); +} + +mi_decl_nodiscard mi_decl_restrict void* mi_zalloc(size_t size) mi_attr_noexcept { + return mi_heap_zalloc(mi_prim_get_default_heap(),size); +} + + +// ------------------------------------------------------ +// Check for double free in secure and debug mode +// This is somewhat expensive so only enabled for secure mode 4 +// ------------------------------------------------------ + +#if (MI_ENCODE_FREELIST && (MI_SECURE>=4 || MI_DEBUG!=0)) +// linear check if the free list contains a specific element +static bool mi_list_contains(const mi_page_t* page, const mi_block_t* list, const mi_block_t* elem) { + while (list != NULL) { + if (elem==list) return true; + list = mi_block_next(page, list); + } + return false; +} + +static mi_decl_noinline bool mi_check_is_double_freex(const mi_page_t* page, const mi_block_t* block) { + // The decoded value is in the same page (or NULL). + // Walk the free lists to verify positively if it is already freed + if (mi_list_contains(page, page->free, block) || + mi_list_contains(page, page->local_free, block) || + mi_list_contains(page, mi_page_thread_free(page), block)) + { + _mi_error_message(EAGAIN, "double free detected of block %p with size %zu\n", block, mi_page_block_size(page)); + return true; + } + return false; +} + +#define mi_track_page(page,access) { size_t psize; void* pstart = _mi_page_start(_mi_page_segment(page),page,&psize); mi_track_mem_##access( pstart, psize); } + +static inline bool mi_check_is_double_free(const mi_page_t* page, const mi_block_t* block) { + bool is_double_free = false; + mi_block_t* n = mi_block_nextx(page, block, page->keys); // pretend it is freed, and get the decoded first field + if (((uintptr_t)n & (MI_INTPTR_SIZE-1))==0 && // quick check: aligned pointer? + (n==NULL || mi_is_in_same_page(block, n))) // quick check: in same page or NULL? + { + // Suspicous: decoded value a in block is in the same page (or NULL) -- maybe a double free? + // (continue in separate function to improve code generation) + is_double_free = mi_check_is_double_freex(page, block); + } + return is_double_free; +} +#else +static inline bool mi_check_is_double_free(const mi_page_t* page, const mi_block_t* block) { + MI_UNUSED(page); + MI_UNUSED(block); + return false; +} +#endif + +// --------------------------------------------------------------------------- +// Check for heap block overflow by setting up padding at the end of the block +// --------------------------------------------------------------------------- + +#if MI_PADDING // && !MI_TRACK_ENABLED +static bool mi_page_decode_padding(const mi_page_t* page, const mi_block_t* block, size_t* delta, size_t* bsize) { + *bsize = mi_page_usable_block_size(page); + const mi_padding_t* const padding = (mi_padding_t*)((uint8_t*)block + *bsize); + mi_track_mem_defined(padding,sizeof(mi_padding_t)); + *delta = padding->delta; + uint32_t canary = padding->canary; + uintptr_t keys[2]; + keys[0] = page->keys[0]; + keys[1] = page->keys[1]; + bool ok = ((uint32_t)mi_ptr_encode(page,block,keys) == canary && *delta <= *bsize); + mi_track_mem_noaccess(padding,sizeof(mi_padding_t)); + return ok; +} + +// Return the exact usable size of a block. +static size_t mi_page_usable_size_of(const mi_page_t* page, const mi_block_t* block) { + size_t bsize; + size_t delta; + bool ok = mi_page_decode_padding(page, block, &delta, &bsize); + mi_assert_internal(ok); mi_assert_internal(delta <= bsize); + return (ok ? bsize - delta : 0); +} + +// When a non-thread-local block is freed, it becomes part of the thread delayed free +// list that is freed later by the owning heap. If the exact usable size is too small to +// contain the pointer for the delayed list, then shrink the padding (by decreasing delta) +// so it will later not trigger an overflow error in `mi_free_block`. +void _mi_padding_shrink(const mi_page_t* page, const mi_block_t* block, const size_t min_size) { + size_t bsize; + size_t delta; + bool ok = mi_page_decode_padding(page, block, &delta, &bsize); + mi_assert_internal(ok); + if (!ok || (bsize - delta) >= min_size) return; // usually already enough space + mi_assert_internal(bsize >= min_size); + if (bsize < min_size) return; // should never happen + size_t new_delta = (bsize - min_size); + mi_assert_internal(new_delta < bsize); + mi_padding_t* padding = (mi_padding_t*)((uint8_t*)block + bsize); + mi_track_mem_defined(padding,sizeof(mi_padding_t)); + padding->delta = (uint32_t)new_delta; + mi_track_mem_noaccess(padding,sizeof(mi_padding_t)); +} +#else +static size_t mi_page_usable_size_of(const mi_page_t* page, const mi_block_t* block) { + MI_UNUSED(block); + return mi_page_usable_block_size(page); +} + +void _mi_padding_shrink(const mi_page_t* page, const mi_block_t* block, const size_t min_size) { + MI_UNUSED(page); + MI_UNUSED(block); + MI_UNUSED(min_size); +} +#endif + +#if MI_PADDING && MI_PADDING_CHECK + +static bool mi_verify_padding(const mi_page_t* page, const mi_block_t* block, size_t* size, size_t* wrong) { + size_t bsize; + size_t delta; + bool ok = mi_page_decode_padding(page, block, &delta, &bsize); + *size = *wrong = bsize; + if (!ok) return false; + mi_assert_internal(bsize >= delta); + *size = bsize - delta; + if (!mi_page_is_huge(page)) { + uint8_t* fill = (uint8_t*)block + bsize - delta; + const size_t maxpad = (delta > MI_MAX_ALIGN_SIZE ? MI_MAX_ALIGN_SIZE : delta); // check at most the first N padding bytes + mi_track_mem_defined(fill, maxpad); + for (size_t i = 0; i < maxpad; i++) { + if (fill[i] != MI_DEBUG_PADDING) { + *wrong = bsize - delta + i; + ok = false; + break; + } + } + mi_track_mem_noaccess(fill, maxpad); + } + return ok; +} + +static void mi_check_padding(const mi_page_t* page, const mi_block_t* block) { + size_t size; + size_t wrong; + if (!mi_verify_padding(page,block,&size,&wrong)) { + _mi_error_message(EFAULT, "buffer overflow in heap block %p of size %zu: write after %zu bytes\n", block, size, wrong ); + } +} + +#else + +static void mi_check_padding(const mi_page_t* page, const mi_block_t* block) { + MI_UNUSED(page); + MI_UNUSED(block); +} + +#endif + +// only maintain stats for smaller objects if requested +#if (MI_STAT>0) +static void mi_stat_free(const mi_page_t* page, const mi_block_t* block) { + #if (MI_STAT < 2) + MI_UNUSED(block); + #endif + mi_heap_t* const heap = mi_heap_get_default(); + const size_t bsize = mi_page_usable_block_size(page); + #if (MI_STAT>1) + const size_t usize = mi_page_usable_size_of(page, block); + mi_heap_stat_decrease(heap, malloc, usize); + #endif + if (bsize <= MI_MEDIUM_OBJ_SIZE_MAX) { + mi_heap_stat_decrease(heap, normal, bsize); + #if (MI_STAT > 1) + mi_heap_stat_decrease(heap, normal_bins[_mi_bin(bsize)], 1); + #endif + } + else if (bsize <= MI_LARGE_OBJ_SIZE_MAX) { + mi_heap_stat_decrease(heap, large, bsize); + } + else { + mi_heap_stat_decrease(heap, huge, bsize); + } +} +#else +static void mi_stat_free(const mi_page_t* page, const mi_block_t* block) { + MI_UNUSED(page); MI_UNUSED(block); +} +#endif + +#if MI_HUGE_PAGE_ABANDON +#if (MI_STAT>0) +// maintain stats for huge objects +static void mi_stat_huge_free(const mi_page_t* page) { + mi_heap_t* const heap = mi_heap_get_default(); + const size_t bsize = mi_page_block_size(page); // to match stats in `page.c:mi_page_huge_alloc` + if (bsize <= MI_LARGE_OBJ_SIZE_MAX) { + mi_heap_stat_decrease(heap, large, bsize); + } + else { + mi_heap_stat_decrease(heap, huge, bsize); + } +} +#else +static void mi_stat_huge_free(const mi_page_t* page) { + MI_UNUSED(page); +} +#endif +#endif + +// ------------------------------------------------------ +// Free +// ------------------------------------------------------ + +// multi-threaded free (or free in huge block if compiled with MI_HUGE_PAGE_ABANDON) +static mi_decl_noinline void _mi_free_block_mt(mi_page_t* page, mi_block_t* block) +{ + // The padding check may access the non-thread-owned page for the key values. + // that is safe as these are constant and the page won't be freed (as the block is not freed yet). + mi_check_padding(page, block); + _mi_padding_shrink(page, block, sizeof(mi_block_t)); // for small size, ensure we can fit the delayed thread pointers without triggering overflow detection + + // huge page segments are always abandoned and can be freed immediately + mi_segment_t* segment = _mi_page_segment(page); + if (segment->kind == MI_SEGMENT_HUGE) { + #if MI_HUGE_PAGE_ABANDON + // huge page segments are always abandoned and can be freed immediately + mi_stat_huge_free(page); + _mi_segment_huge_page_free(segment, page, block); + return; + #else + // huge pages are special as they occupy the entire segment + // as these are large we reset the memory occupied by the page so it is available to other threads + // (as the owning thread needs to actually free the memory later). + _mi_segment_huge_page_reset(segment, page, block); + #endif + } + + #if (MI_DEBUG>0) && !MI_TRACK_ENABLED && !MI_TSAN // note: when tracking, cannot use mi_usable_size with multi-threading + if (segment->kind != MI_SEGMENT_HUGE) { // not for huge segments as we just reset the content + memset(block, MI_DEBUG_FREED, mi_usable_size(block)); + } + #endif + + // Try to put the block on either the page-local thread free list, or the heap delayed free list. + mi_thread_free_t tfreex; + bool use_delayed; + mi_thread_free_t tfree = mi_atomic_load_relaxed(&page->xthread_free); + do { + use_delayed = (mi_tf_delayed(tfree) == MI_USE_DELAYED_FREE); + if mi_unlikely(use_delayed) { + // unlikely: this only happens on the first concurrent free in a page that is in the full list + tfreex = mi_tf_set_delayed(tfree,MI_DELAYED_FREEING); + } + else { + // usual: directly add to page thread_free list + mi_block_set_next(page, block, mi_tf_block(tfree)); + tfreex = mi_tf_set_block(tfree,block); + } + } while (!mi_atomic_cas_weak_release(&page->xthread_free, &tfree, tfreex)); + + if mi_unlikely(use_delayed) { + // racy read on `heap`, but ok because MI_DELAYED_FREEING is set (see `mi_heap_delete` and `mi_heap_collect_abandon`) + mi_heap_t* const heap = (mi_heap_t*)(mi_atomic_load_acquire(&page->xheap)); //mi_page_heap(page); + mi_assert_internal(heap != NULL); + if (heap != NULL) { + // add to the delayed free list of this heap. (do this atomically as the lock only protects heap memory validity) + mi_block_t* dfree = mi_atomic_load_ptr_relaxed(mi_block_t, &heap->thread_delayed_free); + do { + mi_block_set_nextx(heap,block,dfree, heap->keys); + } while (!mi_atomic_cas_ptr_weak_release(mi_block_t,&heap->thread_delayed_free, &dfree, block)); + } + + // and reset the MI_DELAYED_FREEING flag + tfree = mi_atomic_load_relaxed(&page->xthread_free); + do { + tfreex = tfree; + mi_assert_internal(mi_tf_delayed(tfree) == MI_DELAYED_FREEING); + tfreex = mi_tf_set_delayed(tfree,MI_NO_DELAYED_FREE); + } while (!mi_atomic_cas_weak_release(&page->xthread_free, &tfree, tfreex)); + } +} + +// regular free +static inline void _mi_free_block(mi_page_t* page, bool local, mi_block_t* block) +{ + // and push it on the free list + //const size_t bsize = mi_page_block_size(page); + if mi_likely(local) { + // owning thread can free a block directly + if mi_unlikely(mi_check_is_double_free(page, block)) return; + mi_check_padding(page, block); + #if (MI_DEBUG>0) && !MI_TRACK_ENABLED && !MI_TSAN + if (!mi_page_is_huge(page)) { // huge page content may be already decommitted + memset(block, MI_DEBUG_FREED, mi_page_block_size(page)); + } + #endif + mi_block_set_next(page, block, page->local_free); + page->local_free = block; + page->used--; + if mi_unlikely(mi_page_all_free(page)) { + _mi_page_retire(page); + } + else if mi_unlikely(mi_page_is_in_full(page)) { + _mi_page_unfull(page); + } + } + else { + _mi_free_block_mt(page,block); + } +} + + +// Adjust a block that was allocated aligned, to the actual start of the block in the page. +mi_block_t* _mi_page_ptr_unalign(const mi_segment_t* segment, const mi_page_t* page, const void* p) { + mi_assert_internal(page!=NULL && p!=NULL); + const size_t diff = (uint8_t*)p - _mi_page_start(segment, page, NULL); + const size_t adjust = (diff % mi_page_block_size(page)); + return (mi_block_t*)((uintptr_t)p - adjust); +} + + +void mi_decl_noinline _mi_free_generic(const mi_segment_t* segment, mi_page_t* page, bool is_local, void* p) mi_attr_noexcept { + mi_block_t* const block = (mi_page_has_aligned(page) ? _mi_page_ptr_unalign(segment, page, p) : (mi_block_t*)p); + mi_stat_free(page, block); // stat_free may access the padding + mi_track_free_size(block, mi_page_usable_size_of(page,block)); + _mi_free_block(page, is_local, block); +} + +// Get the segment data belonging to a pointer +// This is just a single `and` in assembly but does further checks in debug mode +// (and secure mode) if this was a valid pointer. +static inline mi_segment_t* mi_checked_ptr_segment(const void* p, const char* msg) +{ + MI_UNUSED(msg); + mi_assert(p != NULL); + +#if (MI_DEBUG>0) + if mi_unlikely(((uintptr_t)p & (MI_INTPTR_SIZE - 1)) != 0) { + _mi_error_message(EINVAL, "%s: invalid (unaligned) pointer: %p\n", msg, p); + return NULL; + } +#endif + + mi_segment_t* const segment = _mi_ptr_segment(p); + mi_assert_internal(segment != NULL); + +#if (MI_DEBUG>0) + if mi_unlikely(!mi_is_in_heap_region(p)) { + #if (MI_INTPTR_SIZE == 8 && defined(__linux__)) + if (((uintptr_t)p >> 40) != 0x7F) { // linux tends to align large blocks above 0x7F000000000 (issue #640) + #else + { + #endif + _mi_warning_message("%s: pointer might not point to a valid heap region: %p\n" + "(this may still be a valid very large allocation (over 64MiB))\n", msg, p); + if mi_likely(_mi_ptr_cookie(segment) == segment->cookie) { + _mi_warning_message("(yes, the previous pointer %p was valid after all)\n", p); + } + } + } +#endif +#if (MI_DEBUG>0 || MI_SECURE>=4) + if mi_unlikely(_mi_ptr_cookie(segment) != segment->cookie) { + _mi_error_message(EINVAL, "%s: pointer does not point to a valid heap space: %p\n", msg, p); + return NULL; + } +#endif + + return segment; +} + +// Free a block +// fast path written carefully to prevent spilling on the stack +void mi_free(void* p) mi_attr_noexcept +{ + if mi_unlikely(p == NULL) return; + mi_segment_t* const segment = mi_checked_ptr_segment(p,"mi_free"); + const bool is_local= (_mi_prim_thread_id() == mi_atomic_load_relaxed(&segment->thread_id)); + mi_page_t* const page = _mi_segment_page_of(segment, p); + + if mi_likely(is_local) { // thread-local free? + if mi_likely(page->flags.full_aligned == 0) // and it is not a full page (full pages need to move from the full bin), nor has aligned blocks (aligned blocks need to be unaligned) + { + mi_block_t* const block = (mi_block_t*)p; + if mi_unlikely(mi_check_is_double_free(page, block)) return; + mi_check_padding(page, block); + mi_stat_free(page, block); + #if (MI_DEBUG>0) && !MI_TRACK_ENABLED && !MI_TSAN + memset(block, MI_DEBUG_FREED, mi_page_block_size(page)); + #endif + mi_track_free_size(p, mi_page_usable_size_of(page,block)); // faster then mi_usable_size as we already know the page and that p is unaligned + mi_block_set_next(page, block, page->local_free); + page->local_free = block; + if mi_unlikely(--page->used == 0) { // using this expression generates better code than: page->used--; if (mi_page_all_free(page)) + _mi_page_retire(page); + } + } + else { + // page is full or contains (inner) aligned blocks; use generic path + _mi_free_generic(segment, page, true, p); + } + } + else { + // not thread-local; use generic path + _mi_free_generic(segment, page, false, p); + } +} + +// return true if successful +bool _mi_free_delayed_block(mi_block_t* block) { + // get segment and page + const mi_segment_t* const segment = _mi_ptr_segment(block); + mi_assert_internal(_mi_ptr_cookie(segment) == segment->cookie); + mi_assert_internal(_mi_thread_id() == segment->thread_id); + mi_page_t* const page = _mi_segment_page_of(segment, block); + + // Clear the no-delayed flag so delayed freeing is used again for this page. + // This must be done before collecting the free lists on this page -- otherwise + // some blocks may end up in the page `thread_free` list with no blocks in the + // heap `thread_delayed_free` list which may cause the page to be never freed! + // (it would only be freed if we happen to scan it in `mi_page_queue_find_free_ex`) + if (!_mi_page_try_use_delayed_free(page, MI_USE_DELAYED_FREE, false /* dont overwrite never delayed */)) { + return false; + } + + // collect all other non-local frees to ensure up-to-date `used` count + _mi_page_free_collect(page, false); + + // and free the block (possibly freeing the page as well since used is updated) + _mi_free_block(page, true, block); + return true; +} + +// Bytes available in a block +mi_decl_noinline static size_t mi_page_usable_aligned_size_of(const mi_segment_t* segment, const mi_page_t* page, const void* p) mi_attr_noexcept { + const mi_block_t* block = _mi_page_ptr_unalign(segment, page, p); + const size_t size = mi_page_usable_size_of(page, block); + const ptrdiff_t adjust = (uint8_t*)p - (uint8_t*)block; + mi_assert_internal(adjust >= 0 && (size_t)adjust <= size); + return (size - adjust); +} + +static inline size_t _mi_usable_size(const void* p, const char* msg) mi_attr_noexcept { + if (p == NULL) return 0; + const mi_segment_t* const segment = mi_checked_ptr_segment(p, msg); + const mi_page_t* const page = _mi_segment_page_of(segment, p); + if mi_likely(!mi_page_has_aligned(page)) { + const mi_block_t* block = (const mi_block_t*)p; + return mi_page_usable_size_of(page, block); + } + else { + // split out to separate routine for improved code generation + return mi_page_usable_aligned_size_of(segment, page, p); + } +} + +mi_decl_nodiscard size_t mi_usable_size(const void* p) mi_attr_noexcept { + return _mi_usable_size(p, "mi_usable_size"); +} + + +// ------------------------------------------------------ +// Allocation extensions +// ------------------------------------------------------ + +void mi_free_size(void* p, size_t size) mi_attr_noexcept { + MI_UNUSED_RELEASE(size); + mi_assert(p == NULL || size <= _mi_usable_size(p,"mi_free_size")); + mi_free(p); +} + +void mi_free_size_aligned(void* p, size_t size, size_t alignment) mi_attr_noexcept { + MI_UNUSED_RELEASE(alignment); + mi_assert(((uintptr_t)p % alignment) == 0); + mi_free_size(p,size); +} + +void mi_free_aligned(void* p, size_t alignment) mi_attr_noexcept { + MI_UNUSED_RELEASE(alignment); + mi_assert(((uintptr_t)p % alignment) == 0); + mi_free(p); +} + +mi_decl_nodiscard extern inline mi_decl_restrict void* mi_heap_calloc(mi_heap_t* heap, size_t count, size_t size) mi_attr_noexcept { + size_t total; + if (mi_count_size_overflow(count,size,&total)) return NULL; + return mi_heap_zalloc(heap,total); +} + +mi_decl_nodiscard mi_decl_restrict void* mi_calloc(size_t count, size_t size) mi_attr_noexcept { + return mi_heap_calloc(mi_prim_get_default_heap(),count,size); +} + +// Uninitialized `calloc` +mi_decl_nodiscard extern mi_decl_restrict void* mi_heap_mallocn(mi_heap_t* heap, size_t count, size_t size) mi_attr_noexcept { + size_t total; + if (mi_count_size_overflow(count, size, &total)) return NULL; + return mi_heap_malloc(heap, total); +} + +mi_decl_nodiscard mi_decl_restrict void* mi_mallocn(size_t count, size_t size) mi_attr_noexcept { + return mi_heap_mallocn(mi_prim_get_default_heap(),count,size); +} + +// Expand (or shrink) in place (or fail) +void* mi_expand(void* p, size_t newsize) mi_attr_noexcept { + #if MI_PADDING + // we do not shrink/expand with padding enabled + MI_UNUSED(p); MI_UNUSED(newsize); + return NULL; + #else + if (p == NULL) return NULL; + const size_t size = _mi_usable_size(p,"mi_expand"); + if (newsize > size) return NULL; + return p; // it fits + #endif +} + +void* _mi_heap_realloc_zero(mi_heap_t* heap, void* p, size_t newsize, bool zero) mi_attr_noexcept { + // if p == NULL then behave as malloc. + // else if size == 0 then reallocate to a zero-sized block (and don't return NULL, just as mi_malloc(0)). + // (this means that returning NULL always indicates an error, and `p` will not have been freed in that case.) + const size_t size = _mi_usable_size(p,"mi_realloc"); // also works if p == NULL (with size 0) + if mi_unlikely(newsize <= size && newsize >= (size / 2) && newsize > 0) { // note: newsize must be > 0 or otherwise we return NULL for realloc(NULL,0) + mi_assert_internal(p!=NULL); + // todo: do not track as the usable size is still the same in the free; adjust potential padding? + // mi_track_resize(p,size,newsize) + // if (newsize < size) { mi_track_mem_noaccess((uint8_t*)p + newsize, size - newsize); } + return p; // reallocation still fits and not more than 50% waste + } + void* newp = mi_heap_malloc(heap,newsize); + if mi_likely(newp != NULL) { + if (zero && newsize > size) { + // also set last word in the previous allocation to zero to ensure any padding is zero-initialized + const size_t start = (size >= sizeof(intptr_t) ? size - sizeof(intptr_t) : 0); + _mi_memzero((uint8_t*)newp + start, newsize - start); + } + else if (newsize == 0) { + ((uint8_t*)newp)[0] = 0; // work around for applications that expect zero-reallocation to be zero initialized (issue #725) + } + if mi_likely(p != NULL) { + const size_t copysize = (newsize > size ? size : newsize); + mi_track_mem_defined(p,copysize); // _mi_useable_size may be too large for byte precise memory tracking.. + _mi_memcpy(newp, p, copysize); + mi_free(p); // only free the original pointer if successful + } + } + return newp; +} + +mi_decl_nodiscard void* mi_heap_realloc(mi_heap_t* heap, void* p, size_t newsize) mi_attr_noexcept { + return _mi_heap_realloc_zero(heap, p, newsize, false); +} + +mi_decl_nodiscard void* mi_heap_reallocn(mi_heap_t* heap, void* p, size_t count, size_t size) mi_attr_noexcept { + size_t total; + if (mi_count_size_overflow(count, size, &total)) return NULL; + return mi_heap_realloc(heap, p, total); +} + + +// Reallocate but free `p` on errors +mi_decl_nodiscard void* mi_heap_reallocf(mi_heap_t* heap, void* p, size_t newsize) mi_attr_noexcept { + void* newp = mi_heap_realloc(heap, p, newsize); + if (newp==NULL && p!=NULL) mi_free(p); + return newp; +} + +mi_decl_nodiscard void* mi_heap_rezalloc(mi_heap_t* heap, void* p, size_t newsize) mi_attr_noexcept { + return _mi_heap_realloc_zero(heap, p, newsize, true); +} + +mi_decl_nodiscard void* mi_heap_recalloc(mi_heap_t* heap, void* p, size_t count, size_t size) mi_attr_noexcept { + size_t total; + if (mi_count_size_overflow(count, size, &total)) return NULL; + return mi_heap_rezalloc(heap, p, total); +} + + +mi_decl_nodiscard void* mi_realloc(void* p, size_t newsize) mi_attr_noexcept { + return mi_heap_realloc(mi_prim_get_default_heap(),p,newsize); +} + +mi_decl_nodiscard void* mi_reallocn(void* p, size_t count, size_t size) mi_attr_noexcept { + return mi_heap_reallocn(mi_prim_get_default_heap(),p,count,size); +} + +// Reallocate but free `p` on errors +mi_decl_nodiscard void* mi_reallocf(void* p, size_t newsize) mi_attr_noexcept { + return mi_heap_reallocf(mi_prim_get_default_heap(),p,newsize); +} + +mi_decl_nodiscard void* mi_rezalloc(void* p, size_t newsize) mi_attr_noexcept { + return mi_heap_rezalloc(mi_prim_get_default_heap(), p, newsize); +} + +mi_decl_nodiscard void* mi_recalloc(void* p, size_t count, size_t size) mi_attr_noexcept { + return mi_heap_recalloc(mi_prim_get_default_heap(), p, count, size); +} + + + +// ------------------------------------------------------ +// strdup, strndup, and realpath +// ------------------------------------------------------ + +// `strdup` using mi_malloc +mi_decl_nodiscard mi_decl_restrict char* mi_heap_strdup(mi_heap_t* heap, const char* s) mi_attr_noexcept { + if (s == NULL) return NULL; + size_t n = strlen(s); + char* t = (char*)mi_heap_malloc(heap,n+1); + if (t == NULL) return NULL; + _mi_memcpy(t, s, n); + t[n] = 0; + return t; +} + +mi_decl_nodiscard mi_decl_restrict char* mi_strdup(const char* s) mi_attr_noexcept { + return mi_heap_strdup(mi_prim_get_default_heap(), s); +} + +// `strndup` using mi_malloc +mi_decl_nodiscard mi_decl_restrict char* mi_heap_strndup(mi_heap_t* heap, const char* s, size_t n) mi_attr_noexcept { + if (s == NULL) return NULL; + const char* end = (const char*)memchr(s, 0, n); // find end of string in the first `n` characters (returns NULL if not found) + const size_t m = (end != NULL ? (size_t)(end - s) : n); // `m` is the minimum of `n` or the end-of-string + mi_assert_internal(m <= n); + char* t = (char*)mi_heap_malloc(heap, m+1); + if (t == NULL) return NULL; + _mi_memcpy(t, s, m); + t[m] = 0; + return t; +} + +mi_decl_nodiscard mi_decl_restrict char* mi_strndup(const char* s, size_t n) mi_attr_noexcept { + return mi_heap_strndup(mi_prim_get_default_heap(),s,n); +} + +#ifndef __wasi__ +// `realpath` using mi_malloc +#ifdef _WIN32 +#ifndef PATH_MAX +#define PATH_MAX MAX_PATH +#endif +#include +mi_decl_nodiscard mi_decl_restrict char* mi_heap_realpath(mi_heap_t* heap, const char* fname, char* resolved_name) mi_attr_noexcept { + // todo: use GetFullPathNameW to allow longer file names + char buf[PATH_MAX]; + DWORD res = GetFullPathNameA(fname, PATH_MAX, (resolved_name == NULL ? buf : resolved_name), NULL); + if (res == 0) { + errno = GetLastError(); return NULL; + } + else if (res > PATH_MAX) { + errno = EINVAL; return NULL; + } + else if (resolved_name != NULL) { + return resolved_name; + } + else { + return mi_heap_strndup(heap, buf, PATH_MAX); + } +} +#else +/* +#include // pathconf +static size_t mi_path_max(void) { + static size_t path_max = 0; + if (path_max <= 0) { + long m = pathconf("/",_PC_PATH_MAX); + if (m <= 0) path_max = 4096; // guess + else if (m < 256) path_max = 256; // at least 256 + else path_max = m; + } + return path_max; +} +*/ +char* mi_heap_realpath(mi_heap_t* heap, const char* fname, char* resolved_name) mi_attr_noexcept { + if (resolved_name != NULL) { + return realpath(fname,resolved_name); + } + else { + char* rname = realpath(fname, NULL); + if (rname == NULL) return NULL; + char* result = mi_heap_strdup(heap, rname); + free(rname); // use regular free! (which may be redirected to our free but that's ok) + return result; + } + /* + const size_t n = mi_path_max(); + char* buf = (char*)mi_malloc(n+1); + if (buf == NULL) { + errno = ENOMEM; + return NULL; + } + char* rname = realpath(fname,buf); + char* result = mi_heap_strndup(heap,rname,n); // ok if `rname==NULL` + mi_free(buf); + return result; + } + */ +} +#endif + +mi_decl_nodiscard mi_decl_restrict char* mi_realpath(const char* fname, char* resolved_name) mi_attr_noexcept { + return mi_heap_realpath(mi_prim_get_default_heap(),fname,resolved_name); +} +#endif + +/*------------------------------------------------------- +C++ new and new_aligned +The standard requires calling into `get_new_handler` and +throwing the bad_alloc exception on failure. If we compile +with a C++ compiler we can implement this precisely. If we +use a C compiler we cannot throw a `bad_alloc` exception +but we call `exit` instead (i.e. not returning). +-------------------------------------------------------*/ + +#ifdef __cplusplus +#include +static bool mi_try_new_handler(bool nothrow) { + #if defined(_MSC_VER) || (__cplusplus >= 201103L) + std::new_handler h = std::get_new_handler(); + #else + std::new_handler h = std::set_new_handler(); + std::set_new_handler(h); + #endif + if (h==NULL) { + _mi_error_message(ENOMEM, "out of memory in 'new'"); + if (!nothrow) { + throw std::bad_alloc(); + } + return false; + } + else { + h(); + return true; + } +} +#else +typedef void (*std_new_handler_t)(void); + +#if (defined(__GNUC__) || (defined(__clang__) && !defined(_MSC_VER))) // exclude clang-cl, see issue #631 +std_new_handler_t __attribute__((weak)) _ZSt15get_new_handlerv(void) { + return NULL; +} +static std_new_handler_t mi_get_new_handler(void) { + return _ZSt15get_new_handlerv(); +} +#else +// note: on windows we could dynamically link to `?get_new_handler@std@@YAP6AXXZXZ`. +static std_new_handler_t mi_get_new_handler() { + return NULL; +} +#endif + +static bool mi_try_new_handler(bool nothrow) { + std_new_handler_t h = mi_get_new_handler(); + if (h==NULL) { + _mi_error_message(ENOMEM, "out of memory in 'new'"); + if (!nothrow) { + abort(); // cannot throw in plain C, use abort + } + return false; + } + else { + h(); + return true; + } +} +#endif + +mi_decl_export mi_decl_noinline void* mi_heap_try_new(mi_heap_t* heap, size_t size, bool nothrow ) { + void* p = NULL; + while(p == NULL && mi_try_new_handler(nothrow)) { + p = mi_heap_malloc(heap,size); + } + return p; +} + +static mi_decl_noinline void* mi_try_new(size_t size, bool nothrow) { + return mi_heap_try_new(mi_prim_get_default_heap(), size, nothrow); +} + + +mi_decl_nodiscard mi_decl_restrict void* mi_heap_alloc_new(mi_heap_t* heap, size_t size) { + void* p = mi_heap_malloc(heap,size); + if mi_unlikely(p == NULL) return mi_heap_try_new(heap, size, false); + return p; +} + +mi_decl_nodiscard mi_decl_restrict void* mi_new(size_t size) { + return mi_heap_alloc_new(mi_prim_get_default_heap(), size); +} + + +mi_decl_nodiscard mi_decl_restrict void* mi_heap_alloc_new_n(mi_heap_t* heap, size_t count, size_t size) { + size_t total; + if mi_unlikely(mi_count_size_overflow(count, size, &total)) { + mi_try_new_handler(false); // on overflow we invoke the try_new_handler once to potentially throw std::bad_alloc + return NULL; + } + else { + return mi_heap_alloc_new(heap,total); + } +} + +mi_decl_nodiscard mi_decl_restrict void* mi_new_n(size_t count, size_t size) { + return mi_heap_alloc_new_n(mi_prim_get_default_heap(), size, count); +} + + +mi_decl_nodiscard mi_decl_restrict void* mi_new_nothrow(size_t size) mi_attr_noexcept { + void* p = mi_malloc(size); + if mi_unlikely(p == NULL) return mi_try_new(size, true); + return p; +} + +mi_decl_nodiscard mi_decl_restrict void* mi_new_aligned(size_t size, size_t alignment) { + void* p; + do { + p = mi_malloc_aligned(size, alignment); + } + while(p == NULL && mi_try_new_handler(false)); + return p; +} + +mi_decl_nodiscard mi_decl_restrict void* mi_new_aligned_nothrow(size_t size, size_t alignment) mi_attr_noexcept { + void* p; + do { + p = mi_malloc_aligned(size, alignment); + } + while(p == NULL && mi_try_new_handler(true)); + return p; +} + +mi_decl_nodiscard void* mi_new_realloc(void* p, size_t newsize) { + void* q; + do { + q = mi_realloc(p, newsize); + } while (q == NULL && mi_try_new_handler(false)); + return q; +} + +mi_decl_nodiscard void* mi_new_reallocn(void* p, size_t newcount, size_t size) { + size_t total; + if mi_unlikely(mi_count_size_overflow(newcount, size, &total)) { + mi_try_new_handler(false); // on overflow we invoke the try_new_handler once to potentially throw std::bad_alloc + return NULL; + } + else { + return mi_new_realloc(p, total); + } +} + +// ------------------------------------------------------ +// ensure explicit external inline definitions are emitted! +// ------------------------------------------------------ + +#ifdef __cplusplus +void* _mi_externs[] = { + (void*)&_mi_page_malloc, + (void*)&_mi_heap_malloc_zero, + (void*)&_mi_heap_malloc_zero_ex, + (void*)&mi_malloc, + (void*)&mi_malloc_small, + (void*)&mi_zalloc_small, + (void*)&mi_heap_malloc, + (void*)&mi_heap_zalloc, + (void*)&mi_heap_malloc_small, + // (void*)&mi_heap_alloc_new, + // (void*)&mi_heap_alloc_new_n +}; +#endif diff --git a/compat/mimalloc/arena.c b/compat/mimalloc/arena.c new file mode 100644 index 00000000000000..879ee9e7e773d4 --- /dev/null +++ b/compat/mimalloc/arena.c @@ -0,0 +1,935 @@ +/* ---------------------------------------------------------------------------- +Copyright (c) 2019-2023, Microsoft Research, Daan Leijen +This is free software; you can redistribute it and/or modify it under the +terms of the MIT license. A copy of the license can be found in the file +"LICENSE" at the root of this distribution. +-----------------------------------------------------------------------------*/ + +/* ---------------------------------------------------------------------------- +"Arenas" are fixed area's of OS memory from which we can allocate +large blocks (>= MI_ARENA_MIN_BLOCK_SIZE, 4MiB). +In contrast to the rest of mimalloc, the arenas are shared between +threads and need to be accessed using atomic operations. + +Arenas are used to for huge OS page (1GiB) reservations or for reserving +OS memory upfront which can be improve performance or is sometimes needed +on embedded devices. We can also employ this with WASI or `sbrk` systems +to reserve large arenas upfront and be able to reuse the memory more effectively. + +The arena allocation needs to be thread safe and we use an atomic bitmap to allocate. +-----------------------------------------------------------------------------*/ +#include "mimalloc.h" +#include "mimalloc/internal.h" +#include "mimalloc/atomic.h" + +#include // memset +#include // ENOMEM + +#include "bitmap.h" // atomic bitmap + +/* ----------------------------------------------------------- + Arena allocation +----------------------------------------------------------- */ + +// Block info: bit 0 contains the `in_use` bit, the upper bits the +// size in count of arena blocks. +typedef uintptr_t mi_block_info_t; +#define MI_ARENA_BLOCK_SIZE (MI_SEGMENT_SIZE) // 64MiB (must be at least MI_SEGMENT_ALIGN) +#define MI_ARENA_MIN_OBJ_SIZE (MI_ARENA_BLOCK_SIZE/2) // 32MiB +#define MI_MAX_ARENAS (112) // not more than 126 (since we use 7 bits in the memid and an arena index + 1) + +// A memory arena descriptor +typedef struct mi_arena_s { + mi_arena_id_t id; // arena id; 0 for non-specific + mi_memid_t memid; // memid of the memory area + _Atomic(uint8_t*) start; // the start of the memory area + size_t block_count; // size of the area in arena blocks (of `MI_ARENA_BLOCK_SIZE`) + size_t field_count; // number of bitmap fields (where `field_count * MI_BITMAP_FIELD_BITS >= block_count`) + size_t meta_size; // size of the arena structure itself (including its bitmaps) + mi_memid_t meta_memid; // memid of the arena structure itself (OS or static allocation) + int numa_node; // associated NUMA node + bool exclusive; // only allow allocations if specifically for this arena + bool is_large; // memory area consists of large- or huge OS pages (always committed) + _Atomic(size_t) search_idx; // optimization to start the search for free blocks + _Atomic(mi_msecs_t) purge_expire; // expiration time when blocks should be decommitted from `blocks_decommit`. + mi_bitmap_field_t* blocks_dirty; // are the blocks potentially non-zero? + mi_bitmap_field_t* blocks_committed; // are the blocks committed? (can be NULL for memory that cannot be decommitted) + mi_bitmap_field_t* blocks_purge; // blocks that can be (reset) decommitted. (can be NULL for memory that cannot be (reset) decommitted) + mi_bitmap_field_t blocks_inuse[1]; // in-place bitmap of in-use blocks (of size `field_count`) +} mi_arena_t; + + +// The available arenas +static mi_decl_cache_align _Atomic(mi_arena_t*) mi_arenas[MI_MAX_ARENAS]; +static mi_decl_cache_align _Atomic(size_t) mi_arena_count; // = 0 + + +//static bool mi_manage_os_memory_ex2(void* start, size_t size, bool is_large, int numa_node, bool exclusive, mi_memid_t memid, mi_arena_id_t* arena_id) mi_attr_noexcept; + +/* ----------------------------------------------------------- + Arena id's + id = arena_index + 1 +----------------------------------------------------------- */ + +static size_t mi_arena_id_index(mi_arena_id_t id) { + return (size_t)(id <= 0 ? MI_MAX_ARENAS : id - 1); +} + +static mi_arena_id_t mi_arena_id_create(size_t arena_index) { + mi_assert_internal(arena_index < MI_MAX_ARENAS); + return (int)arena_index + 1; +} + +mi_arena_id_t _mi_arena_id_none(void) { + return 0; +} + +static bool mi_arena_id_is_suitable(mi_arena_id_t arena_id, bool arena_is_exclusive, mi_arena_id_t req_arena_id) { + return ((!arena_is_exclusive && req_arena_id == _mi_arena_id_none()) || + (arena_id == req_arena_id)); +} + +bool _mi_arena_memid_is_suitable(mi_memid_t memid, mi_arena_id_t request_arena_id) { + if (memid.memkind == MI_MEM_ARENA) { + return mi_arena_id_is_suitable(memid.mem.arena.id, memid.mem.arena.is_exclusive, request_arena_id); + } + else { + return mi_arena_id_is_suitable(0, false, request_arena_id); + } +} + +bool _mi_arena_memid_is_os_allocated(mi_memid_t memid) { + return (memid.memkind == MI_MEM_OS); +} + +/* ----------------------------------------------------------- + Arena allocations get a (currently) 16-bit memory id where the + lower 8 bits are the arena id, and the upper bits the block index. +----------------------------------------------------------- */ + +static size_t mi_block_count_of_size(size_t size) { + return _mi_divide_up(size, MI_ARENA_BLOCK_SIZE); +} + +static size_t mi_arena_block_size(size_t bcount) { + return (bcount * MI_ARENA_BLOCK_SIZE); +} + +static size_t mi_arena_size(mi_arena_t* arena) { + return mi_arena_block_size(arena->block_count); +} + +static mi_memid_t mi_memid_create_arena(mi_arena_id_t id, bool is_exclusive, mi_bitmap_index_t bitmap_index) { + mi_memid_t memid = _mi_memid_create(MI_MEM_ARENA); + memid.mem.arena.id = id; + memid.mem.arena.block_index = bitmap_index; + memid.mem.arena.is_exclusive = is_exclusive; + return memid; +} + +static bool mi_arena_memid_indices(mi_memid_t memid, size_t* arena_index, mi_bitmap_index_t* bitmap_index) { + mi_assert_internal(memid.memkind == MI_MEM_ARENA); + *arena_index = mi_arena_id_index(memid.mem.arena.id); + *bitmap_index = memid.mem.arena.block_index; + return memid.mem.arena.is_exclusive; +} + + + +/* ----------------------------------------------------------- + Special static area for mimalloc internal structures + to avoid OS calls (for example, for the arena metadata) +----------------------------------------------------------- */ + +#define MI_ARENA_STATIC_MAX (MI_INTPTR_SIZE*MI_KiB) // 8 KiB on 64-bit + +static uint8_t mi_arena_static[MI_ARENA_STATIC_MAX]; +static _Atomic(size_t) mi_arena_static_top; + +static void* mi_arena_static_zalloc(size_t size, size_t alignment, mi_memid_t* memid) { + *memid = _mi_memid_none(); + if (size == 0 || size > MI_ARENA_STATIC_MAX) return NULL; + if ((mi_atomic_load_relaxed(&mi_arena_static_top) + size) > MI_ARENA_STATIC_MAX) return NULL; + + // try to claim space + if (alignment == 0) { alignment = 1; } + const size_t oversize = size + alignment - 1; + if (oversize > MI_ARENA_STATIC_MAX) return NULL; + const size_t oldtop = mi_atomic_add_acq_rel(&mi_arena_static_top, oversize); + size_t top = oldtop + oversize; + if (top > MI_ARENA_STATIC_MAX) { + // try to roll back, ok if this fails + mi_atomic_cas_strong_acq_rel(&mi_arena_static_top, &top, oldtop); + return NULL; + } + + // success + *memid = _mi_memid_create(MI_MEM_STATIC); + const size_t start = _mi_align_up(oldtop, alignment); + uint8_t* const p = &mi_arena_static[start]; + _mi_memzero(p, size); + return p; +} + +static void* mi_arena_meta_zalloc(size_t size, mi_memid_t* memid, mi_stats_t* stats) { + *memid = _mi_memid_none(); + + // try static + void* p = mi_arena_static_zalloc(size, MI_ALIGNMENT_MAX, memid); + if (p != NULL) return p; + + // or fall back to the OS + return _mi_os_alloc(size, memid, stats); +} + +static void mi_arena_meta_free(void* p, mi_memid_t memid, size_t size, mi_stats_t* stats) { + if (mi_memkind_is_os(memid.memkind)) { + _mi_os_free(p, size, memid, stats); + } + else { + mi_assert(memid.memkind == MI_MEM_STATIC); + } +} + +static void* mi_arena_block_start(mi_arena_t* arena, mi_bitmap_index_t bindex) { + return (arena->start + mi_arena_block_size(mi_bitmap_index_bit(bindex))); +} + + +/* ----------------------------------------------------------- + Thread safe allocation in an arena +----------------------------------------------------------- */ + +// claim the `blocks_inuse` bits +static bool mi_arena_try_claim(mi_arena_t* arena, size_t blocks, mi_bitmap_index_t* bitmap_idx) +{ + size_t idx = 0; // mi_atomic_load_relaxed(&arena->search_idx); // start from last search; ok to be relaxed as the exact start does not matter + if (_mi_bitmap_try_find_from_claim_across(arena->blocks_inuse, arena->field_count, idx, blocks, bitmap_idx)) { + mi_atomic_store_relaxed(&arena->search_idx, mi_bitmap_index_field(*bitmap_idx)); // start search from found location next time around + return true; + }; + return false; +} + + +/* ----------------------------------------------------------- + Arena Allocation +----------------------------------------------------------- */ + +static mi_decl_noinline void* mi_arena_try_alloc_at(mi_arena_t* arena, size_t arena_index, size_t needed_bcount, + bool commit, mi_memid_t* memid, mi_os_tld_t* tld) +{ + MI_UNUSED(arena_index); + mi_assert_internal(mi_arena_id_index(arena->id) == arena_index); + + mi_bitmap_index_t bitmap_index; + if (!mi_arena_try_claim(arena, needed_bcount, &bitmap_index)) return NULL; + + // claimed it! + void* p = mi_arena_block_start(arena, bitmap_index); + *memid = mi_memid_create_arena(arena->id, arena->exclusive, bitmap_index); + memid->is_pinned = arena->memid.is_pinned; + + // none of the claimed blocks should be scheduled for a decommit + if (arena->blocks_purge != NULL) { + // this is thread safe as a potential purge only decommits parts that are not yet claimed as used (in `blocks_inuse`). + _mi_bitmap_unclaim_across(arena->blocks_purge, arena->field_count, needed_bcount, bitmap_index); + } + + // set the dirty bits (todo: no need for an atomic op here?) + if (arena->memid.initially_zero && arena->blocks_dirty != NULL) { + memid->initially_zero = _mi_bitmap_claim_across(arena->blocks_dirty, arena->field_count, needed_bcount, bitmap_index, NULL); + } + + // set commit state + if (arena->blocks_committed == NULL) { + // always committed + memid->initially_committed = true; + } + else if (commit) { + // commit requested, but the range may not be committed as a whole: ensure it is committed now + memid->initially_committed = true; + bool any_uncommitted; + _mi_bitmap_claim_across(arena->blocks_committed, arena->field_count, needed_bcount, bitmap_index, &any_uncommitted); + if (any_uncommitted) { + bool commit_zero = false; + if (!_mi_os_commit(p, mi_arena_block_size(needed_bcount), &commit_zero, tld->stats)) { + memid->initially_committed = false; + } + else { + if (commit_zero) { memid->initially_zero = true; } + } + } + } + else { + // no need to commit, but check if already fully committed + memid->initially_committed = _mi_bitmap_is_claimed_across(arena->blocks_committed, arena->field_count, needed_bcount, bitmap_index); + } + + return p; +} + +// allocate in a speficic arena +static void* mi_arena_try_alloc_at_id(mi_arena_id_t arena_id, bool match_numa_node, int numa_node, size_t size, size_t alignment, + bool commit, bool allow_large, mi_arena_id_t req_arena_id, mi_memid_t* memid, mi_os_tld_t* tld ) +{ + MI_UNUSED_RELEASE(alignment); + mi_assert_internal(alignment <= MI_SEGMENT_ALIGN); + const size_t bcount = mi_block_count_of_size(size); + const size_t arena_index = mi_arena_id_index(arena_id); + mi_assert_internal(arena_index < mi_atomic_load_relaxed(&mi_arena_count)); + mi_assert_internal(size <= mi_arena_block_size(bcount)); + + // Check arena suitability + mi_arena_t* arena = mi_atomic_load_ptr_acquire(mi_arena_t, &mi_arenas[arena_index]); + if (arena == NULL) return NULL; + if (!allow_large && arena->is_large) return NULL; + if (!mi_arena_id_is_suitable(arena->id, arena->exclusive, req_arena_id)) return NULL; + if (req_arena_id == _mi_arena_id_none()) { // in not specific, check numa affinity + const bool numa_suitable = (numa_node < 0 || arena->numa_node < 0 || arena->numa_node == numa_node); + if (match_numa_node) { if (!numa_suitable) return NULL; } + else { if (numa_suitable) return NULL; } + } + + // try to allocate + void* p = mi_arena_try_alloc_at(arena, arena_index, bcount, commit, memid, tld); + mi_assert_internal(p == NULL || _mi_is_aligned(p, alignment)); + return p; +} + + +// allocate from an arena with fallback to the OS +static mi_decl_noinline void* mi_arena_try_alloc(int numa_node, size_t size, size_t alignment, + bool commit, bool allow_large, + mi_arena_id_t req_arena_id, mi_memid_t* memid, mi_os_tld_t* tld ) +{ + MI_UNUSED(alignment); + mi_assert_internal(alignment <= MI_SEGMENT_ALIGN); + const size_t max_arena = mi_atomic_load_relaxed(&mi_arena_count); + if mi_likely(max_arena == 0) return NULL; + + if (req_arena_id != _mi_arena_id_none()) { + // try a specific arena if requested + if (mi_arena_id_index(req_arena_id) < max_arena) { + void* p = mi_arena_try_alloc_at_id(req_arena_id, true, numa_node, size, alignment, commit, allow_large, req_arena_id, memid, tld); + if (p != NULL) return p; + } + } + else { + // try numa affine allocation + for (size_t i = 0; i < max_arena; i++) { + void* p = mi_arena_try_alloc_at_id(mi_arena_id_create(i), true, numa_node, size, alignment, commit, allow_large, req_arena_id, memid, tld); + if (p != NULL) return p; + } + + // try from another numa node instead.. + if (numa_node >= 0) { // if numa_node was < 0 (no specific affinity requested), all arena's have been tried already + for (size_t i = 0; i < max_arena; i++) { + void* p = mi_arena_try_alloc_at_id(mi_arena_id_create(i), false /* only proceed if not numa local */, numa_node, size, alignment, commit, allow_large, req_arena_id, memid, tld); + if (p != NULL) return p; + } + } + } + return NULL; +} + +// try to reserve a fresh arena space +static bool mi_arena_reserve(size_t req_size, bool allow_large, mi_arena_id_t req_arena_id, mi_arena_id_t *arena_id) +{ + if (_mi_preloading()) return false; // use OS only while pre loading + if (req_arena_id != _mi_arena_id_none()) return false; + + const size_t arena_count = mi_atomic_load_acquire(&mi_arena_count); + if (arena_count > (MI_MAX_ARENAS - 4)) return false; + + size_t arena_reserve = mi_option_get_size(mi_option_arena_reserve); + if (arena_reserve == 0) return false; + + if (!_mi_os_has_virtual_reserve()) { + arena_reserve = arena_reserve/4; // be conservative if virtual reserve is not supported (for some embedded systems for example) + } + arena_reserve = _mi_align_up(arena_reserve, MI_ARENA_BLOCK_SIZE); + if (arena_count >= 8 && arena_count <= 128) { + arena_reserve = ((size_t)1<<(arena_count/8)) * arena_reserve; // scale up the arena sizes exponentially + } + if (arena_reserve < req_size) return false; // should be able to at least handle the current allocation size + + // commit eagerly? + bool arena_commit = false; + if (mi_option_get(mi_option_arena_eager_commit) == 2) { arena_commit = _mi_os_has_overcommit(); } + else if (mi_option_get(mi_option_arena_eager_commit) == 1) { arena_commit = true; } + + return (mi_reserve_os_memory_ex(arena_reserve, arena_commit, allow_large, false /* exclusive */, arena_id) == 0); +} + + +void* _mi_arena_alloc_aligned(size_t size, size_t alignment, size_t align_offset, bool commit, bool allow_large, + mi_arena_id_t req_arena_id, mi_memid_t* memid, mi_os_tld_t* tld) +{ + mi_assert_internal(memid != NULL && tld != NULL); + mi_assert_internal(size > 0); + *memid = _mi_memid_none(); + + const int numa_node = _mi_os_numa_node(tld); // current numa node + + // try to allocate in an arena if the alignment is small enough and the object is not too small (as for heap meta data) + if (size >= MI_ARENA_MIN_OBJ_SIZE && alignment <= MI_SEGMENT_ALIGN && align_offset == 0) { + void* p = mi_arena_try_alloc(numa_node, size, alignment, commit, allow_large, req_arena_id, memid, tld); + if (p != NULL) return p; + + // otherwise, try to first eagerly reserve a new arena + if (req_arena_id == _mi_arena_id_none()) { + mi_arena_id_t arena_id = 0; + if (mi_arena_reserve(size, allow_large, req_arena_id, &arena_id)) { + // and try allocate in there + mi_assert_internal(req_arena_id == _mi_arena_id_none()); + p = mi_arena_try_alloc_at_id(arena_id, true, numa_node, size, alignment, commit, allow_large, req_arena_id, memid, tld); + if (p != NULL) return p; + } + } + } + + // if we cannot use OS allocation, return NULL + if (mi_option_is_enabled(mi_option_limit_os_alloc) || req_arena_id != _mi_arena_id_none()) { + errno = ENOMEM; + return NULL; + } + + // finally, fall back to the OS + if (align_offset > 0) { + return _mi_os_alloc_aligned_at_offset(size, alignment, align_offset, commit, allow_large, memid, tld->stats); + } + else { + return _mi_os_alloc_aligned(size, alignment, commit, allow_large, memid, tld->stats); + } +} + +void* _mi_arena_alloc(size_t size, bool commit, bool allow_large, mi_arena_id_t req_arena_id, mi_memid_t* memid, mi_os_tld_t* tld) +{ + return _mi_arena_alloc_aligned(size, MI_ARENA_BLOCK_SIZE, 0, commit, allow_large, req_arena_id, memid, tld); +} + + +void* mi_arena_area(mi_arena_id_t arena_id, size_t* size) { + if (size != NULL) *size = 0; + size_t arena_index = mi_arena_id_index(arena_id); + if (arena_index >= MI_MAX_ARENAS) return NULL; + mi_arena_t* arena = mi_atomic_load_ptr_acquire(mi_arena_t, &mi_arenas[arena_index]); + if (arena == NULL) return NULL; + if (size != NULL) { *size = mi_arena_block_size(arena->block_count); } + return arena->start; +} + + +/* ----------------------------------------------------------- + Arena purge +----------------------------------------------------------- */ + +static long mi_arena_purge_delay(void) { + // <0 = no purging allowed, 0=immediate purging, >0=milli-second delay + return (mi_option_get(mi_option_purge_delay) * mi_option_get(mi_option_arena_purge_mult)); +} + +// reset or decommit in an arena and update the committed/decommit bitmaps +// assumes we own the area (i.e. blocks_in_use is claimed by us) +static void mi_arena_purge(mi_arena_t* arena, size_t bitmap_idx, size_t blocks, mi_stats_t* stats) { + mi_assert_internal(arena->blocks_committed != NULL); + mi_assert_internal(arena->blocks_purge != NULL); + mi_assert_internal(!arena->memid.is_pinned); + const size_t size = mi_arena_block_size(blocks); + void* const p = mi_arena_block_start(arena, bitmap_idx); + bool needs_recommit; + if (_mi_bitmap_is_claimed_across(arena->blocks_committed, arena->field_count, blocks, bitmap_idx)) { + // all blocks are committed, we can purge freely + needs_recommit = _mi_os_purge(p, size, stats); + } + else { + // some blocks are not committed -- this can happen when a partially committed block is freed + // in `_mi_arena_free` and it is conservatively marked as uncommitted but still scheduled for a purge + // we need to ensure we do not try to reset (as that may be invalid for uncommitted memory), + // and also undo the decommit stats (as it was already adjusted) + mi_assert_internal(mi_option_is_enabled(mi_option_purge_decommits)); + needs_recommit = _mi_os_purge_ex(p, size, false /* allow reset? */, stats); + _mi_stat_increase(&stats->committed, size); + } + + // clear the purged blocks + _mi_bitmap_unclaim_across(arena->blocks_purge, arena->field_count, blocks, bitmap_idx); + // update committed bitmap + if (needs_recommit) { + _mi_bitmap_unclaim_across(arena->blocks_committed, arena->field_count, blocks, bitmap_idx); + } +} + +// Schedule a purge. This is usually delayed to avoid repeated decommit/commit calls. +// Note: assumes we (still) own the area as we may purge immediately +static void mi_arena_schedule_purge(mi_arena_t* arena, size_t bitmap_idx, size_t blocks, mi_stats_t* stats) { + mi_assert_internal(arena->blocks_purge != NULL); + const long delay = mi_arena_purge_delay(); + if (delay < 0) return; // is purging allowed at all? + + if (_mi_preloading() || delay == 0) { + // decommit directly + mi_arena_purge(arena, bitmap_idx, blocks, stats); + } + else { + // schedule decommit + mi_msecs_t expire = mi_atomic_loadi64_relaxed(&arena->purge_expire); + if (expire != 0) { + mi_atomic_addi64_acq_rel(&arena->purge_expire, delay/10); // add smallish extra delay + } + else { + mi_atomic_storei64_release(&arena->purge_expire, _mi_clock_now() + delay); + } + _mi_bitmap_claim_across(arena->blocks_purge, arena->field_count, blocks, bitmap_idx, NULL); + } +} + +// purge a range of blocks +// return true if the full range was purged. +// assumes we own the area (i.e. blocks_in_use is claimed by us) +static bool mi_arena_purge_range(mi_arena_t* arena, size_t idx, size_t startidx, size_t bitlen, size_t purge, mi_stats_t* stats) { + const size_t endidx = startidx + bitlen; + size_t bitidx = startidx; + bool all_purged = false; + while (bitidx < endidx) { + // count consequetive ones in the purge mask + size_t count = 0; + while (bitidx + count < endidx && (purge & ((size_t)1 << (bitidx + count))) != 0) { + count++; + } + if (count > 0) { + // found range to be purged + const mi_bitmap_index_t range_idx = mi_bitmap_index_create(idx, bitidx); + mi_arena_purge(arena, range_idx, count, stats); + if (count == bitlen) { + all_purged = true; + } + } + bitidx += (count+1); // +1 to skip the zero bit (or end) + } + return all_purged; +} + +// returns true if anything was purged +static bool mi_arena_try_purge(mi_arena_t* arena, mi_msecs_t now, bool force, mi_stats_t* stats) +{ + if (arena->memid.is_pinned || arena->blocks_purge == NULL) return false; + mi_msecs_t expire = mi_atomic_loadi64_relaxed(&arena->purge_expire); + if (expire == 0) return false; + if (!force && expire > now) return false; + + // reset expire (if not already set concurrently) + mi_atomic_casi64_strong_acq_rel(&arena->purge_expire, &expire, 0); + + // potential purges scheduled, walk through the bitmap + bool any_purged = false; + bool full_purge = true; + for (size_t i = 0; i < arena->field_count; i++) { + size_t purge = mi_atomic_load_relaxed(&arena->blocks_purge[i]); + if (purge != 0) { + size_t bitidx = 0; + while (bitidx < MI_BITMAP_FIELD_BITS) { + // find consequetive range of ones in the purge mask + size_t bitlen = 0; + while (bitidx + bitlen < MI_BITMAP_FIELD_BITS && (purge & ((size_t)1 << (bitidx + bitlen))) != 0) { + bitlen++; + } + // try to claim the longest range of corresponding in_use bits + const mi_bitmap_index_t bitmap_index = mi_bitmap_index_create(i, bitidx); + while( bitlen > 0 ) { + if (_mi_bitmap_try_claim(arena->blocks_inuse, arena->field_count, bitlen, bitmap_index)) { + break; + } + bitlen--; + } + // actual claimed bits at `in_use` + if (bitlen > 0) { + // read purge again now that we have the in_use bits + purge = mi_atomic_load_acquire(&arena->blocks_purge[i]); + if (!mi_arena_purge_range(arena, i, bitidx, bitlen, purge, stats)) { + full_purge = false; + } + any_purged = true; + // release the claimed `in_use` bits again + _mi_bitmap_unclaim(arena->blocks_inuse, arena->field_count, bitlen, bitmap_index); + } + bitidx += (bitlen+1); // +1 to skip the zero (or end) + } // while bitidx + } // purge != 0 + } + // if not fully purged, make sure to purge again in the future + if (!full_purge) { + const long delay = mi_arena_purge_delay(); + mi_msecs_t expected = 0; + mi_atomic_casi64_strong_acq_rel(&arena->purge_expire,&expected,_mi_clock_now() + delay); + } + return any_purged; +} + +static void mi_arenas_try_purge( bool force, bool visit_all, mi_stats_t* stats ) { + if (_mi_preloading() || mi_arena_purge_delay() <= 0) return; // nothing will be scheduled + + const size_t max_arena = mi_atomic_load_acquire(&mi_arena_count); + if (max_arena == 0) return; + + // allow only one thread to purge at a time + static mi_atomic_guard_t purge_guard; + mi_atomic_guard(&purge_guard) + { + mi_msecs_t now = _mi_clock_now(); + size_t max_purge_count = (visit_all ? max_arena : 1); + for (size_t i = 0; i < max_arena; i++) { + mi_arena_t* arena = mi_atomic_load_ptr_acquire(mi_arena_t, &mi_arenas[i]); + if (arena != NULL) { + if (mi_arena_try_purge(arena, now, force, stats)) { + if (max_purge_count <= 1) break; + max_purge_count--; + } + } + } + } +} + + +/* ----------------------------------------------------------- + Arena free +----------------------------------------------------------- */ + +void _mi_arena_free(void* p, size_t size, size_t committed_size, mi_memid_t memid, mi_stats_t* stats) { + mi_assert_internal(size > 0 && stats != NULL); + mi_assert_internal(committed_size <= size); + if (p==NULL) return; + if (size==0) return; + const bool all_committed = (committed_size == size); + + if (mi_memkind_is_os(memid.memkind)) { + // was a direct OS allocation, pass through + if (!all_committed && committed_size > 0) { + // if partially committed, adjust the committed stats (as `_mi_os_free` will increase decommit by the full size) + _mi_stat_decrease(&stats->committed, committed_size); + } + _mi_os_free(p, size, memid, stats); + } + else if (memid.memkind == MI_MEM_ARENA) { + // allocated in an arena + size_t arena_idx; + size_t bitmap_idx; + mi_arena_memid_indices(memid, &arena_idx, &bitmap_idx); + mi_assert_internal(arena_idx < MI_MAX_ARENAS); + mi_arena_t* arena = mi_atomic_load_ptr_acquire(mi_arena_t,&mi_arenas[arena_idx]); + mi_assert_internal(arena != NULL); + const size_t blocks = mi_block_count_of_size(size); + + // checks + if (arena == NULL) { + _mi_error_message(EINVAL, "trying to free from non-existent arena: %p, size %zu, memid: 0x%zx\n", p, size, memid); + return; + } + mi_assert_internal(arena->field_count > mi_bitmap_index_field(bitmap_idx)); + if (arena->field_count <= mi_bitmap_index_field(bitmap_idx)) { + _mi_error_message(EINVAL, "trying to free from non-existent arena block: %p, size %zu, memid: 0x%zx\n", p, size, memid); + return; + } + + // need to set all memory to undefined as some parts may still be marked as no_access (like padding etc.) + mi_track_mem_undefined(p,size); + + // potentially decommit + if (arena->memid.is_pinned || arena->blocks_committed == NULL) { + mi_assert_internal(all_committed); + } + else { + mi_assert_internal(arena->blocks_committed != NULL); + mi_assert_internal(arena->blocks_purge != NULL); + + if (!all_committed) { + // mark the entire range as no longer committed (so we recommit the full range when re-using) + _mi_bitmap_unclaim_across(arena->blocks_committed, arena->field_count, blocks, bitmap_idx); + mi_track_mem_noaccess(p,size); + if (committed_size > 0) { + // if partially committed, adjust the committed stats (is it will be recommitted when re-using) + // in the delayed purge, we now need to not count a decommit if the range is not marked as committed. + _mi_stat_decrease(&stats->committed, committed_size); + } + // note: if not all committed, it may be that the purge will reset/decommit the entire range + // that contains already decommitted parts. Since purge consistently uses reset or decommit that + // works (as we should never reset decommitted parts). + } + // (delay) purge the entire range + mi_arena_schedule_purge(arena, bitmap_idx, blocks, stats); + } + + // and make it available to others again + bool all_inuse = _mi_bitmap_unclaim_across(arena->blocks_inuse, arena->field_count, blocks, bitmap_idx); + if (!all_inuse) { + _mi_error_message(EAGAIN, "trying to free an already freed arena block: %p, size %zu\n", p, size); + return; + }; + } + else { + // arena was none, external, or static; nothing to do + mi_assert_internal(memid.memkind < MI_MEM_OS); + } + + // purge expired decommits + mi_arenas_try_purge(false, false, stats); +} + +// destroy owned arenas; this is unsafe and should only be done using `mi_option_destroy_on_exit` +// for dynamic libraries that are unloaded and need to release all their allocated memory. +static void mi_arenas_unsafe_destroy(void) { + const size_t max_arena = mi_atomic_load_relaxed(&mi_arena_count); + size_t new_max_arena = 0; + for (size_t i = 0; i < max_arena; i++) { + mi_arena_t* arena = mi_atomic_load_ptr_acquire(mi_arena_t, &mi_arenas[i]); + if (arena != NULL) { + if (arena->start != NULL && mi_memkind_is_os(arena->memid.memkind)) { + mi_atomic_store_ptr_release(mi_arena_t, &mi_arenas[i], NULL); + _mi_os_free(arena->start, mi_arena_size(arena), arena->memid, &_mi_stats_main); + } + else { + new_max_arena = i; + } + mi_arena_meta_free(arena, arena->meta_memid, arena->meta_size, &_mi_stats_main); + } + } + + // try to lower the max arena. + size_t expected = max_arena; + mi_atomic_cas_strong_acq_rel(&mi_arena_count, &expected, new_max_arena); +} + +// Purge the arenas; if `force_purge` is true, amenable parts are purged even if not yet expired +void _mi_arena_collect(bool force_purge, mi_stats_t* stats) { + mi_arenas_try_purge(force_purge, true /* visit all */, stats); +} + +// destroy owned arenas; this is unsafe and should only be done using `mi_option_destroy_on_exit` +// for dynamic libraries that are unloaded and need to release all their allocated memory. +void _mi_arena_unsafe_destroy_all(mi_stats_t* stats) { + mi_arenas_unsafe_destroy(); + _mi_arena_collect(true /* force purge */, stats); // purge non-owned arenas +} + +// Is a pointer inside any of our arenas? +bool _mi_arena_contains(const void* p) { + const size_t max_arena = mi_atomic_load_relaxed(&mi_arena_count); + for (size_t i = 0; i < max_arena; i++) { + mi_arena_t* arena = mi_atomic_load_ptr_acquire(mi_arena_t, &mi_arenas[i]); + if (arena != NULL && arena->start <= (const uint8_t*)p && arena->start + mi_arena_block_size(arena->block_count) > (const uint8_t*)p) { + return true; + } + } + return false; +} + + +/* ----------------------------------------------------------- + Add an arena. +----------------------------------------------------------- */ + +static bool mi_arena_add(mi_arena_t* arena, mi_arena_id_t* arena_id) { + mi_assert_internal(arena != NULL); + mi_assert_internal((uintptr_t)mi_atomic_load_ptr_relaxed(uint8_t,&arena->start) % MI_SEGMENT_ALIGN == 0); + mi_assert_internal(arena->block_count > 0); + if (arena_id != NULL) { *arena_id = -1; } + + size_t i = mi_atomic_increment_acq_rel(&mi_arena_count); + if (i >= MI_MAX_ARENAS) { + mi_atomic_decrement_acq_rel(&mi_arena_count); + return false; + } + arena->id = mi_arena_id_create(i); + mi_atomic_store_ptr_release(mi_arena_t,&mi_arenas[i], arena); + if (arena_id != NULL) { *arena_id = arena->id; } + return true; +} + +static bool mi_manage_os_memory_ex2(void* start, size_t size, bool is_large, int numa_node, bool exclusive, mi_memid_t memid, mi_arena_id_t* arena_id) mi_attr_noexcept +{ + if (arena_id != NULL) *arena_id = _mi_arena_id_none(); + if (size < MI_ARENA_BLOCK_SIZE) return false; + + if (is_large) { + mi_assert_internal(memid.initially_committed && memid.is_pinned); + } + + const size_t bcount = size / MI_ARENA_BLOCK_SIZE; + const size_t fields = _mi_divide_up(bcount, MI_BITMAP_FIELD_BITS); + const size_t bitmaps = (memid.is_pinned ? 2 : 4); + const size_t asize = sizeof(mi_arena_t) + (bitmaps*fields*sizeof(mi_bitmap_field_t)); + mi_memid_t meta_memid; + mi_arena_t* arena = (mi_arena_t*)mi_arena_meta_zalloc(asize, &meta_memid, &_mi_stats_main); // TODO: can we avoid allocating from the OS? + if (arena == NULL) return false; + + // already zero'd due to os_alloc + // _mi_memzero(arena, asize); + arena->id = _mi_arena_id_none(); + arena->memid = memid; + arena->exclusive = exclusive; + arena->meta_size = asize; + arena->meta_memid = meta_memid; + arena->block_count = bcount; + arena->field_count = fields; + arena->start = (uint8_t*)start; + arena->numa_node = numa_node; // TODO: or get the current numa node if -1? (now it allows anyone to allocate on -1) + arena->is_large = is_large; + arena->purge_expire = 0; + arena->search_idx = 0; + arena->blocks_dirty = &arena->blocks_inuse[fields]; // just after inuse bitmap + arena->blocks_committed = (arena->memid.is_pinned ? NULL : &arena->blocks_inuse[2*fields]); // just after dirty bitmap + arena->blocks_purge = (arena->memid.is_pinned ? NULL : &arena->blocks_inuse[3*fields]); // just after committed bitmap + // initialize committed bitmap? + if (arena->blocks_committed != NULL && arena->memid.initially_committed) { + memset((void*)arena->blocks_committed, 0xFF, fields*sizeof(mi_bitmap_field_t)); // cast to void* to avoid atomic warning + } + + // and claim leftover blocks if needed (so we never allocate there) + ptrdiff_t post = (fields * MI_BITMAP_FIELD_BITS) - bcount; + mi_assert_internal(post >= 0); + if (post > 0) { + // don't use leftover bits at the end + mi_bitmap_index_t postidx = mi_bitmap_index_create(fields - 1, MI_BITMAP_FIELD_BITS - post); + _mi_bitmap_claim(arena->blocks_inuse, fields, post, postidx, NULL); + } + return mi_arena_add(arena, arena_id); + +} + +bool mi_manage_os_memory_ex(void* start, size_t size, bool is_committed, bool is_large, bool is_zero, int numa_node, bool exclusive, mi_arena_id_t* arena_id) mi_attr_noexcept { + mi_memid_t memid = _mi_memid_create(MI_MEM_EXTERNAL); + memid.initially_committed = is_committed; + memid.initially_zero = is_zero; + memid.is_pinned = is_large; + return mi_manage_os_memory_ex2(start,size,is_large,numa_node,exclusive,memid, arena_id); +} + +// Reserve a range of regular OS memory +int mi_reserve_os_memory_ex(size_t size, bool commit, bool allow_large, bool exclusive, mi_arena_id_t* arena_id) mi_attr_noexcept { + if (arena_id != NULL) *arena_id = _mi_arena_id_none(); + size = _mi_align_up(size, MI_ARENA_BLOCK_SIZE); // at least one block + mi_memid_t memid; + void* start = _mi_os_alloc_aligned(size, MI_SEGMENT_ALIGN, commit, allow_large, &memid, &_mi_stats_main); + if (start == NULL) return ENOMEM; + const bool is_large = memid.is_pinned; // todo: use separate is_large field? + if (!mi_manage_os_memory_ex2(start, size, is_large, -1 /* numa node */, exclusive, memid, arena_id)) { + _mi_os_free_ex(start, size, commit, memid, &_mi_stats_main); + _mi_verbose_message("failed to reserve %zu k memory\n", _mi_divide_up(size, 1024)); + return ENOMEM; + } + _mi_verbose_message("reserved %zu KiB memory%s\n", _mi_divide_up(size, 1024), is_large ? " (in large os pages)" : ""); + return 0; +} + + +// Manage a range of regular OS memory +bool mi_manage_os_memory(void* start, size_t size, bool is_committed, bool is_large, bool is_zero, int numa_node) mi_attr_noexcept { + return mi_manage_os_memory_ex(start, size, is_committed, is_large, is_zero, numa_node, false /* exclusive? */, NULL); +} + +// Reserve a range of regular OS memory +int mi_reserve_os_memory(size_t size, bool commit, bool allow_large) mi_attr_noexcept { + return mi_reserve_os_memory_ex(size, commit, allow_large, false, NULL); +} + + +/* ----------------------------------------------------------- + Debugging +----------------------------------------------------------- */ + +static size_t mi_debug_show_bitmap(const char* prefix, mi_bitmap_field_t* fields, size_t field_count ) { + size_t inuse_count = 0; + for (size_t i = 0; i < field_count; i++) { + char buf[MI_BITMAP_FIELD_BITS + 1]; + uintptr_t field = mi_atomic_load_relaxed(&fields[i]); + for (size_t bit = 0; bit < MI_BITMAP_FIELD_BITS; bit++) { + bool inuse = ((((uintptr_t)1 << bit) & field) != 0); + if (inuse) inuse_count++; + buf[MI_BITMAP_FIELD_BITS - 1 - bit] = (inuse ? 'x' : '.'); + } + buf[MI_BITMAP_FIELD_BITS] = 0; + _mi_verbose_message("%s%s\n", prefix, buf); + } + return inuse_count; +} + +void mi_debug_show_arenas(void) mi_attr_noexcept { + size_t max_arenas = mi_atomic_load_relaxed(&mi_arena_count); + for (size_t i = 0; i < max_arenas; i++) { + mi_arena_t* arena = mi_atomic_load_ptr_relaxed(mi_arena_t, &mi_arenas[i]); + if (arena == NULL) break; + size_t inuse_count = 0; + _mi_verbose_message("arena %zu: %zu blocks with %zu fields\n", i, arena->block_count, arena->field_count); + inuse_count += mi_debug_show_bitmap(" ", arena->blocks_inuse, arena->field_count); + _mi_verbose_message(" blocks in use ('x'): %zu\n", inuse_count); + } +} + + +/* ----------------------------------------------------------- + Reserve a huge page arena. +----------------------------------------------------------- */ +// reserve at a specific numa node +int mi_reserve_huge_os_pages_at_ex(size_t pages, int numa_node, size_t timeout_msecs, bool exclusive, mi_arena_id_t* arena_id) mi_attr_noexcept { + if (arena_id != NULL) *arena_id = -1; + if (pages==0) return 0; + if (numa_node < -1) numa_node = -1; + if (numa_node >= 0) numa_node = numa_node % _mi_os_numa_node_count(); + size_t hsize = 0; + size_t pages_reserved = 0; + mi_memid_t memid; + void* p = _mi_os_alloc_huge_os_pages(pages, numa_node, timeout_msecs, &pages_reserved, &hsize, &memid); + if (p==NULL || pages_reserved==0) { + _mi_warning_message("failed to reserve %zu GiB huge pages\n", pages); + return ENOMEM; + } + _mi_verbose_message("numa node %i: reserved %zu GiB huge pages (of the %zu GiB requested)\n", numa_node, pages_reserved, pages); + + if (!mi_manage_os_memory_ex2(p, hsize, true, numa_node, exclusive, memid, arena_id)) { + _mi_os_free(p, hsize, memid, &_mi_stats_main); + return ENOMEM; + } + return 0; +} + +int mi_reserve_huge_os_pages_at(size_t pages, int numa_node, size_t timeout_msecs) mi_attr_noexcept { + return mi_reserve_huge_os_pages_at_ex(pages, numa_node, timeout_msecs, false, NULL); +} + +// reserve huge pages evenly among the given number of numa nodes (or use the available ones as detected) +int mi_reserve_huge_os_pages_interleave(size_t pages, size_t numa_nodes, size_t timeout_msecs) mi_attr_noexcept { + if (pages == 0) return 0; + + // pages per numa node + size_t numa_count = (numa_nodes > 0 ? numa_nodes : _mi_os_numa_node_count()); + if (numa_count <= 0) numa_count = 1; + const size_t pages_per = pages / numa_count; + const size_t pages_mod = pages % numa_count; + const size_t timeout_per = (timeout_msecs==0 ? 0 : (timeout_msecs / numa_count) + 50); + + // reserve evenly among numa nodes + for (size_t numa_node = 0; numa_node < numa_count && pages > 0; numa_node++) { + size_t node_pages = pages_per; // can be 0 + if (numa_node < pages_mod) node_pages++; + int err = mi_reserve_huge_os_pages_at(node_pages, (int)numa_node, timeout_per); + if (err) return err; + if (pages < node_pages) { + pages = 0; + } + else { + pages -= node_pages; + } + } + + return 0; +} + +int mi_reserve_huge_os_pages(size_t pages, double max_secs, size_t* pages_reserved) mi_attr_noexcept { + MI_UNUSED(max_secs); + _mi_warning_message("mi_reserve_huge_os_pages is deprecated: use mi_reserve_huge_os_pages_interleave/at instead\n"); + if (pages_reserved != NULL) *pages_reserved = 0; + int err = mi_reserve_huge_os_pages_interleave(pages, 0, (size_t)(max_secs * 1000.0)); + if (err==0 && pages_reserved!=NULL) *pages_reserved = pages; + return err; +} diff --git a/compat/mimalloc/bitmap.c b/compat/mimalloc/bitmap.c new file mode 100644 index 00000000000000..878f0ab3250a47 --- /dev/null +++ b/compat/mimalloc/bitmap.c @@ -0,0 +1,432 @@ +/* ---------------------------------------------------------------------------- +Copyright (c) 2019-2023 Microsoft Research, Daan Leijen +This is free software; you can redistribute it and/or modify it under the +terms of the MIT license. A copy of the license can be found in the file +"LICENSE" at the root of this distribution. +-----------------------------------------------------------------------------*/ + +/* ---------------------------------------------------------------------------- +Concurrent bitmap that can set/reset sequences of bits atomically, +represeted as an array of fields where each field is a machine word (`size_t`) + +There are two api's; the standard one cannot have sequences that cross +between the bitmap fields (and a sequence must be <= MI_BITMAP_FIELD_BITS). + +The `_across` postfixed functions do allow sequences that can cross over +between the fields. (This is used in arena allocation) +---------------------------------------------------------------------------- */ + +#include "mimalloc.h" +#include "mimalloc/internal.h" +#include "bitmap.h" + +/* ----------------------------------------------------------- + Bitmap definition +----------------------------------------------------------- */ + +// The bit mask for a given number of blocks at a specified bit index. +static inline size_t mi_bitmap_mask_(size_t count, size_t bitidx) { + mi_assert_internal(count + bitidx <= MI_BITMAP_FIELD_BITS); + mi_assert_internal(count > 0); + if (count >= MI_BITMAP_FIELD_BITS) return MI_BITMAP_FIELD_FULL; + if (count == 0) return 0; + return ((((size_t)1 << count) - 1) << bitidx); +} + + +/* ----------------------------------------------------------- + Claim a bit sequence atomically +----------------------------------------------------------- */ + +// Try to atomically claim a sequence of `count` bits in a single +// field at `idx` in `bitmap`. Returns `true` on success. +inline bool _mi_bitmap_try_find_claim_field(mi_bitmap_t bitmap, size_t idx, const size_t count, mi_bitmap_index_t* bitmap_idx) +{ + mi_assert_internal(bitmap_idx != NULL); + mi_assert_internal(count <= MI_BITMAP_FIELD_BITS); + mi_assert_internal(count > 0); + mi_bitmap_field_t* field = &bitmap[idx]; + size_t map = mi_atomic_load_relaxed(field); + if (map==MI_BITMAP_FIELD_FULL) return false; // short cut + + // search for 0-bit sequence of length count + const size_t mask = mi_bitmap_mask_(count, 0); + const size_t bitidx_max = MI_BITMAP_FIELD_BITS - count; + +#ifdef MI_HAVE_FAST_BITSCAN + size_t bitidx = mi_ctz(~map); // quickly find the first zero bit if possible +#else + size_t bitidx = 0; // otherwise start at 0 +#endif + size_t m = (mask << bitidx); // invariant: m == mask shifted by bitidx + + // scan linearly for a free range of zero bits + while (bitidx <= bitidx_max) { + const size_t mapm = (map & m); + if (mapm == 0) { // are the mask bits free at bitidx? + mi_assert_internal((m >> bitidx) == mask); // no overflow? + const size_t newmap = (map | m); + mi_assert_internal((newmap^map) >> bitidx == mask); + if (!mi_atomic_cas_strong_acq_rel(field, &map, newmap)) { // TODO: use weak cas here? + // no success, another thread claimed concurrently.. keep going (with updated `map`) + continue; + } + else { + // success, we claimed the bits! + *bitmap_idx = mi_bitmap_index_create(idx, bitidx); + return true; + } + } + else { + // on to the next bit range +#ifdef MI_HAVE_FAST_BITSCAN + mi_assert_internal(mapm != 0); + const size_t shift = (count == 1 ? 1 : (MI_INTPTR_BITS - mi_clz(mapm) - bitidx)); + mi_assert_internal(shift > 0 && shift <= count); +#else + const size_t shift = 1; +#endif + bitidx += shift; + m <<= shift; + } + } + // no bits found + return false; +} + +// Find `count` bits of 0 and set them to 1 atomically; returns `true` on success. +// Starts at idx, and wraps around to search in all `bitmap_fields` fields. +// `count` can be at most MI_BITMAP_FIELD_BITS and will never cross fields. +bool _mi_bitmap_try_find_from_claim(mi_bitmap_t bitmap, const size_t bitmap_fields, const size_t start_field_idx, const size_t count, mi_bitmap_index_t* bitmap_idx) { + size_t idx = start_field_idx; + for (size_t visited = 0; visited < bitmap_fields; visited++, idx++) { + if (idx >= bitmap_fields) { idx = 0; } // wrap + if (_mi_bitmap_try_find_claim_field(bitmap, idx, count, bitmap_idx)) { + return true; + } + } + return false; +} + +// Like _mi_bitmap_try_find_from_claim but with an extra predicate that must be fullfilled +bool _mi_bitmap_try_find_from_claim_pred(mi_bitmap_t bitmap, const size_t bitmap_fields, + const size_t start_field_idx, const size_t count, + mi_bitmap_pred_fun_t pred_fun, void* pred_arg, + mi_bitmap_index_t* bitmap_idx) { + size_t idx = start_field_idx; + for (size_t visited = 0; visited < bitmap_fields; visited++, idx++) { + if (idx >= bitmap_fields) idx = 0; // wrap + if (_mi_bitmap_try_find_claim_field(bitmap, idx, count, bitmap_idx)) { + if (pred_fun == NULL || pred_fun(*bitmap_idx, pred_arg)) { + return true; + } + // predicate returned false, unclaim and look further + _mi_bitmap_unclaim(bitmap, bitmap_fields, count, *bitmap_idx); + } + } + return false; +} + +// Set `count` bits at `bitmap_idx` to 0 atomically +// Returns `true` if all `count` bits were 1 previously. +bool _mi_bitmap_unclaim(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx) { + const size_t idx = mi_bitmap_index_field(bitmap_idx); + const size_t bitidx = mi_bitmap_index_bit_in_field(bitmap_idx); + const size_t mask = mi_bitmap_mask_(count, bitidx); + mi_assert_internal(bitmap_fields > idx); MI_UNUSED(bitmap_fields); + // mi_assert_internal((bitmap[idx] & mask) == mask); + const size_t prev = mi_atomic_and_acq_rel(&bitmap[idx], ~mask); + return ((prev & mask) == mask); +} + + +// Set `count` bits at `bitmap_idx` to 1 atomically +// Returns `true` if all `count` bits were 0 previously. `any_zero` is `true` if there was at least one zero bit. +bool _mi_bitmap_claim(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx, bool* any_zero) { + const size_t idx = mi_bitmap_index_field(bitmap_idx); + const size_t bitidx = mi_bitmap_index_bit_in_field(bitmap_idx); + const size_t mask = mi_bitmap_mask_(count, bitidx); + mi_assert_internal(bitmap_fields > idx); MI_UNUSED(bitmap_fields); + //mi_assert_internal(any_zero != NULL || (bitmap[idx] & mask) == 0); + size_t prev = mi_atomic_or_acq_rel(&bitmap[idx], mask); + if (any_zero != NULL) { *any_zero = ((prev & mask) != mask); } + return ((prev & mask) == 0); +} + +// Returns `true` if all `count` bits were 1. `any_ones` is `true` if there was at least one bit set to one. +static bool mi_bitmap_is_claimedx(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx, bool* any_ones) { + const size_t idx = mi_bitmap_index_field(bitmap_idx); + const size_t bitidx = mi_bitmap_index_bit_in_field(bitmap_idx); + const size_t mask = mi_bitmap_mask_(count, bitidx); + mi_assert_internal(bitmap_fields > idx); MI_UNUSED(bitmap_fields); + const size_t field = mi_atomic_load_relaxed(&bitmap[idx]); + if (any_ones != NULL) { *any_ones = ((field & mask) != 0); } + return ((field & mask) == mask); +} + +// Try to set `count` bits at `bitmap_idx` from 0 to 1 atomically. +// Returns `true` if successful when all previous `count` bits were 0. +bool _mi_bitmap_try_claim(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx) { + const size_t idx = mi_bitmap_index_field(bitmap_idx); + const size_t bitidx = mi_bitmap_index_bit_in_field(bitmap_idx); + const size_t mask = mi_bitmap_mask_(count, bitidx); + mi_assert_internal(bitmap_fields > idx); MI_UNUSED(bitmap_fields); + size_t expected = mi_atomic_load_relaxed(&bitmap[idx]); + do { + if ((expected & mask) != 0) return false; + } + while (!mi_atomic_cas_strong_acq_rel(&bitmap[idx], &expected, expected | mask)); + mi_assert_internal((expected & mask) == 0); + return true; +} + + +bool _mi_bitmap_is_claimed(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx) { + return mi_bitmap_is_claimedx(bitmap, bitmap_fields, count, bitmap_idx, NULL); +} + +bool _mi_bitmap_is_any_claimed(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx) { + bool any_ones; + mi_bitmap_is_claimedx(bitmap, bitmap_fields, count, bitmap_idx, &any_ones); + return any_ones; +} + + +//-------------------------------------------------------------------------- +// the `_across` functions work on bitmaps where sequences can cross over +// between the fields. This is used in arena allocation +//-------------------------------------------------------------------------- + +// Try to atomically claim a sequence of `count` bits starting from the field +// at `idx` in `bitmap` and crossing into subsequent fields. Returns `true` on success. +// Only needs to consider crossing into the next fields (see `mi_bitmap_try_find_from_claim_across`) +static bool mi_bitmap_try_find_claim_field_across(mi_bitmap_t bitmap, size_t bitmap_fields, size_t idx, const size_t count, const size_t retries, mi_bitmap_index_t* bitmap_idx) +{ + mi_assert_internal(bitmap_idx != NULL); + + // check initial trailing zeros + mi_bitmap_field_t* field = &bitmap[idx]; + size_t map = mi_atomic_load_relaxed(field); + const size_t initial = mi_clz(map); // count of initial zeros starting at idx + mi_assert_internal(initial <= MI_BITMAP_FIELD_BITS); + if (initial == 0) return false; + if (initial >= count) return _mi_bitmap_try_find_claim_field(bitmap, idx, count, bitmap_idx); // no need to cross fields (this case won't happen for us) + if (_mi_divide_up(count - initial, MI_BITMAP_FIELD_BITS) >= (bitmap_fields - idx)) return false; // not enough entries + + // scan ahead + size_t found = initial; + size_t mask = 0; // mask bits for the final field + while(found < count) { + field++; + map = mi_atomic_load_relaxed(field); + const size_t mask_bits = (found + MI_BITMAP_FIELD_BITS <= count ? MI_BITMAP_FIELD_BITS : (count - found)); + mi_assert_internal(mask_bits > 0 && mask_bits <= MI_BITMAP_FIELD_BITS); + mask = mi_bitmap_mask_(mask_bits, 0); + if ((map & mask) != 0) return false; // some part is already claimed + found += mask_bits; + } + mi_assert_internal(field < &bitmap[bitmap_fields]); + + // we found a range of contiguous zeros up to the final field; mask contains mask in the final field + // now try to claim the range atomically + mi_bitmap_field_t* const final_field = field; + const size_t final_mask = mask; + mi_bitmap_field_t* const initial_field = &bitmap[idx]; + const size_t initial_idx = MI_BITMAP_FIELD_BITS - initial; + const size_t initial_mask = mi_bitmap_mask_(initial, initial_idx); + + // initial field + size_t newmap; + field = initial_field; + map = mi_atomic_load_relaxed(field); + do { + newmap = (map | initial_mask); + if ((map & initial_mask) != 0) { goto rollback; }; + } while (!mi_atomic_cas_strong_acq_rel(field, &map, newmap)); + + // intermediate fields + while (++field < final_field) { + newmap = MI_BITMAP_FIELD_FULL; + map = 0; + if (!mi_atomic_cas_strong_acq_rel(field, &map, newmap)) { goto rollback; } + } + + // final field + mi_assert_internal(field == final_field); + map = mi_atomic_load_relaxed(field); + do { + newmap = (map | final_mask); + if ((map & final_mask) != 0) { goto rollback; } + } while (!mi_atomic_cas_strong_acq_rel(field, &map, newmap)); + + // claimed! + *bitmap_idx = mi_bitmap_index_create(idx, initial_idx); + return true; + +rollback: + // roll back intermediate fields + // (we just failed to claim `field` so decrement first) + while (--field > initial_field) { + newmap = 0; + map = MI_BITMAP_FIELD_FULL; + mi_assert_internal(mi_atomic_load_relaxed(field) == map); + mi_atomic_store_release(field, newmap); + } + if (field == initial_field) { // (if we failed on the initial field, `field + 1 == initial_field`) + map = mi_atomic_load_relaxed(field); + do { + mi_assert_internal((map & initial_mask) == initial_mask); + newmap = (map & ~initial_mask); + } while (!mi_atomic_cas_strong_acq_rel(field, &map, newmap)); + } + // retry? (we make a recursive call instead of goto to be able to use const declarations) + if (retries <= 2) { + return mi_bitmap_try_find_claim_field_across(bitmap, bitmap_fields, idx, count, retries+1, bitmap_idx); + } + else { + return false; + } +} + + +// Find `count` bits of zeros and set them to 1 atomically; returns `true` on success. +// Starts at idx, and wraps around to search in all `bitmap_fields` fields. +bool _mi_bitmap_try_find_from_claim_across(mi_bitmap_t bitmap, const size_t bitmap_fields, const size_t start_field_idx, const size_t count, mi_bitmap_index_t* bitmap_idx) { + mi_assert_internal(count > 0); + if (count <= 2) { + // we don't bother with crossover fields for small counts + return _mi_bitmap_try_find_from_claim(bitmap, bitmap_fields, start_field_idx, count, bitmap_idx); + } + + // visit the fields + size_t idx = start_field_idx; + for (size_t visited = 0; visited < bitmap_fields; visited++, idx++) { + if (idx >= bitmap_fields) { idx = 0; } // wrap + // first try to claim inside a field + if (count <= MI_BITMAP_FIELD_BITS) { + if (_mi_bitmap_try_find_claim_field(bitmap, idx, count, bitmap_idx)) { + return true; + } + } + // if that fails, then try to claim across fields + if (mi_bitmap_try_find_claim_field_across(bitmap, bitmap_fields, idx, count, 0, bitmap_idx)) { + return true; + } + } + return false; +} + +// Helper for masks across fields; returns the mid count, post_mask may be 0 +static size_t mi_bitmap_mask_across(mi_bitmap_index_t bitmap_idx, size_t bitmap_fields, size_t count, size_t* pre_mask, size_t* mid_mask, size_t* post_mask) { + MI_UNUSED(bitmap_fields); + const size_t bitidx = mi_bitmap_index_bit_in_field(bitmap_idx); + if mi_likely(bitidx + count <= MI_BITMAP_FIELD_BITS) { + *pre_mask = mi_bitmap_mask_(count, bitidx); + *mid_mask = 0; + *post_mask = 0; + mi_assert_internal(mi_bitmap_index_field(bitmap_idx) < bitmap_fields); + return 0; + } + else { + const size_t pre_bits = MI_BITMAP_FIELD_BITS - bitidx; + mi_assert_internal(pre_bits < count); + *pre_mask = mi_bitmap_mask_(pre_bits, bitidx); + count -= pre_bits; + const size_t mid_count = (count / MI_BITMAP_FIELD_BITS); + *mid_mask = MI_BITMAP_FIELD_FULL; + count %= MI_BITMAP_FIELD_BITS; + *post_mask = (count==0 ? 0 : mi_bitmap_mask_(count, 0)); + mi_assert_internal(mi_bitmap_index_field(bitmap_idx) + mid_count + (count==0 ? 0 : 1) < bitmap_fields); + return mid_count; + } +} + +// Set `count` bits at `bitmap_idx` to 0 atomically +// Returns `true` if all `count` bits were 1 previously. +bool _mi_bitmap_unclaim_across(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx) { + size_t idx = mi_bitmap_index_field(bitmap_idx); + size_t pre_mask; + size_t mid_mask; + size_t post_mask; + size_t mid_count = mi_bitmap_mask_across(bitmap_idx, bitmap_fields, count, &pre_mask, &mid_mask, &post_mask); + bool all_one = true; + mi_bitmap_field_t* field = &bitmap[idx]; + size_t prev = mi_atomic_and_acq_rel(field++, ~pre_mask); // clear first part + if ((prev & pre_mask) != pre_mask) all_one = false; + while(mid_count-- > 0) { + prev = mi_atomic_and_acq_rel(field++, ~mid_mask); // clear mid part + if ((prev & mid_mask) != mid_mask) all_one = false; + } + if (post_mask!=0) { + prev = mi_atomic_and_acq_rel(field, ~post_mask); // clear end part + if ((prev & post_mask) != post_mask) all_one = false; + } + return all_one; +} + +// Set `count` bits at `bitmap_idx` to 1 atomically +// Returns `true` if all `count` bits were 0 previously. `any_zero` is `true` if there was at least one zero bit. +bool _mi_bitmap_claim_across(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx, bool* pany_zero) { + size_t idx = mi_bitmap_index_field(bitmap_idx); + size_t pre_mask; + size_t mid_mask; + size_t post_mask; + size_t mid_count = mi_bitmap_mask_across(bitmap_idx, bitmap_fields, count, &pre_mask, &mid_mask, &post_mask); + bool all_zero = true; + bool any_zero = false; + _Atomic(size_t)*field = &bitmap[idx]; + size_t prev = mi_atomic_or_acq_rel(field++, pre_mask); + if ((prev & pre_mask) != 0) all_zero = false; + if ((prev & pre_mask) != pre_mask) any_zero = true; + while (mid_count-- > 0) { + prev = mi_atomic_or_acq_rel(field++, mid_mask); + if ((prev & mid_mask) != 0) all_zero = false; + if ((prev & mid_mask) != mid_mask) any_zero = true; + } + if (post_mask!=0) { + prev = mi_atomic_or_acq_rel(field, post_mask); + if ((prev & post_mask) != 0) all_zero = false; + if ((prev & post_mask) != post_mask) any_zero = true; + } + if (pany_zero != NULL) { *pany_zero = any_zero; } + return all_zero; +} + + +// Returns `true` if all `count` bits were 1. +// `any_ones` is `true` if there was at least one bit set to one. +static bool mi_bitmap_is_claimedx_across(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx, bool* pany_ones) { + size_t idx = mi_bitmap_index_field(bitmap_idx); + size_t pre_mask; + size_t mid_mask; + size_t post_mask; + size_t mid_count = mi_bitmap_mask_across(bitmap_idx, bitmap_fields, count, &pre_mask, &mid_mask, &post_mask); + bool all_ones = true; + bool any_ones = false; + mi_bitmap_field_t* field = &bitmap[idx]; + size_t prev = mi_atomic_load_relaxed(field++); + if ((prev & pre_mask) != pre_mask) all_ones = false; + if ((prev & pre_mask) != 0) any_ones = true; + while (mid_count-- > 0) { + prev = mi_atomic_load_relaxed(field++); + if ((prev & mid_mask) != mid_mask) all_ones = false; + if ((prev & mid_mask) != 0) any_ones = true; + } + if (post_mask!=0) { + prev = mi_atomic_load_relaxed(field); + if ((prev & post_mask) != post_mask) all_ones = false; + if ((prev & post_mask) != 0) any_ones = true; + } + if (pany_ones != NULL) { *pany_ones = any_ones; } + return all_ones; +} + +bool _mi_bitmap_is_claimed_across(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx) { + return mi_bitmap_is_claimedx_across(bitmap, bitmap_fields, count, bitmap_idx, NULL); +} + +bool _mi_bitmap_is_any_claimed_across(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx) { + bool any_ones; + mi_bitmap_is_claimedx_across(bitmap, bitmap_fields, count, bitmap_idx, &any_ones); + return any_ones; +} diff --git a/compat/mimalloc/bitmap.h b/compat/mimalloc/bitmap.h new file mode 100644 index 00000000000000..9ba15d5d6f09ea --- /dev/null +++ b/compat/mimalloc/bitmap.h @@ -0,0 +1,115 @@ +/* ---------------------------------------------------------------------------- +Copyright (c) 2019-2023 Microsoft Research, Daan Leijen +This is free software; you can redistribute it and/or modify it under the +terms of the MIT license. A copy of the license can be found in the file +"LICENSE" at the root of this distribution. +-----------------------------------------------------------------------------*/ + +/* ---------------------------------------------------------------------------- +Concurrent bitmap that can set/reset sequences of bits atomically, +represeted as an array of fields where each field is a machine word (`size_t`) + +There are two api's; the standard one cannot have sequences that cross +between the bitmap fields (and a sequence must be <= MI_BITMAP_FIELD_BITS). +(this is used in region allocation) + +The `_across` postfixed functions do allow sequences that can cross over +between the fields. (This is used in arena allocation) +---------------------------------------------------------------------------- */ +#pragma once +#ifndef MI_BITMAP_H +#define MI_BITMAP_H + +/* ----------------------------------------------------------- + Bitmap definition +----------------------------------------------------------- */ + +#define MI_BITMAP_FIELD_BITS (8*MI_SIZE_SIZE) +#define MI_BITMAP_FIELD_FULL (~((size_t)0)) // all bits set + +// An atomic bitmap of `size_t` fields +typedef _Atomic(size_t) mi_bitmap_field_t; +typedef mi_bitmap_field_t* mi_bitmap_t; + +// A bitmap index is the index of the bit in a bitmap. +typedef size_t mi_bitmap_index_t; + +// Create a bit index. +static inline mi_bitmap_index_t mi_bitmap_index_create(size_t idx, size_t bitidx) { + mi_assert_internal(bitidx < MI_BITMAP_FIELD_BITS); + return (idx*MI_BITMAP_FIELD_BITS) + bitidx; +} + +// Create a bit index. +static inline mi_bitmap_index_t mi_bitmap_index_create_from_bit(size_t full_bitidx) { + return mi_bitmap_index_create(full_bitidx / MI_BITMAP_FIELD_BITS, full_bitidx % MI_BITMAP_FIELD_BITS); +} + +// Get the field index from a bit index. +static inline size_t mi_bitmap_index_field(mi_bitmap_index_t bitmap_idx) { + return (bitmap_idx / MI_BITMAP_FIELD_BITS); +} + +// Get the bit index in a bitmap field +static inline size_t mi_bitmap_index_bit_in_field(mi_bitmap_index_t bitmap_idx) { + return (bitmap_idx % MI_BITMAP_FIELD_BITS); +} + +// Get the full bit index +static inline size_t mi_bitmap_index_bit(mi_bitmap_index_t bitmap_idx) { + return bitmap_idx; +} + +/* ----------------------------------------------------------- + Claim a bit sequence atomically +----------------------------------------------------------- */ + +// Try to atomically claim a sequence of `count` bits in a single +// field at `idx` in `bitmap`. Returns `true` on success. +bool _mi_bitmap_try_find_claim_field(mi_bitmap_t bitmap, size_t idx, const size_t count, mi_bitmap_index_t* bitmap_idx); + +// Starts at idx, and wraps around to search in all `bitmap_fields` fields. +// For now, `count` can be at most MI_BITMAP_FIELD_BITS and will never cross fields. +bool _mi_bitmap_try_find_from_claim(mi_bitmap_t bitmap, const size_t bitmap_fields, const size_t start_field_idx, const size_t count, mi_bitmap_index_t* bitmap_idx); + +// Like _mi_bitmap_try_find_from_claim but with an extra predicate that must be fullfilled +typedef bool (mi_cdecl *mi_bitmap_pred_fun_t)(mi_bitmap_index_t bitmap_idx, void* pred_arg); +bool _mi_bitmap_try_find_from_claim_pred(mi_bitmap_t bitmap, const size_t bitmap_fields, const size_t start_field_idx, const size_t count, mi_bitmap_pred_fun_t pred_fun, void* pred_arg, mi_bitmap_index_t* bitmap_idx); + +// Set `count` bits at `bitmap_idx` to 0 atomically +// Returns `true` if all `count` bits were 1 previously. +bool _mi_bitmap_unclaim(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx); + +// Try to set `count` bits at `bitmap_idx` from 0 to 1 atomically. +// Returns `true` if successful when all previous `count` bits were 0. +bool _mi_bitmap_try_claim(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx); + +// Set `count` bits at `bitmap_idx` to 1 atomically +// Returns `true` if all `count` bits were 0 previously. `any_zero` is `true` if there was at least one zero bit. +bool _mi_bitmap_claim(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx, bool* any_zero); + +bool _mi_bitmap_is_claimed(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx); +bool _mi_bitmap_is_any_claimed(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx); + + +//-------------------------------------------------------------------------- +// the `_across` functions work on bitmaps where sequences can cross over +// between the fields. This is used in arena allocation +//-------------------------------------------------------------------------- + +// Find `count` bits of zeros and set them to 1 atomically; returns `true` on success. +// Starts at idx, and wraps around to search in all `bitmap_fields` fields. +bool _mi_bitmap_try_find_from_claim_across(mi_bitmap_t bitmap, const size_t bitmap_fields, const size_t start_field_idx, const size_t count, mi_bitmap_index_t* bitmap_idx); + +// Set `count` bits at `bitmap_idx` to 0 atomically +// Returns `true` if all `count` bits were 1 previously. +bool _mi_bitmap_unclaim_across(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx); + +// Set `count` bits at `bitmap_idx` to 1 atomically +// Returns `true` if all `count` bits were 0 previously. `any_zero` is `true` if there was at least one zero bit. +bool _mi_bitmap_claim_across(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx, bool* pany_zero); + +bool _mi_bitmap_is_claimed_across(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx); +bool _mi_bitmap_is_any_claimed_across(mi_bitmap_t bitmap, size_t bitmap_fields, size_t count, mi_bitmap_index_t bitmap_idx); + +#endif diff --git a/compat/mimalloc/heap.c b/compat/mimalloc/heap.c new file mode 100644 index 00000000000000..dab8c4bf8ae388 --- /dev/null +++ b/compat/mimalloc/heap.c @@ -0,0 +1,626 @@ +/*---------------------------------------------------------------------------- +Copyright (c) 2018-2021, Microsoft Research, Daan Leijen +This is free software; you can redistribute it and/or modify it under the +terms of the MIT license. A copy of the license can be found in the file +"LICENSE" at the root of this distribution. +-----------------------------------------------------------------------------*/ + +#include "mimalloc.h" +#include "mimalloc/internal.h" +#include "mimalloc/atomic.h" +#include "mimalloc/prim.h" // mi_prim_get_default_heap + +#include // memset, memcpy + +#if defined(_MSC_VER) && (_MSC_VER < 1920) +#pragma warning(disable:4204) // non-constant aggregate initializer +#endif + +/* ----------------------------------------------------------- + Helpers +----------------------------------------------------------- */ + +// return `true` if ok, `false` to break +typedef bool (heap_page_visitor_fun)(mi_heap_t* heap, mi_page_queue_t* pq, mi_page_t* page, void* arg1, void* arg2); + +// Visit all pages in a heap; returns `false` if break was called. +static bool mi_heap_visit_pages(mi_heap_t* heap, heap_page_visitor_fun* fn, void* arg1, void* arg2) +{ + if (heap==NULL || heap->page_count==0) return 0; + + // visit all pages + #if MI_DEBUG>1 + size_t total = heap->page_count; + size_t count = 0; + #endif + + for (size_t i = 0; i <= MI_BIN_FULL; i++) { + mi_page_queue_t* pq = &heap->pages[i]; + mi_page_t* page = pq->first; + while(page != NULL) { + mi_page_t* next = page->next; // save next in case the page gets removed from the queue + mi_assert_internal(mi_page_heap(page) == heap); + #if MI_DEBUG>1 + count++; + #endif + if (!fn(heap, pq, page, arg1, arg2)) return false; + page = next; // and continue + } + } + mi_assert_internal(count == total); + return true; +} + + +#if MI_DEBUG>=2 +static bool mi_heap_page_is_valid(mi_heap_t* heap, mi_page_queue_t* pq, mi_page_t* page, void* arg1, void* arg2) { + MI_UNUSED(arg1); + MI_UNUSED(arg2); + MI_UNUSED(pq); + mi_assert_internal(mi_page_heap(page) == heap); + mi_segment_t* segment = _mi_page_segment(page); + mi_assert_internal(segment->thread_id == heap->thread_id); + mi_assert_expensive(_mi_page_is_valid(page)); + return true; +} +#endif +#if MI_DEBUG>=3 +static bool mi_heap_is_valid(mi_heap_t* heap) { + mi_assert_internal(heap!=NULL); + mi_heap_visit_pages(heap, &mi_heap_page_is_valid, NULL, NULL); + return true; +} +#endif + + + + +/* ----------------------------------------------------------- + "Collect" pages by migrating `local_free` and `thread_free` + lists and freeing empty pages. This is done when a thread + stops (and in that case abandons pages if there are still + blocks alive) +----------------------------------------------------------- */ + +typedef enum mi_collect_e { + MI_NORMAL, + MI_FORCE, + MI_ABANDON +} mi_collect_t; + + +static bool mi_heap_page_collect(mi_heap_t* heap, mi_page_queue_t* pq, mi_page_t* page, void* arg_collect, void* arg2 ) { + MI_UNUSED(arg2); + MI_UNUSED(heap); + mi_assert_internal(mi_heap_page_is_valid(heap, pq, page, NULL, NULL)); + mi_collect_t collect = *((mi_collect_t*)arg_collect); + _mi_page_free_collect(page, collect >= MI_FORCE); + if (mi_page_all_free(page)) { + // no more used blocks, free the page. + // note: this will free retired pages as well. + _mi_page_free(page, pq, collect >= MI_FORCE); + } + else if (collect == MI_ABANDON) { + // still used blocks but the thread is done; abandon the page + _mi_page_abandon(page, pq); + } + return true; // don't break +} + +static bool mi_heap_page_never_delayed_free(mi_heap_t* heap, mi_page_queue_t* pq, mi_page_t* page, void* arg1, void* arg2) { + MI_UNUSED(arg1); + MI_UNUSED(arg2); + MI_UNUSED(heap); + MI_UNUSED(pq); + _mi_page_use_delayed_free(page, MI_NEVER_DELAYED_FREE, false); + return true; // don't break +} + +static void mi_heap_collect_ex(mi_heap_t* heap, mi_collect_t collect) +{ + if (heap==NULL || !mi_heap_is_initialized(heap)) return; + + const bool force = collect >= MI_FORCE; + _mi_deferred_free(heap, force); + + // note: never reclaim on collect but leave it to threads that need storage to reclaim + const bool force_main = + #ifdef NDEBUG + collect == MI_FORCE + #else + collect >= MI_FORCE + #endif + && _mi_is_main_thread() && mi_heap_is_backing(heap) && !heap->no_reclaim; + + if (force_main) { + // the main thread is abandoned (end-of-program), try to reclaim all abandoned segments. + // if all memory is freed by now, all segments should be freed. + _mi_abandoned_reclaim_all(heap, &heap->tld->segments); + } + + // if abandoning, mark all pages to no longer add to delayed_free + if (collect == MI_ABANDON) { + mi_heap_visit_pages(heap, &mi_heap_page_never_delayed_free, NULL, NULL); + } + + // free all current thread delayed blocks. + // (if abandoning, after this there are no more thread-delayed references into the pages.) + _mi_heap_delayed_free_all(heap); + + // collect retired pages + _mi_heap_collect_retired(heap, force); + + // collect all pages owned by this thread + mi_heap_visit_pages(heap, &mi_heap_page_collect, &collect, NULL); + mi_assert_internal( collect != MI_ABANDON || mi_atomic_load_ptr_acquire(mi_block_t,&heap->thread_delayed_free) == NULL ); + + // collect abandoned segments (in particular, purge expired parts of segments in the abandoned segment list) + // note: forced purge can be quite expensive if many threads are created/destroyed so we do not force on abandonment + _mi_abandoned_collect(heap, collect == MI_FORCE /* force? */, &heap->tld->segments); + + // collect segment local caches + if (force) { + _mi_segment_thread_collect(&heap->tld->segments); + } + + // collect regions on program-exit (or shared library unload) + if (force && _mi_is_main_thread() && mi_heap_is_backing(heap)) { + _mi_thread_data_collect(); // collect thread data cache + _mi_arena_collect(true /* force purge */, &heap->tld->stats); + } +} + +void _mi_heap_collect_abandon(mi_heap_t* heap) { + mi_heap_collect_ex(heap, MI_ABANDON); +} + +void mi_heap_collect(mi_heap_t* heap, bool force) mi_attr_noexcept { + mi_heap_collect_ex(heap, (force ? MI_FORCE : MI_NORMAL)); +} + +void mi_collect(bool force) mi_attr_noexcept { + mi_heap_collect(mi_prim_get_default_heap(), force); +} + + +/* ----------------------------------------------------------- + Heap new +----------------------------------------------------------- */ + +mi_heap_t* mi_heap_get_default(void) { + mi_thread_init(); + return mi_prim_get_default_heap(); +} + +static bool mi_heap_is_default(const mi_heap_t* heap) { + return (heap == mi_prim_get_default_heap()); +} + + +mi_heap_t* mi_heap_get_backing(void) { + mi_heap_t* heap = mi_heap_get_default(); + mi_assert_internal(heap!=NULL); + mi_heap_t* bheap = heap->tld->heap_backing; + mi_assert_internal(bheap!=NULL); + mi_assert_internal(bheap->thread_id == _mi_thread_id()); + return bheap; +} + +mi_decl_nodiscard mi_heap_t* mi_heap_new_in_arena(mi_arena_id_t arena_id) { + mi_heap_t* bheap = mi_heap_get_backing(); + mi_heap_t* heap = mi_heap_malloc_tp(bheap, mi_heap_t); // todo: OS allocate in secure mode? + if (heap == NULL) return NULL; + _mi_memcpy_aligned(heap, &_mi_heap_empty, sizeof(mi_heap_t)); + heap->tld = bheap->tld; + heap->thread_id = _mi_thread_id(); + heap->arena_id = arena_id; + _mi_random_split(&bheap->random, &heap->random); + heap->cookie = _mi_heap_random_next(heap) | 1; + heap->keys[0] = _mi_heap_random_next(heap); + heap->keys[1] = _mi_heap_random_next(heap); + heap->no_reclaim = true; // don't reclaim abandoned pages or otherwise destroy is unsafe + // push on the thread local heaps list + heap->next = heap->tld->heaps; + heap->tld->heaps = heap; + return heap; +} + +mi_decl_nodiscard mi_heap_t* mi_heap_new(void) { + return mi_heap_new_in_arena(_mi_arena_id_none()); +} + +bool _mi_heap_memid_is_suitable(mi_heap_t* heap, mi_memid_t memid) { + return _mi_arena_memid_is_suitable(memid, heap->arena_id); +} + +uintptr_t _mi_heap_random_next(mi_heap_t* heap) { + return _mi_random_next(&heap->random); +} + +// zero out the page queues +static void mi_heap_reset_pages(mi_heap_t* heap) { + mi_assert_internal(heap != NULL); + mi_assert_internal(mi_heap_is_initialized(heap)); + // TODO: copy full empty heap instead? + memset(&heap->pages_free_direct, 0, sizeof(heap->pages_free_direct)); + _mi_memcpy_aligned(&heap->pages, &_mi_heap_empty.pages, sizeof(heap->pages)); + heap->thread_delayed_free = NULL; + heap->page_count = 0; +} + +// called from `mi_heap_destroy` and `mi_heap_delete` to free the internal heap resources. +static void mi_heap_free(mi_heap_t* heap) { + mi_assert(heap != NULL); + mi_assert_internal(mi_heap_is_initialized(heap)); + if (heap==NULL || !mi_heap_is_initialized(heap)) return; + if (mi_heap_is_backing(heap)) return; // dont free the backing heap + + // reset default + if (mi_heap_is_default(heap)) { + _mi_heap_set_default_direct(heap->tld->heap_backing); + } + + // remove ourselves from the thread local heaps list + // linear search but we expect the number of heaps to be relatively small + mi_heap_t* prev = NULL; + mi_heap_t* curr = heap->tld->heaps; + while (curr != heap && curr != NULL) { + prev = curr; + curr = curr->next; + } + mi_assert_internal(curr == heap); + if (curr == heap) { + if (prev != NULL) { prev->next = heap->next; } + else { heap->tld->heaps = heap->next; } + } + mi_assert_internal(heap->tld->heaps != NULL); + + // and free the used memory + mi_free(heap); +} + + +/* ----------------------------------------------------------- + Heap destroy +----------------------------------------------------------- */ + +static bool _mi_heap_page_destroy(mi_heap_t* heap, mi_page_queue_t* pq, mi_page_t* page, void* arg1, void* arg2) { + MI_UNUSED(arg1); + MI_UNUSED(arg2); + MI_UNUSED(heap); + MI_UNUSED(pq); + + // ensure no more thread_delayed_free will be added + _mi_page_use_delayed_free(page, MI_NEVER_DELAYED_FREE, false); + + // stats + const size_t bsize = mi_page_block_size(page); + if (bsize > MI_MEDIUM_OBJ_SIZE_MAX) { + if (bsize <= MI_LARGE_OBJ_SIZE_MAX) { + mi_heap_stat_decrease(heap, large, bsize); + } + else { + mi_heap_stat_decrease(heap, huge, bsize); + } + } +#if (MI_STAT) + _mi_page_free_collect(page, false); // update used count + const size_t inuse = page->used; + if (bsize <= MI_LARGE_OBJ_SIZE_MAX) { + mi_heap_stat_decrease(heap, normal, bsize * inuse); +#if (MI_STAT>1) + mi_heap_stat_decrease(heap, normal_bins[_mi_bin(bsize)], inuse); +#endif + } + mi_heap_stat_decrease(heap, malloc, bsize * inuse); // todo: off for aligned blocks... +#endif + + /// pretend it is all free now + mi_assert_internal(mi_page_thread_free(page) == NULL); + page->used = 0; + + // and free the page + // mi_page_free(page,false); + page->next = NULL; + page->prev = NULL; + _mi_segment_page_free(page,false /* no force? */, &heap->tld->segments); + + return true; // keep going +} + +void _mi_heap_destroy_pages(mi_heap_t* heap) { + mi_heap_visit_pages(heap, &_mi_heap_page_destroy, NULL, NULL); + mi_heap_reset_pages(heap); +} + +#if MI_TRACK_HEAP_DESTROY +static bool mi_cdecl mi_heap_track_block_free(const mi_heap_t* heap, const mi_heap_area_t* area, void* block, size_t block_size, void* arg) { + MI_UNUSED(heap); MI_UNUSED(area); MI_UNUSED(arg); MI_UNUSED(block_size); + mi_track_free_size(block,mi_usable_size(block)); + return true; +} +#endif + +void mi_heap_destroy(mi_heap_t* heap) { + mi_assert(heap != NULL); + mi_assert(mi_heap_is_initialized(heap)); + mi_assert(heap->no_reclaim); + mi_assert_expensive(mi_heap_is_valid(heap)); + if (heap==NULL || !mi_heap_is_initialized(heap)) return; + if (!heap->no_reclaim) { + // don't free in case it may contain reclaimed pages + mi_heap_delete(heap); + } + else { + // track all blocks as freed + #if MI_TRACK_HEAP_DESTROY + mi_heap_visit_blocks(heap, true, mi_heap_track_block_free, NULL); + #endif + // free all pages + _mi_heap_destroy_pages(heap); + mi_heap_free(heap); + } +} + +// forcefully destroy all heaps in the current thread +void _mi_heap_unsafe_destroy_all(void) { + mi_heap_t* bheap = mi_heap_get_backing(); + mi_heap_t* curr = bheap->tld->heaps; + while (curr != NULL) { + mi_heap_t* next = curr->next; + if (curr->no_reclaim) { + mi_heap_destroy(curr); + } + else { + _mi_heap_destroy_pages(curr); + } + curr = next; + } +} + +/* ----------------------------------------------------------- + Safe Heap delete +----------------------------------------------------------- */ + +// Transfer the pages from one heap to the other +static void mi_heap_absorb(mi_heap_t* heap, mi_heap_t* from) { + mi_assert_internal(heap!=NULL); + if (from==NULL || from->page_count == 0) return; + + // reduce the size of the delayed frees + _mi_heap_delayed_free_partial(from); + + // transfer all pages by appending the queues; this will set a new heap field + // so threads may do delayed frees in either heap for a while. + // note: appending waits for each page to not be in the `MI_DELAYED_FREEING` state + // so after this only the new heap will get delayed frees + for (size_t i = 0; i <= MI_BIN_FULL; i++) { + mi_page_queue_t* pq = &heap->pages[i]; + mi_page_queue_t* append = &from->pages[i]; + size_t pcount = _mi_page_queue_append(heap, pq, append); + heap->page_count += pcount; + from->page_count -= pcount; + } + mi_assert_internal(from->page_count == 0); + + // and do outstanding delayed frees in the `from` heap + // note: be careful here as the `heap` field in all those pages no longer point to `from`, + // turns out to be ok as `_mi_heap_delayed_free` only visits the list and calls a + // the regular `_mi_free_delayed_block` which is safe. + _mi_heap_delayed_free_all(from); + #if !defined(_MSC_VER) || (_MSC_VER > 1900) // somehow the following line gives an error in VS2015, issue #353 + mi_assert_internal(mi_atomic_load_ptr_relaxed(mi_block_t,&from->thread_delayed_free) == NULL); + #endif + + // and reset the `from` heap + mi_heap_reset_pages(from); +} + +// Safe delete a heap without freeing any still allocated blocks in that heap. +void mi_heap_delete(mi_heap_t* heap) +{ + mi_assert(heap != NULL); + mi_assert(mi_heap_is_initialized(heap)); + mi_assert_expensive(mi_heap_is_valid(heap)); + if (heap==NULL || !mi_heap_is_initialized(heap)) return; + + if (!mi_heap_is_backing(heap)) { + // tranfer still used pages to the backing heap + mi_heap_absorb(heap->tld->heap_backing, heap); + } + else { + // the backing heap abandons its pages + _mi_heap_collect_abandon(heap); + } + mi_assert_internal(heap->page_count==0); + mi_heap_free(heap); +} + +mi_heap_t* mi_heap_set_default(mi_heap_t* heap) { + mi_assert(heap != NULL); + mi_assert(mi_heap_is_initialized(heap)); + if (heap==NULL || !mi_heap_is_initialized(heap)) return NULL; + mi_assert_expensive(mi_heap_is_valid(heap)); + mi_heap_t* old = mi_prim_get_default_heap(); + _mi_heap_set_default_direct(heap); + return old; +} + + + + +/* ----------------------------------------------------------- + Analysis +----------------------------------------------------------- */ + +// static since it is not thread safe to access heaps from other threads. +static mi_heap_t* mi_heap_of_block(const void* p) { + if (p == NULL) return NULL; + mi_segment_t* segment = _mi_ptr_segment(p); + bool valid = (_mi_ptr_cookie(segment) == segment->cookie); + mi_assert_internal(valid); + if mi_unlikely(!valid) return NULL; + return mi_page_heap(_mi_segment_page_of(segment,p)); +} + +bool mi_heap_contains_block(mi_heap_t* heap, const void* p) { + mi_assert(heap != NULL); + if (heap==NULL || !mi_heap_is_initialized(heap)) return false; + return (heap == mi_heap_of_block(p)); +} + + +static bool mi_heap_page_check_owned(mi_heap_t* heap, mi_page_queue_t* pq, mi_page_t* page, void* p, void* vfound) { + MI_UNUSED(heap); + MI_UNUSED(pq); + bool* found = (bool*)vfound; + mi_segment_t* segment = _mi_page_segment(page); + void* start = _mi_page_start(segment, page, NULL); + void* end = (uint8_t*)start + (page->capacity * mi_page_block_size(page)); + *found = (p >= start && p < end); + return (!*found); // continue if not found +} + +bool mi_heap_check_owned(mi_heap_t* heap, const void* p) { + mi_assert(heap != NULL); + if (heap==NULL || !mi_heap_is_initialized(heap)) return false; + if (((uintptr_t)p & (MI_INTPTR_SIZE - 1)) != 0) return false; // only aligned pointers + bool found = false; + mi_heap_visit_pages(heap, &mi_heap_page_check_owned, (void*)p, &found); + return found; +} + +bool mi_check_owned(const void* p) { + return mi_heap_check_owned(mi_prim_get_default_heap(), p); +} + +/* ----------------------------------------------------------- + Visit all heap blocks and areas + Todo: enable visiting abandoned pages, and + enable visiting all blocks of all heaps across threads +----------------------------------------------------------- */ + +// Separate struct to keep `mi_page_t` out of the public interface +typedef struct mi_heap_area_ex_s { + mi_heap_area_t area; + mi_page_t* page; +} mi_heap_area_ex_t; + +static bool mi_heap_area_visit_blocks(const mi_heap_area_ex_t* xarea, mi_block_visit_fun* visitor, void* arg) { + mi_assert(xarea != NULL); + if (xarea==NULL) return true; + const mi_heap_area_t* area = &xarea->area; + mi_page_t* page = xarea->page; + mi_assert(page != NULL); + if (page == NULL) return true; + + _mi_page_free_collect(page,true); + mi_assert_internal(page->local_free == NULL); + if (page->used == 0) return true; + + const size_t bsize = mi_page_block_size(page); + const size_t ubsize = mi_page_usable_block_size(page); // without padding + size_t psize; + uint8_t* pstart = _mi_page_start(_mi_page_segment(page), page, &psize); + + if (page->capacity == 1) { + // optimize page with one block + mi_assert_internal(page->used == 1 && page->free == NULL); + return visitor(mi_page_heap(page), area, pstart, ubsize, arg); + } + + // create a bitmap of free blocks. + #define MI_MAX_BLOCKS (MI_SMALL_PAGE_SIZE / sizeof(void*)) + uintptr_t free_map[MI_MAX_BLOCKS / sizeof(uintptr_t)]; + memset(free_map, 0, sizeof(free_map)); + + #if MI_DEBUG>1 + size_t free_count = 0; + #endif + for (mi_block_t* block = page->free; block != NULL; block = mi_block_next(page,block)) { + #if MI_DEBUG>1 + free_count++; + #endif + mi_assert_internal((uint8_t*)block >= pstart && (uint8_t*)block < (pstart + psize)); + size_t offset = (uint8_t*)block - pstart; + mi_assert_internal(offset % bsize == 0); + size_t blockidx = offset / bsize; // Todo: avoid division? + mi_assert_internal( blockidx < MI_MAX_BLOCKS); + size_t bitidx = (blockidx / sizeof(uintptr_t)); + size_t bit = blockidx - (bitidx * sizeof(uintptr_t)); + free_map[bitidx] |= ((uintptr_t)1 << bit); + } + mi_assert_internal(page->capacity == (free_count + page->used)); + + // walk through all blocks skipping the free ones + #if MI_DEBUG>1 + size_t used_count = 0; + #endif + for (size_t i = 0; i < page->capacity; i++) { + size_t bitidx = (i / sizeof(uintptr_t)); + size_t bit = i - (bitidx * sizeof(uintptr_t)); + uintptr_t m = free_map[bitidx]; + if (bit == 0 && m == UINTPTR_MAX) { + i += (sizeof(uintptr_t) - 1); // skip a run of free blocks + } + else if ((m & ((uintptr_t)1 << bit)) == 0) { + #if MI_DEBUG>1 + used_count++; + #endif + uint8_t* block = pstart + (i * bsize); + if (!visitor(mi_page_heap(page), area, block, ubsize, arg)) return false; + } + } + mi_assert_internal(page->used == used_count); + return true; +} + +typedef bool (mi_heap_area_visit_fun)(const mi_heap_t* heap, const mi_heap_area_ex_t* area, void* arg); + + +static bool mi_heap_visit_areas_page(mi_heap_t* heap, mi_page_queue_t* pq, mi_page_t* page, void* vfun, void* arg) { + MI_UNUSED(heap); + MI_UNUSED(pq); + mi_heap_area_visit_fun* fun = (mi_heap_area_visit_fun*)vfun; + mi_heap_area_ex_t xarea; + const size_t bsize = mi_page_block_size(page); + const size_t ubsize = mi_page_usable_block_size(page); + xarea.page = page; + xarea.area.reserved = page->reserved * bsize; + xarea.area.committed = page->capacity * bsize; + xarea.area.blocks = _mi_page_start(_mi_page_segment(page), page, NULL); + xarea.area.used = page->used; // number of blocks in use (#553) + xarea.area.block_size = ubsize; + xarea.area.full_block_size = bsize; + return fun(heap, &xarea, arg); +} + +// Visit all heap pages as areas +static bool mi_heap_visit_areas(const mi_heap_t* heap, mi_heap_area_visit_fun* visitor, void* arg) { + if (visitor == NULL) return false; + return mi_heap_visit_pages((mi_heap_t*)heap, &mi_heap_visit_areas_page, (void*)(visitor), arg); // note: function pointer to void* :-{ +} + +// Just to pass arguments +typedef struct mi_visit_blocks_args_s { + bool visit_blocks; + mi_block_visit_fun* visitor; + void* arg; +} mi_visit_blocks_args_t; + +static bool mi_heap_area_visitor(const mi_heap_t* heap, const mi_heap_area_ex_t* xarea, void* arg) { + mi_visit_blocks_args_t* args = (mi_visit_blocks_args_t*)arg; + if (!args->visitor(heap, &xarea->area, NULL, xarea->area.block_size, args->arg)) return false; + if (args->visit_blocks) { + return mi_heap_area_visit_blocks(xarea, args->visitor, args->arg); + } + else { + return true; + } +} + +// Visit all blocks in a heap +bool mi_heap_visit_blocks(const mi_heap_t* heap, bool visit_blocks, mi_block_visit_fun* visitor, void* arg) { + mi_visit_blocks_args_t args = { visit_blocks, visitor, arg }; + return mi_heap_visit_areas(heap, &mi_heap_area_visitor, &args); +} diff --git a/compat/mimalloc/init.c b/compat/mimalloc/init.c new file mode 100644 index 00000000000000..4670d5510db187 --- /dev/null +++ b/compat/mimalloc/init.c @@ -0,0 +1,709 @@ +/* ---------------------------------------------------------------------------- +Copyright (c) 2018-2022, Microsoft Research, Daan Leijen +This is free software; you can redistribute it and/or modify it under the +terms of the MIT license. A copy of the license can be found in the file +"LICENSE" at the root of this distribution. +-----------------------------------------------------------------------------*/ +#include "mimalloc.h" +#include "mimalloc/internal.h" +#include "mimalloc/prim.h" + +#include // memcpy, memset +#include // atexit + + +// Empty page used to initialize the small free pages array +const mi_page_t _mi_page_empty = { + 0, false, false, false, + 0, // capacity + 0, // reserved capacity + { 0 }, // flags + false, // is_zero + 0, // retire_expire + NULL, // free + 0, // used + 0, // xblock_size + NULL, // local_free + #if (MI_PADDING || MI_ENCODE_FREELIST) + { 0, 0 }, + #endif + MI_ATOMIC_VAR_INIT(0), // xthread_free + MI_ATOMIC_VAR_INIT(0), // xheap + NULL, NULL + #if MI_INTPTR_SIZE==8 + , { 0 } // padding + #endif +}; + +#define MI_PAGE_EMPTY() ((mi_page_t*)&_mi_page_empty) + +#if (MI_SMALL_WSIZE_MAX==128) +#if (MI_PADDING>0) && (MI_INTPTR_SIZE >= 8) +#define MI_SMALL_PAGES_EMPTY { MI_INIT128(MI_PAGE_EMPTY), MI_PAGE_EMPTY(), MI_PAGE_EMPTY() } +#elif (MI_PADDING>0) +#define MI_SMALL_PAGES_EMPTY { MI_INIT128(MI_PAGE_EMPTY), MI_PAGE_EMPTY(), MI_PAGE_EMPTY(), MI_PAGE_EMPTY() } +#else +#define MI_SMALL_PAGES_EMPTY { MI_INIT128(MI_PAGE_EMPTY), MI_PAGE_EMPTY() } +#endif +#else +#error "define right initialization sizes corresponding to MI_SMALL_WSIZE_MAX" +#endif + +// Empty page queues for every bin +#define QNULL(sz) { NULL, NULL, (sz)*sizeof(uintptr_t) } +#define MI_PAGE_QUEUES_EMPTY \ + { QNULL(1), \ + QNULL( 1), QNULL( 2), QNULL( 3), QNULL( 4), QNULL( 5), QNULL( 6), QNULL( 7), QNULL( 8), /* 8 */ \ + QNULL( 10), QNULL( 12), QNULL( 14), QNULL( 16), QNULL( 20), QNULL( 24), QNULL( 28), QNULL( 32), /* 16 */ \ + QNULL( 40), QNULL( 48), QNULL( 56), QNULL( 64), QNULL( 80), QNULL( 96), QNULL( 112), QNULL( 128), /* 24 */ \ + QNULL( 160), QNULL( 192), QNULL( 224), QNULL( 256), QNULL( 320), QNULL( 384), QNULL( 448), QNULL( 512), /* 32 */ \ + QNULL( 640), QNULL( 768), QNULL( 896), QNULL( 1024), QNULL( 1280), QNULL( 1536), QNULL( 1792), QNULL( 2048), /* 40 */ \ + QNULL( 2560), QNULL( 3072), QNULL( 3584), QNULL( 4096), QNULL( 5120), QNULL( 6144), QNULL( 7168), QNULL( 8192), /* 48 */ \ + QNULL( 10240), QNULL( 12288), QNULL( 14336), QNULL( 16384), QNULL( 20480), QNULL( 24576), QNULL( 28672), QNULL( 32768), /* 56 */ \ + QNULL( 40960), QNULL( 49152), QNULL( 57344), QNULL( 65536), QNULL( 81920), QNULL( 98304), QNULL(114688), QNULL(131072), /* 64 */ \ + QNULL(163840), QNULL(196608), QNULL(229376), QNULL(262144), QNULL(327680), QNULL(393216), QNULL(458752), QNULL(524288), /* 72 */ \ + QNULL(MI_MEDIUM_OBJ_WSIZE_MAX + 1 /* 655360, Huge queue */), \ + QNULL(MI_MEDIUM_OBJ_WSIZE_MAX + 2) /* Full queue */ } + +#define MI_STAT_COUNT_NULL() {0,0,0,0} + +// Empty statistics +#if MI_STAT>1 +#define MI_STAT_COUNT_END_NULL() , { MI_STAT_COUNT_NULL(), MI_INIT32(MI_STAT_COUNT_NULL) } +#else +#define MI_STAT_COUNT_END_NULL() +#endif + +#define MI_STATS_NULL \ + MI_STAT_COUNT_NULL(), MI_STAT_COUNT_NULL(), \ + MI_STAT_COUNT_NULL(), MI_STAT_COUNT_NULL(), \ + MI_STAT_COUNT_NULL(), MI_STAT_COUNT_NULL(), \ + MI_STAT_COUNT_NULL(), MI_STAT_COUNT_NULL(), \ + MI_STAT_COUNT_NULL(), MI_STAT_COUNT_NULL(), \ + MI_STAT_COUNT_NULL(), MI_STAT_COUNT_NULL(), \ + MI_STAT_COUNT_NULL(), MI_STAT_COUNT_NULL(), \ + MI_STAT_COUNT_NULL(), \ + { 0, 0 }, { 0, 0 }, { 0, 0 }, { 0, 0 }, \ + { 0, 0 }, { 0, 0 }, { 0, 0 }, { 0, 0 }, { 0, 0 }, { 0, 0 } \ + MI_STAT_COUNT_END_NULL() + + +// Empty slice span queues for every bin +#define SQNULL(sz) { NULL, NULL, sz } +#define MI_SEGMENT_SPAN_QUEUES_EMPTY \ + { SQNULL(1), \ + SQNULL( 1), SQNULL( 2), SQNULL( 3), SQNULL( 4), SQNULL( 5), SQNULL( 6), SQNULL( 7), SQNULL( 10), /* 8 */ \ + SQNULL( 12), SQNULL( 14), SQNULL( 16), SQNULL( 20), SQNULL( 24), SQNULL( 28), SQNULL( 32), SQNULL( 40), /* 16 */ \ + SQNULL( 48), SQNULL( 56), SQNULL( 64), SQNULL( 80), SQNULL( 96), SQNULL( 112), SQNULL( 128), SQNULL( 160), /* 24 */ \ + SQNULL( 192), SQNULL( 224), SQNULL( 256), SQNULL( 320), SQNULL( 384), SQNULL( 448), SQNULL( 512), SQNULL( 640), /* 32 */ \ + SQNULL( 768), SQNULL( 896), SQNULL( 1024) /* 35 */ } + + +// -------------------------------------------------------- +// Statically allocate an empty heap as the initial +// thread local value for the default heap, +// and statically allocate the backing heap for the main +// thread so it can function without doing any allocation +// itself (as accessing a thread local for the first time +// may lead to allocation itself on some platforms) +// -------------------------------------------------------- + +mi_decl_cache_align const mi_heap_t _mi_heap_empty = { + NULL, + MI_SMALL_PAGES_EMPTY, + MI_PAGE_QUEUES_EMPTY, + MI_ATOMIC_VAR_INIT(NULL), + 0, // tid + 0, // cookie + 0, // arena id + { 0, 0 }, // keys + { {0}, {0}, 0, true }, // random + 0, // page count + MI_BIN_FULL, 0, // page retired min/max + NULL, // next + false +}; + +#define tld_empty_stats ((mi_stats_t*)((uint8_t*)&tld_empty + offsetof(mi_tld_t,stats))) +#define tld_empty_os ((mi_os_tld_t*)((uint8_t*)&tld_empty + offsetof(mi_tld_t,os))) + +mi_decl_cache_align static const mi_tld_t tld_empty = { + 0, + false, + NULL, NULL, + { MI_SEGMENT_SPAN_QUEUES_EMPTY, 0, 0, 0, 0, tld_empty_stats, tld_empty_os }, // segments + { 0, tld_empty_stats }, // os + { MI_STATS_NULL } // stats +}; + +mi_threadid_t _mi_thread_id(void) mi_attr_noexcept { + return _mi_prim_thread_id(); +} + +// the thread-local default heap for allocation +mi_decl_thread mi_heap_t* _mi_heap_default = (mi_heap_t*)&_mi_heap_empty; + +extern mi_heap_t _mi_heap_main; + +static mi_tld_t tld_main = { + 0, false, + &_mi_heap_main, & _mi_heap_main, + { MI_SEGMENT_SPAN_QUEUES_EMPTY, 0, 0, 0, 0, &tld_main.stats, &tld_main.os }, // segments + { 0, &tld_main.stats }, // os + { MI_STATS_NULL } // stats +}; + +mi_heap_t _mi_heap_main = { + &tld_main, + MI_SMALL_PAGES_EMPTY, + MI_PAGE_QUEUES_EMPTY, + MI_ATOMIC_VAR_INIT(NULL), + 0, // thread id + 0, // initial cookie + 0, // arena id + { 0, 0 }, // the key of the main heap can be fixed (unlike page keys that need to be secure!) + { {0x846ca68b}, {0}, 0, true }, // random + 0, // page count + MI_BIN_FULL, 0, // page retired min/max + NULL, // next heap + false // can reclaim +}; + +bool _mi_process_is_initialized = false; // set to `true` in `mi_process_init`. + +mi_stats_t _mi_stats_main = { MI_STATS_NULL }; + + +static void mi_heap_main_init(void) { + if (_mi_heap_main.cookie == 0) { + _mi_heap_main.thread_id = _mi_thread_id(); + _mi_heap_main.cookie = 1; + #if defined(_WIN32) && !defined(MI_SHARED_LIB) + _mi_random_init_weak(&_mi_heap_main.random); // prevent allocation failure during bcrypt dll initialization with static linking + #else + _mi_random_init(&_mi_heap_main.random); + #endif + _mi_heap_main.cookie = _mi_heap_random_next(&_mi_heap_main); + _mi_heap_main.keys[0] = _mi_heap_random_next(&_mi_heap_main); + _mi_heap_main.keys[1] = _mi_heap_random_next(&_mi_heap_main); + } +} + +mi_heap_t* _mi_heap_main_get(void) { + mi_heap_main_init(); + return &_mi_heap_main; +} + + +/* ----------------------------------------------------------- + Initialization and freeing of the thread local heaps +----------------------------------------------------------- */ + +// note: in x64 in release build `sizeof(mi_thread_data_t)` is under 4KiB (= OS page size). +typedef struct mi_thread_data_s { + mi_heap_t heap; // must come first due to cast in `_mi_heap_done` + mi_tld_t tld; + mi_memid_t memid; +} mi_thread_data_t; + + +// Thread meta-data is allocated directly from the OS. For +// some programs that do not use thread pools and allocate and +// destroy many OS threads, this may causes too much overhead +// per thread so we maintain a small cache of recently freed metadata. + +#define TD_CACHE_SIZE (16) +static _Atomic(mi_thread_data_t*) td_cache[TD_CACHE_SIZE]; + +static mi_thread_data_t* mi_thread_data_zalloc(void) { + // try to find thread metadata in the cache + bool is_zero = false; + mi_thread_data_t* td = NULL; + for (int i = 0; i < TD_CACHE_SIZE; i++) { + td = mi_atomic_load_ptr_relaxed(mi_thread_data_t, &td_cache[i]); + if (td != NULL) { + // found cached allocation, try use it + td = mi_atomic_exchange_ptr_acq_rel(mi_thread_data_t, &td_cache[i], NULL); + if (td != NULL) { + break; + } + } + } + + // if that fails, allocate as meta data + if (td == NULL) { + mi_memid_t memid; + td = (mi_thread_data_t*)_mi_os_alloc(sizeof(mi_thread_data_t), &memid, &_mi_stats_main); + if (td == NULL) { + // if this fails, try once more. (issue #257) + td = (mi_thread_data_t*)_mi_os_alloc(sizeof(mi_thread_data_t), &memid, &_mi_stats_main); + if (td == NULL) { + // really out of memory + _mi_error_message(ENOMEM, "unable to allocate thread local heap metadata (%zu bytes)\n", sizeof(mi_thread_data_t)); + } + } + if (td != NULL) { + td->memid = memid; + is_zero = memid.initially_zero; + } + } + + if (td != NULL && !is_zero) { + _mi_memzero_aligned(td, sizeof(*td)); + } + return td; +} + +static void mi_thread_data_free( mi_thread_data_t* tdfree ) { + // try to add the thread metadata to the cache + for (int i = 0; i < TD_CACHE_SIZE; i++) { + mi_thread_data_t* td = mi_atomic_load_ptr_relaxed(mi_thread_data_t, &td_cache[i]); + if (td == NULL) { + mi_thread_data_t* expected = NULL; + if (mi_atomic_cas_ptr_weak_acq_rel(mi_thread_data_t, &td_cache[i], &expected, tdfree)) { + return; + } + } + } + // if that fails, just free it directly + _mi_os_free(tdfree, sizeof(mi_thread_data_t), tdfree->memid, &_mi_stats_main); +} + +void _mi_thread_data_collect(void) { + // free all thread metadata from the cache + for (int i = 0; i < TD_CACHE_SIZE; i++) { + mi_thread_data_t* td = mi_atomic_load_ptr_relaxed(mi_thread_data_t, &td_cache[i]); + if (td != NULL) { + td = mi_atomic_exchange_ptr_acq_rel(mi_thread_data_t, &td_cache[i], NULL); + if (td != NULL) { + _mi_os_free(td, sizeof(mi_thread_data_t), td->memid, &_mi_stats_main); + } + } + } +} + +// Initialize the thread local default heap, called from `mi_thread_init` +static bool _mi_heap_init(void) { + if (mi_heap_is_initialized(mi_prim_get_default_heap())) return true; + if (_mi_is_main_thread()) { + // mi_assert_internal(_mi_heap_main.thread_id != 0); // can happen on freeBSD where alloc is called before any initialization + // the main heap is statically allocated + mi_heap_main_init(); + _mi_heap_set_default_direct(&_mi_heap_main); + //mi_assert_internal(_mi_heap_default->tld->heap_backing == mi_prim_get_default_heap()); + } + else { + // use `_mi_os_alloc` to allocate directly from the OS + mi_thread_data_t* td = mi_thread_data_zalloc(); + if (td == NULL) return false; + + mi_tld_t* tld = &td->tld; + mi_heap_t* heap = &td->heap; + _mi_memcpy_aligned(tld, &tld_empty, sizeof(*tld)); + _mi_memcpy_aligned(heap, &_mi_heap_empty, sizeof(*heap)); + heap->thread_id = _mi_thread_id(); + _mi_random_init(&heap->random); + heap->cookie = _mi_heap_random_next(heap) | 1; + heap->keys[0] = _mi_heap_random_next(heap); + heap->keys[1] = _mi_heap_random_next(heap); + heap->tld = tld; + tld->heap_backing = heap; + tld->heaps = heap; + tld->segments.stats = &tld->stats; + tld->segments.os = &tld->os; + tld->os.stats = &tld->stats; + _mi_heap_set_default_direct(heap); + } + return false; +} + +// Free the thread local default heap (called from `mi_thread_done`) +static bool _mi_heap_done(mi_heap_t* heap) { + if (!mi_heap_is_initialized(heap)) return true; + + // reset default heap + _mi_heap_set_default_direct(_mi_is_main_thread() ? &_mi_heap_main : (mi_heap_t*)&_mi_heap_empty); + + // switch to backing heap + heap = heap->tld->heap_backing; + if (!mi_heap_is_initialized(heap)) return false; + + // delete all non-backing heaps in this thread + mi_heap_t* curr = heap->tld->heaps; + while (curr != NULL) { + mi_heap_t* next = curr->next; // save `next` as `curr` will be freed + if (curr != heap) { + mi_assert_internal(!mi_heap_is_backing(curr)); + mi_heap_delete(curr); + } + curr = next; + } + mi_assert_internal(heap->tld->heaps == heap && heap->next == NULL); + mi_assert_internal(mi_heap_is_backing(heap)); + + // collect if not the main thread + if (heap != &_mi_heap_main) { + _mi_heap_collect_abandon(heap); + } + + // merge stats + _mi_stats_done(&heap->tld->stats); + + // free if not the main thread + if (heap != &_mi_heap_main) { + // the following assertion does not always hold for huge segments as those are always treated + // as abondened: one may allocate it in one thread, but deallocate in another in which case + // the count can be too large or negative. todo: perhaps not count huge segments? see issue #363 + // mi_assert_internal(heap->tld->segments.count == 0 || heap->thread_id != _mi_thread_id()); + mi_thread_data_free((mi_thread_data_t*)heap); + } + else { + #if 0 + // never free the main thread even in debug mode; if a dll is linked statically with mimalloc, + // there may still be delete/free calls after the mi_fls_done is called. Issue #207 + _mi_heap_destroy_pages(heap); + mi_assert_internal(heap->tld->heap_backing == &_mi_heap_main); + #endif + } + return false; +} + + + +// -------------------------------------------------------- +// Try to run `mi_thread_done()` automatically so any memory +// owned by the thread but not yet released can be abandoned +// and re-owned by another thread. +// +// 1. windows dynamic library: +// call from DllMain on DLL_THREAD_DETACH +// 2. windows static library: +// use `FlsAlloc` to call a destructor when the thread is done +// 3. unix, pthreads: +// use a pthread key to call a destructor when a pthread is done +// +// In the last two cases we also need to call `mi_process_init` +// to set up the thread local keys. +// -------------------------------------------------------- + +// Set up handlers so `mi_thread_done` is called automatically +static void mi_process_setup_auto_thread_done(void) { + static bool tls_initialized = false; // fine if it races + if (tls_initialized) return; + tls_initialized = true; + _mi_prim_thread_init_auto_done(); + _mi_heap_set_default_direct(&_mi_heap_main); +} + + +bool _mi_is_main_thread(void) { + return (_mi_heap_main.thread_id==0 || _mi_heap_main.thread_id == _mi_thread_id()); +} + +static _Atomic(size_t) thread_count = MI_ATOMIC_VAR_INIT(1); + +size_t _mi_current_thread_count(void) { + return mi_atomic_load_relaxed(&thread_count); +} + +// This is called from the `mi_malloc_generic` +void mi_thread_init(void) mi_attr_noexcept +{ + // ensure our process has started already + mi_process_init(); + + // initialize the thread local default heap + // (this will call `_mi_heap_set_default_direct` and thus set the + // fiber/pthread key to a non-zero value, ensuring `_mi_thread_done` is called) + if (_mi_heap_init()) return; // returns true if already initialized + + _mi_stat_increase(&_mi_stats_main.threads, 1); + mi_atomic_increment_relaxed(&thread_count); + //_mi_verbose_message("thread init: 0x%zx\n", _mi_thread_id()); +} + +void mi_thread_done(void) mi_attr_noexcept { + _mi_thread_done(NULL); +} + +void _mi_thread_done(mi_heap_t* heap) +{ + // calling with NULL implies using the default heap + if (heap == NULL) { + heap = mi_prim_get_default_heap(); + if (heap == NULL) return; + } + + // prevent re-entrancy through heap_done/heap_set_default_direct (issue #699) + if (!mi_heap_is_initialized(heap)) { + return; + } + + // adjust stats + mi_atomic_decrement_relaxed(&thread_count); + _mi_stat_decrease(&_mi_stats_main.threads, 1); + + // check thread-id as on Windows shutdown with FLS the main (exit) thread may call this on thread-local heaps... + if (heap->thread_id != _mi_thread_id()) return; + + // abandon the thread local heap + if (_mi_heap_done(heap)) return; // returns true if already ran +} + +void _mi_heap_set_default_direct(mi_heap_t* heap) { + mi_assert_internal(heap != NULL); + #if defined(MI_TLS_SLOT) + mi_prim_tls_slot_set(MI_TLS_SLOT,heap); + #elif defined(MI_TLS_PTHREAD_SLOT_OFS) + *mi_tls_pthread_heap_slot() = heap; + #elif defined(MI_TLS_PTHREAD) + // we use _mi_heap_default_key + #else + _mi_heap_default = heap; + #endif + + // ensure the default heap is passed to `_mi_thread_done` + // setting to a non-NULL value also ensures `mi_thread_done` is called. + _mi_prim_thread_associate_default_heap(heap); +} + + +// -------------------------------------------------------- +// Run functions on process init/done, and thread init/done +// -------------------------------------------------------- +static void mi_cdecl mi_process_done(void); + +static bool os_preloading = true; // true until this module is initialized +static bool mi_redirected = false; // true if malloc redirects to mi_malloc + +// Returns true if this module has not been initialized; Don't use C runtime routines until it returns false. +bool mi_decl_noinline _mi_preloading(void) { + return os_preloading; +} + +mi_decl_nodiscard bool mi_is_redirected(void) mi_attr_noexcept { + return mi_redirected; +} + +// Communicate with the redirection module on Windows +#if defined(_WIN32) && defined(MI_SHARED_LIB) && !defined(MI_WIN_NOREDIRECT) +#ifdef __cplusplus +extern "C" { +#endif +mi_decl_export void _mi_redirect_entry(DWORD reason) { + // called on redirection; careful as this may be called before DllMain + if (reason == DLL_PROCESS_ATTACH) { + mi_redirected = true; + } + else if (reason == DLL_PROCESS_DETACH) { + mi_redirected = false; + } + else if (reason == DLL_THREAD_DETACH) { + mi_thread_done(); + } +} +__declspec(dllimport) bool mi_cdecl mi_allocator_init(const char** message); +__declspec(dllimport) void mi_cdecl mi_allocator_done(void); +#ifdef __cplusplus +} +#endif +#else +static bool mi_allocator_init(const char** message) { + if (message != NULL) *message = NULL; + return true; +} +static void mi_allocator_done(void) { + // nothing to do +} +#endif + +// Called once by the process loader +static void mi_process_load(void) { + mi_heap_main_init(); + #if defined(__APPLE__) || defined(MI_TLS_RECURSE_GUARD) + volatile mi_heap_t* dummy = _mi_heap_default; // access TLS to allocate it before setting tls_initialized to true; + if (dummy == NULL) return; // use dummy or otherwise the access may get optimized away (issue #697) + #endif + os_preloading = false; + mi_assert_internal(_mi_is_main_thread()); + #if !(defined(_WIN32) && defined(MI_SHARED_LIB)) // use Dll process detach (see below) instead of atexit (issue #521) + atexit(&mi_process_done); + #endif + _mi_options_init(); + mi_process_setup_auto_thread_done(); + mi_process_init(); + if (mi_redirected) _mi_verbose_message("malloc is redirected.\n"); + + // show message from the redirector (if present) + const char* msg = NULL; + mi_allocator_init(&msg); + if (msg != NULL && (mi_option_is_enabled(mi_option_verbose) || mi_option_is_enabled(mi_option_show_errors))) { + _mi_fputs(NULL,NULL,NULL,msg); + } + + // reseed random + _mi_random_reinit_if_weak(&_mi_heap_main.random); +} + +#if defined(_WIN32) && (defined(_M_IX86) || defined(_M_X64)) +#include +mi_decl_cache_align bool _mi_cpu_has_fsrm = false; + +static void mi_detect_cpu_features(void) { + // FSRM for fast rep movsb support (AMD Zen3+ (~2020) or Intel Ice Lake+ (~2017)) + int32_t cpu_info[4]; + __cpuid(cpu_info, 7); + _mi_cpu_has_fsrm = ((cpu_info[3] & (1 << 4)) != 0); // bit 4 of EDX : see +} +#else +static void mi_detect_cpu_features(void) { + // nothing +} +#endif + +// Initialize the process; called by thread_init or the process loader +void mi_process_init(void) mi_attr_noexcept { + // ensure we are called once + static mi_atomic_once_t process_init; + #if _MSC_VER < 1920 + mi_heap_main_init(); // vs2017 can dynamically re-initialize _mi_heap_main + #endif + if (!mi_atomic_once(&process_init)) return; + _mi_process_is_initialized = true; + _mi_verbose_message("process init: 0x%zx\n", _mi_thread_id()); + mi_process_setup_auto_thread_done(); + + mi_detect_cpu_features(); + _mi_os_init(); + mi_heap_main_init(); + #if MI_DEBUG + _mi_verbose_message("debug level : %d\n", MI_DEBUG); + #endif + _mi_verbose_message("secure level: %d\n", MI_SECURE); + _mi_verbose_message("mem tracking: %s\n", MI_TRACK_TOOL); + #if MI_TSAN + _mi_verbose_message("thread santizer enabled\n"); + #endif + mi_thread_init(); + + #if defined(_WIN32) + // On windows, when building as a static lib the FLS cleanup happens to early for the main thread. + // To avoid this, set the FLS value for the main thread to NULL so the fls cleanup + // will not call _mi_thread_done on the (still executing) main thread. See issue #508. + _mi_prim_thread_associate_default_heap(NULL); + #endif + + mi_stats_reset(); // only call stat reset *after* thread init (or the heap tld == NULL) + mi_track_init(); + + if (mi_option_is_enabled(mi_option_reserve_huge_os_pages)) { + size_t pages = mi_option_get_clamp(mi_option_reserve_huge_os_pages, 0, 128*1024); + long reserve_at = mi_option_get(mi_option_reserve_huge_os_pages_at); + if (reserve_at != -1) { + mi_reserve_huge_os_pages_at(pages, reserve_at, pages*500); + } else { + mi_reserve_huge_os_pages_interleave(pages, 0, pages*500); + } + } + if (mi_option_is_enabled(mi_option_reserve_os_memory)) { + long ksize = mi_option_get(mi_option_reserve_os_memory); + if (ksize > 0) { + mi_reserve_os_memory((size_t)ksize*MI_KiB, true /* commit? */, true /* allow large pages? */); + } + } +} + +// Called when the process is done (through `at_exit`) +static void mi_cdecl mi_process_done(void) { + // only shutdown if we were initialized + if (!_mi_process_is_initialized) return; + // ensure we are called once + static bool process_done = false; + if (process_done) return; + process_done = true; + + // release any thread specific resources and ensure _mi_thread_done is called on all but the main thread + _mi_prim_thread_done_auto_done(); + + #ifndef MI_SKIP_COLLECT_ON_EXIT + #if (MI_DEBUG || !defined(MI_SHARED_LIB)) + // free all memory if possible on process exit. This is not needed for a stand-alone process + // but should be done if mimalloc is statically linked into another shared library which + // is repeatedly loaded/unloaded, see issue #281. + mi_collect(true /* force */ ); + #endif + #endif + + // Forcefully release all retained memory; this can be dangerous in general if overriding regular malloc/free + // since after process_done there might still be other code running that calls `free` (like at_exit routines, + // or C-runtime termination code. + if (mi_option_is_enabled(mi_option_destroy_on_exit)) { + mi_collect(true /* force */); + _mi_heap_unsafe_destroy_all(); // forcefully release all memory held by all heaps (of this thread only!) + _mi_arena_unsafe_destroy_all(& _mi_heap_main_get()->tld->stats); + } + + if (mi_option_is_enabled(mi_option_show_stats) || mi_option_is_enabled(mi_option_verbose)) { + mi_stats_print(NULL); + } + mi_allocator_done(); + _mi_verbose_message("process done: 0x%zx\n", _mi_heap_main.thread_id); + os_preloading = true; // don't call the C runtime anymore +} + + + +#if defined(_WIN32) && defined(MI_SHARED_LIB) + // Windows DLL: easy to hook into process_init and thread_done + __declspec(dllexport) BOOL WINAPI DllMain(HINSTANCE inst, DWORD reason, LPVOID reserved) { + MI_UNUSED(reserved); + MI_UNUSED(inst); + if (reason==DLL_PROCESS_ATTACH) { + mi_process_load(); + } + else if (reason==DLL_PROCESS_DETACH) { + mi_process_done(); + } + else if (reason==DLL_THREAD_DETACH) { + if (!mi_is_redirected()) { + mi_thread_done(); + } + } + return TRUE; + } + +#elif defined(_MSC_VER) + // MSVC: use data section magic for static libraries + // See + static int _mi_process_init(void) { + mi_process_load(); + return 0; + } + typedef int(*_mi_crt_callback_t)(void); + #if defined(_M_X64) || defined(_M_ARM64) + __pragma(comment(linker, "/include:" "_mi_msvc_initu")) + #pragma section(".CRT$XIU", long, read) + #else + __pragma(comment(linker, "/include:" "__mi_msvc_initu")) + #endif + #pragma data_seg(".CRT$XIU") + mi_decl_externc _mi_crt_callback_t _mi_msvc_initu[] = { &_mi_process_init }; + #pragma data_seg() + +#elif defined(__cplusplus) + // C++: use static initialization to detect process start + static bool _mi_process_init(void) { + mi_process_load(); + return (_mi_heap_main.thread_id != 0); + } + static bool mi_initialized = _mi_process_init(); + +#elif defined(__GNUC__) || defined(__clang__) + // GCC,Clang: use the constructor attribute + static void __attribute__((constructor)) _mi_process_init(void) { + mi_process_load(); + } + +#else +#pragma message("define a way to call mi_process_load on your platform") +#endif diff --git a/compat/mimalloc/mimalloc.h b/compat/mimalloc/mimalloc.h new file mode 100644 index 00000000000000..c0f5e96e51e975 --- /dev/null +++ b/compat/mimalloc/mimalloc.h @@ -0,0 +1,565 @@ +/* ---------------------------------------------------------------------------- +Copyright (c) 2018-2023, Microsoft Research, Daan Leijen +This is free software; you can redistribute it and/or modify it under the +terms of the MIT license. A copy of the license can be found in the file +"LICENSE" at the root of this distribution. +-----------------------------------------------------------------------------*/ +#pragma once +#ifndef MIMALLOC_H +#define MIMALLOC_H + +#define MI_MALLOC_VERSION 212 // major + 2 digits minor + +// ------------------------------------------------------ +// Compiler specific attributes +// ------------------------------------------------------ + +#ifdef __cplusplus + #if (__cplusplus >= 201103L) || (_MSC_VER > 1900) // C++11 + #define mi_attr_noexcept noexcept + #else + #define mi_attr_noexcept throw() + #endif +#else + #define mi_attr_noexcept +#endif + +#if defined(__cplusplus) && (__cplusplus >= 201703) + #define mi_decl_nodiscard [[nodiscard]] +#elif (defined(__GNUC__) && (__GNUC__ >= 4)) || defined(__clang__) // includes clang, icc, and clang-cl + #define mi_decl_nodiscard __attribute__((warn_unused_result)) +#elif defined(_HAS_NODISCARD) + #define mi_decl_nodiscard _NODISCARD +#elif (_MSC_VER >= 1700) + #define mi_decl_nodiscard _Check_return_ +#else + #define mi_decl_nodiscard +#endif + +#if defined(_MSC_VER) || defined(__MINGW32__) + #if !defined(MI_SHARED_LIB) + #define mi_decl_export + #elif defined(MI_SHARED_LIB_EXPORT) + #define mi_decl_export __declspec(dllexport) + #else + #define mi_decl_export __declspec(dllimport) + #endif + #if defined(__MINGW32__) + #define mi_decl_restrict + #define mi_attr_malloc __attribute__((malloc)) + #else + #if (_MSC_VER >= 1900) && !defined(__EDG__) + #define mi_decl_restrict __declspec(allocator) __declspec(restrict) + #else + #define mi_decl_restrict __declspec(restrict) + #endif + #define mi_attr_malloc + #endif + #define mi_cdecl __cdecl + #define mi_attr_alloc_size(s) + #define mi_attr_alloc_size2(s1,s2) + #define mi_attr_alloc_align(p) +#elif defined(__GNUC__) // includes clang and icc + #if defined(MI_SHARED_LIB) && defined(MI_SHARED_LIB_EXPORT) + #define mi_decl_export __attribute__((visibility("default"))) + #else + #define mi_decl_export + #endif + #define mi_cdecl // leads to warnings... __attribute__((cdecl)) + #define mi_decl_restrict + #define mi_attr_malloc __attribute__((malloc)) + #if (defined(__clang_major__) && (__clang_major__ < 4)) || (__GNUC__ < 5) + #define mi_attr_alloc_size(s) + #define mi_attr_alloc_size2(s1,s2) + #define mi_attr_alloc_align(p) + #elif defined(__INTEL_COMPILER) + #define mi_attr_alloc_size(s) __attribute__((alloc_size(s))) + #define mi_attr_alloc_size2(s1,s2) __attribute__((alloc_size(s1,s2))) + #define mi_attr_alloc_align(p) + #else + #define mi_attr_alloc_size(s) __attribute__((alloc_size(s))) + #define mi_attr_alloc_size2(s1,s2) __attribute__((alloc_size(s1,s2))) + #define mi_attr_alloc_align(p) __attribute__((alloc_align(p))) + #endif +#else + #define mi_cdecl + #define mi_decl_export + #define mi_decl_restrict + #define mi_attr_malloc + #define mi_attr_alloc_size(s) + #define mi_attr_alloc_size2(s1,s2) + #define mi_attr_alloc_align(p) +#endif + +// ------------------------------------------------------ +// Includes +// ------------------------------------------------------ + +#include // size_t +#include // bool +#include // INTPTR_MAX + +#ifdef __cplusplus +extern "C" { +#endif + +// ------------------------------------------------------ +// Standard malloc interface +// ------------------------------------------------------ + +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_malloc(size_t size) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(1); +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_calloc(size_t count, size_t size) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size2(1,2); +mi_decl_nodiscard mi_decl_export void* mi_realloc(void* p, size_t newsize) mi_attr_noexcept mi_attr_alloc_size(2); +mi_decl_export void* mi_expand(void* p, size_t newsize) mi_attr_noexcept mi_attr_alloc_size(2); + +mi_decl_export void mi_free(void* p) mi_attr_noexcept; +mi_decl_nodiscard mi_decl_export mi_decl_restrict char* mi_strdup(const char* s) mi_attr_noexcept mi_attr_malloc; +mi_decl_nodiscard mi_decl_export mi_decl_restrict char* mi_strndup(const char* s, size_t n) mi_attr_noexcept mi_attr_malloc; +mi_decl_nodiscard mi_decl_export mi_decl_restrict char* mi_realpath(const char* fname, char* resolved_name) mi_attr_noexcept mi_attr_malloc; + +// ------------------------------------------------------ +// Extended functionality +// ------------------------------------------------------ +#define MI_SMALL_WSIZE_MAX (128) +#define MI_SMALL_SIZE_MAX (MI_SMALL_WSIZE_MAX*sizeof(void*)) + +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_malloc_small(size_t size) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(1); +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_zalloc_small(size_t size) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(1); +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_zalloc(size_t size) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(1); + +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_mallocn(size_t count, size_t size) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size2(1,2); +mi_decl_nodiscard mi_decl_export void* mi_reallocn(void* p, size_t count, size_t size) mi_attr_noexcept mi_attr_alloc_size2(2,3); +mi_decl_nodiscard mi_decl_export void* mi_reallocf(void* p, size_t newsize) mi_attr_noexcept mi_attr_alloc_size(2); + +mi_decl_nodiscard mi_decl_export size_t mi_usable_size(const void* p) mi_attr_noexcept; +mi_decl_nodiscard mi_decl_export size_t mi_good_size(size_t size) mi_attr_noexcept; + + +// ------------------------------------------------------ +// Internals +// ------------------------------------------------------ + +typedef void (mi_cdecl mi_deferred_free_fun)(bool force, unsigned long long heartbeat, void* arg); +mi_decl_export void mi_register_deferred_free(mi_deferred_free_fun* deferred_free, void* arg) mi_attr_noexcept; + +typedef void (mi_cdecl mi_output_fun)(const char* msg, void* arg); +mi_decl_export void mi_register_output(mi_output_fun* out, void* arg) mi_attr_noexcept; + +typedef void (mi_cdecl mi_error_fun)(int err, void* arg); +mi_decl_export void mi_register_error(mi_error_fun* fun, void* arg); + +mi_decl_export void mi_collect(bool force) mi_attr_noexcept; +mi_decl_export int mi_version(void) mi_attr_noexcept; +mi_decl_export void mi_stats_reset(void) mi_attr_noexcept; +mi_decl_export void mi_stats_merge(void) mi_attr_noexcept; +mi_decl_export void mi_stats_print(void* out) mi_attr_noexcept; // backward compatibility: `out` is ignored and should be NULL +mi_decl_export void mi_stats_print_out(mi_output_fun* out, void* arg) mi_attr_noexcept; + +mi_decl_export void mi_process_init(void) mi_attr_noexcept; +mi_decl_export void mi_thread_init(void) mi_attr_noexcept; +mi_decl_export void mi_thread_done(void) mi_attr_noexcept; +mi_decl_export void mi_thread_stats_print_out(mi_output_fun* out, void* arg) mi_attr_noexcept; + +mi_decl_export void mi_process_info(size_t* elapsed_msecs, size_t* user_msecs, size_t* system_msecs, + size_t* current_rss, size_t* peak_rss, + size_t* current_commit, size_t* peak_commit, size_t* page_faults) mi_attr_noexcept; + +// ------------------------------------------------------------------------------------- +// Aligned allocation +// Note that `alignment` always follows `size` for consistency with unaligned +// allocation, but unfortunately this differs from `posix_memalign` and `aligned_alloc`. +// ------------------------------------------------------------------------------------- + +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_malloc_aligned(size_t size, size_t alignment) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(1) mi_attr_alloc_align(2); +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_malloc_aligned_at(size_t size, size_t alignment, size_t offset) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(1); +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_zalloc_aligned(size_t size, size_t alignment) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(1) mi_attr_alloc_align(2); +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_zalloc_aligned_at(size_t size, size_t alignment, size_t offset) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(1); +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_calloc_aligned(size_t count, size_t size, size_t alignment) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size2(1,2) mi_attr_alloc_align(3); +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_calloc_aligned_at(size_t count, size_t size, size_t alignment, size_t offset) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size2(1,2); +mi_decl_nodiscard mi_decl_export void* mi_realloc_aligned(void* p, size_t newsize, size_t alignment) mi_attr_noexcept mi_attr_alloc_size(2) mi_attr_alloc_align(3); +mi_decl_nodiscard mi_decl_export void* mi_realloc_aligned_at(void* p, size_t newsize, size_t alignment, size_t offset) mi_attr_noexcept mi_attr_alloc_size(2); + + +// ------------------------------------------------------------------------------------- +// Heaps: first-class, but can only allocate from the same thread that created it. +// ------------------------------------------------------------------------------------- + +struct mi_heap_s; +typedef struct mi_heap_s mi_heap_t; + +mi_decl_nodiscard mi_decl_export mi_heap_t* mi_heap_new(void); +mi_decl_export void mi_heap_delete(mi_heap_t* heap); +mi_decl_export void mi_heap_destroy(mi_heap_t* heap); +mi_decl_export mi_heap_t* mi_heap_set_default(mi_heap_t* heap); +mi_decl_export mi_heap_t* mi_heap_get_default(void); +mi_decl_export mi_heap_t* mi_heap_get_backing(void); +mi_decl_export void mi_heap_collect(mi_heap_t* heap, bool force) mi_attr_noexcept; + +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_heap_malloc(mi_heap_t* heap, size_t size) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(2); +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_heap_zalloc(mi_heap_t* heap, size_t size) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(2); +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_heap_calloc(mi_heap_t* heap, size_t count, size_t size) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size2(2, 3); +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_heap_mallocn(mi_heap_t* heap, size_t count, size_t size) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size2(2, 3); +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_heap_malloc_small(mi_heap_t* heap, size_t size) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(2); + +mi_decl_nodiscard mi_decl_export void* mi_heap_realloc(mi_heap_t* heap, void* p, size_t newsize) mi_attr_noexcept mi_attr_alloc_size(3); +mi_decl_nodiscard mi_decl_export void* mi_heap_reallocn(mi_heap_t* heap, void* p, size_t count, size_t size) mi_attr_noexcept mi_attr_alloc_size2(3,4); +mi_decl_nodiscard mi_decl_export void* mi_heap_reallocf(mi_heap_t* heap, void* p, size_t newsize) mi_attr_noexcept mi_attr_alloc_size(3); + +mi_decl_nodiscard mi_decl_export mi_decl_restrict char* mi_heap_strdup(mi_heap_t* heap, const char* s) mi_attr_noexcept mi_attr_malloc; +mi_decl_nodiscard mi_decl_export mi_decl_restrict char* mi_heap_strndup(mi_heap_t* heap, const char* s, size_t n) mi_attr_noexcept mi_attr_malloc; +mi_decl_nodiscard mi_decl_export mi_decl_restrict char* mi_heap_realpath(mi_heap_t* heap, const char* fname, char* resolved_name) mi_attr_noexcept mi_attr_malloc; + +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_heap_malloc_aligned(mi_heap_t* heap, size_t size, size_t alignment) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(2) mi_attr_alloc_align(3); +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_heap_malloc_aligned_at(mi_heap_t* heap, size_t size, size_t alignment, size_t offset) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(2); +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_heap_zalloc_aligned(mi_heap_t* heap, size_t size, size_t alignment) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(2) mi_attr_alloc_align(3); +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_heap_zalloc_aligned_at(mi_heap_t* heap, size_t size, size_t alignment, size_t offset) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(2); +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_heap_calloc_aligned(mi_heap_t* heap, size_t count, size_t size, size_t alignment) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size2(2, 3) mi_attr_alloc_align(4); +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_heap_calloc_aligned_at(mi_heap_t* heap, size_t count, size_t size, size_t alignment, size_t offset) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size2(2, 3); +mi_decl_nodiscard mi_decl_export void* mi_heap_realloc_aligned(mi_heap_t* heap, void* p, size_t newsize, size_t alignment) mi_attr_noexcept mi_attr_alloc_size(3) mi_attr_alloc_align(4); +mi_decl_nodiscard mi_decl_export void* mi_heap_realloc_aligned_at(mi_heap_t* heap, void* p, size_t newsize, size_t alignment, size_t offset) mi_attr_noexcept mi_attr_alloc_size(3); + + +// -------------------------------------------------------------------------------- +// Zero initialized re-allocation. +// Only valid on memory that was originally allocated with zero initialization too. +// e.g. `mi_calloc`, `mi_zalloc`, `mi_zalloc_aligned` etc. +// see +// -------------------------------------------------------------------------------- + +mi_decl_nodiscard mi_decl_export void* mi_rezalloc(void* p, size_t newsize) mi_attr_noexcept mi_attr_alloc_size(2); +mi_decl_nodiscard mi_decl_export void* mi_recalloc(void* p, size_t newcount, size_t size) mi_attr_noexcept mi_attr_alloc_size2(2,3); + +mi_decl_nodiscard mi_decl_export void* mi_rezalloc_aligned(void* p, size_t newsize, size_t alignment) mi_attr_noexcept mi_attr_alloc_size(2) mi_attr_alloc_align(3); +mi_decl_nodiscard mi_decl_export void* mi_rezalloc_aligned_at(void* p, size_t newsize, size_t alignment, size_t offset) mi_attr_noexcept mi_attr_alloc_size(2); +mi_decl_nodiscard mi_decl_export void* mi_recalloc_aligned(void* p, size_t newcount, size_t size, size_t alignment) mi_attr_noexcept mi_attr_alloc_size2(2,3) mi_attr_alloc_align(4); +mi_decl_nodiscard mi_decl_export void* mi_recalloc_aligned_at(void* p, size_t newcount, size_t size, size_t alignment, size_t offset) mi_attr_noexcept mi_attr_alloc_size2(2,3); + +mi_decl_nodiscard mi_decl_export void* mi_heap_rezalloc(mi_heap_t* heap, void* p, size_t newsize) mi_attr_noexcept mi_attr_alloc_size(3); +mi_decl_nodiscard mi_decl_export void* mi_heap_recalloc(mi_heap_t* heap, void* p, size_t newcount, size_t size) mi_attr_noexcept mi_attr_alloc_size2(3,4); + +mi_decl_nodiscard mi_decl_export void* mi_heap_rezalloc_aligned(mi_heap_t* heap, void* p, size_t newsize, size_t alignment) mi_attr_noexcept mi_attr_alloc_size(3) mi_attr_alloc_align(4); +mi_decl_nodiscard mi_decl_export void* mi_heap_rezalloc_aligned_at(mi_heap_t* heap, void* p, size_t newsize, size_t alignment, size_t offset) mi_attr_noexcept mi_attr_alloc_size(3); +mi_decl_nodiscard mi_decl_export void* mi_heap_recalloc_aligned(mi_heap_t* heap, void* p, size_t newcount, size_t size, size_t alignment) mi_attr_noexcept mi_attr_alloc_size2(3,4) mi_attr_alloc_align(5); +mi_decl_nodiscard mi_decl_export void* mi_heap_recalloc_aligned_at(mi_heap_t* heap, void* p, size_t newcount, size_t size, size_t alignment, size_t offset) mi_attr_noexcept mi_attr_alloc_size2(3,4); + + +// ------------------------------------------------------ +// Analysis +// ------------------------------------------------------ + +mi_decl_export bool mi_heap_contains_block(mi_heap_t* heap, const void* p); +mi_decl_export bool mi_heap_check_owned(mi_heap_t* heap, const void* p); +mi_decl_export bool mi_check_owned(const void* p); + +// An area of heap space contains blocks of a single size. +typedef struct mi_heap_area_s { + void* blocks; // start of the area containing heap blocks + size_t reserved; // bytes reserved for this area (virtual) + size_t committed; // current available bytes for this area + size_t used; // number of allocated blocks + size_t block_size; // size in bytes of each block + size_t full_block_size; // size in bytes of a full block including padding and metadata. +} mi_heap_area_t; + +typedef bool (mi_cdecl mi_block_visit_fun)(const mi_heap_t* heap, const mi_heap_area_t* area, void* block, size_t block_size, void* arg); + +mi_decl_export bool mi_heap_visit_blocks(const mi_heap_t* heap, bool visit_all_blocks, mi_block_visit_fun* visitor, void* arg); + +// Experimental +mi_decl_nodiscard mi_decl_export bool mi_is_in_heap_region(const void* p) mi_attr_noexcept; +mi_decl_nodiscard mi_decl_export bool mi_is_redirected(void) mi_attr_noexcept; + +mi_decl_export int mi_reserve_huge_os_pages_interleave(size_t pages, size_t numa_nodes, size_t timeout_msecs) mi_attr_noexcept; +mi_decl_export int mi_reserve_huge_os_pages_at(size_t pages, int numa_node, size_t timeout_msecs) mi_attr_noexcept; + +mi_decl_export int mi_reserve_os_memory(size_t size, bool commit, bool allow_large) mi_attr_noexcept; +mi_decl_export bool mi_manage_os_memory(void* start, size_t size, bool is_committed, bool is_large, bool is_zero, int numa_node) mi_attr_noexcept; + +mi_decl_export void mi_debug_show_arenas(void) mi_attr_noexcept; + +// Experimental: heaps associated with specific memory arena's +typedef int mi_arena_id_t; +mi_decl_export void* mi_arena_area(mi_arena_id_t arena_id, size_t* size); +mi_decl_export int mi_reserve_huge_os_pages_at_ex(size_t pages, int numa_node, size_t timeout_msecs, bool exclusive, mi_arena_id_t* arena_id) mi_attr_noexcept; +mi_decl_export int mi_reserve_os_memory_ex(size_t size, bool commit, bool allow_large, bool exclusive, mi_arena_id_t* arena_id) mi_attr_noexcept; +mi_decl_export bool mi_manage_os_memory_ex(void* start, size_t size, bool is_committed, bool is_large, bool is_zero, int numa_node, bool exclusive, mi_arena_id_t* arena_id) mi_attr_noexcept; + +#if MI_MALLOC_VERSION >= 182 +// Create a heap that only allocates in the specified arena +mi_decl_nodiscard mi_decl_export mi_heap_t* mi_heap_new_in_arena(mi_arena_id_t arena_id); +#endif + +// deprecated +mi_decl_export int mi_reserve_huge_os_pages(size_t pages, double max_secs, size_t* pages_reserved) mi_attr_noexcept; + + +// ------------------------------------------------------ +// Convenience +// ------------------------------------------------------ + +#define mi_malloc_tp(tp) ((tp*)mi_malloc(sizeof(tp))) +#define mi_zalloc_tp(tp) ((tp*)mi_zalloc(sizeof(tp))) +#define mi_calloc_tp(tp,n) ((tp*)mi_calloc(n,sizeof(tp))) +#define mi_mallocn_tp(tp,n) ((tp*)mi_mallocn(n,sizeof(tp))) +#define mi_reallocn_tp(p,tp,n) ((tp*)mi_reallocn(p,n,sizeof(tp))) +#define mi_recalloc_tp(p,tp,n) ((tp*)mi_recalloc(p,n,sizeof(tp))) + +#define mi_heap_malloc_tp(hp,tp) ((tp*)mi_heap_malloc(hp,sizeof(tp))) +#define mi_heap_zalloc_tp(hp,tp) ((tp*)mi_heap_zalloc(hp,sizeof(tp))) +#define mi_heap_calloc_tp(hp,tp,n) ((tp*)mi_heap_calloc(hp,n,sizeof(tp))) +#define mi_heap_mallocn_tp(hp,tp,n) ((tp*)mi_heap_mallocn(hp,n,sizeof(tp))) +#define mi_heap_reallocn_tp(hp,p,tp,n) ((tp*)mi_heap_reallocn(hp,p,n,sizeof(tp))) +#define mi_heap_recalloc_tp(hp,p,tp,n) ((tp*)mi_heap_recalloc(hp,p,n,sizeof(tp))) + + +// ------------------------------------------------------ +// Options +// ------------------------------------------------------ + +typedef enum mi_option_e { + // stable options + mi_option_show_errors, // print error messages + mi_option_show_stats, // print statistics on termination + mi_option_verbose, // print verbose messages + // the following options are experimental (see src/options.h) + mi_option_eager_commit, // eager commit segments? (after `eager_commit_delay` segments) (=1) + mi_option_arena_eager_commit, // eager commit arenas? Use 2 to enable just on overcommit systems (=2) + mi_option_purge_decommits, // should a memory purge decommit (or only reset) (=1) + mi_option_allow_large_os_pages, // allow large (2MiB) OS pages, implies eager commit + mi_option_reserve_huge_os_pages, // reserve N huge OS pages (1GiB/page) at startup + mi_option_reserve_huge_os_pages_at, // reserve huge OS pages at a specific NUMA node + mi_option_reserve_os_memory, // reserve specified amount of OS memory in an arena at startup + mi_option_deprecated_segment_cache, + mi_option_deprecated_page_reset, + mi_option_abandoned_page_purge, // immediately purge delayed purges on thread termination + mi_option_deprecated_segment_reset, + mi_option_eager_commit_delay, + mi_option_purge_delay, // memory purging is delayed by N milli seconds; use 0 for immediate purging or -1 for no purging at all. + mi_option_use_numa_nodes, // 0 = use all available numa nodes, otherwise use at most N nodes. + mi_option_limit_os_alloc, // 1 = do not use OS memory for allocation (but only programmatically reserved arenas) + mi_option_os_tag, // tag used for OS logging (macOS only for now) + mi_option_max_errors, // issue at most N error messages + mi_option_max_warnings, // issue at most N warning messages + mi_option_max_segment_reclaim, + mi_option_destroy_on_exit, // if set, release all memory on exit; sometimes used for dynamic unloading but can be unsafe. + mi_option_arena_reserve, // initial memory size in KiB for arena reservation (1GiB on 64-bit) + mi_option_arena_purge_mult, + mi_option_purge_extend_delay, + _mi_option_last, + // legacy option names + mi_option_large_os_pages = mi_option_allow_large_os_pages, + mi_option_eager_region_commit = mi_option_arena_eager_commit, + mi_option_reset_decommits = mi_option_purge_decommits, + mi_option_reset_delay = mi_option_purge_delay, + mi_option_abandoned_page_reset = mi_option_abandoned_page_purge +} mi_option_t; + + +mi_decl_nodiscard mi_decl_export bool mi_option_is_enabled(mi_option_t option); +mi_decl_export void mi_option_enable(mi_option_t option); +mi_decl_export void mi_option_disable(mi_option_t option); +mi_decl_export void mi_option_set_enabled(mi_option_t option, bool enable); +mi_decl_export void mi_option_set_enabled_default(mi_option_t option, bool enable); + +mi_decl_nodiscard mi_decl_export long mi_option_get(mi_option_t option); +mi_decl_nodiscard mi_decl_export long mi_option_get_clamp(mi_option_t option, long min, long max); +mi_decl_nodiscard mi_decl_export size_t mi_option_get_size(mi_option_t option); +mi_decl_export void mi_option_set(mi_option_t option, long value); +mi_decl_export void mi_option_set_default(mi_option_t option, long value); + + +// ------------------------------------------------------------------------------------------------------- +// "mi" prefixed implementations of various posix, Unix, Windows, and C++ allocation functions. +// (This can be convenient when providing overrides of these functions as done in `mimalloc-override.h`.) +// note: we use `mi_cfree` as "checked free" and it checks if the pointer is in our heap before free-ing. +// ------------------------------------------------------------------------------------------------------- + +mi_decl_export void mi_cfree(void* p) mi_attr_noexcept; +mi_decl_export void* mi__expand(void* p, size_t newsize) mi_attr_noexcept; +mi_decl_nodiscard mi_decl_export size_t mi_malloc_size(const void* p) mi_attr_noexcept; +mi_decl_nodiscard mi_decl_export size_t mi_malloc_good_size(size_t size) mi_attr_noexcept; +mi_decl_nodiscard mi_decl_export size_t mi_malloc_usable_size(const void *p) mi_attr_noexcept; + +mi_decl_export int mi_posix_memalign(void** p, size_t alignment, size_t size) mi_attr_noexcept; +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_memalign(size_t alignment, size_t size) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(2) mi_attr_alloc_align(1); +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_valloc(size_t size) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(1); +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_pvalloc(size_t size) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(1); +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_aligned_alloc(size_t alignment, size_t size) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(2) mi_attr_alloc_align(1); + +mi_decl_nodiscard mi_decl_export void* mi_reallocarray(void* p, size_t count, size_t size) mi_attr_noexcept mi_attr_alloc_size2(2,3); +mi_decl_nodiscard mi_decl_export int mi_reallocarr(void* p, size_t count, size_t size) mi_attr_noexcept; +mi_decl_nodiscard mi_decl_export void* mi_aligned_recalloc(void* p, size_t newcount, size_t size, size_t alignment) mi_attr_noexcept; +mi_decl_nodiscard mi_decl_export void* mi_aligned_offset_recalloc(void* p, size_t newcount, size_t size, size_t alignment, size_t offset) mi_attr_noexcept; + +mi_decl_nodiscard mi_decl_export mi_decl_restrict unsigned short* mi_wcsdup(const unsigned short* s) mi_attr_noexcept mi_attr_malloc; +mi_decl_nodiscard mi_decl_export mi_decl_restrict unsigned char* mi_mbsdup(const unsigned char* s) mi_attr_noexcept mi_attr_malloc; +mi_decl_export int mi_dupenv_s(char** buf, size_t* size, const char* name) mi_attr_noexcept; +mi_decl_export int mi_wdupenv_s(unsigned short** buf, size_t* size, const unsigned short* name) mi_attr_noexcept; + +mi_decl_export void mi_free_size(void* p, size_t size) mi_attr_noexcept; +mi_decl_export void mi_free_size_aligned(void* p, size_t size, size_t alignment) mi_attr_noexcept; +mi_decl_export void mi_free_aligned(void* p, size_t alignment) mi_attr_noexcept; + +// The `mi_new` wrappers implement C++ semantics on out-of-memory instead of directly returning `NULL`. +// (and call `std::get_new_handler` and potentially raise a `std::bad_alloc` exception). +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_new(size_t size) mi_attr_malloc mi_attr_alloc_size(1); +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_new_aligned(size_t size, size_t alignment) mi_attr_malloc mi_attr_alloc_size(1) mi_attr_alloc_align(2); +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_new_nothrow(size_t size) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(1); +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_new_aligned_nothrow(size_t size, size_t alignment) mi_attr_noexcept mi_attr_malloc mi_attr_alloc_size(1) mi_attr_alloc_align(2); +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_new_n(size_t count, size_t size) mi_attr_malloc mi_attr_alloc_size2(1, 2); +mi_decl_nodiscard mi_decl_export void* mi_new_realloc(void* p, size_t newsize) mi_attr_alloc_size(2); +mi_decl_nodiscard mi_decl_export void* mi_new_reallocn(void* p, size_t newcount, size_t size) mi_attr_alloc_size2(2, 3); + +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_heap_alloc_new(mi_heap_t* heap, size_t size) mi_attr_malloc mi_attr_alloc_size(2); +mi_decl_nodiscard mi_decl_export mi_decl_restrict void* mi_heap_alloc_new_n(mi_heap_t* heap, size_t count, size_t size) mi_attr_malloc mi_attr_alloc_size2(2, 3); + +#ifdef __cplusplus +} +#endif + +// --------------------------------------------------------------------------------------------- +// Implement the C++ std::allocator interface for use in STL containers. +// (note: see `mimalloc-new-delete.h` for overriding the new/delete operators globally) +// --------------------------------------------------------------------------------------------- +#ifdef __cplusplus + +#include // std::size_t +#include // PTRDIFF_MAX +#if (__cplusplus >= 201103L) || (_MSC_VER > 1900) // C++11 +#include // std::true_type +#include // std::forward +#endif + +template struct _mi_stl_allocator_common { + typedef T value_type; + typedef std::size_t size_type; + typedef std::ptrdiff_t difference_type; + typedef value_type& reference; + typedef value_type const& const_reference; + typedef value_type* pointer; + typedef value_type const* const_pointer; + + #if ((__cplusplus >= 201103L) || (_MSC_VER > 1900)) // C++11 + using propagate_on_container_copy_assignment = std::true_type; + using propagate_on_container_move_assignment = std::true_type; + using propagate_on_container_swap = std::true_type; + template void construct(U* p, Args&& ...args) { ::new(p) U(std::forward(args)...); } + template void destroy(U* p) mi_attr_noexcept { p->~U(); } + #else + void construct(pointer p, value_type const& val) { ::new(p) value_type(val); } + void destroy(pointer p) { p->~value_type(); } + #endif + + size_type max_size() const mi_attr_noexcept { return (PTRDIFF_MAX/sizeof(value_type)); } + pointer address(reference x) const { return &x; } + const_pointer address(const_reference x) const { return &x; } +}; + +template struct mi_stl_allocator : public _mi_stl_allocator_common { + using typename _mi_stl_allocator_common::size_type; + using typename _mi_stl_allocator_common::value_type; + using typename _mi_stl_allocator_common::pointer; + template struct rebind { typedef mi_stl_allocator other; }; + + mi_stl_allocator() mi_attr_noexcept = default; + mi_stl_allocator(const mi_stl_allocator&) mi_attr_noexcept = default; + template mi_stl_allocator(const mi_stl_allocator&) mi_attr_noexcept { } + mi_stl_allocator select_on_container_copy_construction() const { return *this; } + void deallocate(T* p, size_type) { mi_free(p); } + + #if (__cplusplus >= 201703L) // C++17 + mi_decl_nodiscard T* allocate(size_type count) { return static_cast(mi_new_n(count, sizeof(T))); } + mi_decl_nodiscard T* allocate(size_type count, const void*) { return allocate(count); } + #else + mi_decl_nodiscard pointer allocate(size_type count, const void* = 0) { return static_cast(mi_new_n(count, sizeof(value_type))); } + #endif + + #if ((__cplusplus >= 201103L) || (_MSC_VER > 1900)) // C++11 + using is_always_equal = std::true_type; + #endif +}; + +template bool operator==(const mi_stl_allocator& , const mi_stl_allocator& ) mi_attr_noexcept { return true; } +template bool operator!=(const mi_stl_allocator& , const mi_stl_allocator& ) mi_attr_noexcept { return false; } + + +#if (__cplusplus >= 201103L) || (_MSC_VER >= 1900) // C++11 +#define MI_HAS_HEAP_STL_ALLOCATOR 1 + +#include // std::shared_ptr + +// Common base class for STL allocators in a specific heap +template struct _mi_heap_stl_allocator_common : public _mi_stl_allocator_common { + using typename _mi_stl_allocator_common::size_type; + using typename _mi_stl_allocator_common::value_type; + using typename _mi_stl_allocator_common::pointer; + + _mi_heap_stl_allocator_common(mi_heap_t* hp) : heap(hp) { } /* will not delete nor destroy the passed in heap */ + + #if (__cplusplus >= 201703L) // C++17 + mi_decl_nodiscard T* allocate(size_type count) { return static_cast(mi_heap_alloc_new_n(this->heap.get(), count, sizeof(T))); } + mi_decl_nodiscard T* allocate(size_type count, const void*) { return allocate(count); } + #else + mi_decl_nodiscard pointer allocate(size_type count, const void* = 0) { return static_cast(mi_heap_alloc_new_n(this->heap.get(), count, sizeof(value_type))); } + #endif + + #if ((__cplusplus >= 201103L) || (_MSC_VER > 1900)) // C++11 + using is_always_equal = std::false_type; + #endif + + void collect(bool force) { mi_heap_collect(this->heap.get(), force); } + template bool is_equal(const _mi_heap_stl_allocator_common& x) const { return (this->heap == x.heap); } + +protected: + std::shared_ptr heap; + template friend struct _mi_heap_stl_allocator_common; + + _mi_heap_stl_allocator_common() { + mi_heap_t* hp = mi_heap_new(); + this->heap.reset(hp, (_mi_destroy ? &heap_destroy : &heap_delete)); /* calls heap_delete/destroy when the refcount drops to zero */ + } + _mi_heap_stl_allocator_common(const _mi_heap_stl_allocator_common& x) mi_attr_noexcept : heap(x.heap) { } + template _mi_heap_stl_allocator_common(const _mi_heap_stl_allocator_common& x) mi_attr_noexcept : heap(x.heap) { } + +private: + static void heap_delete(mi_heap_t* hp) { if (hp != NULL) { mi_heap_delete(hp); } } + static void heap_destroy(mi_heap_t* hp) { if (hp != NULL) { mi_heap_destroy(hp); } } +}; + +// STL allocator allocation in a specific heap +template struct mi_heap_stl_allocator : public _mi_heap_stl_allocator_common { + using typename _mi_heap_stl_allocator_common::size_type; + mi_heap_stl_allocator() : _mi_heap_stl_allocator_common() { } // creates fresh heap that is deleted when the destructor is called + mi_heap_stl_allocator(mi_heap_t* hp) : _mi_heap_stl_allocator_common(hp) { } // no delete nor destroy on the passed in heap + template mi_heap_stl_allocator(const mi_heap_stl_allocator& x) mi_attr_noexcept : _mi_heap_stl_allocator_common(x) { } + + mi_heap_stl_allocator select_on_container_copy_construction() const { return *this; } + void deallocate(T* p, size_type) { mi_free(p); } + template struct rebind { typedef mi_heap_stl_allocator other; }; +}; + +template bool operator==(const mi_heap_stl_allocator& x, const mi_heap_stl_allocator& y) mi_attr_noexcept { return (x.is_equal(y)); } +template bool operator!=(const mi_heap_stl_allocator& x, const mi_heap_stl_allocator& y) mi_attr_noexcept { return (!x.is_equal(y)); } + + +// STL allocator allocation in a specific heap, where `free` does nothing and +// the heap is destroyed in one go on destruction -- use with care! +template struct mi_heap_destroy_stl_allocator : public _mi_heap_stl_allocator_common { + using typename _mi_heap_stl_allocator_common::size_type; + mi_heap_destroy_stl_allocator() : _mi_heap_stl_allocator_common() { } // creates fresh heap that is destroyed when the destructor is called + mi_heap_destroy_stl_allocator(mi_heap_t* hp) : _mi_heap_stl_allocator_common(hp) { } // no delete nor destroy on the passed in heap + template mi_heap_destroy_stl_allocator(const mi_heap_destroy_stl_allocator& x) mi_attr_noexcept : _mi_heap_stl_allocator_common(x) { } + + mi_heap_destroy_stl_allocator select_on_container_copy_construction() const { return *this; } + void deallocate(T*, size_type) { /* do nothing as we destroy the heap on destruct. */ } + template struct rebind { typedef mi_heap_destroy_stl_allocator other; }; +}; + +template bool operator==(const mi_heap_destroy_stl_allocator& x, const mi_heap_destroy_stl_allocator& y) mi_attr_noexcept { return (x.is_equal(y)); } +template bool operator!=(const mi_heap_destroy_stl_allocator& x, const mi_heap_destroy_stl_allocator& y) mi_attr_noexcept { return (!x.is_equal(y)); } + +#endif // C++11 + +#endif // __cplusplus + +#endif diff --git a/compat/mimalloc/mimalloc/atomic.h b/compat/mimalloc/mimalloc/atomic.h new file mode 100644 index 00000000000000..c6b8146ffdb049 --- /dev/null +++ b/compat/mimalloc/mimalloc/atomic.h @@ -0,0 +1,385 @@ +/* ---------------------------------------------------------------------------- +Copyright (c) 2018-2023 Microsoft Research, Daan Leijen +This is free software; you can redistribute it and/or modify it under the +terms of the MIT license. A copy of the license can be found in the file +"LICENSE" at the root of this distribution. +-----------------------------------------------------------------------------*/ +#pragma once +#ifndef MIMALLOC_ATOMIC_H +#define MIMALLOC_ATOMIC_H + +// -------------------------------------------------------------------------------------------- +// Atomics +// We need to be portable between C, C++, and MSVC. +// We base the primitives on the C/C++ atomics and create a mimimal wrapper for MSVC in C compilation mode. +// This is why we try to use only `uintptr_t` and `*` as atomic types. +// To gain better insight in the range of used atomics, we use explicitly named memory order operations +// instead of passing the memory order as a parameter. +// ----------------------------------------------------------------------------------------------- + +#if defined(__cplusplus) +// Use C++ atomics +#include +#define _Atomic(tp) std::atomic +#define mi_atomic(name) std::atomic_##name +#define mi_memory_order(name) std::memory_order_##name +#if !defined(ATOMIC_VAR_INIT) || (__cplusplus >= 202002L) // c++20, see issue #571 + #define MI_ATOMIC_VAR_INIT(x) x +#else + #define MI_ATOMIC_VAR_INIT(x) ATOMIC_VAR_INIT(x) +#endif +#elif defined(_MSC_VER) +// Use MSVC C wrapper for C11 atomics +#define _Atomic(tp) tp +#define MI_ATOMIC_VAR_INIT(x) x +#define mi_atomic(name) mi_atomic_##name +#define mi_memory_order(name) mi_memory_order_##name +#else +// Use C11 atomics +#include +#define mi_atomic(name) atomic_##name +#define mi_memory_order(name) memory_order_##name +#if !defined(ATOMIC_VAR_INIT) || (__STDC_VERSION__ >= 201710L) // c17, see issue #735 + #define MI_ATOMIC_VAR_INIT(x) x +#else + #define MI_ATOMIC_VAR_INIT(x) ATOMIC_VAR_INIT(x) +#endif +#endif + +// Various defines for all used memory orders in mimalloc +#define mi_atomic_cas_weak(p,expected,desired,mem_success,mem_fail) \ + mi_atomic(compare_exchange_weak_explicit)(p,expected,desired,mem_success,mem_fail) + +#define mi_atomic_cas_strong(p,expected,desired,mem_success,mem_fail) \ + mi_atomic(compare_exchange_strong_explicit)(p,expected,desired,mem_success,mem_fail) + +#define mi_atomic_load_acquire(p) mi_atomic(load_explicit)(p,mi_memory_order(acquire)) +#define mi_atomic_load_relaxed(p) mi_atomic(load_explicit)(p,mi_memory_order(relaxed)) +#define mi_atomic_store_release(p,x) mi_atomic(store_explicit)(p,x,mi_memory_order(release)) +#define mi_atomic_store_relaxed(p,x) mi_atomic(store_explicit)(p,x,mi_memory_order(relaxed)) +#define mi_atomic_exchange_release(p,x) mi_atomic(exchange_explicit)(p,x,mi_memory_order(release)) +#define mi_atomic_exchange_acq_rel(p,x) mi_atomic(exchange_explicit)(p,x,mi_memory_order(acq_rel)) +#define mi_atomic_cas_weak_release(p,exp,des) mi_atomic_cas_weak(p,exp,des,mi_memory_order(release),mi_memory_order(relaxed)) +#define mi_atomic_cas_weak_acq_rel(p,exp,des) mi_atomic_cas_weak(p,exp,des,mi_memory_order(acq_rel),mi_memory_order(acquire)) +#define mi_atomic_cas_strong_release(p,exp,des) mi_atomic_cas_strong(p,exp,des,mi_memory_order(release),mi_memory_order(relaxed)) +#define mi_atomic_cas_strong_acq_rel(p,exp,des) mi_atomic_cas_strong(p,exp,des,mi_memory_order(acq_rel),mi_memory_order(acquire)) + +#define mi_atomic_add_relaxed(p,x) mi_atomic(fetch_add_explicit)(p,x,mi_memory_order(relaxed)) +#define mi_atomic_sub_relaxed(p,x) mi_atomic(fetch_sub_explicit)(p,x,mi_memory_order(relaxed)) +#define mi_atomic_add_acq_rel(p,x) mi_atomic(fetch_add_explicit)(p,x,mi_memory_order(acq_rel)) +#define mi_atomic_sub_acq_rel(p,x) mi_atomic(fetch_sub_explicit)(p,x,mi_memory_order(acq_rel)) +#define mi_atomic_and_acq_rel(p,x) mi_atomic(fetch_and_explicit)(p,x,mi_memory_order(acq_rel)) +#define mi_atomic_or_acq_rel(p,x) mi_atomic(fetch_or_explicit)(p,x,mi_memory_order(acq_rel)) + +#define mi_atomic_increment_relaxed(p) mi_atomic_add_relaxed(p,(uintptr_t)1) +#define mi_atomic_decrement_relaxed(p) mi_atomic_sub_relaxed(p,(uintptr_t)1) +#define mi_atomic_increment_acq_rel(p) mi_atomic_add_acq_rel(p,(uintptr_t)1) +#define mi_atomic_decrement_acq_rel(p) mi_atomic_sub_acq_rel(p,(uintptr_t)1) + +static inline void mi_atomic_yield(void); +static inline intptr_t mi_atomic_addi(_Atomic(intptr_t)*p, intptr_t add); +static inline intptr_t mi_atomic_subi(_Atomic(intptr_t)*p, intptr_t sub); + + +#if defined(__cplusplus) || !defined(_MSC_VER) + +// In C++/C11 atomics we have polymorphic atomics so can use the typed `ptr` variants (where `tp` is the type of atomic value) +// We use these macros so we can provide a typed wrapper in MSVC in C compilation mode as well +#define mi_atomic_load_ptr_acquire(tp,p) mi_atomic_load_acquire(p) +#define mi_atomic_load_ptr_relaxed(tp,p) mi_atomic_load_relaxed(p) + +// In C++ we need to add casts to help resolve templates if NULL is passed +#if defined(__cplusplus) +#define mi_atomic_store_ptr_release(tp,p,x) mi_atomic_store_release(p,(tp*)x) +#define mi_atomic_store_ptr_relaxed(tp,p,x) mi_atomic_store_relaxed(p,(tp*)x) +#define mi_atomic_cas_ptr_weak_release(tp,p,exp,des) mi_atomic_cas_weak_release(p,exp,(tp*)des) +#define mi_atomic_cas_ptr_weak_acq_rel(tp,p,exp,des) mi_atomic_cas_weak_acq_rel(p,exp,(tp*)des) +#define mi_atomic_cas_ptr_strong_release(tp,p,exp,des) mi_atomic_cas_strong_release(p,exp,(tp*)des) +#define mi_atomic_exchange_ptr_release(tp,p,x) mi_atomic_exchange_release(p,(tp*)x) +#define mi_atomic_exchange_ptr_acq_rel(tp,p,x) mi_atomic_exchange_acq_rel(p,(tp*)x) +#else +#define mi_atomic_store_ptr_release(tp,p,x) mi_atomic_store_release(p,x) +#define mi_atomic_store_ptr_relaxed(tp,p,x) mi_atomic_store_relaxed(p,x) +#define mi_atomic_cas_ptr_weak_release(tp,p,exp,des) mi_atomic_cas_weak_release(p,exp,des) +#define mi_atomic_cas_ptr_weak_acq_rel(tp,p,exp,des) mi_atomic_cas_weak_acq_rel(p,exp,des) +#define mi_atomic_cas_ptr_strong_release(tp,p,exp,des) mi_atomic_cas_strong_release(p,exp,des) +#define mi_atomic_exchange_ptr_release(tp,p,x) mi_atomic_exchange_release(p,x) +#define mi_atomic_exchange_ptr_acq_rel(tp,p,x) mi_atomic_exchange_acq_rel(p,x) +#endif + +// These are used by the statistics +static inline int64_t mi_atomic_addi64_relaxed(volatile int64_t* p, int64_t add) { + return mi_atomic(fetch_add_explicit)((_Atomic(int64_t)*)p, add, mi_memory_order(relaxed)); +} +static inline void mi_atomic_maxi64_relaxed(volatile int64_t* p, int64_t x) { + int64_t current = mi_atomic_load_relaxed((_Atomic(int64_t)*)p); + while (current < x && !mi_atomic_cas_weak_release((_Atomic(int64_t)*)p, ¤t, x)) { /* nothing */ }; +} + +// Used by timers +#define mi_atomic_loadi64_acquire(p) mi_atomic(load_explicit)(p,mi_memory_order(acquire)) +#define mi_atomic_loadi64_relaxed(p) mi_atomic(load_explicit)(p,mi_memory_order(relaxed)) +#define mi_atomic_storei64_release(p,x) mi_atomic(store_explicit)(p,x,mi_memory_order(release)) +#define mi_atomic_storei64_relaxed(p,x) mi_atomic(store_explicit)(p,x,mi_memory_order(relaxed)) + +#define mi_atomic_casi64_strong_acq_rel(p,e,d) mi_atomic_cas_strong_acq_rel(p,e,d) +#define mi_atomic_addi64_acq_rel(p,i) mi_atomic_add_acq_rel(p,i) + + +#elif defined(_MSC_VER) + +// MSVC C compilation wrapper that uses Interlocked operations to model C11 atomics. +#define WIN32_LEAN_AND_MEAN +#include +#include +#ifdef _WIN64 +typedef LONG64 msc_intptr_t; +#define MI_64(f) f##64 +#else +typedef LONG msc_intptr_t; +#define MI_64(f) f +#endif + +typedef enum mi_memory_order_e { + mi_memory_order_relaxed, + mi_memory_order_consume, + mi_memory_order_acquire, + mi_memory_order_release, + mi_memory_order_acq_rel, + mi_memory_order_seq_cst +} mi_memory_order; + +static inline uintptr_t mi_atomic_fetch_add_explicit(_Atomic(uintptr_t)*p, uintptr_t add, mi_memory_order mo) { + (void)(mo); + return (uintptr_t)MI_64(_InterlockedExchangeAdd)((volatile msc_intptr_t*)p, (msc_intptr_t)add); +} +static inline uintptr_t mi_atomic_fetch_sub_explicit(_Atomic(uintptr_t)*p, uintptr_t sub, mi_memory_order mo) { + (void)(mo); + return (uintptr_t)MI_64(_InterlockedExchangeAdd)((volatile msc_intptr_t*)p, -((msc_intptr_t)sub)); +} +static inline uintptr_t mi_atomic_fetch_and_explicit(_Atomic(uintptr_t)*p, uintptr_t x, mi_memory_order mo) { + (void)(mo); + return (uintptr_t)MI_64(_InterlockedAnd)((volatile msc_intptr_t*)p, (msc_intptr_t)x); +} +static inline uintptr_t mi_atomic_fetch_or_explicit(_Atomic(uintptr_t)*p, uintptr_t x, mi_memory_order mo) { + (void)(mo); + return (uintptr_t)MI_64(_InterlockedOr)((volatile msc_intptr_t*)p, (msc_intptr_t)x); +} +static inline bool mi_atomic_compare_exchange_strong_explicit(_Atomic(uintptr_t)*p, uintptr_t* expected, uintptr_t desired, mi_memory_order mo1, mi_memory_order mo2) { + (void)(mo1); (void)(mo2); + uintptr_t read = (uintptr_t)MI_64(_InterlockedCompareExchange)((volatile msc_intptr_t*)p, (msc_intptr_t)desired, (msc_intptr_t)(*expected)); + if (read == *expected) { + return true; + } + else { + *expected = read; + return false; + } +} +static inline bool mi_atomic_compare_exchange_weak_explicit(_Atomic(uintptr_t)*p, uintptr_t* expected, uintptr_t desired, mi_memory_order mo1, mi_memory_order mo2) { + return mi_atomic_compare_exchange_strong_explicit(p, expected, desired, mo1, mo2); +} +static inline uintptr_t mi_atomic_exchange_explicit(_Atomic(uintptr_t)*p, uintptr_t exchange, mi_memory_order mo) { + (void)(mo); + return (uintptr_t)MI_64(_InterlockedExchange)((volatile msc_intptr_t*)p, (msc_intptr_t)exchange); +} +static inline void mi_atomic_thread_fence(mi_memory_order mo) { + (void)(mo); + _Atomic(uintptr_t) x = 0; + mi_atomic_exchange_explicit(&x, 1, mo); +} +static inline uintptr_t mi_atomic_load_explicit(_Atomic(uintptr_t) const* p, mi_memory_order mo) { + (void)(mo); +#if defined(_M_IX86) || defined(_M_X64) + return *p; +#else + uintptr_t x = *p; + if (mo > mi_memory_order_relaxed) { + while (!mi_atomic_compare_exchange_weak_explicit(p, &x, x, mo, mi_memory_order_relaxed)) { /* nothing */ }; + } + return x; +#endif +} +static inline void mi_atomic_store_explicit(_Atomic(uintptr_t)*p, uintptr_t x, mi_memory_order mo) { + (void)(mo); +#if defined(_M_IX86) || defined(_M_X64) + *p = x; +#else + mi_atomic_exchange_explicit(p, x, mo); +#endif +} +static inline int64_t mi_atomic_loadi64_explicit(_Atomic(int64_t)*p, mi_memory_order mo) { + (void)(mo); +#if defined(_M_X64) + return *p; +#else + int64_t old = *p; + int64_t x = old; + while ((old = InterlockedCompareExchange64(p, x, old)) != x) { + x = old; + } + return x; +#endif +} +static inline void mi_atomic_storei64_explicit(_Atomic(int64_t)*p, int64_t x, mi_memory_order mo) { + (void)(mo); +#if defined(x_M_IX86) || defined(_M_X64) + *p = x; +#else + InterlockedExchange64(p, x); +#endif +} + +// These are used by the statistics +static inline int64_t mi_atomic_addi64_relaxed(volatile _Atomic(int64_t)*p, int64_t add) { +#ifdef _WIN64 + return (int64_t)mi_atomic_addi((int64_t*)p, add); +#else + int64_t current; + int64_t sum; + do { + current = *p; + sum = current + add; + } while (_InterlockedCompareExchange64(p, sum, current) != current); + return current; +#endif +} +static inline void mi_atomic_maxi64_relaxed(volatile _Atomic(int64_t)*p, int64_t x) { + int64_t current; + do { + current = *p; + } while (current < x && _InterlockedCompareExchange64(p, x, current) != current); +} + +static inline void mi_atomic_addi64_acq_rel(volatile _Atomic(int64_t*)p, int64_t i) { + mi_atomic_addi64_relaxed(p, i); +} + +static inline bool mi_atomic_casi64_strong_acq_rel(volatile _Atomic(int64_t*)p, int64_t* exp, int64_t des) { + int64_t read = _InterlockedCompareExchange64(p, des, *exp); + if (read == *exp) { + return true; + } + else { + *exp = read; + return false; + } +} + +// The pointer macros cast to `uintptr_t`. +#define mi_atomic_load_ptr_acquire(tp,p) (tp*)mi_atomic_load_acquire((_Atomic(uintptr_t)*)(p)) +#define mi_atomic_load_ptr_relaxed(tp,p) (tp*)mi_atomic_load_relaxed((_Atomic(uintptr_t)*)(p)) +#define mi_atomic_store_ptr_release(tp,p,x) mi_atomic_store_release((_Atomic(uintptr_t)*)(p),(uintptr_t)(x)) +#define mi_atomic_store_ptr_relaxed(tp,p,x) mi_atomic_store_relaxed((_Atomic(uintptr_t)*)(p),(uintptr_t)(x)) +#define mi_atomic_cas_ptr_weak_release(tp,p,exp,des) mi_atomic_cas_weak_release((_Atomic(uintptr_t)*)(p),(uintptr_t*)exp,(uintptr_t)des) +#define mi_atomic_cas_ptr_weak_acq_rel(tp,p,exp,des) mi_atomic_cas_weak_acq_rel((_Atomic(uintptr_t)*)(p),(uintptr_t*)exp,(uintptr_t)des) +#define mi_atomic_cas_ptr_strong_release(tp,p,exp,des) mi_atomic_cas_strong_release((_Atomic(uintptr_t)*)(p),(uintptr_t*)exp,(uintptr_t)des) +#define mi_atomic_exchange_ptr_release(tp,p,x) (tp*)mi_atomic_exchange_release((_Atomic(uintptr_t)*)(p),(uintptr_t)x) +#define mi_atomic_exchange_ptr_acq_rel(tp,p,x) (tp*)mi_atomic_exchange_acq_rel((_Atomic(uintptr_t)*)(p),(uintptr_t)x) + +#define mi_atomic_loadi64_acquire(p) mi_atomic(loadi64_explicit)(p,mi_memory_order(acquire)) +#define mi_atomic_loadi64_relaxed(p) mi_atomic(loadi64_explicit)(p,mi_memory_order(relaxed)) +#define mi_atomic_storei64_release(p,x) mi_atomic(storei64_explicit)(p,x,mi_memory_order(release)) +#define mi_atomic_storei64_relaxed(p,x) mi_atomic(storei64_explicit)(p,x,mi_memory_order(relaxed)) + + +#endif + + +// Atomically add a signed value; returns the previous value. +static inline intptr_t mi_atomic_addi(_Atomic(intptr_t)*p, intptr_t add) { + return (intptr_t)mi_atomic_add_acq_rel((_Atomic(uintptr_t)*)p, (uintptr_t)add); +} + +// Atomically subtract a signed value; returns the previous value. +static inline intptr_t mi_atomic_subi(_Atomic(intptr_t)*p, intptr_t sub) { + return (intptr_t)mi_atomic_addi(p, -sub); +} + +typedef _Atomic(uintptr_t) mi_atomic_once_t; + +// Returns true only on the first invocation +static inline bool mi_atomic_once( mi_atomic_once_t* once ) { + if (mi_atomic_load_relaxed(once) != 0) return false; // quick test + uintptr_t expected = 0; + return mi_atomic_cas_strong_acq_rel(once, &expected, (uintptr_t)1); // try to set to 1 +} + +typedef _Atomic(uintptr_t) mi_atomic_guard_t; + +// Allows only one thread to execute at a time +#define mi_atomic_guard(guard) \ + uintptr_t _mi_guard_expected = 0; \ + for(bool _mi_guard_once = true; \ + _mi_guard_once && mi_atomic_cas_strong_acq_rel(guard,&_mi_guard_expected,(uintptr_t)1); \ + (mi_atomic_store_release(guard,(uintptr_t)0), _mi_guard_once = false) ) + + + +// Yield +#if defined(__cplusplus) +#include +static inline void mi_atomic_yield(void) { + std::this_thread::yield(); +} +#elif defined(_WIN32) +#define WIN32_LEAN_AND_MEAN +#include +static inline void mi_atomic_yield(void) { + YieldProcessor(); +} +#elif defined(__SSE2__) +#include +static inline void mi_atomic_yield(void) { + _mm_pause(); +} +#elif (defined(__GNUC__) || defined(__clang__)) && \ + (defined(__x86_64__) || defined(__i386__) || defined(__arm__) || defined(__armel__) || defined(__ARMEL__) || \ + defined(__aarch64__) || defined(__powerpc__) || defined(__ppc__) || defined(__PPC__)) || defined(__POWERPC__) +#if defined(__x86_64__) || defined(__i386__) +static inline void mi_atomic_yield(void) { + __asm__ volatile ("pause" ::: "memory"); +} +#elif defined(__aarch64__) +static inline void mi_atomic_yield(void) { + __asm__ volatile("wfe"); +} +#elif (defined(__arm__) && __ARM_ARCH__ >= 7) +static inline void mi_atomic_yield(void) { + __asm__ volatile("yield" ::: "memory"); +} +#elif defined(__powerpc__) || defined(__ppc__) || defined(__PPC__) || defined(__POWERPC__) +#ifdef __APPLE__ +static inline void mi_atomic_yield(void) { + __asm__ volatile ("or r27,r27,r27" ::: "memory"); +} +#else +static inline void mi_atomic_yield(void) { + __asm__ __volatile__ ("or 27,27,27" ::: "memory"); +} +#endif +#elif defined(__armel__) || defined(__ARMEL__) +static inline void mi_atomic_yield(void) { + __asm__ volatile ("nop" ::: "memory"); +} +#endif +#elif defined(__sun) +// Fallback for other archs +#include +static inline void mi_atomic_yield(void) { + smt_pause(); +} +#elif defined(__wasi__) +#include +static inline void mi_atomic_yield(void) { + sched_yield(); +} +#else +#include +static inline void mi_atomic_yield(void) { + sleep(0); +} +#endif + + +#endif // __MIMALLOC_ATOMIC_H diff --git a/compat/mimalloc/mimalloc/internal.h b/compat/mimalloc/mimalloc/internal.h new file mode 100644 index 00000000000000..f076bc6a40f977 --- /dev/null +++ b/compat/mimalloc/mimalloc/internal.h @@ -0,0 +1,979 @@ +/* ---------------------------------------------------------------------------- +Copyright (c) 2018-2023, Microsoft Research, Daan Leijen +This is free software; you can redistribute it and/or modify it under the +terms of the MIT license. A copy of the license can be found in the file +"LICENSE" at the root of this distribution. +-----------------------------------------------------------------------------*/ +#pragma once +#ifndef MIMALLOC_INTERNAL_H +#define MIMALLOC_INTERNAL_H + + +// -------------------------------------------------------------------------- +// This file contains the interal API's of mimalloc and various utility +// functions and macros. +// -------------------------------------------------------------------------- + +#include "mimalloc/types.h" +#include "mimalloc/track.h" + +#if (MI_DEBUG>0) +#define mi_trace_message(...) _mi_trace_message(__VA_ARGS__) +#else +#define mi_trace_message(...) +#endif + +#define MI_CACHE_LINE 64 +#if defined(_MSC_VER) +#pragma warning(disable:4127) // suppress constant conditional warning (due to MI_SECURE paths) +#pragma warning(disable:26812) // unscoped enum warning +#define mi_decl_noinline __declspec(noinline) +#define mi_decl_thread __declspec(thread) +#define mi_decl_cache_align __declspec(align(MI_CACHE_LINE)) +#elif (defined(__GNUC__) && (__GNUC__ >= 3)) || defined(__clang__) // includes clang and icc +#define mi_decl_noinline __attribute__((noinline)) +#define mi_decl_thread __thread +#define mi_decl_cache_align __attribute__((aligned(MI_CACHE_LINE))) +#else +#define mi_decl_noinline +#define mi_decl_thread __thread // hope for the best :-) +#define mi_decl_cache_align +#endif + +#if defined(__EMSCRIPTEN__) && !defined(__wasi__) +#define __wasi__ +#endif + +#if defined(__cplusplus) +#define mi_decl_externc extern "C" +#else +#define mi_decl_externc +#endif + +// pthreads +#if !defined(_WIN32) && !defined(__wasi__) +#define MI_USE_PTHREADS +#include +#endif + +// "options.c" +void _mi_fputs(mi_output_fun* out, void* arg, const char* prefix, const char* message); +void _mi_fprintf(mi_output_fun* out, void* arg, const char* fmt, ...); +void _mi_warning_message(const char* fmt, ...); +void _mi_verbose_message(const char* fmt, ...); +void _mi_trace_message(const char* fmt, ...); +void _mi_options_init(void); +void _mi_error_message(int err, const char* fmt, ...); + +// random.c +void _mi_random_init(mi_random_ctx_t* ctx); +void _mi_random_init_weak(mi_random_ctx_t* ctx); +void _mi_random_reinit_if_weak(mi_random_ctx_t * ctx); +void _mi_random_split(mi_random_ctx_t* ctx, mi_random_ctx_t* new_ctx); +uintptr_t _mi_random_next(mi_random_ctx_t* ctx); +uintptr_t _mi_heap_random_next(mi_heap_t* heap); +uintptr_t _mi_os_random_weak(uintptr_t extra_seed); +static inline uintptr_t _mi_random_shuffle(uintptr_t x); + +// init.c +extern mi_decl_cache_align mi_stats_t _mi_stats_main; +extern mi_decl_cache_align const mi_page_t _mi_page_empty; +bool _mi_is_main_thread(void); +size_t _mi_current_thread_count(void); +bool _mi_preloading(void); // true while the C runtime is not initialized yet +mi_threadid_t _mi_thread_id(void) mi_attr_noexcept; +mi_heap_t* _mi_heap_main_get(void); // statically allocated main backing heap +void _mi_thread_done(mi_heap_t* heap); +void _mi_thread_data_collect(void); + +// os.c +void _mi_os_init(void); // called from process init +void* _mi_os_alloc(size_t size, mi_memid_t* memid, mi_stats_t* stats); +void _mi_os_free(void* p, size_t size, mi_memid_t memid, mi_stats_t* stats); +void _mi_os_free_ex(void* p, size_t size, bool still_committed, mi_memid_t memid, mi_stats_t* stats); + +size_t _mi_os_page_size(void); +size_t _mi_os_good_alloc_size(size_t size); +bool _mi_os_has_overcommit(void); +bool _mi_os_has_virtual_reserve(void); + +bool _mi_os_purge(void* p, size_t size, mi_stats_t* stats); +bool _mi_os_reset(void* addr, size_t size, mi_stats_t* tld_stats); +bool _mi_os_commit(void* p, size_t size, bool* is_zero, mi_stats_t* stats); +bool _mi_os_decommit(void* addr, size_t size, mi_stats_t* stats); +bool _mi_os_protect(void* addr, size_t size); +bool _mi_os_unprotect(void* addr, size_t size); +bool _mi_os_purge(void* p, size_t size, mi_stats_t* stats); +bool _mi_os_purge_ex(void* p, size_t size, bool allow_reset, mi_stats_t* stats); + +void* _mi_os_alloc_aligned(size_t size, size_t alignment, bool commit, bool allow_large, mi_memid_t* memid, mi_stats_t* stats); +void* _mi_os_alloc_aligned_at_offset(size_t size, size_t alignment, size_t align_offset, bool commit, bool allow_large, mi_memid_t* memid, mi_stats_t* tld_stats); + +void* _mi_os_get_aligned_hint(size_t try_alignment, size_t size); +bool _mi_os_use_large_page(size_t size, size_t alignment); +size_t _mi_os_large_page_size(void); + +void* _mi_os_alloc_huge_os_pages(size_t pages, int numa_node, mi_msecs_t max_secs, size_t* pages_reserved, size_t* psize, mi_memid_t* memid); + +// arena.c +mi_arena_id_t _mi_arena_id_none(void); +void _mi_arena_free(void* p, size_t size, size_t still_committed_size, mi_memid_t memid, mi_stats_t* stats); +void* _mi_arena_alloc(size_t size, bool commit, bool allow_large, mi_arena_id_t req_arena_id, mi_memid_t* memid, mi_os_tld_t* tld); +void* _mi_arena_alloc_aligned(size_t size, size_t alignment, size_t align_offset, bool commit, bool allow_large, mi_arena_id_t req_arena_id, mi_memid_t* memid, mi_os_tld_t* tld); +bool _mi_arena_memid_is_suitable(mi_memid_t memid, mi_arena_id_t request_arena_id); +bool _mi_arena_contains(const void* p); +void _mi_arena_collect(bool force_purge, mi_stats_t* stats); +void _mi_arena_unsafe_destroy_all(mi_stats_t* stats); + +// "segment-map.c" +void _mi_segment_map_allocated_at(const mi_segment_t* segment); +void _mi_segment_map_freed_at(const mi_segment_t* segment); + +// "segment.c" +mi_page_t* _mi_segment_page_alloc(mi_heap_t* heap, size_t block_size, size_t page_alignment, mi_segments_tld_t* tld, mi_os_tld_t* os_tld); +void _mi_segment_page_free(mi_page_t* page, bool force, mi_segments_tld_t* tld); +void _mi_segment_page_abandon(mi_page_t* page, mi_segments_tld_t* tld); +bool _mi_segment_try_reclaim_abandoned( mi_heap_t* heap, bool try_all, mi_segments_tld_t* tld); +void _mi_segment_thread_collect(mi_segments_tld_t* tld); + +#if MI_HUGE_PAGE_ABANDON +void _mi_segment_huge_page_free(mi_segment_t* segment, mi_page_t* page, mi_block_t* block); +#else +void _mi_segment_huge_page_reset(mi_segment_t* segment, mi_page_t* page, mi_block_t* block); +#endif + +uint8_t* _mi_segment_page_start(const mi_segment_t* segment, const mi_page_t* page, size_t* page_size); // page start for any page +void _mi_abandoned_reclaim_all(mi_heap_t* heap, mi_segments_tld_t* tld); +void _mi_abandoned_await_readers(void); +void _mi_abandoned_collect(mi_heap_t* heap, bool force, mi_segments_tld_t* tld); + +// "page.c" +void* _mi_malloc_generic(mi_heap_t* heap, size_t size, bool zero, size_t huge_alignment) mi_attr_noexcept mi_attr_malloc; + +void _mi_page_retire(mi_page_t* page) mi_attr_noexcept; // free the page if there are no other pages with many free blocks +void _mi_page_unfull(mi_page_t* page); +void _mi_page_free(mi_page_t* page, mi_page_queue_t* pq, bool force); // free the page +void _mi_page_abandon(mi_page_t* page, mi_page_queue_t* pq); // abandon the page, to be picked up by another thread... +void _mi_heap_delayed_free_all(mi_heap_t* heap); +bool _mi_heap_delayed_free_partial(mi_heap_t* heap); +void _mi_heap_collect_retired(mi_heap_t* heap, bool force); + +void _mi_page_use_delayed_free(mi_page_t* page, mi_delayed_t delay, bool override_never); +bool _mi_page_try_use_delayed_free(mi_page_t* page, mi_delayed_t delay, bool override_never); +size_t _mi_page_queue_append(mi_heap_t* heap, mi_page_queue_t* pq, mi_page_queue_t* append); +void _mi_deferred_free(mi_heap_t* heap, bool force); + +void _mi_page_free_collect(mi_page_t* page,bool force); +void _mi_page_reclaim(mi_heap_t* heap, mi_page_t* page); // callback from segments + +size_t _mi_bin_size(uint8_t bin); // for stats +uint8_t _mi_bin(size_t size); // for stats + +// "heap.c" +void _mi_heap_destroy_pages(mi_heap_t* heap); +void _mi_heap_collect_abandon(mi_heap_t* heap); +void _mi_heap_set_default_direct(mi_heap_t* heap); +bool _mi_heap_memid_is_suitable(mi_heap_t* heap, mi_memid_t memid); +void _mi_heap_unsafe_destroy_all(void); + +// "stats.c" +void _mi_stats_done(mi_stats_t* stats); +mi_msecs_t _mi_clock_now(void); +mi_msecs_t _mi_clock_end(mi_msecs_t start); +mi_msecs_t _mi_clock_start(void); + +// "alloc.c" +void* _mi_page_malloc(mi_heap_t* heap, mi_page_t* page, size_t size, bool zero) mi_attr_noexcept; // called from `_mi_malloc_generic` +void* _mi_heap_malloc_zero(mi_heap_t* heap, size_t size, bool zero) mi_attr_noexcept; +void* _mi_heap_malloc_zero_ex(mi_heap_t* heap, size_t size, bool zero, size_t huge_alignment) mi_attr_noexcept; // called from `_mi_heap_malloc_aligned` +void* _mi_heap_realloc_zero(mi_heap_t* heap, void* p, size_t newsize, bool zero) mi_attr_noexcept; +mi_block_t* _mi_page_ptr_unalign(const mi_segment_t* segment, const mi_page_t* page, const void* p); +bool _mi_free_delayed_block(mi_block_t* block); +void _mi_free_generic(const mi_segment_t* segment, mi_page_t* page, bool is_local, void* p) mi_attr_noexcept; // for runtime integration +void _mi_padding_shrink(const mi_page_t* page, const mi_block_t* block, const size_t min_size); + +// option.c, c primitives +char _mi_toupper(char c); +int _mi_strnicmp(const char* s, const char* t, size_t n); +void _mi_strlcpy(char* dest, const char* src, size_t dest_size); +void _mi_strlcat(char* dest, const char* src, size_t dest_size); +size_t _mi_strlen(const char* s); +size_t _mi_strnlen(const char* s, size_t max_len); + + +#if MI_DEBUG>1 +bool _mi_page_is_valid(mi_page_t* page); +#endif + + +// ------------------------------------------------------ +// Branches +// ------------------------------------------------------ + +#if defined(__GNUC__) || defined(__clang__) +#define mi_unlikely(x) (__builtin_expect(!!(x),false)) +#define mi_likely(x) (__builtin_expect(!!(x),true)) +#elif (defined(__cplusplus) && (__cplusplus >= 202002L)) || (defined(_MSVC_LANG) && _MSVC_LANG >= 202002L) +#define mi_unlikely(x) (x) [[unlikely]] +#define mi_likely(x) (x) [[likely]] +#else +#define mi_unlikely(x) (x) +#define mi_likely(x) (x) +#endif + +#ifndef __has_builtin +#define __has_builtin(x) 0 +#endif + + +/* ----------------------------------------------------------- + Error codes passed to `_mi_fatal_error` + All are recoverable but EFAULT is a serious error and aborts by default in secure mode. + For portability define undefined error codes using common Unix codes: + +----------------------------------------------------------- */ +#include +#ifndef EAGAIN // double free +#define EAGAIN (11) +#endif +#ifndef ENOMEM // out of memory +#define ENOMEM (12) +#endif +#ifndef EFAULT // corrupted free-list or meta-data +#define EFAULT (14) +#endif +#ifndef EINVAL // trying to free an invalid pointer +#define EINVAL (22) +#endif +#ifndef EOVERFLOW // count*size overflow +#define EOVERFLOW (75) +#endif + + +/* ----------------------------------------------------------- + Inlined definitions +----------------------------------------------------------- */ +#define MI_UNUSED(x) (void)(x) +#if (MI_DEBUG>0) +#define MI_UNUSED_RELEASE(x) +#else +#define MI_UNUSED_RELEASE(x) MI_UNUSED(x) +#endif + +#define MI_INIT4(x) x(),x(),x(),x() +#define MI_INIT8(x) MI_INIT4(x),MI_INIT4(x) +#define MI_INIT16(x) MI_INIT8(x),MI_INIT8(x) +#define MI_INIT32(x) MI_INIT16(x),MI_INIT16(x) +#define MI_INIT64(x) MI_INIT32(x),MI_INIT32(x) +#define MI_INIT128(x) MI_INIT64(x),MI_INIT64(x) +#define MI_INIT256(x) MI_INIT128(x),MI_INIT128(x) + + +#include +// initialize a local variable to zero; use memset as compilers optimize constant sized memset's +#define _mi_memzero_var(x) memset(&x,0,sizeof(x)) + +// Is `x` a power of two? (0 is considered a power of two) +static inline bool _mi_is_power_of_two(uintptr_t x) { + return ((x & (x - 1)) == 0); +} + +// Is a pointer aligned? +static inline bool _mi_is_aligned(void* p, size_t alignment) { + mi_assert_internal(alignment != 0); + return (((uintptr_t)p % alignment) == 0); +} + +// Align upwards +static inline uintptr_t _mi_align_up(uintptr_t sz, size_t alignment) { + mi_assert_internal(alignment != 0); + uintptr_t mask = alignment - 1; + if ((alignment & mask) == 0) { // power of two? + return ((sz + mask) & ~mask); + } + else { + return (((sz + mask)/alignment)*alignment); + } +} + +// Align downwards +static inline uintptr_t _mi_align_down(uintptr_t sz, size_t alignment) { + mi_assert_internal(alignment != 0); + uintptr_t mask = alignment - 1; + if ((alignment & mask) == 0) { // power of two? + return (sz & ~mask); + } + else { + return ((sz / alignment) * alignment); + } +} + +// Divide upwards: `s <= _mi_divide_up(s,d)*d < s+d`. +static inline uintptr_t _mi_divide_up(uintptr_t size, size_t divider) { + mi_assert_internal(divider != 0); + return (divider == 0 ? size : ((size + divider - 1) / divider)); +} + +// Is memory zero initialized? +static inline bool mi_mem_is_zero(const void* p, size_t size) { + for (size_t i = 0; i < size; i++) { + if (((uint8_t*)p)[i] != 0) return false; + } + return true; +} + + +// Align a byte size to a size in _machine words_, +// i.e. byte size == `wsize*sizeof(void*)`. +static inline size_t _mi_wsize_from_size(size_t size) { + mi_assert_internal(size <= SIZE_MAX - sizeof(uintptr_t)); + return (size + sizeof(uintptr_t) - 1) / sizeof(uintptr_t); +} + +// Overflow detecting multiply +#if __has_builtin(__builtin_umul_overflow) || (defined(__GNUC__) && (__GNUC__ >= 5)) +#include // UINT_MAX, ULONG_MAX +#if defined(_CLOCK_T) // for Illumos +#undef _CLOCK_T +#endif +static inline bool mi_mul_overflow(size_t count, size_t size, size_t* total) { + #if (SIZE_MAX == ULONG_MAX) + return __builtin_umull_overflow(count, size, (unsigned long *)total); + #elif (SIZE_MAX == UINT_MAX) + return __builtin_umul_overflow(count, size, (unsigned int *)total); + #else + return __builtin_umulll_overflow(count, size, (unsigned long long *)total); + #endif +} +#else /* __builtin_umul_overflow is unavailable */ +static inline bool mi_mul_overflow(size_t count, size_t size, size_t* total) { + #define MI_MUL_NO_OVERFLOW ((size_t)1 << (4*sizeof(size_t))) // sqrt(SIZE_MAX) + *total = count * size; + // note: gcc/clang optimize this to directly check the overflow flag + return ((size >= MI_MUL_NO_OVERFLOW || count >= MI_MUL_NO_OVERFLOW) && size > 0 && (SIZE_MAX / size) < count); +} +#endif + +// Safe multiply `count*size` into `total`; return `true` on overflow. +static inline bool mi_count_size_overflow(size_t count, size_t size, size_t* total) { + if (count==1) { // quick check for the case where count is one (common for C++ allocators) + *total = size; + return false; + } + else if mi_unlikely(mi_mul_overflow(count, size, total)) { + #if MI_DEBUG > 0 + _mi_error_message(EOVERFLOW, "allocation request is too large (%zu * %zu bytes)\n", count, size); + #endif + *total = SIZE_MAX; + return true; + } + else return false; +} + + +/*---------------------------------------------------------------------------------------- + Heap functions +------------------------------------------------------------------------------------------- */ + +extern const mi_heap_t _mi_heap_empty; // read-only empty heap, initial value of the thread local default heap + +static inline bool mi_heap_is_backing(const mi_heap_t* heap) { + return (heap->tld->heap_backing == heap); +} + +static inline bool mi_heap_is_initialized(mi_heap_t* heap) { + mi_assert_internal(heap != NULL); + return (heap != &_mi_heap_empty); +} + +static inline uintptr_t _mi_ptr_cookie(const void* p) { + extern mi_heap_t _mi_heap_main; + mi_assert_internal(_mi_heap_main.cookie != 0); + return ((uintptr_t)p ^ _mi_heap_main.cookie); +} + +/* ----------------------------------------------------------- + Pages +----------------------------------------------------------- */ + +static inline mi_page_t* _mi_heap_get_free_small_page(mi_heap_t* heap, size_t size) { + mi_assert_internal(size <= (MI_SMALL_SIZE_MAX + MI_PADDING_SIZE)); + const size_t idx = _mi_wsize_from_size(size); + mi_assert_internal(idx < MI_PAGES_DIRECT); + return heap->pages_free_direct[idx]; +} + +// Segment that contains the pointer +// Large aligned blocks may be aligned at N*MI_SEGMENT_SIZE (inside a huge segment > MI_SEGMENT_SIZE), +// and we need align "down" to the segment info which is `MI_SEGMENT_SIZE` bytes before it; +// therefore we align one byte before `p`. +static inline mi_segment_t* _mi_ptr_segment(const void* p) { + mi_assert_internal(p != NULL); + return (mi_segment_t*)(((uintptr_t)p - 1) & ~MI_SEGMENT_MASK); +} + +static inline mi_page_t* mi_slice_to_page(mi_slice_t* s) { + mi_assert_internal(s->slice_offset== 0 && s->slice_count > 0); + return (mi_page_t*)(s); +} + +static inline mi_slice_t* mi_page_to_slice(mi_page_t* p) { + mi_assert_internal(p->slice_offset== 0 && p->slice_count > 0); + return (mi_slice_t*)(p); +} + +// Segment belonging to a page +static inline mi_segment_t* _mi_page_segment(const mi_page_t* page) { + mi_segment_t* segment = _mi_ptr_segment(page); + mi_assert_internal(segment == NULL || ((mi_slice_t*)page >= segment->slices && (mi_slice_t*)page < segment->slices + segment->slice_entries)); + return segment; +} + +static inline mi_slice_t* mi_slice_first(const mi_slice_t* slice) { + mi_slice_t* start = (mi_slice_t*)((uint8_t*)slice - slice->slice_offset); + mi_assert_internal(start >= _mi_ptr_segment(slice)->slices); + mi_assert_internal(start->slice_offset == 0); + mi_assert_internal(start + start->slice_count > slice); + return start; +} + +// Get the page containing the pointer (performance critical as it is called in mi_free) +static inline mi_page_t* _mi_segment_page_of(const mi_segment_t* segment, const void* p) { + mi_assert_internal(p > (void*)segment); + ptrdiff_t diff = (uint8_t*)p - (uint8_t*)segment; + mi_assert_internal(diff > 0 && diff <= (ptrdiff_t)MI_SEGMENT_SIZE); + size_t idx = (size_t)diff >> MI_SEGMENT_SLICE_SHIFT; + mi_assert_internal(idx <= segment->slice_entries); + mi_slice_t* slice0 = (mi_slice_t*)&segment->slices[idx]; + mi_slice_t* slice = mi_slice_first(slice0); // adjust to the block that holds the page data + mi_assert_internal(slice->slice_offset == 0); + mi_assert_internal(slice >= segment->slices && slice < segment->slices + segment->slice_entries); + return mi_slice_to_page(slice); +} + +// Quick page start for initialized pages +static inline uint8_t* _mi_page_start(const mi_segment_t* segment, const mi_page_t* page, size_t* page_size) { + return _mi_segment_page_start(segment, page, page_size); +} + +// Get the page containing the pointer +static inline mi_page_t* _mi_ptr_page(void* p) { + return _mi_segment_page_of(_mi_ptr_segment(p), p); +} + +// Get the block size of a page (special case for huge objects) +static inline size_t mi_page_block_size(const mi_page_t* page) { + const size_t bsize = page->xblock_size; + mi_assert_internal(bsize > 0); + if mi_likely(bsize < MI_HUGE_BLOCK_SIZE) { + return bsize; + } + else { + size_t psize; + _mi_segment_page_start(_mi_page_segment(page), page, &psize); + return psize; + } +} + +static inline bool mi_page_is_huge(const mi_page_t* page) { + return (_mi_page_segment(page)->kind == MI_SEGMENT_HUGE); +} + +// Get the usable block size of a page without fixed padding. +// This may still include internal padding due to alignment and rounding up size classes. +static inline size_t mi_page_usable_block_size(const mi_page_t* page) { + return mi_page_block_size(page) - MI_PADDING_SIZE; +} + +// size of a segment +static inline size_t mi_segment_size(mi_segment_t* segment) { + return segment->segment_slices * MI_SEGMENT_SLICE_SIZE; +} + +static inline uint8_t* mi_segment_end(mi_segment_t* segment) { + return (uint8_t*)segment + mi_segment_size(segment); +} + +// Thread free access +static inline mi_block_t* mi_page_thread_free(const mi_page_t* page) { + return (mi_block_t*)(mi_atomic_load_relaxed(&((mi_page_t*)page)->xthread_free) & ~3); +} + +static inline mi_delayed_t mi_page_thread_free_flag(const mi_page_t* page) { + return (mi_delayed_t)(mi_atomic_load_relaxed(&((mi_page_t*)page)->xthread_free) & 3); +} + +// Heap access +static inline mi_heap_t* mi_page_heap(const mi_page_t* page) { + return (mi_heap_t*)(mi_atomic_load_relaxed(&((mi_page_t*)page)->xheap)); +} + +static inline void mi_page_set_heap(mi_page_t* page, mi_heap_t* heap) { + mi_assert_internal(mi_page_thread_free_flag(page) != MI_DELAYED_FREEING); + mi_atomic_store_release(&page->xheap,(uintptr_t)heap); +} + +// Thread free flag helpers +static inline mi_block_t* mi_tf_block(mi_thread_free_t tf) { + return (mi_block_t*)(tf & ~0x03); +} +static inline mi_delayed_t mi_tf_delayed(mi_thread_free_t tf) { + return (mi_delayed_t)(tf & 0x03); +} +static inline mi_thread_free_t mi_tf_make(mi_block_t* block, mi_delayed_t delayed) { + return (mi_thread_free_t)((uintptr_t)block | (uintptr_t)delayed); +} +static inline mi_thread_free_t mi_tf_set_delayed(mi_thread_free_t tf, mi_delayed_t delayed) { + return mi_tf_make(mi_tf_block(tf),delayed); +} +static inline mi_thread_free_t mi_tf_set_block(mi_thread_free_t tf, mi_block_t* block) { + return mi_tf_make(block, mi_tf_delayed(tf)); +} + +// are all blocks in a page freed? +// note: needs up-to-date used count, (as the `xthread_free` list may not be empty). see `_mi_page_collect_free`. +static inline bool mi_page_all_free(const mi_page_t* page) { + mi_assert_internal(page != NULL); + return (page->used == 0); +} + +// are there any available blocks? +static inline bool mi_page_has_any_available(const mi_page_t* page) { + mi_assert_internal(page != NULL && page->reserved > 0); + return (page->used < page->reserved || (mi_page_thread_free(page) != NULL)); +} + +// are there immediately available blocks, i.e. blocks available on the free list. +static inline bool mi_page_immediate_available(const mi_page_t* page) { + mi_assert_internal(page != NULL); + return (page->free != NULL); +} + +// is more than 7/8th of a page in use? +static inline bool mi_page_mostly_used(const mi_page_t* page) { + if (page==NULL) return true; + uint16_t frac = page->reserved / 8U; + return (page->reserved - page->used <= frac); +} + +static inline mi_page_queue_t* mi_page_queue(const mi_heap_t* heap, size_t size) { + return &((mi_heap_t*)heap)->pages[_mi_bin(size)]; +} + + + +//----------------------------------------------------------- +// Page flags +//----------------------------------------------------------- +static inline bool mi_page_is_in_full(const mi_page_t* page) { + return page->flags.x.in_full; +} + +static inline void mi_page_set_in_full(mi_page_t* page, bool in_full) { + page->flags.x.in_full = in_full; +} + +static inline bool mi_page_has_aligned(const mi_page_t* page) { + return page->flags.x.has_aligned; +} + +static inline void mi_page_set_has_aligned(mi_page_t* page, bool has_aligned) { + page->flags.x.has_aligned = has_aligned; +} + + +/* ------------------------------------------------------------------- +Encoding/Decoding the free list next pointers + +This is to protect against buffer overflow exploits where the +free list is mutated. Many hardened allocators xor the next pointer `p` +with a secret key `k1`, as `p^k1`. This prevents overwriting with known +values but might be still too weak: if the attacker can guess +the pointer `p` this can reveal `k1` (since `p^k1^p == k1`). +Moreover, if multiple blocks can be read as well, the attacker can +xor both as `(p1^k1) ^ (p2^k1) == p1^p2` which may reveal a lot +about the pointers (and subsequently `k1`). + +Instead mimalloc uses an extra key `k2` and encodes as `((p^k2)<<> (MI_INTPTR_BITS - shift)))); +} +static inline uintptr_t mi_rotr(uintptr_t x, uintptr_t shift) { + shift %= MI_INTPTR_BITS; + return (shift==0 ? x : ((x >> shift) | (x << (MI_INTPTR_BITS - shift)))); +} + +static inline void* mi_ptr_decode(const void* null, const mi_encoded_t x, const uintptr_t* keys) { + void* p = (void*)(mi_rotr(x - keys[0], keys[0]) ^ keys[1]); + return (p==null ? NULL : p); +} + +static inline mi_encoded_t mi_ptr_encode(const void* null, const void* p, const uintptr_t* keys) { + uintptr_t x = (uintptr_t)(p==NULL ? null : p); + return mi_rotl(x ^ keys[1], keys[0]) + keys[0]; +} + +static inline mi_block_t* mi_block_nextx( const void* null, const mi_block_t* block, const uintptr_t* keys ) { + mi_track_mem_defined(block,sizeof(mi_block_t)); + mi_block_t* next; + #ifdef MI_ENCODE_FREELIST + next = (mi_block_t*)mi_ptr_decode(null, block->next, keys); + #else + MI_UNUSED(keys); MI_UNUSED(null); + next = (mi_block_t*)block->next; + #endif + mi_track_mem_noaccess(block,sizeof(mi_block_t)); + return next; +} + +static inline void mi_block_set_nextx(const void* null, mi_block_t* block, const mi_block_t* next, const uintptr_t* keys) { + mi_track_mem_undefined(block,sizeof(mi_block_t)); + #ifdef MI_ENCODE_FREELIST + block->next = mi_ptr_encode(null, next, keys); + #else + MI_UNUSED(keys); MI_UNUSED(null); + block->next = (mi_encoded_t)next; + #endif + mi_track_mem_noaccess(block,sizeof(mi_block_t)); +} + +static inline mi_block_t* mi_block_next(const mi_page_t* page, const mi_block_t* block) { + #ifdef MI_ENCODE_FREELIST + mi_block_t* next = mi_block_nextx(page,block,page->keys); + // check for free list corruption: is `next` at least in the same page? + // TODO: check if `next` is `page->block_size` aligned? + if mi_unlikely(next!=NULL && !mi_is_in_same_page(block, next)) { + _mi_error_message(EFAULT, "corrupted free list entry of size %zub at %p: value 0x%zx\n", mi_page_block_size(page), block, (uintptr_t)next); + next = NULL; + } + return next; + #else + MI_UNUSED(page); + return mi_block_nextx(page,block,NULL); + #endif +} + +static inline void mi_block_set_next(const mi_page_t* page, mi_block_t* block, const mi_block_t* next) { + #ifdef MI_ENCODE_FREELIST + mi_block_set_nextx(page,block,next, page->keys); + #else + MI_UNUSED(page); + mi_block_set_nextx(page,block,next,NULL); + #endif +} + + +// ------------------------------------------------------------------- +// commit mask +// ------------------------------------------------------------------- + +static inline void mi_commit_mask_create_empty(mi_commit_mask_t* cm) { + for (size_t i = 0; i < MI_COMMIT_MASK_FIELD_COUNT; i++) { + cm->mask[i] = 0; + } +} + +static inline void mi_commit_mask_create_full(mi_commit_mask_t* cm) { + for (size_t i = 0; i < MI_COMMIT_MASK_FIELD_COUNT; i++) { + cm->mask[i] = ~((size_t)0); + } +} + +static inline bool mi_commit_mask_is_empty(const mi_commit_mask_t* cm) { + for (size_t i = 0; i < MI_COMMIT_MASK_FIELD_COUNT; i++) { + if (cm->mask[i] != 0) return false; + } + return true; +} + +static inline bool mi_commit_mask_is_full(const mi_commit_mask_t* cm) { + for (size_t i = 0; i < MI_COMMIT_MASK_FIELD_COUNT; i++) { + if (cm->mask[i] != ~((size_t)0)) return false; + } + return true; +} + +// defined in `segment.c`: +size_t _mi_commit_mask_committed_size(const mi_commit_mask_t* cm, size_t total); +size_t _mi_commit_mask_next_run(const mi_commit_mask_t* cm, size_t* idx); + +#define mi_commit_mask_foreach(cm,idx,count) \ + idx = 0; \ + while ((count = _mi_commit_mask_next_run(cm,&idx)) > 0) { + +#define mi_commit_mask_foreach_end() \ + idx += count; \ + } + + + +/* ----------------------------------------------------------- + memory id's +----------------------------------------------------------- */ + +static inline mi_memid_t _mi_memid_create(mi_memkind_t memkind) { + mi_memid_t memid; + _mi_memzero_var(memid); + memid.memkind = memkind; + return memid; +} + +static inline mi_memid_t _mi_memid_none(void) { + return _mi_memid_create(MI_MEM_NONE); +} + +static inline mi_memid_t _mi_memid_create_os(bool committed, bool is_zero, bool is_large) { + mi_memid_t memid = _mi_memid_create(MI_MEM_OS); + memid.initially_committed = committed; + memid.initially_zero = is_zero; + memid.is_pinned = is_large; + return memid; +} + + +// ------------------------------------------------------------------- +// Fast "random" shuffle +// ------------------------------------------------------------------- + +static inline uintptr_t _mi_random_shuffle(uintptr_t x) { + if (x==0) { x = 17; } // ensure we don't get stuck in generating zeros +#if (MI_INTPTR_SIZE==8) + // by Sebastiano Vigna, see: + x ^= x >> 30; + x *= 0xbf58476d1ce4e5b9UL; + x ^= x >> 27; + x *= 0x94d049bb133111ebUL; + x ^= x >> 31; +#elif (MI_INTPTR_SIZE==4) + // by Chris Wellons, see: + x ^= x >> 16; + x *= 0x7feb352dUL; + x ^= x >> 15; + x *= 0x846ca68bUL; + x ^= x >> 16; +#endif + return x; +} + +// ------------------------------------------------------------------- +// Optimize numa node access for the common case (= one node) +// ------------------------------------------------------------------- + +int _mi_os_numa_node_get(mi_os_tld_t* tld); +size_t _mi_os_numa_node_count_get(void); + +extern _Atomic(size_t) _mi_numa_node_count; +static inline int _mi_os_numa_node(mi_os_tld_t* tld) { + if mi_likely(mi_atomic_load_relaxed(&_mi_numa_node_count) == 1) { return 0; } + else return _mi_os_numa_node_get(tld); +} +static inline size_t _mi_os_numa_node_count(void) { + const size_t count = mi_atomic_load_relaxed(&_mi_numa_node_count); + if mi_likely(count > 0) { return count; } + else return _mi_os_numa_node_count_get(); +} + + + +// ----------------------------------------------------------------------- +// Count bits: trailing or leading zeros (with MI_INTPTR_BITS on all zero) +// ----------------------------------------------------------------------- + +#if defined(__GNUC__) + +#include // LONG_MAX +#define MI_HAVE_FAST_BITSCAN +static inline size_t mi_clz(uintptr_t x) { + if (x==0) return MI_INTPTR_BITS; +#if (INTPTR_MAX == LONG_MAX) + return __builtin_clzl(x); +#else + return __builtin_clzll(x); +#endif +} +static inline size_t mi_ctz(uintptr_t x) { + if (x==0) return MI_INTPTR_BITS; +#if (INTPTR_MAX == LONG_MAX) + return __builtin_ctzl(x); +#else + return __builtin_ctzll(x); +#endif +} + +#elif defined(_MSC_VER) + +#include // LONG_MAX +#include // BitScanReverse64 +#define MI_HAVE_FAST_BITSCAN +static inline size_t mi_clz(uintptr_t x) { + if (x==0) return MI_INTPTR_BITS; + unsigned long idx; +#if (INTPTR_MAX == LONG_MAX) + _BitScanReverse(&idx, x); +#else + _BitScanReverse64(&idx, x); +#endif + return ((MI_INTPTR_BITS - 1) - idx); +} +static inline size_t mi_ctz(uintptr_t x) { + if (x==0) return MI_INTPTR_BITS; + unsigned long idx; +#if (INTPTR_MAX == LONG_MAX) + _BitScanForward(&idx, x); +#else + _BitScanForward64(&idx, x); +#endif + return idx; +} + +#else +static inline size_t mi_ctz32(uint32_t x) { + // de Bruijn multiplication, see + static const unsigned char debruijn[32] = { + 0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8, + 31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9 + }; + if (x==0) return 32; + return debruijn[((x & -(int32_t)x) * 0x077CB531UL) >> 27]; +} +static inline size_t mi_clz32(uint32_t x) { + // de Bruijn multiplication, see + static const uint8_t debruijn[32] = { + 31, 22, 30, 21, 18, 10, 29, 2, 20, 17, 15, 13, 9, 6, 28, 1, + 23, 19, 11, 3, 16, 14, 7, 24, 12, 4, 8, 25, 5, 26, 27, 0 + }; + if (x==0) return 32; + x |= x >> 1; + x |= x >> 2; + x |= x >> 4; + x |= x >> 8; + x |= x >> 16; + return debruijn[(uint32_t)(x * 0x07C4ACDDUL) >> 27]; +} + +static inline size_t mi_clz(uintptr_t x) { + if (x==0) return MI_INTPTR_BITS; +#if (MI_INTPTR_BITS <= 32) + return mi_clz32((uint32_t)x); +#else + size_t count = mi_clz32((uint32_t)(x >> 32)); + if (count < 32) return count; + return (32 + mi_clz32((uint32_t)x)); +#endif +} +static inline size_t mi_ctz(uintptr_t x) { + if (x==0) return MI_INTPTR_BITS; +#if (MI_INTPTR_BITS <= 32) + return mi_ctz32((uint32_t)x); +#else + size_t count = mi_ctz32((uint32_t)x); + if (count < 32) return count; + return (32 + mi_ctz32((uint32_t)(x>>32))); +#endif +} + +#endif + +// "bit scan reverse": Return index of the highest bit (or MI_INTPTR_BITS if `x` is zero) +static inline size_t mi_bsr(uintptr_t x) { + return (x==0 ? MI_INTPTR_BITS : MI_INTPTR_BITS - 1 - mi_clz(x)); +} + + +// --------------------------------------------------------------------------------- +// Provide our own `_mi_memcpy` for potential performance optimizations. +// +// For now, only on Windows with msvc/clang-cl we optimize to `rep movsb` if +// we happen to run on x86/x64 cpu's that have "fast short rep movsb" (FSRM) support +// (AMD Zen3+ (~2020) or Intel Ice Lake+ (~2017). See also issue #201 and pr #253. +// --------------------------------------------------------------------------------- + +#if !MI_TRACK_ENABLED && defined(_WIN32) && (defined(_M_IX86) || defined(_M_X64)) +#include +extern bool _mi_cpu_has_fsrm; +static inline void _mi_memcpy(void* dst, const void* src, size_t n) { + if (_mi_cpu_has_fsrm) { + __movsb((unsigned char*)dst, (const unsigned char*)src, n); + } + else { + memcpy(dst, src, n); + } +} +static inline void _mi_memzero(void* dst, size_t n) { + if (_mi_cpu_has_fsrm) { + __stosb((unsigned char*)dst, 0, n); + } + else { + memset(dst, 0, n); + } +} +#else +static inline void _mi_memcpy(void* dst, const void* src, size_t n) { + memcpy(dst, src, n); +} +static inline void _mi_memzero(void* dst, size_t n) { + memset(dst, 0, n); +} +#endif + +// ------------------------------------------------------------------------------- +// The `_mi_memcpy_aligned` can be used if the pointers are machine-word aligned +// This is used for example in `mi_realloc`. +// ------------------------------------------------------------------------------- + +#if (defined(__GNUC__) && (__GNUC__ >= 4)) || defined(__clang__) +// On GCC/CLang we provide a hint that the pointers are word aligned. +static inline void _mi_memcpy_aligned(void* dst, const void* src, size_t n) { + mi_assert_internal(((uintptr_t)dst % MI_INTPTR_SIZE == 0) && ((uintptr_t)src % MI_INTPTR_SIZE == 0)); + void* adst = __builtin_assume_aligned(dst, MI_INTPTR_SIZE); + const void* asrc = __builtin_assume_aligned(src, MI_INTPTR_SIZE); + _mi_memcpy(adst, asrc, n); +} + +static inline void _mi_memzero_aligned(void* dst, size_t n) { + mi_assert_internal((uintptr_t)dst % MI_INTPTR_SIZE == 0); + void* adst = __builtin_assume_aligned(dst, MI_INTPTR_SIZE); + _mi_memzero(adst, n); +} +#else +// Default fallback on `_mi_memcpy` +static inline void _mi_memcpy_aligned(void* dst, const void* src, size_t n) { + mi_assert_internal(((uintptr_t)dst % MI_INTPTR_SIZE == 0) && ((uintptr_t)src % MI_INTPTR_SIZE == 0)); + _mi_memcpy(dst, src, n); +} + +static inline void _mi_memzero_aligned(void* dst, size_t n) { + mi_assert_internal((uintptr_t)dst % MI_INTPTR_SIZE == 0); + _mi_memzero(dst, n); +} +#endif + + +#endif diff --git a/compat/mimalloc/mimalloc/prim.h b/compat/mimalloc/mimalloc/prim.h new file mode 100644 index 00000000000000..1e55cb5f8802d7 --- /dev/null +++ b/compat/mimalloc/mimalloc/prim.h @@ -0,0 +1,323 @@ +/* ---------------------------------------------------------------------------- +Copyright (c) 2018-2023, Microsoft Research, Daan Leijen +This is free software; you can redistribute it and/or modify it under the +terms of the MIT license. A copy of the license can be found in the file +"LICENSE" at the root of this distribution. +-----------------------------------------------------------------------------*/ +#pragma once +#ifndef MIMALLOC_PRIM_H +#define MIMALLOC_PRIM_H + + +// -------------------------------------------------------------------------- +// This file specifies the primitive portability API. +// Each OS/host needs to implement these primitives, see `src/prim` +// for implementations on Window, macOS, WASI, and Linux/Unix. +// +// note: on all primitive functions, we always have result parameters != NUL, and: +// addr != NULL and page aligned +// size > 0 and page aligned +// return value is an error code an int where 0 is success. +// -------------------------------------------------------------------------- + +// OS memory configuration +typedef struct mi_os_mem_config_s { + size_t page_size; // 4KiB + size_t large_page_size; // 2MiB + size_t alloc_granularity; // smallest allocation size (on Windows 64KiB) + bool has_overcommit; // can we reserve more memory than can be actually committed? + bool must_free_whole; // must allocated blocks be freed as a whole (false for mmap, true for VirtualAlloc) + bool has_virtual_reserve; // supports virtual address space reservation? (if true we can reserve virtual address space without using commit or physical memory) +} mi_os_mem_config_t; + +// Initialize +void _mi_prim_mem_init( mi_os_mem_config_t* config ); + +// Free OS memory +int _mi_prim_free(void* addr, size_t size ); + +// Allocate OS memory. Return NULL on error. +// The `try_alignment` is just a hint and the returned pointer does not have to be aligned. +// If `commit` is false, the virtual memory range only needs to be reserved (with no access) +// which will later be committed explicitly using `_mi_prim_commit`. +// `is_zero` is set to true if the memory was zero initialized (as on most OS's) +// pre: !commit => !allow_large +// try_alignment >= _mi_os_page_size() and a power of 2 +int _mi_prim_alloc(size_t size, size_t try_alignment, bool commit, bool allow_large, bool* is_large, bool* is_zero, void** addr); + +// Commit memory. Returns error code or 0 on success. +// For example, on Linux this would make the memory PROT_READ|PROT_WRITE. +// `is_zero` is set to true if the memory was zero initialized (e.g. on Windows) +int _mi_prim_commit(void* addr, size_t size, bool* is_zero); + +// Decommit memory. Returns error code or 0 on success. The `needs_recommit` result is true +// if the memory would need to be re-committed. For example, on Windows this is always true, +// but on Linux we could use MADV_DONTNEED to decommit which does not need a recommit. +// pre: needs_recommit != NULL +int _mi_prim_decommit(void* addr, size_t size, bool* needs_recommit); + +// Reset memory. The range keeps being accessible but the content might be reset. +// Returns error code or 0 on success. +int _mi_prim_reset(void* addr, size_t size); + +// Protect memory. Returns error code or 0 on success. +int _mi_prim_protect(void* addr, size_t size, bool protect); + +// Allocate huge (1GiB) pages possibly associated with a NUMA node. +// `is_zero` is set to true if the memory was zero initialized (as on most OS's) +// pre: size > 0 and a multiple of 1GiB. +// numa_node is either negative (don't care), or a numa node number. +int _mi_prim_alloc_huge_os_pages(void* hint_addr, size_t size, int numa_node, bool* is_zero, void** addr); + +// Return the current NUMA node +size_t _mi_prim_numa_node(void); + +// Return the number of logical NUMA nodes +size_t _mi_prim_numa_node_count(void); + +// Clock ticks +mi_msecs_t _mi_prim_clock_now(void); + +// Return process information (only for statistics) +typedef struct mi_process_info_s { + mi_msecs_t elapsed; + mi_msecs_t utime; + mi_msecs_t stime; + size_t current_rss; + size_t peak_rss; + size_t current_commit; + size_t peak_commit; + size_t page_faults; +} mi_process_info_t; + +void _mi_prim_process_info(mi_process_info_t* pinfo); + +// Default stderr output. (only for warnings etc. with verbose enabled) +// msg != NULL && _mi_strlen(msg) > 0 +void _mi_prim_out_stderr( const char* msg ); + +// Get an environment variable. (only for options) +// name != NULL, result != NULL, result_size >= 64 +bool _mi_prim_getenv(const char* name, char* result, size_t result_size); + + +// Fill a buffer with strong randomness; return `false` on error or if +// there is no strong randomization available. +bool _mi_prim_random_buf(void* buf, size_t buf_len); + +// Called on the first thread start, and should ensure `_mi_thread_done` is called on thread termination. +void _mi_prim_thread_init_auto_done(void); + +// Called on process exit and may take action to clean up resources associated with the thread auto done. +void _mi_prim_thread_done_auto_done(void); + +// Called when the default heap for a thread changes +void _mi_prim_thread_associate_default_heap(mi_heap_t* heap); + + +//------------------------------------------------------------------- +// Thread id: `_mi_prim_thread_id()` +// +// Getting the thread id should be performant as it is called in the +// fast path of `_mi_free` and we specialize for various platforms as +// inlined definitions. Regular code should call `init.c:_mi_thread_id()`. +// We only require _mi_prim_thread_id() to return a unique id +// for each thread (unequal to zero). +//------------------------------------------------------------------- + +// defined in `init.c`; do not use these directly +extern mi_decl_thread mi_heap_t* _mi_heap_default; // default heap to allocate from +extern bool _mi_process_is_initialized; // has mi_process_init been called? + +static inline mi_threadid_t _mi_prim_thread_id(void) mi_attr_noexcept; + +#if defined(_WIN32) + +#define WIN32_LEAN_AND_MEAN +#include +static inline mi_threadid_t _mi_prim_thread_id(void) mi_attr_noexcept { + // Windows: works on Intel and ARM in both 32- and 64-bit + return (uintptr_t)NtCurrentTeb(); +} + +// We use assembly for a fast thread id on the main platforms. The TLS layout depends on +// both the OS and libc implementation so we use specific tests for each main platform. +// If you test on another platform and it works please send a PR :-) +// see also https://akkadia.org/drepper/tls.pdf for more info on the TLS register. +#elif defined(__GNUC__) && ( \ + (defined(__GLIBC__) && (defined(__x86_64__) || defined(__i386__) || defined(__arm__) || defined(__aarch64__))) \ + || (defined(__APPLE__) && (defined(__x86_64__) || defined(__aarch64__))) \ + || (defined(__BIONIC__) && (defined(__x86_64__) || defined(__i386__) || defined(__arm__) || defined(__aarch64__))) \ + || (defined(__FreeBSD__) && (defined(__x86_64__) || defined(__i386__) || defined(__aarch64__))) \ + || (defined(__OpenBSD__) && (defined(__x86_64__) || defined(__i386__) || defined(__aarch64__))) \ + ) + +static inline void* mi_prim_tls_slot(size_t slot) mi_attr_noexcept { + void* res; + const size_t ofs = (slot*sizeof(void*)); + #if defined(__i386__) + __asm__("movl %%gs:%1, %0" : "=r" (res) : "m" (*((void**)ofs)) : ); // x86 32-bit always uses GS + #elif defined(__APPLE__) && defined(__x86_64__) + __asm__("movq %%gs:%1, %0" : "=r" (res) : "m" (*((void**)ofs)) : ); // x86_64 macOSX uses GS + #elif defined(__x86_64__) && (MI_INTPTR_SIZE==4) + __asm__("movl %%fs:%1, %0" : "=r" (res) : "m" (*((void**)ofs)) : ); // x32 ABI + #elif defined(__x86_64__) + __asm__("movq %%fs:%1, %0" : "=r" (res) : "m" (*((void**)ofs)) : ); // x86_64 Linux, BSD uses FS + #elif defined(__arm__) + void** tcb; MI_UNUSED(ofs); + __asm__ volatile ("mrc p15, 0, %0, c13, c0, 3\nbic %0, %0, #3" : "=r" (tcb)); + res = tcb[slot]; + #elif defined(__aarch64__) + void** tcb; MI_UNUSED(ofs); + #if defined(__APPLE__) // M1, issue #343 + __asm__ volatile ("mrs %0, tpidrro_el0\nbic %0, %0, #7" : "=r" (tcb)); + #else + __asm__ volatile ("mrs %0, tpidr_el0" : "=r" (tcb)); + #endif + res = tcb[slot]; + #endif + return res; +} + +// setting a tls slot is only used on macOS for now +static inline void mi_prim_tls_slot_set(size_t slot, void* value) mi_attr_noexcept { + const size_t ofs = (slot*sizeof(void*)); + #if defined(__i386__) + __asm__("movl %1,%%gs:%0" : "=m" (*((void**)ofs)) : "rn" (value) : ); // 32-bit always uses GS + #elif defined(__APPLE__) && defined(__x86_64__) + __asm__("movq %1,%%gs:%0" : "=m" (*((void**)ofs)) : "rn" (value) : ); // x86_64 macOS uses GS + #elif defined(__x86_64__) && (MI_INTPTR_SIZE==4) + __asm__("movl %1,%%fs:%0" : "=m" (*((void**)ofs)) : "rn" (value) : ); // x32 ABI + #elif defined(__x86_64__) + __asm__("movq %1,%%fs:%0" : "=m" (*((void**)ofs)) : "rn" (value) : ); // x86_64 Linux, BSD uses FS + #elif defined(__arm__) + void** tcb; MI_UNUSED(ofs); + __asm__ volatile ("mrc p15, 0, %0, c13, c0, 3\nbic %0, %0, #3" : "=r" (tcb)); + tcb[slot] = value; + #elif defined(__aarch64__) + void** tcb; MI_UNUSED(ofs); + #if defined(__APPLE__) // M1, issue #343 + __asm__ volatile ("mrs %0, tpidrro_el0\nbic %0, %0, #7" : "=r" (tcb)); + #else + __asm__ volatile ("mrs %0, tpidr_el0" : "=r" (tcb)); + #endif + tcb[slot] = value; + #endif +} + +static inline mi_threadid_t _mi_prim_thread_id(void) mi_attr_noexcept { + #if defined(__BIONIC__) + // issue #384, #495: on the Bionic libc (Android), slot 1 is the thread id + // see: https://github.com/aosp-mirror/platform_bionic/blob/c44b1d0676ded732df4b3b21c5f798eacae93228/libc/platform/bionic/tls_defines.h#L86 + return (uintptr_t)mi_prim_tls_slot(1); + #else + // in all our other targets, slot 0 is the thread id + // glibc: https://sourceware.org/git/?p=glibc.git;a=blob_plain;f=sysdeps/x86_64/nptl/tls.h + // apple: https://github.com/apple/darwin-xnu/blob/main/libsyscall/os/tsd.h#L36 + return (uintptr_t)mi_prim_tls_slot(0); + #endif +} + +#else + +// otherwise use portable C, taking the address of a thread local variable (this is still very fast on most platforms). +static inline mi_threadid_t _mi_prim_thread_id(void) mi_attr_noexcept { + return (uintptr_t)&_mi_heap_default; +} + +#endif + + + +/* ---------------------------------------------------------------------------------------- +The thread local default heap: `_mi_prim_get_default_heap()` +This is inlined here as it is on the fast path for allocation functions. + +On most platforms (Windows, Linux, FreeBSD, NetBSD, etc), this just returns a +__thread local variable (`_mi_heap_default`). With the initial-exec TLS model this ensures +that the storage will always be available (allocated on the thread stacks). + +On some platforms though we cannot use that when overriding `malloc` since the underlying +TLS implementation (or the loader) will call itself `malloc` on a first access and recurse. +We try to circumvent this in an efficient way: +- macOSX : we use an unused TLS slot from the OS allocated slots (MI_TLS_SLOT). On OSX, the + loader itself calls `malloc` even before the modules are initialized. +- OpenBSD: we use an unused slot from the pthread block (MI_TLS_PTHREAD_SLOT_OFS). +- DragonFly: defaults are working but seem slow compared to freeBSD (see PR #323) +------------------------------------------------------------------------------------------- */ + +static inline mi_heap_t* mi_prim_get_default_heap(void); + +#if defined(MI_MALLOC_OVERRIDE) +#if defined(__APPLE__) // macOS + #define MI_TLS_SLOT 89 // seems unused? + // #define MI_TLS_RECURSE_GUARD 1 + // other possible unused ones are 9, 29, __PTK_FRAMEWORK_JAVASCRIPTCORE_KEY4 (94), __PTK_FRAMEWORK_GC_KEY9 (112) and __PTK_FRAMEWORK_OLDGC_KEY9 (89) + // see +#elif defined(__OpenBSD__) + // use end bytes of a name; goes wrong if anyone uses names > 23 characters (ptrhread specifies 16) + // see + #define MI_TLS_PTHREAD_SLOT_OFS (6*sizeof(int) + 4*sizeof(void*) + 24) + // #elif defined(__DragonFly__) + // #warning "mimalloc is not working correctly on DragonFly yet." + // #define MI_TLS_PTHREAD_SLOT_OFS (4 + 1*sizeof(void*)) // offset `uniqueid` (also used by gdb?) +#elif defined(__ANDROID__) + // See issue #381 + #define MI_TLS_PTHREAD +#endif +#endif + + +#if defined(MI_TLS_SLOT) + +static inline mi_heap_t* mi_prim_get_default_heap(void) { + mi_heap_t* heap = (mi_heap_t*)mi_prim_tls_slot(MI_TLS_SLOT); + if mi_unlikely(heap == NULL) { + #ifdef __GNUC__ + __asm(""); // prevent conditional load of the address of _mi_heap_empty + #endif + heap = (mi_heap_t*)&_mi_heap_empty; + } + return heap; +} + +#elif defined(MI_TLS_PTHREAD_SLOT_OFS) + +static inline mi_heap_t** mi_prim_tls_pthread_heap_slot(void) { + pthread_t self = pthread_self(); + #if defined(__DragonFly__) + if (self==NULL) return NULL; + #endif + return (mi_heap_t**)((uint8_t*)self + MI_TLS_PTHREAD_SLOT_OFS); +} + +static inline mi_heap_t* mi_prim_get_default_heap(void) { + mi_heap_t** pheap = mi_prim_tls_pthread_heap_slot(); + if mi_unlikely(pheap == NULL) return _mi_heap_main_get(); + mi_heap_t* heap = *pheap; + if mi_unlikely(heap == NULL) return (mi_heap_t*)&_mi_heap_empty; + return heap; +} + +#elif defined(MI_TLS_PTHREAD) + +extern pthread_key_t _mi_heap_default_key; +static inline mi_heap_t* mi_prim_get_default_heap(void) { + mi_heap_t* heap = (mi_unlikely(_mi_heap_default_key == (pthread_key_t)(-1)) ? _mi_heap_main_get() : (mi_heap_t*)pthread_getspecific(_mi_heap_default_key)); + return (mi_unlikely(heap == NULL) ? (mi_heap_t*)&_mi_heap_empty : heap); +} + +#else // default using a thread local variable; used on most platforms. + +static inline mi_heap_t* mi_prim_get_default_heap(void) { + #if defined(MI_TLS_RECURSE_GUARD) + if (mi_unlikely(!_mi_process_is_initialized)) return _mi_heap_main_get(); + #endif + return _mi_heap_default; +} + +#endif // mi_prim_get_default_heap() + + + +#endif // MIMALLOC_PRIM_H diff --git a/compat/mimalloc/mimalloc/track.h b/compat/mimalloc/mimalloc/track.h new file mode 100644 index 00000000000000..fa1a048d846a9c --- /dev/null +++ b/compat/mimalloc/mimalloc/track.h @@ -0,0 +1,147 @@ +/* ---------------------------------------------------------------------------- +Copyright (c) 2018-2023, Microsoft Research, Daan Leijen +This is free software; you can redistribute it and/or modify it under the +terms of the MIT license. A copy of the license can be found in the file +"LICENSE" at the root of this distribution. +-----------------------------------------------------------------------------*/ +#pragma once +#ifndef MIMALLOC_TRACK_H +#define MIMALLOC_TRACK_H + +/* ------------------------------------------------------------------------------------------------------ +Track memory ranges with macros for tools like Valgrind address sanitizer, or other memory checkers. +These can be defined for tracking allocation: + + #define mi_track_malloc_size(p,reqsize,size,zero) + #define mi_track_free_size(p,_size) + +The macros are set up such that the size passed to `mi_track_free_size` +always matches the size of `mi_track_malloc_size`. (currently, `size == mi_usable_size(p)`). +The `reqsize` is what the user requested, and `size >= reqsize`. +The `size` is either byte precise (and `size==reqsize`) if `MI_PADDING` is enabled, +or otherwise it is the usable block size which may be larger than the original request. +Use `_mi_block_size_of(void* p)` to get the full block size that was allocated (including padding etc). +The `zero` parameter is `true` if the allocated block is zero initialized. + +Optional: + + #define mi_track_align(p,alignedp,offset,size) + #define mi_track_resize(p,oldsize,newsize) + #define mi_track_init() + +The `mi_track_align` is called right after a `mi_track_malloc` for aligned pointers in a block. +The corresponding `mi_track_free` still uses the block start pointer and original size (corresponding to the `mi_track_malloc`). +The `mi_track_resize` is currently unused but could be called on reallocations within a block. +`mi_track_init` is called at program start. + +The following macros are for tools like asan and valgrind to track whether memory is +defined, undefined, or not accessible at all: + + #define mi_track_mem_defined(p,size) + #define mi_track_mem_undefined(p,size) + #define mi_track_mem_noaccess(p,size) + +-------------------------------------------------------------------------------------------------------*/ + +#if MI_TRACK_VALGRIND +// valgrind tool + +#define MI_TRACK_ENABLED 1 +#define MI_TRACK_HEAP_DESTROY 1 // track free of individual blocks on heap_destroy +#define MI_TRACK_TOOL "valgrind" + +#include +#include + +#define mi_track_malloc_size(p,reqsize,size,zero) VALGRIND_MALLOCLIKE_BLOCK(p,size,MI_PADDING_SIZE /*red zone*/,zero) +#define mi_track_free_size(p,_size) VALGRIND_FREELIKE_BLOCK(p,MI_PADDING_SIZE /*red zone*/) +#define mi_track_resize(p,oldsize,newsize) VALGRIND_RESIZEINPLACE_BLOCK(p,oldsize,newsize,MI_PADDING_SIZE /*red zone*/) +#define mi_track_mem_defined(p,size) VALGRIND_MAKE_MEM_DEFINED(p,size) +#define mi_track_mem_undefined(p,size) VALGRIND_MAKE_MEM_UNDEFINED(p,size) +#define mi_track_mem_noaccess(p,size) VALGRIND_MAKE_MEM_NOACCESS(p,size) + +#elif MI_TRACK_ASAN +// address sanitizer + +#define MI_TRACK_ENABLED 1 +#define MI_TRACK_HEAP_DESTROY 0 +#define MI_TRACK_TOOL "asan" + +#include + +#define mi_track_malloc_size(p,reqsize,size,zero) ASAN_UNPOISON_MEMORY_REGION(p,size) +#define mi_track_free_size(p,size) ASAN_POISON_MEMORY_REGION(p,size) +#define mi_track_mem_defined(p,size) ASAN_UNPOISON_MEMORY_REGION(p,size) +#define mi_track_mem_undefined(p,size) ASAN_UNPOISON_MEMORY_REGION(p,size) +#define mi_track_mem_noaccess(p,size) ASAN_POISON_MEMORY_REGION(p,size) + +#elif MI_TRACK_ETW +// windows event tracing + +#define MI_TRACK_ENABLED 1 +#define MI_TRACK_HEAP_DESTROY 1 +#define MI_TRACK_TOOL "ETW" + +#define WIN32_LEAN_AND_MEAN +#include +#include "../src/prim/windows/etw.h" + +#define mi_track_init() EventRegistermicrosoft_windows_mimalloc(); +#define mi_track_malloc_size(p,reqsize,size,zero) EventWriteETW_MI_ALLOC((UINT64)(p), size) +#define mi_track_free_size(p,size) EventWriteETW_MI_FREE((UINT64)(p), size) + +#else +// no tracking + +#define MI_TRACK_ENABLED 0 +#define MI_TRACK_HEAP_DESTROY 0 +#define MI_TRACK_TOOL "none" + +#define mi_track_malloc_size(p,reqsize,size,zero) +#define mi_track_free_size(p,_size) + +#endif + +// ------------------- +// Utility definitions + +#ifndef mi_track_resize +#define mi_track_resize(p,oldsize,newsize) mi_track_free_size(p,oldsize); mi_track_malloc(p,newsize,false) +#endif + +#ifndef mi_track_align +#define mi_track_align(p,alignedp,offset,size) mi_track_mem_noaccess(p,offset) +#endif + +#ifndef mi_track_init +#define mi_track_init() +#endif + +#ifndef mi_track_mem_defined +#define mi_track_mem_defined(p,size) +#endif + +#ifndef mi_track_mem_undefined +#define mi_track_mem_undefined(p,size) +#endif + +#ifndef mi_track_mem_noaccess +#define mi_track_mem_noaccess(p,size) +#endif + + +#if MI_PADDING +#define mi_track_malloc(p,reqsize,zero) \ + if ((p)!=NULL) { \ + mi_assert_internal(mi_usable_size(p)==(reqsize)); \ + mi_track_malloc_size(p,reqsize,reqsize,zero); \ + } +#else +#define mi_track_malloc(p,reqsize,zero) \ + if ((p)!=NULL) { \ + mi_assert_internal(mi_usable_size(p)>=(reqsize)); \ + mi_track_malloc_size(p,reqsize,mi_usable_size(p),zero); \ + } +#endif + +#endif diff --git a/compat/mimalloc/mimalloc/types.h b/compat/mimalloc/mimalloc/types.h new file mode 100644 index 00000000000000..7616f37e4b978f --- /dev/null +++ b/compat/mimalloc/mimalloc/types.h @@ -0,0 +1,670 @@ +/* ---------------------------------------------------------------------------- +Copyright (c) 2018-2023, Microsoft Research, Daan Leijen +This is free software; you can redistribute it and/or modify it under the +terms of the MIT license. A copy of the license can be found in the file +"LICENSE" at the root of this distribution. +-----------------------------------------------------------------------------*/ +#pragma once +#ifndef MIMALLOC_TYPES_H +#define MIMALLOC_TYPES_H + +// -------------------------------------------------------------------------- +// This file contains the main type definitions for mimalloc: +// mi_heap_t : all data for a thread-local heap, contains +// lists of all managed heap pages. +// mi_segment_t : a larger chunk of memory (32GiB) from where pages +// are allocated. +// mi_page_t : a mimalloc page (usually 64KiB or 512KiB) from +// where objects are allocated. +// -------------------------------------------------------------------------- + + +#include // ptrdiff_t +#include // uintptr_t, uint16_t, etc +#include "mimalloc/atomic.h" // _Atomic + +#ifdef _MSC_VER +#pragma warning(disable:4214) // bitfield is not int +#endif + +// Minimal alignment necessary. On most platforms 16 bytes are needed +// due to SSE registers for example. This must be at least `sizeof(void*)` +#ifndef MI_MAX_ALIGN_SIZE +#define MI_MAX_ALIGN_SIZE 16 // sizeof(max_align_t) +#endif + +// ------------------------------------------------------ +// Variants +// ------------------------------------------------------ + +// Define NDEBUG in the release version to disable assertions. +// #define NDEBUG + +// Define MI_TRACK_ to enable tracking support +// #define MI_TRACK_VALGRIND 1 +// #define MI_TRACK_ASAN 1 +// #define MI_TRACK_ETW 1 + +// Define MI_STAT as 1 to maintain statistics; set it to 2 to have detailed statistics (but costs some performance). +// #define MI_STAT 1 + +// Define MI_SECURE to enable security mitigations +// #define MI_SECURE 1 // guard page around metadata +// #define MI_SECURE 2 // guard page around each mimalloc page +// #define MI_SECURE 3 // encode free lists (detect corrupted free list (buffer overflow), and invalid pointer free) +// #define MI_SECURE 4 // checks for double free. (may be more expensive) + +#if !defined(MI_SECURE) +#define MI_SECURE 0 +#endif + +// Define MI_DEBUG for debug mode +// #define MI_DEBUG 1 // basic assertion checks and statistics, check double free, corrupted free list, and invalid pointer free. +// #define MI_DEBUG 2 // + internal assertion checks +// #define MI_DEBUG 3 // + extensive internal invariant checking (cmake -DMI_DEBUG_FULL=ON) +#if !defined(MI_DEBUG) +#if !defined(NDEBUG) || defined(_DEBUG) +#define MI_DEBUG 2 +#else +#define MI_DEBUG 0 +#endif +#endif + +// Reserve extra padding at the end of each block to be more resilient against heap block overflows. +// The padding can detect buffer overflow on free. +#if !defined(MI_PADDING) && (MI_SECURE>=3 || MI_DEBUG>=1 || (MI_TRACK_VALGRIND || MI_TRACK_ASAN || MI_TRACK_ETW)) +#define MI_PADDING 1 +#endif + +// Check padding bytes; allows byte-precise buffer overflow detection +#if !defined(MI_PADDING_CHECK) && MI_PADDING && (MI_SECURE>=3 || MI_DEBUG>=1) +#define MI_PADDING_CHECK 1 +#endif + + +// Encoded free lists allow detection of corrupted free lists +// and can detect buffer overflows, modify after free, and double `free`s. +#if (MI_SECURE>=3 || MI_DEBUG>=1) +#define MI_ENCODE_FREELIST 1 +#endif + + +// We used to abandon huge pages but to eagerly deallocate if freed from another thread, +// but that makes it not possible to visit them during a heap walk or include them in a +// `mi_heap_destroy`. We therefore instead reset/decommit the huge blocks if freed from +// another thread so most memory is available until it gets properly freed by the owning thread. +// #define MI_HUGE_PAGE_ABANDON 1 + + +// ------------------------------------------------------ +// Platform specific values +// ------------------------------------------------------ + +// ------------------------------------------------------ +// Size of a pointer. +// We assume that `sizeof(void*)==sizeof(intptr_t)` +// and it holds for all platforms we know of. +// +// However, the C standard only requires that: +// p == (void*)((intptr_t)p)) +// but we also need: +// i == (intptr_t)((void*)i) +// or otherwise one might define an intptr_t type that is larger than a pointer... +// ------------------------------------------------------ + +#if INTPTR_MAX > INT64_MAX +# define MI_INTPTR_SHIFT (4) // assume 128-bit (as on arm CHERI for example) +#elif INTPTR_MAX == INT64_MAX +# define MI_INTPTR_SHIFT (3) +#elif INTPTR_MAX == INT32_MAX +# define MI_INTPTR_SHIFT (2) +#else +#error platform pointers must be 32, 64, or 128 bits +#endif + +#if SIZE_MAX == UINT64_MAX +# define MI_SIZE_SHIFT (3) +typedef int64_t mi_ssize_t; +#elif SIZE_MAX == UINT32_MAX +# define MI_SIZE_SHIFT (2) +typedef int32_t mi_ssize_t; +#else +#error platform objects must be 32 or 64 bits +#endif + +#if (SIZE_MAX/2) > LONG_MAX +# define MI_ZU(x) x##ULL +# define MI_ZI(x) x##LL +#else +# define MI_ZU(x) x##UL +# define MI_ZI(x) x##L +#endif + +#define MI_INTPTR_SIZE (1< 4 +#define MI_SEGMENT_SHIFT ( 9 + MI_SEGMENT_SLICE_SHIFT) // 32MiB +#else +#define MI_SEGMENT_SHIFT ( 7 + MI_SEGMENT_SLICE_SHIFT) // 4MiB on 32-bit +#endif + +#define MI_SMALL_PAGE_SHIFT (MI_SEGMENT_SLICE_SHIFT) // 64KiB +#define MI_MEDIUM_PAGE_SHIFT ( 3 + MI_SMALL_PAGE_SHIFT) // 512KiB + + +// Derived constants +#define MI_SEGMENT_SIZE (MI_ZU(1)<= 655360) +#error "mimalloc internal: define more bins" +#endif + +// Maximum slice offset (15) +#define MI_MAX_SLICE_OFFSET ((MI_ALIGNMENT_MAX / MI_SEGMENT_SLICE_SIZE) - 1) + +// Used as a special value to encode block sizes in 32 bits. +#define MI_HUGE_BLOCK_SIZE ((uint32_t)(2*MI_GiB)) + +// blocks up to this size are always allocated aligned +#define MI_MAX_ALIGN_GUARANTEE (8*MI_MAX_ALIGN_SIZE) + +// Alignments over MI_ALIGNMENT_MAX are allocated in dedicated huge page segments +#define MI_ALIGNMENT_MAX (MI_SEGMENT_SIZE >> 1) + + +// ------------------------------------------------------ +// Mimalloc pages contain allocated blocks +// ------------------------------------------------------ + +// The free lists use encoded next fields +// (Only actually encodes when MI_ENCODED_FREELIST is defined.) +typedef uintptr_t mi_encoded_t; + +// thread id's +typedef size_t mi_threadid_t; + +// free lists contain blocks +typedef struct mi_block_s { + mi_encoded_t next; +} mi_block_t; + + +// The delayed flags are used for efficient multi-threaded free-ing +typedef enum mi_delayed_e { + MI_USE_DELAYED_FREE = 0, // push on the owning heap thread delayed list + MI_DELAYED_FREEING = 1, // temporary: another thread is accessing the owning heap + MI_NO_DELAYED_FREE = 2, // optimize: push on page local thread free queue if another block is already in the heap thread delayed free list + MI_NEVER_DELAYED_FREE = 3 // sticky, only resets on page reclaim +} mi_delayed_t; + + +// The `in_full` and `has_aligned` page flags are put in a union to efficiently +// test if both are false (`full_aligned == 0`) in the `mi_free` routine. +#if !MI_TSAN +typedef union mi_page_flags_s { + uint8_t full_aligned; + struct { + uint8_t in_full : 1; + uint8_t has_aligned : 1; + } x; +} mi_page_flags_t; +#else +// under thread sanitizer, use a byte for each flag to suppress warning, issue #130 +typedef union mi_page_flags_s { + uint16_t full_aligned; + struct { + uint8_t in_full; + uint8_t has_aligned; + } x; +} mi_page_flags_t; +#endif + +// Thread free list. +// We use the bottom 2 bits of the pointer for mi_delayed_t flags +typedef uintptr_t mi_thread_free_t; + +// A page contains blocks of one specific size (`block_size`). +// Each page has three list of free blocks: +// `free` for blocks that can be allocated, +// `local_free` for freed blocks that are not yet available to `mi_malloc` +// `thread_free` for freed blocks by other threads +// The `local_free` and `thread_free` lists are migrated to the `free` list +// when it is exhausted. The separate `local_free` list is necessary to +// implement a monotonic heartbeat. The `thread_free` list is needed for +// avoiding atomic operations in the common case. +// +// +// `used - |thread_free|` == actual blocks that are in use (alive) +// `used - |thread_free| + |free| + |local_free| == capacity` +// +// We don't count `freed` (as |free|) but use `used` to reduce +// the number of memory accesses in the `mi_page_all_free` function(s). +// +// Notes: +// - Access is optimized for `mi_free` and `mi_page_alloc` (in `alloc.c`) +// - Using `uint16_t` does not seem to slow things down +// - The size is 8 words on 64-bit which helps the page index calculations +// (and 10 words on 32-bit, and encoded free lists add 2 words. Sizes 10 +// and 12 are still good for address calculation) +// - To limit the structure size, the `xblock_size` is 32-bits only; for +// blocks > MI_HUGE_BLOCK_SIZE the size is determined from the segment page size +// - `thread_free` uses the bottom bits as a delayed-free flags to optimize +// concurrent frees where only the first concurrent free adds to the owning +// heap `thread_delayed_free` list (see `alloc.c:mi_free_block_mt`). +// The invariant is that no-delayed-free is only set if there is +// at least one block that will be added, or as already been added, to +// the owning heap `thread_delayed_free` list. This guarantees that pages +// will be freed correctly even if only other threads free blocks. +typedef struct mi_page_s { + // "owned" by the segment + uint32_t slice_count; // slices in this page (0 if not a page) + uint32_t slice_offset; // distance from the actual page data slice (0 if a page) + uint8_t is_committed : 1; // `true` if the page virtual memory is committed + uint8_t is_zero_init : 1; // `true` if the page was initially zero initialized + + // layout like this to optimize access in `mi_malloc` and `mi_free` + uint16_t capacity; // number of blocks committed, must be the first field, see `segment.c:page_clear` + uint16_t reserved; // number of blocks reserved in memory + mi_page_flags_t flags; // `in_full` and `has_aligned` flags (8 bits) + uint8_t free_is_zero : 1; // `true` if the blocks in the free list are zero initialized + uint8_t retire_expire : 7; // expiration count for retired blocks + + mi_block_t* free; // list of available free blocks (`malloc` allocates from this list) + uint32_t used; // number of blocks in use (including blocks in `local_free` and `thread_free`) + uint32_t xblock_size; // size available in each block (always `>0`) + mi_block_t* local_free; // list of deferred free blocks by this thread (migrates to `free`) + + #if (MI_ENCODE_FREELIST || MI_PADDING) + uintptr_t keys[2]; // two random keys to encode the free lists (see `_mi_block_next`) or padding canary + #endif + + _Atomic(mi_thread_free_t) xthread_free; // list of deferred free blocks freed by other threads + _Atomic(uintptr_t) xheap; + + struct mi_page_s* next; // next page owned by this thread with the same `block_size` + struct mi_page_s* prev; // previous page owned by this thread with the same `block_size` + + // 64-bit 9 words, 32-bit 12 words, (+2 for secure) + #if MI_INTPTR_SIZE==8 + uintptr_t padding[1]; + #endif +} mi_page_t; + + + +// ------------------------------------------------------ +// Mimalloc segments contain mimalloc pages +// ------------------------------------------------------ + +typedef enum mi_page_kind_e { + MI_PAGE_SMALL, // small blocks go into 64KiB pages inside a segment + MI_PAGE_MEDIUM, // medium blocks go into medium pages inside a segment + MI_PAGE_LARGE, // larger blocks go into a page of just one block + MI_PAGE_HUGE, // huge blocks (> 16 MiB) are put into a single page in a single segment. +} mi_page_kind_t; + +typedef enum mi_segment_kind_e { + MI_SEGMENT_NORMAL, // MI_SEGMENT_SIZE size with pages inside. + MI_SEGMENT_HUGE, // > MI_LARGE_SIZE_MAX segment with just one huge page inside. +} mi_segment_kind_t; + +// ------------------------------------------------------ +// A segment holds a commit mask where a bit is set if +// the corresponding MI_COMMIT_SIZE area is committed. +// The MI_COMMIT_SIZE must be a multiple of the slice +// size. If it is equal we have the most fine grained +// decommit (but setting it higher can be more efficient). +// The MI_MINIMAL_COMMIT_SIZE is the minimal amount that will +// be committed in one go which can be set higher than +// MI_COMMIT_SIZE for efficiency (while the decommit mask +// is still tracked in fine-grained MI_COMMIT_SIZE chunks) +// ------------------------------------------------------ + +#define MI_MINIMAL_COMMIT_SIZE (1*MI_SEGMENT_SLICE_SIZE) +#define MI_COMMIT_SIZE (MI_SEGMENT_SLICE_SIZE) // 64KiB +#define MI_COMMIT_MASK_BITS (MI_SEGMENT_SIZE / MI_COMMIT_SIZE) +#define MI_COMMIT_MASK_FIELD_BITS MI_SIZE_BITS +#define MI_COMMIT_MASK_FIELD_COUNT (MI_COMMIT_MASK_BITS / MI_COMMIT_MASK_FIELD_BITS) + +#if (MI_COMMIT_MASK_BITS != (MI_COMMIT_MASK_FIELD_COUNT * MI_COMMIT_MASK_FIELD_BITS)) +#error "the segment size must be exactly divisible by the (commit size * size_t bits)" +#endif + +typedef struct mi_commit_mask_s { + size_t mask[MI_COMMIT_MASK_FIELD_COUNT]; +} mi_commit_mask_t; + +typedef mi_page_t mi_slice_t; +typedef int64_t mi_msecs_t; + + +// Memory can reside in arena's, direct OS allocated, or statically allocated. The memid keeps track of this. +typedef enum mi_memkind_e { + MI_MEM_NONE, // not allocated + MI_MEM_EXTERNAL, // not owned by mimalloc but provided externally (via `mi_manage_os_memory` for example) + MI_MEM_STATIC, // allocated in a static area and should not be freed (for arena meta data for example) + MI_MEM_OS, // allocated from the OS + MI_MEM_OS_HUGE, // allocated as huge os pages + MI_MEM_OS_REMAP, // allocated in a remapable area (i.e. using `mremap`) + MI_MEM_ARENA // allocated from an arena (the usual case) +} mi_memkind_t; + +static inline bool mi_memkind_is_os(mi_memkind_t memkind) { + return (memkind >= MI_MEM_OS && memkind <= MI_MEM_OS_REMAP); +} + +typedef struct mi_memid_os_info { + void* base; // actual base address of the block (used for offset aligned allocations) + size_t alignment; // alignment at allocation +} mi_memid_os_info_t; + +typedef struct mi_memid_arena_info { + size_t block_index; // index in the arena + mi_arena_id_t id; // arena id (>= 1) + bool is_exclusive; // the arena can only be used for specific arena allocations +} mi_memid_arena_info_t; + +typedef struct mi_memid_s { + union { + mi_memid_os_info_t os; // only used for MI_MEM_OS + mi_memid_arena_info_t arena; // only used for MI_MEM_ARENA + } mem; + bool is_pinned; // `true` if we cannot decommit/reset/protect in this memory (e.g. when allocated using large OS pages) + bool initially_committed;// `true` if the memory was originally allocated as committed + bool initially_zero; // `true` if the memory was originally zero initialized + mi_memkind_t memkind; +} mi_memid_t; + + +// Segments are large allocated memory blocks (8mb on 64 bit) from +// the OS. Inside segments we allocated fixed size _pages_ that +// contain blocks. +typedef struct mi_segment_s { + // constant fields + mi_memid_t memid; // memory id for arena allocation + bool allow_decommit; + bool allow_purge; + size_t segment_size; + + // segment fields + mi_msecs_t purge_expire; + mi_commit_mask_t purge_mask; + mi_commit_mask_t commit_mask; + + _Atomic(struct mi_segment_s*) abandoned_next; + + // from here is zero initialized + struct mi_segment_s* next; // the list of freed segments in the cache (must be first field, see `segment.c:mi_segment_init`) + + size_t abandoned; // abandoned pages (i.e. the original owning thread stopped) (`abandoned <= used`) + size_t abandoned_visits; // count how often this segment is visited in the abandoned list (to force reclaim it it is too long) + size_t used; // count of pages in use + uintptr_t cookie; // verify addresses in debug mode: `mi_ptr_cookie(segment) == segment->cookie` + + size_t segment_slices; // for huge segments this may be different from `MI_SLICES_PER_SEGMENT` + size_t segment_info_slices; // initial slices we are using segment info and possible guard pages. + + // layout like this to optimize access in `mi_free` + mi_segment_kind_t kind; + size_t slice_entries; // entries in the `slices` array, at most `MI_SLICES_PER_SEGMENT` + _Atomic(mi_threadid_t) thread_id; // unique id of the thread owning this segment + + mi_slice_t slices[MI_SLICES_PER_SEGMENT+1]; // one more for huge blocks with large alignment +} mi_segment_t; + + +// ------------------------------------------------------ +// Heaps +// Provide first-class heaps to allocate from. +// A heap just owns a set of pages for allocation and +// can only be allocate/reallocate from the thread that created it. +// Freeing blocks can be done from any thread though. +// Per thread, the segments are shared among its heaps. +// Per thread, there is always a default heap that is +// used for allocation; it is initialized to statically +// point to an empty heap to avoid initialization checks +// in the fast path. +// ------------------------------------------------------ + +// Thread local data +typedef struct mi_tld_s mi_tld_t; + +// Pages of a certain block size are held in a queue. +typedef struct mi_page_queue_s { + mi_page_t* first; + mi_page_t* last; + size_t block_size; +} mi_page_queue_t; + +#define MI_BIN_FULL (MI_BIN_HUGE+1) + +// Random context +typedef struct mi_random_cxt_s { + uint32_t input[16]; + uint32_t output[16]; + int output_available; + bool weak; +} mi_random_ctx_t; + + +// In debug mode there is a padding structure at the end of the blocks to check for buffer overflows +#if (MI_PADDING) +typedef struct mi_padding_s { + uint32_t canary; // encoded block value to check validity of the padding (in case of overflow) + uint32_t delta; // padding bytes before the block. (mi_usable_size(p) - delta == exact allocated bytes) +} mi_padding_t; +#define MI_PADDING_SIZE (sizeof(mi_padding_t)) +#define MI_PADDING_WSIZE ((MI_PADDING_SIZE + MI_INTPTR_SIZE - 1) / MI_INTPTR_SIZE) +#else +#define MI_PADDING_SIZE 0 +#define MI_PADDING_WSIZE 0 +#endif + +#define MI_PAGES_DIRECT (MI_SMALL_WSIZE_MAX + MI_PADDING_WSIZE + 1) + + +// A heap owns a set of pages. +struct mi_heap_s { + mi_tld_t* tld; + mi_page_t* pages_free_direct[MI_PAGES_DIRECT]; // optimize: array where every entry points a page with possibly free blocks in the corresponding queue for that size. + mi_page_queue_t pages[MI_BIN_FULL + 1]; // queue of pages for each size class (or "bin") + _Atomic(mi_block_t*) thread_delayed_free; + mi_threadid_t thread_id; // thread this heap belongs too + mi_arena_id_t arena_id; // arena id if the heap belongs to a specific arena (or 0) + uintptr_t cookie; // random cookie to verify pointers (see `_mi_ptr_cookie`) + uintptr_t keys[2]; // two random keys used to encode the `thread_delayed_free` list + mi_random_ctx_t random; // random number context used for secure allocation + size_t page_count; // total number of pages in the `pages` queues. + size_t page_retired_min; // smallest retired index (retired pages are fully free, but still in the page queues) + size_t page_retired_max; // largest retired index into the `pages` array. + mi_heap_t* next; // list of heaps per thread + bool no_reclaim; // `true` if this heap should not reclaim abandoned pages +}; + + + +// ------------------------------------------------------ +// Debug +// ------------------------------------------------------ + +#if !defined(MI_DEBUG_UNINIT) +#define MI_DEBUG_UNINIT (0xD0) +#endif +#if !defined(MI_DEBUG_FREED) +#define MI_DEBUG_FREED (0xDF) +#endif +#if !defined(MI_DEBUG_PADDING) +#define MI_DEBUG_PADDING (0xDE) +#endif + +#if (MI_DEBUG) +// use our own assertion to print without memory allocation +void _mi_assert_fail(const char* assertion, const char* fname, unsigned int line, const char* func ); +#define mi_assert(expr) ((expr) ? (void)0 : _mi_assert_fail(#expr,__FILE__,__LINE__,__func__)) +#else +#define mi_assert(x) +#endif + +#if (MI_DEBUG>1) +#define mi_assert_internal mi_assert +#else +#define mi_assert_internal(x) +#endif + +#if (MI_DEBUG>2) +#define mi_assert_expensive mi_assert +#else +#define mi_assert_expensive(x) +#endif + +// ------------------------------------------------------ +// Statistics +// ------------------------------------------------------ + +#ifndef MI_STAT +#if (MI_DEBUG>0) +#define MI_STAT 2 +#else +#define MI_STAT 0 +#endif +#endif + +typedef struct mi_stat_count_s { + int64_t allocated; + int64_t freed; + int64_t peak; + int64_t current; +} mi_stat_count_t; + +typedef struct mi_stat_counter_s { + int64_t total; + int64_t count; +} mi_stat_counter_t; + +typedef struct mi_stats_s { + mi_stat_count_t segments; + mi_stat_count_t pages; + mi_stat_count_t reserved; + mi_stat_count_t committed; + mi_stat_count_t reset; + mi_stat_count_t purged; + mi_stat_count_t page_committed; + mi_stat_count_t segments_abandoned; + mi_stat_count_t pages_abandoned; + mi_stat_count_t threads; + mi_stat_count_t normal; + mi_stat_count_t huge; + mi_stat_count_t large; + mi_stat_count_t malloc; + mi_stat_count_t segments_cache; + mi_stat_counter_t pages_extended; + mi_stat_counter_t mmap_calls; + mi_stat_counter_t commit_calls; + mi_stat_counter_t reset_calls; + mi_stat_counter_t purge_calls; + mi_stat_counter_t page_no_retire; + mi_stat_counter_t searches; + mi_stat_counter_t normal_count; + mi_stat_counter_t huge_count; + mi_stat_counter_t large_count; +#if MI_STAT>1 + mi_stat_count_t normal_bins[MI_BIN_HUGE+1]; +#endif +} mi_stats_t; + + +void _mi_stat_increase(mi_stat_count_t* stat, size_t amount); +void _mi_stat_decrease(mi_stat_count_t* stat, size_t amount); +void _mi_stat_counter_increase(mi_stat_counter_t* stat, size_t amount); + +#if (MI_STAT) +#define mi_stat_increase(stat,amount) _mi_stat_increase( &(stat), amount) +#define mi_stat_decrease(stat,amount) _mi_stat_decrease( &(stat), amount) +#define mi_stat_counter_increase(stat,amount) _mi_stat_counter_increase( &(stat), amount) +#else +#define mi_stat_increase(stat,amount) (void)0 +#define mi_stat_decrease(stat,amount) (void)0 +#define mi_stat_counter_increase(stat,amount) (void)0 +#endif + +#define mi_heap_stat_counter_increase(heap,stat,amount) mi_stat_counter_increase( (heap)->tld->stats.stat, amount) +#define mi_heap_stat_increase(heap,stat,amount) mi_stat_increase( (heap)->tld->stats.stat, amount) +#define mi_heap_stat_decrease(heap,stat,amount) mi_stat_decrease( (heap)->tld->stats.stat, amount) + +// ------------------------------------------------------ +// Thread Local data +// ------------------------------------------------------ + +// A "span" is is an available range of slices. The span queues keep +// track of slice spans of at most the given `slice_count` (but more than the previous size class). +typedef struct mi_span_queue_s { + mi_slice_t* first; + mi_slice_t* last; + size_t slice_count; +} mi_span_queue_t; + +#define MI_SEGMENT_BIN_MAX (35) // 35 == mi_segment_bin(MI_SLICES_PER_SEGMENT) + +// OS thread local data +typedef struct mi_os_tld_s { + size_t region_idx; // start point for next allocation + mi_stats_t* stats; // points to tld stats +} mi_os_tld_t; + + +// Segments thread local data +typedef struct mi_segments_tld_s { + mi_span_queue_t spans[MI_SEGMENT_BIN_MAX+1]; // free slice spans inside segments + size_t count; // current number of segments; + size_t peak_count; // peak number of segments + size_t current_size; // current size of all segments + size_t peak_size; // peak size of all segments + mi_stats_t* stats; // points to tld stats + mi_os_tld_t* os; // points to os stats +} mi_segments_tld_t; + +// Thread local data +struct mi_tld_s { + unsigned long long heartbeat; // monotonic heartbeat count + bool recurse; // true if deferred was called; used to prevent infinite recursion. + mi_heap_t* heap_backing; // backing heap of this thread (cannot be deleted) + mi_heap_t* heaps; // list of heaps in this thread (so we can abandon all when the thread terminates) + mi_segments_tld_t segments; // segment tld + mi_os_tld_t os; // os tld + mi_stats_t stats; // statistics +}; + +#endif diff --git a/compat/mimalloc/options.c b/compat/mimalloc/options.c new file mode 100644 index 00000000000000..3a3090d9acfc94 --- /dev/null +++ b/compat/mimalloc/options.c @@ -0,0 +1,571 @@ +/* ---------------------------------------------------------------------------- +Copyright (c) 2018-2021, Microsoft Research, Daan Leijen +This is free software; you can redistribute it and/or modify it under the +terms of the MIT license. A copy of the license can be found in the file +"LICENSE" at the root of this distribution. +-----------------------------------------------------------------------------*/ +#include "mimalloc.h" +#include "mimalloc/internal.h" +#include "mimalloc/atomic.h" +#include "mimalloc/prim.h" // mi_prim_out_stderr + +#include // FILE +#include // abort +#include + + +static long mi_max_error_count = 16; // stop outputting errors after this (use < 0 for no limit) +static long mi_max_warning_count = 16; // stop outputting warnings after this (use < 0 for no limit) + +static void mi_add_stderr_output(void); + +int mi_version(void) mi_attr_noexcept { + return MI_MALLOC_VERSION; +} + + +// -------------------------------------------------------- +// Options +// These can be accessed by multiple threads and may be +// concurrently initialized, but an initializing data race +// is ok since they resolve to the same value. +// -------------------------------------------------------- +typedef enum mi_init_e { + UNINIT, // not yet initialized + DEFAULTED, // not found in the environment, use default value + INITIALIZED // found in environment or set explicitly +} mi_init_t; + +typedef struct mi_option_desc_s { + long value; // the value + mi_init_t init; // is it initialized yet? (from the environment) + mi_option_t option; // for debugging: the option index should match the option + const char* name; // option name without `mimalloc_` prefix + const char* legacy_name; // potential legacy option name +} mi_option_desc_t; + +#define MI_OPTION(opt) mi_option_##opt, #opt, NULL +#define MI_OPTION_LEGACY(opt,legacy) mi_option_##opt, #opt, #legacy + +static mi_option_desc_t options[_mi_option_last] = +{ + // stable options + #if MI_DEBUG || defined(MI_SHOW_ERRORS) + { 1, UNINIT, MI_OPTION(show_errors) }, + #else + { 0, UNINIT, MI_OPTION(show_errors) }, + #endif + { 0, UNINIT, MI_OPTION(show_stats) }, + { 0, UNINIT, MI_OPTION(verbose) }, + + // the following options are experimental and not all combinations make sense. + { 1, UNINIT, MI_OPTION(eager_commit) }, // commit per segment directly (4MiB) (but see also `eager_commit_delay`) + { 2, UNINIT, MI_OPTION_LEGACY(arena_eager_commit,eager_region_commit) }, // eager commit arena's? 2 is used to enable this only on an OS that has overcommit (i.e. linux) + { 1, UNINIT, MI_OPTION_LEGACY(purge_decommits,reset_decommits) }, // purge decommits memory (instead of reset) (note: on linux this uses MADV_DONTNEED for decommit) + { 0, UNINIT, MI_OPTION_LEGACY(allow_large_os_pages,large_os_pages) }, // use large OS pages, use only with eager commit to prevent fragmentation of VMA's + { 0, UNINIT, MI_OPTION(reserve_huge_os_pages) }, // per 1GiB huge pages + {-1, UNINIT, MI_OPTION(reserve_huge_os_pages_at) }, // reserve huge pages at node N + { 0, UNINIT, MI_OPTION(reserve_os_memory) }, + { 0, UNINIT, MI_OPTION(deprecated_segment_cache) }, // cache N segments per thread + { 0, UNINIT, MI_OPTION(deprecated_page_reset) }, // reset page memory on free + { 0, UNINIT, MI_OPTION_LEGACY(abandoned_page_purge,abandoned_page_reset) }, // reset free page memory when a thread terminates + { 0, UNINIT, MI_OPTION(deprecated_segment_reset) }, // reset segment memory on free (needs eager commit) +#if defined(__NetBSD__) + { 0, UNINIT, MI_OPTION(eager_commit_delay) }, // the first N segments per thread are not eagerly committed +#else + { 1, UNINIT, MI_OPTION(eager_commit_delay) }, // the first N segments per thread are not eagerly committed (but per page in the segment on demand) +#endif + { 10, UNINIT, MI_OPTION_LEGACY(purge_delay,reset_delay) }, // purge delay in milli-seconds + { 0, UNINIT, MI_OPTION(use_numa_nodes) }, // 0 = use available numa nodes, otherwise use at most N nodes. + { 0, UNINIT, MI_OPTION(limit_os_alloc) }, // 1 = do not use OS memory for allocation (but only reserved arenas) + { 100, UNINIT, MI_OPTION(os_tag) }, // only apple specific for now but might serve more or less related purpose + { 16, UNINIT, MI_OPTION(max_errors) }, // maximum errors that are output + { 16, UNINIT, MI_OPTION(max_warnings) }, // maximum warnings that are output + { 8, UNINIT, MI_OPTION(max_segment_reclaim)}, // max. number of segment reclaims from the abandoned segments per try. + { 0, UNINIT, MI_OPTION(destroy_on_exit)}, // release all OS memory on process exit; careful with dangling pointer or after-exit frees! + #if (MI_INTPTR_SIZE>4) + { 1024L * 1024L, UNINIT, MI_OPTION(arena_reserve) }, // reserve memory N KiB at a time + #else + { 128L * 1024L, UNINIT, MI_OPTION(arena_reserve) }, + #endif + { 10, UNINIT, MI_OPTION(arena_purge_mult) }, // purge delay multiplier for arena's + { 1, UNINIT, MI_OPTION_LEGACY(purge_extend_delay, decommit_extend_delay) }, +}; + +static void mi_option_init(mi_option_desc_t* desc); + +void _mi_options_init(void) { + // called on process load; should not be called before the CRT is initialized! + // (e.g. do not call this from process_init as that may run before CRT initialization) + mi_add_stderr_output(); // now it safe to use stderr for output + for(int i = 0; i < _mi_option_last; i++ ) { + mi_option_t option = (mi_option_t)i; + long l = mi_option_get(option); MI_UNUSED(l); // initialize + // if (option != mi_option_verbose) + { + mi_option_desc_t* desc = &options[option]; + _mi_verbose_message("option '%s': %ld\n", desc->name, desc->value); + } + } + mi_max_error_count = mi_option_get(mi_option_max_errors); + mi_max_warning_count = mi_option_get(mi_option_max_warnings); +} + +mi_decl_nodiscard long mi_option_get(mi_option_t option) { + mi_assert(option >= 0 && option < _mi_option_last); + if (option < 0 || option >= _mi_option_last) return 0; + mi_option_desc_t* desc = &options[option]; + mi_assert(desc->option == option); // index should match the option + if mi_unlikely(desc->init == UNINIT) { + mi_option_init(desc); + } + return desc->value; +} + +mi_decl_nodiscard long mi_option_get_clamp(mi_option_t option, long min, long max) { + long x = mi_option_get(option); + return (x < min ? min : (x > max ? max : x)); +} + +mi_decl_nodiscard size_t mi_option_get_size(mi_option_t option) { + mi_assert_internal(option == mi_option_reserve_os_memory || option == mi_option_arena_reserve); + long x = mi_option_get(option); + return (x < 0 ? 0 : (size_t)x * MI_KiB); +} + +void mi_option_set(mi_option_t option, long value) { + mi_assert(option >= 0 && option < _mi_option_last); + if (option < 0 || option >= _mi_option_last) return; + mi_option_desc_t* desc = &options[option]; + mi_assert(desc->option == option); // index should match the option + desc->value = value; + desc->init = INITIALIZED; +} + +void mi_option_set_default(mi_option_t option, long value) { + mi_assert(option >= 0 && option < _mi_option_last); + if (option < 0 || option >= _mi_option_last) return; + mi_option_desc_t* desc = &options[option]; + if (desc->init != INITIALIZED) { + desc->value = value; + } +} + +mi_decl_nodiscard bool mi_option_is_enabled(mi_option_t option) { + return (mi_option_get(option) != 0); +} + +void mi_option_set_enabled(mi_option_t option, bool enable) { + mi_option_set(option, (enable ? 1 : 0)); +} + +void mi_option_set_enabled_default(mi_option_t option, bool enable) { + mi_option_set_default(option, (enable ? 1 : 0)); +} + +void mi_option_enable(mi_option_t option) { + mi_option_set_enabled(option,true); +} + +void mi_option_disable(mi_option_t option) { + mi_option_set_enabled(option,false); +} + +static void mi_cdecl mi_out_stderr(const char* msg, void* arg) { + MI_UNUSED(arg); + if (msg != NULL && msg[0] != 0) { + _mi_prim_out_stderr(msg); + } +} + +// Since an output function can be registered earliest in the `main` +// function we also buffer output that happens earlier. When +// an output function is registered it is called immediately with +// the output up to that point. +#ifndef MI_MAX_DELAY_OUTPUT +#define MI_MAX_DELAY_OUTPUT ((size_t)(32*1024)) +#endif +static char out_buf[MI_MAX_DELAY_OUTPUT+1]; +static _Atomic(size_t) out_len; + +static void mi_cdecl mi_out_buf(const char* msg, void* arg) { + MI_UNUSED(arg); + if (msg==NULL) return; + if (mi_atomic_load_relaxed(&out_len)>=MI_MAX_DELAY_OUTPUT) return; + size_t n = _mi_strlen(msg); + if (n==0) return; + // claim space + size_t start = mi_atomic_add_acq_rel(&out_len, n); + if (start >= MI_MAX_DELAY_OUTPUT) return; + // check bound + if (start+n >= MI_MAX_DELAY_OUTPUT) { + n = MI_MAX_DELAY_OUTPUT-start-1; + } + _mi_memcpy(&out_buf[start], msg, n); +} + +static void mi_out_buf_flush(mi_output_fun* out, bool no_more_buf, void* arg) { + if (out==NULL) return; + // claim (if `no_more_buf == true`, no more output will be added after this point) + size_t count = mi_atomic_add_acq_rel(&out_len, (no_more_buf ? MI_MAX_DELAY_OUTPUT : 1)); + // and output the current contents + if (count>MI_MAX_DELAY_OUTPUT) count = MI_MAX_DELAY_OUTPUT; + out_buf[count] = 0; + out(out_buf,arg); + if (!no_more_buf) { + out_buf[count] = '\n'; // if continue with the buffer, insert a newline + } +} + + +// Once this module is loaded, switch to this routine +// which outputs to stderr and the delayed output buffer. +static void mi_cdecl mi_out_buf_stderr(const char* msg, void* arg) { + mi_out_stderr(msg,arg); + mi_out_buf(msg,arg); +} + + + +// -------------------------------------------------------- +// Default output handler +// -------------------------------------------------------- + +// Should be atomic but gives errors on many platforms as generally we cannot cast a function pointer to a uintptr_t. +// For now, don't register output from multiple threads. +static mi_output_fun* volatile mi_out_default; // = NULL +static _Atomic(void*) mi_out_arg; // = NULL + +static mi_output_fun* mi_out_get_default(void** parg) { + if (parg != NULL) { *parg = mi_atomic_load_ptr_acquire(void,&mi_out_arg); } + mi_output_fun* out = mi_out_default; + return (out == NULL ? &mi_out_buf : out); +} + +void mi_register_output(mi_output_fun* out, void* arg) mi_attr_noexcept { + mi_out_default = (out == NULL ? &mi_out_stderr : out); // stop using the delayed output buffer + mi_atomic_store_ptr_release(void,&mi_out_arg, arg); + if (out!=NULL) mi_out_buf_flush(out,true,arg); // output all the delayed output now +} + +// add stderr to the delayed output after the module is loaded +static void mi_add_stderr_output(void) { + mi_assert_internal(mi_out_default == NULL); + mi_out_buf_flush(&mi_out_stderr, false, NULL); // flush current contents to stderr + mi_out_default = &mi_out_buf_stderr; // and add stderr to the delayed output +} + +// -------------------------------------------------------- +// Messages, all end up calling `_mi_fputs`. +// -------------------------------------------------------- +static _Atomic(size_t) error_count; // = 0; // when >= max_error_count stop emitting errors +static _Atomic(size_t) warning_count; // = 0; // when >= max_warning_count stop emitting warnings + +// When overriding malloc, we may recurse into mi_vfprintf if an allocation +// inside the C runtime causes another message. +// In some cases (like on macOS) the loader already allocates which +// calls into mimalloc; if we then access thread locals (like `recurse`) +// this may crash as the access may call _tlv_bootstrap that tries to +// (recursively) invoke malloc again to allocate space for the thread local +// variables on demand. This is why we use a _mi_preloading test on such +// platforms. However, C code generator may move the initial thread local address +// load before the `if` and we therefore split it out in a separate funcion. +static mi_decl_thread bool recurse = false; + +static mi_decl_noinline bool mi_recurse_enter_prim(void) { + if (recurse) return false; + recurse = true; + return true; +} + +static mi_decl_noinline void mi_recurse_exit_prim(void) { + recurse = false; +} + +static bool mi_recurse_enter(void) { + #if defined(__APPLE__) || defined(MI_TLS_RECURSE_GUARD) + if (_mi_preloading()) return false; + #endif + return mi_recurse_enter_prim(); +} + +static void mi_recurse_exit(void) { + #if defined(__APPLE__) || defined(MI_TLS_RECURSE_GUARD) + if (_mi_preloading()) return; + #endif + mi_recurse_exit_prim(); +} + +void _mi_fputs(mi_output_fun* out, void* arg, const char* prefix, const char* message) { + if (out==NULL || (void*)out==(void*)stdout || (void*)out==(void*)stderr) { // TODO: use mi_out_stderr for stderr? + if (!mi_recurse_enter()) return; + out = mi_out_get_default(&arg); + if (prefix != NULL) out(prefix, arg); + out(message, arg); + mi_recurse_exit(); + } + else { + if (prefix != NULL) out(prefix, arg); + out(message, arg); + } +} + +// Define our own limited `fprintf` that avoids memory allocation. +// We do this using `snprintf` with a limited buffer. +static void mi_vfprintf( mi_output_fun* out, void* arg, const char* prefix, const char* fmt, va_list args ) { + char buf[512]; + if (fmt==NULL) return; + if (!mi_recurse_enter()) return; + vsnprintf(buf,sizeof(buf)-1,fmt,args); + mi_recurse_exit(); + _mi_fputs(out,arg,prefix,buf); +} + +void _mi_fprintf( mi_output_fun* out, void* arg, const char* fmt, ... ) { + va_list args; + va_start(args,fmt); + mi_vfprintf(out,arg,NULL,fmt,args); + va_end(args); +} + +static void mi_vfprintf_thread(mi_output_fun* out, void* arg, const char* prefix, const char* fmt, va_list args) { + if (prefix != NULL && _mi_strnlen(prefix,33) <= 32 && !_mi_is_main_thread()) { + char tprefix[64]; + snprintf(tprefix, sizeof(tprefix), "%sthread 0x%llx: ", prefix, (unsigned long long)_mi_thread_id()); + mi_vfprintf(out, arg, tprefix, fmt, args); + } + else { + mi_vfprintf(out, arg, prefix, fmt, args); + } +} + +void _mi_trace_message(const char* fmt, ...) { + if (mi_option_get(mi_option_verbose) <= 1) return; // only with verbose level 2 or higher + va_list args; + va_start(args, fmt); + mi_vfprintf_thread(NULL, NULL, "mimalloc: ", fmt, args); + va_end(args); +} + +void _mi_verbose_message(const char* fmt, ...) { + if (!mi_option_is_enabled(mi_option_verbose)) return; + va_list args; + va_start(args,fmt); + mi_vfprintf(NULL, NULL, "mimalloc: ", fmt, args); + va_end(args); +} + +static void mi_show_error_message(const char* fmt, va_list args) { + if (!mi_option_is_enabled(mi_option_verbose)) { + if (!mi_option_is_enabled(mi_option_show_errors)) return; + if (mi_max_error_count >= 0 && (long)mi_atomic_increment_acq_rel(&error_count) > mi_max_error_count) return; + } + mi_vfprintf_thread(NULL, NULL, "mimalloc: error: ", fmt, args); +} + +void _mi_warning_message(const char* fmt, ...) { + if (!mi_option_is_enabled(mi_option_verbose)) { + if (!mi_option_is_enabled(mi_option_show_errors)) return; + if (mi_max_warning_count >= 0 && (long)mi_atomic_increment_acq_rel(&warning_count) > mi_max_warning_count) return; + } + va_list args; + va_start(args,fmt); + mi_vfprintf_thread(NULL, NULL, "mimalloc: warning: ", fmt, args); + va_end(args); +} + + +#if MI_DEBUG +void _mi_assert_fail(const char* assertion, const char* fname, unsigned line, const char* func ) { + _mi_fprintf(NULL, NULL, "mimalloc: assertion failed: at \"%s\":%u, %s\n assertion: \"%s\"\n", fname, line, (func==NULL?"":func), assertion); + abort(); +} +#endif + +// -------------------------------------------------------- +// Errors +// -------------------------------------------------------- + +static mi_error_fun* volatile mi_error_handler; // = NULL +static _Atomic(void*) mi_error_arg; // = NULL + +static void mi_error_default(int err) { + MI_UNUSED(err); +#if (MI_DEBUG>0) + if (err==EFAULT) { + #ifdef _MSC_VER + __debugbreak(); + #endif + abort(); + } +#endif +#if (MI_SECURE>0) + if (err==EFAULT) { // abort on serious errors in secure mode (corrupted meta-data) + abort(); + } +#endif +#if defined(MI_XMALLOC) + if (err==ENOMEM || err==EOVERFLOW) { // abort on memory allocation fails in xmalloc mode + abort(); + } +#endif +} + +void mi_register_error(mi_error_fun* fun, void* arg) { + mi_error_handler = fun; // can be NULL + mi_atomic_store_ptr_release(void,&mi_error_arg, arg); +} + +void _mi_error_message(int err, const char* fmt, ...) { + // show detailed error message + va_list args; + va_start(args, fmt); + mi_show_error_message(fmt, args); + va_end(args); + // and call the error handler which may abort (or return normally) + if (mi_error_handler != NULL) { + mi_error_handler(err, mi_atomic_load_ptr_acquire(void,&mi_error_arg)); + } + else { + mi_error_default(err); + } +} + +// -------------------------------------------------------- +// Initialize options by checking the environment +// -------------------------------------------------------- +char _mi_toupper(char c) { + if (c >= 'a' && c <= 'z') return (c - 'a' + 'A'); + else return c; +} + +int _mi_strnicmp(const char* s, const char* t, size_t n) { + if (n == 0) return 0; + for (; *s != 0 && *t != 0 && n > 0; s++, t++, n--) { + if (_mi_toupper(*s) != _mi_toupper(*t)) break; + } + return (n == 0 ? 0 : *s - *t); +} + +void _mi_strlcpy(char* dest, const char* src, size_t dest_size) { + if (dest==NULL || src==NULL || dest_size == 0) return; + // copy until end of src, or when dest is (almost) full + while (*src != 0 && dest_size > 1) { + *dest++ = *src++; + dest_size--; + } + // always zero terminate + *dest = 0; +} + +void _mi_strlcat(char* dest, const char* src, size_t dest_size) { + if (dest==NULL || src==NULL || dest_size == 0) return; + // find end of string in the dest buffer + while (*dest != 0 && dest_size > 1) { + dest++; + dest_size--; + } + // and catenate + _mi_strlcpy(dest, src, dest_size); +} + +size_t _mi_strlen(const char* s) { + if (s==NULL) return 0; + size_t len = 0; + while(s[len] != 0) { len++; } + return len; +} + +size_t _mi_strnlen(const char* s, size_t max_len) { + if (s==NULL) return 0; + size_t len = 0; + while(s[len] != 0 && len < max_len) { len++; } + return len; +} + +#ifdef MI_NO_GETENV +static bool mi_getenv(const char* name, char* result, size_t result_size) { + MI_UNUSED(name); + MI_UNUSED(result); + MI_UNUSED(result_size); + return false; +} +#else +static bool mi_getenv(const char* name, char* result, size_t result_size) { + if (name==NULL || result == NULL || result_size < 64) return false; + return _mi_prim_getenv(name,result,result_size); +} +#endif + +// TODO: implement ourselves to reduce dependencies on the C runtime +#include // strtol +#include // strstr + + +static void mi_option_init(mi_option_desc_t* desc) { + // Read option value from the environment + char s[64 + 1]; + char buf[64+1]; + _mi_strlcpy(buf, "mimalloc_", sizeof(buf)); + _mi_strlcat(buf, desc->name, sizeof(buf)); + bool found = mi_getenv(buf, s, sizeof(s)); + if (!found && desc->legacy_name != NULL) { + _mi_strlcpy(buf, "mimalloc_", sizeof(buf)); + _mi_strlcat(buf, desc->legacy_name, sizeof(buf)); + found = mi_getenv(buf, s, sizeof(s)); + if (found) { + _mi_warning_message("environment option \"mimalloc_%s\" is deprecated -- use \"mimalloc_%s\" instead.\n", desc->legacy_name, desc->name); + } + } + + if (found) { + size_t len = _mi_strnlen(s, sizeof(buf) - 1); + for (size_t i = 0; i < len; i++) { + buf[i] = _mi_toupper(s[i]); + } + buf[len] = 0; + if (buf[0] == 0 || strstr("1;TRUE;YES;ON", buf) != NULL) { + desc->value = 1; + desc->init = INITIALIZED; + } + else if (strstr("0;FALSE;NO;OFF", buf) != NULL) { + desc->value = 0; + desc->init = INITIALIZED; + } + else { + char* end = buf; + long value = strtol(buf, &end, 10); + if (desc->option == mi_option_reserve_os_memory || desc->option == mi_option_arena_reserve) { + // this option is interpreted in KiB to prevent overflow of `long` + if (*end == 'K') { end++; } + else if (*end == 'M') { value *= MI_KiB; end++; } + else if (*end == 'G') { value *= MI_MiB; end++; } + else { value = (value + MI_KiB - 1) / MI_KiB; } + if (end[0] == 'I' && end[1] == 'B') { end += 2; } + else if (*end == 'B') { end++; } + } + if (*end == 0) { + desc->value = value; + desc->init = INITIALIZED; + } + else { + // set `init` first to avoid recursion through _mi_warning_message on mimalloc_verbose. + desc->init = DEFAULTED; + if (desc->option == mi_option_verbose && desc->value == 0) { + // if the 'mimalloc_verbose' env var has a bogus value we'd never know + // (since the value defaults to 'off') so in that case briefly enable verbose + desc->value = 1; + _mi_warning_message("environment option mimalloc_%s has an invalid value.\n", desc->name); + desc->value = 0; + } + else { + _mi_warning_message("environment option mimalloc_%s has an invalid value.\n", desc->name); + } + } + } + mi_assert_internal(desc->init != UNINIT); + } + else if (!_mi_preloading()) { + desc->init = DEFAULTED; + } +} diff --git a/compat/mimalloc/os.c b/compat/mimalloc/os.c new file mode 100644 index 00000000000000..bf9de1be0fdb49 --- /dev/null +++ b/compat/mimalloc/os.c @@ -0,0 +1,689 @@ +/* ---------------------------------------------------------------------------- +Copyright (c) 2018-2023, Microsoft Research, Daan Leijen +This is free software; you can redistribute it and/or modify it under the +terms of the MIT license. A copy of the license can be found in the file +"LICENSE" at the root of this distribution. +-----------------------------------------------------------------------------*/ +#include "mimalloc.h" +#include "mimalloc/internal.h" +#include "mimalloc/atomic.h" +#include "mimalloc/prim.h" + + +/* ----------------------------------------------------------- + Initialization. + On windows initializes support for aligned allocation and + large OS pages (if MIMALLOC_LARGE_OS_PAGES is true). +----------------------------------------------------------- */ + +static mi_os_mem_config_t mi_os_mem_config = { + 4096, // page size + 0, // large page size (usually 2MiB) + 4096, // allocation granularity + true, // has overcommit? (if true we use MAP_NORESERVE on mmap systems) + false, // must free whole? (on mmap systems we can free anywhere in a mapped range, but on Windows we must free the entire span) + true // has virtual reserve? (if true we can reserve virtual address space without using commit or physical memory) +}; + +bool _mi_os_has_overcommit(void) { + return mi_os_mem_config.has_overcommit; +} + +bool _mi_os_has_virtual_reserve(void) { + return mi_os_mem_config.has_virtual_reserve; +} + + +// OS (small) page size +size_t _mi_os_page_size(void) { + return mi_os_mem_config.page_size; +} + +// if large OS pages are supported (2 or 4MiB), then return the size, otherwise return the small page size (4KiB) +size_t _mi_os_large_page_size(void) { + return (mi_os_mem_config.large_page_size != 0 ? mi_os_mem_config.large_page_size : _mi_os_page_size()); +} + +bool _mi_os_use_large_page(size_t size, size_t alignment) { + // if we have access, check the size and alignment requirements + if (mi_os_mem_config.large_page_size == 0 || !mi_option_is_enabled(mi_option_allow_large_os_pages)) return false; + return ((size % mi_os_mem_config.large_page_size) == 0 && (alignment % mi_os_mem_config.large_page_size) == 0); +} + +// round to a good OS allocation size (bounded by max 12.5% waste) +size_t _mi_os_good_alloc_size(size_t size) { + size_t align_size; + if (size < 512*MI_KiB) align_size = _mi_os_page_size(); + else if (size < 2*MI_MiB) align_size = 64*MI_KiB; + else if (size < 8*MI_MiB) align_size = 256*MI_KiB; + else if (size < 32*MI_MiB) align_size = 1*MI_MiB; + else align_size = 4*MI_MiB; + if mi_unlikely(size >= (SIZE_MAX - align_size)) return size; // possible overflow? + return _mi_align_up(size, align_size); +} + +void _mi_os_init(void) { + _mi_prim_mem_init(&mi_os_mem_config); +} + + +/* ----------------------------------------------------------- + Util +-------------------------------------------------------------- */ +bool _mi_os_decommit(void* addr, size_t size, mi_stats_t* stats); +bool _mi_os_commit(void* addr, size_t size, bool* is_zero, mi_stats_t* tld_stats); + +static void* mi_align_up_ptr(void* p, size_t alignment) { + return (void*)_mi_align_up((uintptr_t)p, alignment); +} + +static void* mi_align_down_ptr(void* p, size_t alignment) { + return (void*)_mi_align_down((uintptr_t)p, alignment); +} + + +/* ----------------------------------------------------------- + aligned hinting +-------------------------------------------------------------- */ + +// On 64-bit systems, we can do efficient aligned allocation by using +// the 2TiB to 30TiB area to allocate those. +#if (MI_INTPTR_SIZE >= 8) +static mi_decl_cache_align _Atomic(uintptr_t)aligned_base; + +// Return a MI_SEGMENT_SIZE aligned address that is probably available. +// If this returns NULL, the OS will determine the address but on some OS's that may not be +// properly aligned which can be more costly as it needs to be adjusted afterwards. +// For a size > 1GiB this always returns NULL in order to guarantee good ASLR randomization; +// (otherwise an initial large allocation of say 2TiB has a 50% chance to include (known) addresses +// in the middle of the 2TiB - 6TiB address range (see issue #372)) + +#define MI_HINT_BASE ((uintptr_t)2 << 40) // 2TiB start +#define MI_HINT_AREA ((uintptr_t)4 << 40) // upto 6TiB (since before win8 there is "only" 8TiB available to processes) +#define MI_HINT_MAX ((uintptr_t)30 << 40) // wrap after 30TiB (area after 32TiB is used for huge OS pages) + +void* _mi_os_get_aligned_hint(size_t try_alignment, size_t size) +{ + if (try_alignment <= 1 || try_alignment > MI_SEGMENT_SIZE) return NULL; + size = _mi_align_up(size, MI_SEGMENT_SIZE); + if (size > 1*MI_GiB) return NULL; // guarantee the chance of fixed valid address is at most 1/(MI_HINT_AREA / 1<<30) = 1/4096. + #if (MI_SECURE>0) + size += MI_SEGMENT_SIZE; // put in `MI_SEGMENT_SIZE` virtual gaps between hinted blocks; this splits VLA's but increases guarded areas. + #endif + + uintptr_t hint = mi_atomic_add_acq_rel(&aligned_base, size); + if (hint == 0 || hint > MI_HINT_MAX) { // wrap or initialize + uintptr_t init = MI_HINT_BASE; + #if (MI_SECURE>0 || MI_DEBUG==0) // security: randomize start of aligned allocations unless in debug mode + uintptr_t r = _mi_heap_random_next(mi_prim_get_default_heap()); + init = init + ((MI_SEGMENT_SIZE * ((r>>17) & 0xFFFFF)) % MI_HINT_AREA); // (randomly 20 bits)*4MiB == 0 to 4TiB + #endif + uintptr_t expected = hint + size; + mi_atomic_cas_strong_acq_rel(&aligned_base, &expected, init); + hint = mi_atomic_add_acq_rel(&aligned_base, size); // this may still give 0 or > MI_HINT_MAX but that is ok, it is a hint after all + } + if (hint%try_alignment != 0) return NULL; + return (void*)hint; +} +#else +void* _mi_os_get_aligned_hint(size_t try_alignment, size_t size) { + MI_UNUSED(try_alignment); MI_UNUSED(size); + return NULL; +} +#endif + + +/* ----------------------------------------------------------- + Free memory +-------------------------------------------------------------- */ + +static void mi_os_free_huge_os_pages(void* p, size_t size, mi_stats_t* stats); + +static void mi_os_prim_free(void* addr, size_t size, bool still_committed, mi_stats_t* tld_stats) { + MI_UNUSED(tld_stats); + mi_assert_internal((size % _mi_os_page_size()) == 0); + if (addr == NULL || size == 0) return; // || _mi_os_is_huge_reserved(addr) + int err = _mi_prim_free(addr, size); + if (err != 0) { + _mi_warning_message("unable to free OS memory (error: %d (0x%x), size: 0x%zx bytes, address: %p)\n", err, err, size, addr); + } + mi_stats_t* stats = &_mi_stats_main; + if (still_committed) { _mi_stat_decrease(&stats->committed, size); } + _mi_stat_decrease(&stats->reserved, size); +} + +void _mi_os_free_ex(void* addr, size_t size, bool still_committed, mi_memid_t memid, mi_stats_t* tld_stats) { + if (mi_memkind_is_os(memid.memkind)) { + size_t csize = _mi_os_good_alloc_size(size); + void* base = addr; + // different base? (due to alignment) + if (memid.mem.os.base != NULL) { + mi_assert(memid.mem.os.base <= addr); + mi_assert((uint8_t*)memid.mem.os.base + memid.mem.os.alignment >= (uint8_t*)addr); + base = memid.mem.os.base; + csize += ((uint8_t*)addr - (uint8_t*)memid.mem.os.base); + } + // free it + if (memid.memkind == MI_MEM_OS_HUGE) { + mi_assert(memid.is_pinned); + mi_os_free_huge_os_pages(base, csize, tld_stats); + } + else { + mi_os_prim_free(base, csize, still_committed, tld_stats); + } + } + else { + // nothing to do + mi_assert(memid.memkind < MI_MEM_OS); + } +} + +void _mi_os_free(void* p, size_t size, mi_memid_t memid, mi_stats_t* tld_stats) { + _mi_os_free_ex(p, size, true, memid, tld_stats); +} + + +/* ----------------------------------------------------------- + Primitive allocation from the OS. +-------------------------------------------------------------- */ + +// Note: the `try_alignment` is just a hint and the returned pointer is not guaranteed to be aligned. +static void* mi_os_prim_alloc(size_t size, size_t try_alignment, bool commit, bool allow_large, bool* is_large, bool* is_zero, mi_stats_t* stats) { + mi_assert_internal(size > 0 && (size % _mi_os_page_size()) == 0); + mi_assert_internal(is_zero != NULL); + mi_assert_internal(is_large != NULL); + if (size == 0) return NULL; + if (!commit) { allow_large = false; } + if (try_alignment == 0) { try_alignment = 1; } // avoid 0 to ensure there will be no divide by zero when aligning + + *is_zero = false; + void* p = NULL; + int err = _mi_prim_alloc(size, try_alignment, commit, allow_large, is_large, is_zero, &p); + if (err != 0) { + _mi_warning_message("unable to allocate OS memory (error: %d (0x%x), size: 0x%zx bytes, align: 0x%zx, commit: %d, allow large: %d)\n", err, err, size, try_alignment, commit, allow_large); + } + mi_stat_counter_increase(stats->mmap_calls, 1); + if (p != NULL) { + _mi_stat_increase(&stats->reserved, size); + if (commit) { + _mi_stat_increase(&stats->committed, size); + // seems needed for asan (or `mimalloc-test-api` fails) + #ifdef MI_TRACK_ASAN + if (*is_zero) { mi_track_mem_defined(p,size); } + else { mi_track_mem_undefined(p,size); } + #endif + } + } + return p; +} + + +// Primitive aligned allocation from the OS. +// This function guarantees the allocated memory is aligned. +static void* mi_os_prim_alloc_aligned(size_t size, size_t alignment, bool commit, bool allow_large, bool* is_large, bool* is_zero, void** base, mi_stats_t* stats) { + mi_assert_internal(alignment >= _mi_os_page_size() && ((alignment & (alignment - 1)) == 0)); + mi_assert_internal(size > 0 && (size % _mi_os_page_size()) == 0); + mi_assert_internal(is_large != NULL); + mi_assert_internal(is_zero != NULL); + mi_assert_internal(base != NULL); + if (!commit) allow_large = false; + if (!(alignment >= _mi_os_page_size() && ((alignment & (alignment - 1)) == 0))) return NULL; + size = _mi_align_up(size, _mi_os_page_size()); + + // try first with a hint (this will be aligned directly on Win 10+ or BSD) + void* p = mi_os_prim_alloc(size, alignment, commit, allow_large, is_large, is_zero, stats); + if (p == NULL) return NULL; + + // aligned already? + if (((uintptr_t)p % alignment) == 0) { + *base = p; + } + else { + // if not aligned, free it, overallocate, and unmap around it + _mi_warning_message("unable to allocate aligned OS memory directly, fall back to over-allocation (size: 0x%zx bytes, address: %p, alignment: 0x%zx, commit: %d)\n", size, p, alignment, commit); + mi_os_prim_free(p, size, commit, stats); + if (size >= (SIZE_MAX - alignment)) return NULL; // overflow + const size_t over_size = size + alignment; + + if (mi_os_mem_config.must_free_whole) { // win32 virtualAlloc cannot free parts of an allocate block + // over-allocate uncommitted (virtual) memory + p = mi_os_prim_alloc(over_size, 1 /*alignment*/, false /* commit? */, false /* allow_large */, is_large, is_zero, stats); + if (p == NULL) return NULL; + + // set p to the aligned part in the full region + // note: this is dangerous on Windows as VirtualFree needs the actual base pointer + // this is handled though by having the `base` field in the memid's + *base = p; // remember the base + p = mi_align_up_ptr(p, alignment); + + // explicitly commit only the aligned part + if (commit) { + _mi_os_commit(p, size, NULL, stats); + } + } + else { // mmap can free inside an allocation + // overallocate... + p = mi_os_prim_alloc(over_size, 1, commit, false, is_large, is_zero, stats); + if (p == NULL) return NULL; + + // and selectively unmap parts around the over-allocated area. (noop on sbrk) + void* aligned_p = mi_align_up_ptr(p, alignment); + size_t pre_size = (uint8_t*)aligned_p - (uint8_t*)p; + size_t mid_size = _mi_align_up(size, _mi_os_page_size()); + size_t post_size = over_size - pre_size - mid_size; + mi_assert_internal(pre_size < over_size&& post_size < over_size&& mid_size >= size); + if (pre_size > 0) { mi_os_prim_free(p, pre_size, commit, stats); } + if (post_size > 0) { mi_os_prim_free((uint8_t*)aligned_p + mid_size, post_size, commit, stats); } + // we can return the aligned pointer on `mmap` (and sbrk) systems + p = aligned_p; + *base = aligned_p; // since we freed the pre part, `*base == p`. + } + } + + mi_assert_internal(p == NULL || (p != NULL && *base != NULL && ((uintptr_t)p % alignment) == 0)); + return p; +} + + +/* ----------------------------------------------------------- + OS API: alloc and alloc_aligned +----------------------------------------------------------- */ + +void* _mi_os_alloc(size_t size, mi_memid_t* memid, mi_stats_t* tld_stats) { + MI_UNUSED(tld_stats); + *memid = _mi_memid_none(); + mi_stats_t* stats = &_mi_stats_main; + if (size == 0) return NULL; + size = _mi_os_good_alloc_size(size); + bool os_is_large = false; + bool os_is_zero = false; + void* p = mi_os_prim_alloc(size, 0, true, false, &os_is_large, &os_is_zero, stats); + if (p != NULL) { + *memid = _mi_memid_create_os(true, os_is_zero, os_is_large); + } + return p; +} + +void* _mi_os_alloc_aligned(size_t size, size_t alignment, bool commit, bool allow_large, mi_memid_t* memid, mi_stats_t* tld_stats) +{ + MI_UNUSED(&_mi_os_get_aligned_hint); // suppress unused warnings + MI_UNUSED(tld_stats); + *memid = _mi_memid_none(); + if (size == 0) return NULL; + size = _mi_os_good_alloc_size(size); + alignment = _mi_align_up(alignment, _mi_os_page_size()); + + bool os_is_large = false; + bool os_is_zero = false; + void* os_base = NULL; + void* p = mi_os_prim_alloc_aligned(size, alignment, commit, allow_large, &os_is_large, &os_is_zero, &os_base, &_mi_stats_main /*tld->stats*/ ); + if (p != NULL) { + *memid = _mi_memid_create_os(commit, os_is_zero, os_is_large); + memid->mem.os.base = os_base; + memid->mem.os.alignment = alignment; + } + return p; +} + +/* ----------------------------------------------------------- + OS aligned allocation with an offset. This is used + for large alignments > MI_ALIGNMENT_MAX. We use a large mimalloc + page where the object can be aligned at an offset from the start of the segment. + As we may need to overallocate, we need to free such pointers using `mi_free_aligned` + to use the actual start of the memory region. +----------------------------------------------------------- */ + +void* _mi_os_alloc_aligned_at_offset(size_t size, size_t alignment, size_t offset, bool commit, bool allow_large, mi_memid_t* memid, mi_stats_t* tld_stats) { + mi_assert(offset <= MI_SEGMENT_SIZE); + mi_assert(offset <= size); + mi_assert((alignment % _mi_os_page_size()) == 0); + *memid = _mi_memid_none(); + if (offset > MI_SEGMENT_SIZE) return NULL; + if (offset == 0) { + // regular aligned allocation + return _mi_os_alloc_aligned(size, alignment, commit, allow_large, memid, tld_stats); + } + else { + // overallocate to align at an offset + const size_t extra = _mi_align_up(offset, alignment) - offset; + const size_t oversize = size + extra; + void* const start = _mi_os_alloc_aligned(oversize, alignment, commit, allow_large, memid, tld_stats); + if (start == NULL) return NULL; + + void* const p = (uint8_t*)start + extra; + mi_assert(_mi_is_aligned((uint8_t*)p + offset, alignment)); + // decommit the overallocation at the start + if (commit && extra > _mi_os_page_size()) { + _mi_os_decommit(start, extra, tld_stats); + } + return p; + } +} + +/* ----------------------------------------------------------- + OS memory API: reset, commit, decommit, protect, unprotect. +----------------------------------------------------------- */ + +// OS page align within a given area, either conservative (pages inside the area only), +// or not (straddling pages outside the area is possible) +static void* mi_os_page_align_areax(bool conservative, void* addr, size_t size, size_t* newsize) { + mi_assert(addr != NULL && size > 0); + if (newsize != NULL) *newsize = 0; + if (size == 0 || addr == NULL) return NULL; + + // page align conservatively within the range + void* start = (conservative ? mi_align_up_ptr(addr, _mi_os_page_size()) + : mi_align_down_ptr(addr, _mi_os_page_size())); + void* end = (conservative ? mi_align_down_ptr((uint8_t*)addr + size, _mi_os_page_size()) + : mi_align_up_ptr((uint8_t*)addr + size, _mi_os_page_size())); + ptrdiff_t diff = (uint8_t*)end - (uint8_t*)start; + if (diff <= 0) return NULL; + + mi_assert_internal((conservative && (size_t)diff <= size) || (!conservative && (size_t)diff >= size)); + if (newsize != NULL) *newsize = (size_t)diff; + return start; +} + +static void* mi_os_page_align_area_conservative(void* addr, size_t size, size_t* newsize) { + return mi_os_page_align_areax(true, addr, size, newsize); +} + +bool _mi_os_commit(void* addr, size_t size, bool* is_zero, mi_stats_t* tld_stats) { + MI_UNUSED(tld_stats); + mi_stats_t* stats = &_mi_stats_main; + if (is_zero != NULL) { *is_zero = false; } + _mi_stat_increase(&stats->committed, size); // use size for precise commit vs. decommit + _mi_stat_counter_increase(&stats->commit_calls, 1); + + // page align range + size_t csize; + void* start = mi_os_page_align_areax(false /* conservative? */, addr, size, &csize); + if (csize == 0) return true; + + // commit + bool os_is_zero = false; + int err = _mi_prim_commit(start, csize, &os_is_zero); + if (err != 0) { + _mi_warning_message("cannot commit OS memory (error: %d (0x%x), address: %p, size: 0x%zx bytes)\n", err, err, start, csize); + return false; + } + if (os_is_zero && is_zero != NULL) { + *is_zero = true; + mi_assert_expensive(mi_mem_is_zero(start, csize)); + } + // note: the following seems required for asan (otherwise `mimalloc-test-stress` fails) + #ifdef MI_TRACK_ASAN + if (os_is_zero) { mi_track_mem_defined(start,csize); } + else { mi_track_mem_undefined(start,csize); } + #endif + return true; +} + +static bool mi_os_decommit_ex(void* addr, size_t size, bool* needs_recommit, mi_stats_t* tld_stats) { + MI_UNUSED(tld_stats); + mi_stats_t* stats = &_mi_stats_main; + mi_assert_internal(needs_recommit!=NULL); + _mi_stat_decrease(&stats->committed, size); + + // page align + size_t csize; + void* start = mi_os_page_align_area_conservative(addr, size, &csize); + if (csize == 0) return true; + + // decommit + *needs_recommit = true; + int err = _mi_prim_decommit(start,csize,needs_recommit); + if (err != 0) { + _mi_warning_message("cannot decommit OS memory (error: %d (0x%x), address: %p, size: 0x%zx bytes)\n", err, err, start, csize); + } + mi_assert_internal(err == 0); + return (err == 0); +} + +bool _mi_os_decommit(void* addr, size_t size, mi_stats_t* tld_stats) { + bool needs_recommit; + return mi_os_decommit_ex(addr, size, &needs_recommit, tld_stats); +} + + +// Signal to the OS that the address range is no longer in use +// but may be used later again. This will release physical memory +// pages and reduce swapping while keeping the memory committed. +// We page align to a conservative area inside the range to reset. +bool _mi_os_reset(void* addr, size_t size, mi_stats_t* stats) { + // page align conservatively within the range + size_t csize; + void* start = mi_os_page_align_area_conservative(addr, size, &csize); + if (csize == 0) return true; // || _mi_os_is_huge_reserved(addr) + _mi_stat_increase(&stats->reset, csize); + _mi_stat_counter_increase(&stats->reset_calls, 1); + + #if (MI_DEBUG>1) && !MI_SECURE && !MI_TRACK_ENABLED // && !MI_TSAN + memset(start, 0, csize); // pretend it is eagerly reset + #endif + + int err = _mi_prim_reset(start, csize); + if (err != 0) { + _mi_warning_message("cannot reset OS memory (error: %d (0x%x), address: %p, size: 0x%zx bytes)\n", err, err, start, csize); + } + return (err == 0); +} + + +// either resets or decommits memory, returns true if the memory needs +// to be recommitted if it is to be re-used later on. +bool _mi_os_purge_ex(void* p, size_t size, bool allow_reset, mi_stats_t* stats) +{ + if (mi_option_get(mi_option_purge_delay) < 0) return false; // is purging allowed? + _mi_stat_counter_increase(&stats->purge_calls, 1); + _mi_stat_increase(&stats->purged, size); + + if (mi_option_is_enabled(mi_option_purge_decommits) && // should decommit? + !_mi_preloading()) // don't decommit during preloading (unsafe) + { + bool needs_recommit = true; + mi_os_decommit_ex(p, size, &needs_recommit, stats); + return needs_recommit; + } + else { + if (allow_reset) { // this can sometimes be not allowed if the range is not fully committed + _mi_os_reset(p, size, stats); + } + return false; // needs no recommit + } +} + +// either resets or decommits memory, returns true if the memory needs +// to be recommitted if it is to be re-used later on. +bool _mi_os_purge(void* p, size_t size, mi_stats_t * stats) { + return _mi_os_purge_ex(p, size, true, stats); +} + +// Protect a region in memory to be not accessible. +static bool mi_os_protectx(void* addr, size_t size, bool protect) { + // page align conservatively within the range + size_t csize = 0; + void* start = mi_os_page_align_area_conservative(addr, size, &csize); + if (csize == 0) return false; + /* + if (_mi_os_is_huge_reserved(addr)) { + _mi_warning_message("cannot mprotect memory allocated in huge OS pages\n"); + } + */ + int err = _mi_prim_protect(start,csize,protect); + if (err != 0) { + _mi_warning_message("cannot %s OS memory (error: %d (0x%x), address: %p, size: 0x%zx bytes)\n", (protect ? "protect" : "unprotect"), err, err, start, csize); + } + return (err == 0); +} + +bool _mi_os_protect(void* addr, size_t size) { + return mi_os_protectx(addr, size, true); +} + +bool _mi_os_unprotect(void* addr, size_t size) { + return mi_os_protectx(addr, size, false); +} + + + +/* ---------------------------------------------------------------------------- +Support for allocating huge OS pages (1Gib) that are reserved up-front +and possibly associated with a specific NUMA node. (use `numa_node>=0`) +-----------------------------------------------------------------------------*/ +#define MI_HUGE_OS_PAGE_SIZE (MI_GiB) + + +#if (MI_INTPTR_SIZE >= 8) +// To ensure proper alignment, use our own area for huge OS pages +static mi_decl_cache_align _Atomic(uintptr_t) mi_huge_start; // = 0 + +// Claim an aligned address range for huge pages +static uint8_t* mi_os_claim_huge_pages(size_t pages, size_t* total_size) { + if (total_size != NULL) *total_size = 0; + const size_t size = pages * MI_HUGE_OS_PAGE_SIZE; + + uintptr_t start = 0; + uintptr_t end = 0; + uintptr_t huge_start = mi_atomic_load_relaxed(&mi_huge_start); + do { + start = huge_start; + if (start == 0) { + // Initialize the start address after the 32TiB area + start = ((uintptr_t)32 << 40); // 32TiB virtual start address + #if (MI_SECURE>0 || MI_DEBUG==0) // security: randomize start of huge pages unless in debug mode + uintptr_t r = _mi_heap_random_next(mi_prim_get_default_heap()); + start = start + ((uintptr_t)MI_HUGE_OS_PAGE_SIZE * ((r>>17) & 0x0FFF)); // (randomly 12bits)*1GiB == between 0 to 4TiB + #endif + } + end = start + size; + mi_assert_internal(end % MI_SEGMENT_SIZE == 0); + } while (!mi_atomic_cas_strong_acq_rel(&mi_huge_start, &huge_start, end)); + + if (total_size != NULL) *total_size = size; + return (uint8_t*)start; +} +#else +static uint8_t* mi_os_claim_huge_pages(size_t pages, size_t* total_size) { + MI_UNUSED(pages); + if (total_size != NULL) *total_size = 0; + return NULL; +} +#endif + +// Allocate MI_SEGMENT_SIZE aligned huge pages +void* _mi_os_alloc_huge_os_pages(size_t pages, int numa_node, mi_msecs_t max_msecs, size_t* pages_reserved, size_t* psize, mi_memid_t* memid) { + *memid = _mi_memid_none(); + if (psize != NULL) *psize = 0; + if (pages_reserved != NULL) *pages_reserved = 0; + size_t size = 0; + uint8_t* start = mi_os_claim_huge_pages(pages, &size); + if (start == NULL) return NULL; // or 32-bit systems + + // Allocate one page at the time but try to place them contiguously + // We allocate one page at the time to be able to abort if it takes too long + // or to at least allocate as many as available on the system. + mi_msecs_t start_t = _mi_clock_start(); + size_t page = 0; + bool all_zero = true; + while (page < pages) { + // allocate a page + bool is_zero = false; + void* addr = start + (page * MI_HUGE_OS_PAGE_SIZE); + void* p = NULL; + int err = _mi_prim_alloc_huge_os_pages(addr, MI_HUGE_OS_PAGE_SIZE, numa_node, &is_zero, &p); + if (!is_zero) { all_zero = false; } + if (err != 0) { + _mi_warning_message("unable to allocate huge OS page (error: %d (0x%x), address: %p, size: %zx bytes)\n", err, err, addr, MI_HUGE_OS_PAGE_SIZE); + break; + } + + // Did we succeed at a contiguous address? + if (p != addr) { + // no success, issue a warning and break + if (p != NULL) { + _mi_warning_message("could not allocate contiguous huge OS page %zu at %p\n", page, addr); + mi_os_prim_free(p, MI_HUGE_OS_PAGE_SIZE, true, &_mi_stats_main); + } + break; + } + + // success, record it + page++; // increase before timeout check (see issue #711) + _mi_stat_increase(&_mi_stats_main.committed, MI_HUGE_OS_PAGE_SIZE); + _mi_stat_increase(&_mi_stats_main.reserved, MI_HUGE_OS_PAGE_SIZE); + + // check for timeout + if (max_msecs > 0) { + mi_msecs_t elapsed = _mi_clock_end(start_t); + if (page >= 1) { + mi_msecs_t estimate = ((elapsed / (page+1)) * pages); + if (estimate > 2*max_msecs) { // seems like we are going to timeout, break + elapsed = max_msecs + 1; + } + } + if (elapsed > max_msecs) { + _mi_warning_message("huge OS page allocation timed out (after allocating %zu page(s))\n", page); + break; + } + } + } + mi_assert_internal(page*MI_HUGE_OS_PAGE_SIZE <= size); + if (pages_reserved != NULL) { *pages_reserved = page; } + if (psize != NULL) { *psize = page * MI_HUGE_OS_PAGE_SIZE; } + if (page != 0) { + mi_assert(start != NULL); + *memid = _mi_memid_create_os(true /* is committed */, all_zero, true /* is_large */); + memid->memkind = MI_MEM_OS_HUGE; + mi_assert(memid->is_pinned); + #ifdef MI_TRACK_ASAN + if (all_zero) { mi_track_mem_defined(start,size); } + #endif + } + return (page == 0 ? NULL : start); +} + +// free every huge page in a range individually (as we allocated per page) +// note: needed with VirtualAlloc but could potentially be done in one go on mmap'd systems. +static void mi_os_free_huge_os_pages(void* p, size_t size, mi_stats_t* stats) { + if (p==NULL || size==0) return; + uint8_t* base = (uint8_t*)p; + while (size >= MI_HUGE_OS_PAGE_SIZE) { + mi_os_prim_free(base, MI_HUGE_OS_PAGE_SIZE, true, stats); + size -= MI_HUGE_OS_PAGE_SIZE; + base += MI_HUGE_OS_PAGE_SIZE; + } +} + +/* ---------------------------------------------------------------------------- +Support NUMA aware allocation +-----------------------------------------------------------------------------*/ + +_Atomic(size_t) _mi_numa_node_count; // = 0 // cache the node count + +size_t _mi_os_numa_node_count_get(void) { + size_t count = mi_atomic_load_acquire(&_mi_numa_node_count); + if (count <= 0) { + long ncount = mi_option_get(mi_option_use_numa_nodes); // given explicitly? + if (ncount > 0) { + count = (size_t)ncount; + } + else { + count = _mi_prim_numa_node_count(); // or detect dynamically + if (count == 0) count = 1; + } + mi_atomic_store_release(&_mi_numa_node_count, count); // save it + _mi_verbose_message("using %zd numa regions\n", count); + } + return count; +} + +int _mi_os_numa_node_get(mi_os_tld_t* tld) { + MI_UNUSED(tld); + size_t numa_count = _mi_os_numa_node_count(); + if (numa_count<=1) return 0; // optimize on single numa node systems: always node 0 + // never more than the node count and >= 0 + size_t numa_node = _mi_prim_numa_node(); + if (numa_node >= numa_count) { numa_node = numa_node % numa_count; } + return (int)numa_node; +} diff --git a/compat/mimalloc/page-queue.c b/compat/mimalloc/page-queue.c new file mode 100644 index 00000000000000..5619a81f9917fe --- /dev/null +++ b/compat/mimalloc/page-queue.c @@ -0,0 +1,332 @@ +/*---------------------------------------------------------------------------- +Copyright (c) 2018-2020, Microsoft Research, Daan Leijen +This is free software; you can redistribute it and/or modify it under the +terms of the MIT license. A copy of the license can be found in the file +"LICENSE" at the root of this distribution. +-----------------------------------------------------------------------------*/ + +/* ----------------------------------------------------------- + Definition of page queues for each block size +----------------------------------------------------------- */ + +#ifndef MI_IN_PAGE_C +#error "this file should be included from 'page.c'" +#endif + +/* ----------------------------------------------------------- + Minimal alignment in machine words (i.e. `sizeof(void*)`) +----------------------------------------------------------- */ + +#if (MI_MAX_ALIGN_SIZE > 4*MI_INTPTR_SIZE) + #error "define alignment for more than 4x word size for this platform" +#elif (MI_MAX_ALIGN_SIZE > 2*MI_INTPTR_SIZE) + #define MI_ALIGN4W // 4 machine words minimal alignment +#elif (MI_MAX_ALIGN_SIZE > MI_INTPTR_SIZE) + #define MI_ALIGN2W // 2 machine words minimal alignment +#else + // ok, default alignment is 1 word +#endif + + +/* ----------------------------------------------------------- + Queue query +----------------------------------------------------------- */ + + +static inline bool mi_page_queue_is_huge(const mi_page_queue_t* pq) { + return (pq->block_size == (MI_MEDIUM_OBJ_SIZE_MAX+sizeof(uintptr_t))); +} + +static inline bool mi_page_queue_is_full(const mi_page_queue_t* pq) { + return (pq->block_size == (MI_MEDIUM_OBJ_SIZE_MAX+(2*sizeof(uintptr_t)))); +} + +static inline bool mi_page_queue_is_special(const mi_page_queue_t* pq) { + return (pq->block_size > MI_MEDIUM_OBJ_SIZE_MAX); +} + +/* ----------------------------------------------------------- + Bins +----------------------------------------------------------- */ + +// Return the bin for a given field size. +// Returns MI_BIN_HUGE if the size is too large. +// We use `wsize` for the size in "machine word sizes", +// i.e. byte size == `wsize*sizeof(void*)`. +static inline uint8_t mi_bin(size_t size) { + size_t wsize = _mi_wsize_from_size(size); + uint8_t bin; + if (wsize <= 1) { + bin = 1; + } + #if defined(MI_ALIGN4W) + else if (wsize <= 4) { + bin = (uint8_t)((wsize+1)&~1); // round to double word sizes + } + #elif defined(MI_ALIGN2W) + else if (wsize <= 8) { + bin = (uint8_t)((wsize+1)&~1); // round to double word sizes + } + #else + else if (wsize <= 8) { + bin = (uint8_t)wsize; + } + #endif + else if (wsize > MI_MEDIUM_OBJ_WSIZE_MAX) { + bin = MI_BIN_HUGE; + } + else { + #if defined(MI_ALIGN4W) + if (wsize <= 16) { wsize = (wsize+3)&~3; } // round to 4x word sizes + #endif + wsize--; + // find the highest bit + uint8_t b = (uint8_t)mi_bsr(wsize); // note: wsize != 0 + // and use the top 3 bits to determine the bin (~12.5% worst internal fragmentation). + // - adjust with 3 because we use do not round the first 8 sizes + // which each get an exact bin + bin = ((b << 2) + (uint8_t)((wsize >> (b - 2)) & 0x03)) - 3; + mi_assert_internal(bin < MI_BIN_HUGE); + } + mi_assert_internal(bin > 0 && bin <= MI_BIN_HUGE); + return bin; +} + + + +/* ----------------------------------------------------------- + Queue of pages with free blocks +----------------------------------------------------------- */ + +uint8_t _mi_bin(size_t size) { + return mi_bin(size); +} + +size_t _mi_bin_size(uint8_t bin) { + return _mi_heap_empty.pages[bin].block_size; +} + +// Good size for allocation +size_t mi_good_size(size_t size) mi_attr_noexcept { + if (size <= MI_MEDIUM_OBJ_SIZE_MAX) { + return _mi_bin_size(mi_bin(size)); + } + else { + return _mi_align_up(size,_mi_os_page_size()); + } +} + +#if (MI_DEBUG>1) +static bool mi_page_queue_contains(mi_page_queue_t* queue, const mi_page_t* page) { + mi_assert_internal(page != NULL); + mi_page_t* list = queue->first; + while (list != NULL) { + mi_assert_internal(list->next == NULL || list->next->prev == list); + mi_assert_internal(list->prev == NULL || list->prev->next == list); + if (list == page) break; + list = list->next; + } + return (list == page); +} + +#endif + +#if (MI_DEBUG>1) +static bool mi_heap_contains_queue(const mi_heap_t* heap, const mi_page_queue_t* pq) { + return (pq >= &heap->pages[0] && pq <= &heap->pages[MI_BIN_FULL]); +} +#endif + +static mi_page_queue_t* mi_page_queue_of(const mi_page_t* page) { + uint8_t bin = (mi_page_is_in_full(page) ? MI_BIN_FULL : mi_bin(page->xblock_size)); + mi_heap_t* heap = mi_page_heap(page); + mi_assert_internal(heap != NULL && bin <= MI_BIN_FULL); + mi_page_queue_t* pq = &heap->pages[bin]; + mi_assert_internal(bin >= MI_BIN_HUGE || page->xblock_size == pq->block_size); + mi_assert_expensive(mi_page_queue_contains(pq, page)); + return pq; +} + +static mi_page_queue_t* mi_heap_page_queue_of(mi_heap_t* heap, const mi_page_t* page) { + uint8_t bin = (mi_page_is_in_full(page) ? MI_BIN_FULL : mi_bin(page->xblock_size)); + mi_assert_internal(bin <= MI_BIN_FULL); + mi_page_queue_t* pq = &heap->pages[bin]; + mi_assert_internal(mi_page_is_in_full(page) || page->xblock_size == pq->block_size); + return pq; +} + +// The current small page array is for efficiency and for each +// small size (up to 256) it points directly to the page for that +// size without having to compute the bin. This means when the +// current free page queue is updated for a small bin, we need to update a +// range of entries in `_mi_page_small_free`. +static inline void mi_heap_queue_first_update(mi_heap_t* heap, const mi_page_queue_t* pq) { + mi_assert_internal(mi_heap_contains_queue(heap,pq)); + size_t size = pq->block_size; + if (size > MI_SMALL_SIZE_MAX) return; + + mi_page_t* page = pq->first; + if (pq->first == NULL) page = (mi_page_t*)&_mi_page_empty; + + // find index in the right direct page array + size_t start; + size_t idx = _mi_wsize_from_size(size); + mi_page_t** pages_free = heap->pages_free_direct; + + if (pages_free[idx] == page) return; // already set + + // find start slot + if (idx<=1) { + start = 0; + } + else { + // find previous size; due to minimal alignment upto 3 previous bins may need to be skipped + uint8_t bin = mi_bin(size); + const mi_page_queue_t* prev = pq - 1; + while( bin == mi_bin(prev->block_size) && prev > &heap->pages[0]) { + prev--; + } + start = 1 + _mi_wsize_from_size(prev->block_size); + if (start > idx) start = idx; + } + + // set size range to the right page + mi_assert(start <= idx); + for (size_t sz = start; sz <= idx; sz++) { + pages_free[sz] = page; + } +} + +/* +static bool mi_page_queue_is_empty(mi_page_queue_t* queue) { + return (queue->first == NULL); +} +*/ + +static void mi_page_queue_remove(mi_page_queue_t* queue, mi_page_t* page) { + mi_assert_internal(page != NULL); + mi_assert_expensive(mi_page_queue_contains(queue, page)); + mi_assert_internal(page->xblock_size == queue->block_size || (page->xblock_size > MI_MEDIUM_OBJ_SIZE_MAX && mi_page_queue_is_huge(queue)) || (mi_page_is_in_full(page) && mi_page_queue_is_full(queue))); + mi_heap_t* heap = mi_page_heap(page); + + if (page->prev != NULL) page->prev->next = page->next; + if (page->next != NULL) page->next->prev = page->prev; + if (page == queue->last) queue->last = page->prev; + if (page == queue->first) { + queue->first = page->next; + // update first + mi_assert_internal(mi_heap_contains_queue(heap, queue)); + mi_heap_queue_first_update(heap,queue); + } + heap->page_count--; + page->next = NULL; + page->prev = NULL; + // mi_atomic_store_ptr_release(mi_atomic_cast(void*, &page->heap), NULL); + mi_page_set_in_full(page,false); +} + + +static void mi_page_queue_push(mi_heap_t* heap, mi_page_queue_t* queue, mi_page_t* page) { + mi_assert_internal(mi_page_heap(page) == heap); + mi_assert_internal(!mi_page_queue_contains(queue, page)); + #if MI_HUGE_PAGE_ABANDON + mi_assert_internal(_mi_page_segment(page)->kind != MI_SEGMENT_HUGE); + #endif + mi_assert_internal(page->xblock_size == queue->block_size || + (page->xblock_size > MI_MEDIUM_OBJ_SIZE_MAX) || + (mi_page_is_in_full(page) && mi_page_queue_is_full(queue))); + + mi_page_set_in_full(page, mi_page_queue_is_full(queue)); + // mi_atomic_store_ptr_release(mi_atomic_cast(void*, &page->heap), heap); + page->next = queue->first; + page->prev = NULL; + if (queue->first != NULL) { + mi_assert_internal(queue->first->prev == NULL); + queue->first->prev = page; + queue->first = page; + } + else { + queue->first = queue->last = page; + } + + // update direct + mi_heap_queue_first_update(heap, queue); + heap->page_count++; +} + + +static void mi_page_queue_enqueue_from(mi_page_queue_t* to, mi_page_queue_t* from, mi_page_t* page) { + mi_assert_internal(page != NULL); + mi_assert_expensive(mi_page_queue_contains(from, page)); + mi_assert_expensive(!mi_page_queue_contains(to, page)); + + mi_assert_internal((page->xblock_size == to->block_size && page->xblock_size == from->block_size) || + (page->xblock_size == to->block_size && mi_page_queue_is_full(from)) || + (page->xblock_size == from->block_size && mi_page_queue_is_full(to)) || + (page->xblock_size > MI_LARGE_OBJ_SIZE_MAX && mi_page_queue_is_huge(to)) || + (page->xblock_size > MI_LARGE_OBJ_SIZE_MAX && mi_page_queue_is_full(to))); + + mi_heap_t* heap = mi_page_heap(page); + if (page->prev != NULL) page->prev->next = page->next; + if (page->next != NULL) page->next->prev = page->prev; + if (page == from->last) from->last = page->prev; + if (page == from->first) { + from->first = page->next; + // update first + mi_assert_internal(mi_heap_contains_queue(heap, from)); + mi_heap_queue_first_update(heap, from); + } + + page->prev = to->last; + page->next = NULL; + if (to->last != NULL) { + mi_assert_internal(heap == mi_page_heap(to->last)); + to->last->next = page; + to->last = page; + } + else { + to->first = page; + to->last = page; + mi_heap_queue_first_update(heap, to); + } + + mi_page_set_in_full(page, mi_page_queue_is_full(to)); +} + +// Only called from `mi_heap_absorb`. +size_t _mi_page_queue_append(mi_heap_t* heap, mi_page_queue_t* pq, mi_page_queue_t* append) { + mi_assert_internal(mi_heap_contains_queue(heap,pq)); + mi_assert_internal(pq->block_size == append->block_size); + + if (append->first==NULL) return 0; + + // set append pages to new heap and count + size_t count = 0; + for (mi_page_t* page = append->first; page != NULL; page = page->next) { + // inline `mi_page_set_heap` to avoid wrong assertion during absorption; + // in this case it is ok to be delayed freeing since both "to" and "from" heap are still alive. + mi_atomic_store_release(&page->xheap, (uintptr_t)heap); + // set the flag to delayed free (not overriding NEVER_DELAYED_FREE) which has as a + // side effect that it spins until any DELAYED_FREEING is finished. This ensures + // that after appending only the new heap will be used for delayed free operations. + _mi_page_use_delayed_free(page, MI_USE_DELAYED_FREE, false); + count++; + } + + if (pq->last==NULL) { + // take over afresh + mi_assert_internal(pq->first==NULL); + pq->first = append->first; + pq->last = append->last; + mi_heap_queue_first_update(heap, pq); + } + else { + // append to end + mi_assert_internal(pq->last!=NULL); + mi_assert_internal(append->first!=NULL); + pq->last->next = append->first; + append->first->prev = pq->last; + pq->last = append->last; + } + return count; +} diff --git a/compat/mimalloc/page.c b/compat/mimalloc/page.c new file mode 100644 index 00000000000000..211204aa79e59d --- /dev/null +++ b/compat/mimalloc/page.c @@ -0,0 +1,939 @@ +/*---------------------------------------------------------------------------- +Copyright (c) 2018-2020, Microsoft Research, Daan Leijen +This is free software; you can redistribute it and/or modify it under the +terms of the MIT license. A copy of the license can be found in the file +"LICENSE" at the root of this distribution. +-----------------------------------------------------------------------------*/ + +/* ----------------------------------------------------------- + The core of the allocator. Every segment contains + pages of a certain block size. The main function + exported is `mi_malloc_generic`. +----------------------------------------------------------- */ + +#include "mimalloc.h" +#include "mimalloc/internal.h" +#include "mimalloc/atomic.h" + +/* ----------------------------------------------------------- + Definition of page queues for each block size +----------------------------------------------------------- */ + +#define MI_IN_PAGE_C +#include "page-queue.c" +#undef MI_IN_PAGE_C + + +/* ----------------------------------------------------------- + Page helpers +----------------------------------------------------------- */ + +// Index a block in a page +static inline mi_block_t* mi_page_block_at(const mi_page_t* page, void* page_start, size_t block_size, size_t i) { + MI_UNUSED(page); + mi_assert_internal(page != NULL); + mi_assert_internal(i <= page->reserved); + return (mi_block_t*)((uint8_t*)page_start + (i * block_size)); +} + +static void mi_page_init(mi_heap_t* heap, mi_page_t* page, size_t size, mi_tld_t* tld); +static void mi_page_extend_free(mi_heap_t* heap, mi_page_t* page, mi_tld_t* tld); + +#if (MI_DEBUG>=3) +static size_t mi_page_list_count(mi_page_t* page, mi_block_t* head) { + size_t count = 0; + while (head != NULL) { + mi_assert_internal(page == _mi_ptr_page(head)); + count++; + head = mi_block_next(page, head); + } + return count; +} + +/* +// Start of the page available memory +static inline uint8_t* mi_page_area(const mi_page_t* page) { + return _mi_page_start(_mi_page_segment(page), page, NULL); +} +*/ + +static bool mi_page_list_is_valid(mi_page_t* page, mi_block_t* p) { + size_t psize; + uint8_t* page_area = _mi_page_start(_mi_page_segment(page), page, &psize); + mi_block_t* start = (mi_block_t*)page_area; + mi_block_t* end = (mi_block_t*)(page_area + psize); + while(p != NULL) { + if (p < start || p >= end) return false; + p = mi_block_next(page, p); + } +#if MI_DEBUG>3 // generally too expensive to check this + if (page->free_is_zero) { + const size_t ubsize = mi_page_usable_block_size(page); + for (mi_block_t* block = page->free; block != NULL; block = mi_block_next(page, block)) { + mi_assert_expensive(mi_mem_is_zero(block + 1, ubsize - sizeof(mi_block_t))); + } + } +#endif + return true; +} + +static bool mi_page_is_valid_init(mi_page_t* page) { + mi_assert_internal(page->xblock_size > 0); + mi_assert_internal(page->used <= page->capacity); + mi_assert_internal(page->capacity <= page->reserved); + + mi_segment_t* segment = _mi_page_segment(page); + uint8_t* start = _mi_page_start(segment,page,NULL); + mi_assert_internal(start == _mi_segment_page_start(segment,page,NULL)); + //const size_t bsize = mi_page_block_size(page); + //mi_assert_internal(start + page->capacity*page->block_size == page->top); + + mi_assert_internal(mi_page_list_is_valid(page,page->free)); + mi_assert_internal(mi_page_list_is_valid(page,page->local_free)); + + #if MI_DEBUG>3 // generally too expensive to check this + if (page->free_is_zero) { + const size_t ubsize = mi_page_usable_block_size(page); + for(mi_block_t* block = page->free; block != NULL; block = mi_block_next(page,block)) { + mi_assert_expensive(mi_mem_is_zero(block + 1, ubsize - sizeof(mi_block_t))); + } + } + #endif + + #if !MI_TRACK_ENABLED && !MI_TSAN + mi_block_t* tfree = mi_page_thread_free(page); + mi_assert_internal(mi_page_list_is_valid(page, tfree)); + //size_t tfree_count = mi_page_list_count(page, tfree); + //mi_assert_internal(tfree_count <= page->thread_freed + 1); + #endif + + size_t free_count = mi_page_list_count(page, page->free) + mi_page_list_count(page, page->local_free); + mi_assert_internal(page->used + free_count == page->capacity); + + return true; +} + +extern bool _mi_process_is_initialized; // has mi_process_init been called? + +bool _mi_page_is_valid(mi_page_t* page) { + mi_assert_internal(mi_page_is_valid_init(page)); + #if MI_SECURE + mi_assert_internal(page->keys[0] != 0); + #endif + if (mi_page_heap(page)!=NULL) { + mi_segment_t* segment = _mi_page_segment(page); + + mi_assert_internal(!_mi_process_is_initialized || segment->thread_id==0 || segment->thread_id == mi_page_heap(page)->thread_id); + #if MI_HUGE_PAGE_ABANDON + if (segment->kind != MI_SEGMENT_HUGE) + #endif + { + mi_page_queue_t* pq = mi_page_queue_of(page); + mi_assert_internal(mi_page_queue_contains(pq, page)); + mi_assert_internal(pq->block_size==mi_page_block_size(page) || mi_page_block_size(page) > MI_MEDIUM_OBJ_SIZE_MAX || mi_page_is_in_full(page)); + mi_assert_internal(mi_heap_contains_queue(mi_page_heap(page),pq)); + } + } + return true; +} +#endif + +void _mi_page_use_delayed_free(mi_page_t* page, mi_delayed_t delay, bool override_never) { + while (!_mi_page_try_use_delayed_free(page, delay, override_never)) { + mi_atomic_yield(); + } +} + +bool _mi_page_try_use_delayed_free(mi_page_t* page, mi_delayed_t delay, bool override_never) { + mi_thread_free_t tfreex; + mi_delayed_t old_delay; + mi_thread_free_t tfree; + size_t yield_count = 0; + do { + tfree = mi_atomic_load_acquire(&page->xthread_free); // note: must acquire as we can break/repeat this loop and not do a CAS; + tfreex = mi_tf_set_delayed(tfree, delay); + old_delay = mi_tf_delayed(tfree); + if mi_unlikely(old_delay == MI_DELAYED_FREEING) { + if (yield_count >= 4) return false; // give up after 4 tries + yield_count++; + mi_atomic_yield(); // delay until outstanding MI_DELAYED_FREEING are done. + // tfree = mi_tf_set_delayed(tfree, MI_NO_DELAYED_FREE); // will cause CAS to busy fail + } + else if (delay == old_delay) { + break; // avoid atomic operation if already equal + } + else if (!override_never && old_delay == MI_NEVER_DELAYED_FREE) { + break; // leave never-delayed flag set + } + } while ((old_delay == MI_DELAYED_FREEING) || + !mi_atomic_cas_weak_release(&page->xthread_free, &tfree, tfreex)); + + return true; // success +} + +/* ----------------------------------------------------------- + Page collect the `local_free` and `thread_free` lists +----------------------------------------------------------- */ + +// Collect the local `thread_free` list using an atomic exchange. +// Note: The exchange must be done atomically as this is used right after +// moving to the full list in `mi_page_collect_ex` and we need to +// ensure that there was no race where the page became unfull just before the move. +static void _mi_page_thread_free_collect(mi_page_t* page) +{ + mi_block_t* head; + mi_thread_free_t tfreex; + mi_thread_free_t tfree = mi_atomic_load_relaxed(&page->xthread_free); + do { + head = mi_tf_block(tfree); + tfreex = mi_tf_set_block(tfree,NULL); + } while (!mi_atomic_cas_weak_acq_rel(&page->xthread_free, &tfree, tfreex)); + + // return if the list is empty + if (head == NULL) return; + + // find the tail -- also to get a proper count (without data races) + uint32_t max_count = page->capacity; // cannot collect more than capacity + uint32_t count = 1; + mi_block_t* tail = head; + mi_block_t* next; + while ((next = mi_block_next(page,tail)) != NULL && count <= max_count) { + count++; + tail = next; + } + // if `count > max_count` there was a memory corruption (possibly infinite list due to double multi-threaded free) + if (count > max_count) { + _mi_error_message(EFAULT, "corrupted thread-free list\n"); + return; // the thread-free items cannot be freed + } + + // and append the current local free list + mi_block_set_next(page,tail, page->local_free); + page->local_free = head; + + // update counts now + page->used -= count; +} + +void _mi_page_free_collect(mi_page_t* page, bool force) { + mi_assert_internal(page!=NULL); + + // collect the thread free list + if (force || mi_page_thread_free(page) != NULL) { // quick test to avoid an atomic operation + _mi_page_thread_free_collect(page); + } + + // and the local free list + if (page->local_free != NULL) { + if mi_likely(page->free == NULL) { + // usual case + page->free = page->local_free; + page->local_free = NULL; + page->free_is_zero = false; + } + else if (force) { + // append -- only on shutdown (force) as this is a linear operation + mi_block_t* tail = page->local_free; + mi_block_t* next; + while ((next = mi_block_next(page, tail)) != NULL) { + tail = next; + } + mi_block_set_next(page, tail, page->free); + page->free = page->local_free; + page->local_free = NULL; + page->free_is_zero = false; + } + } + + mi_assert_internal(!force || page->local_free == NULL); +} + + + +/* ----------------------------------------------------------- + Page fresh and retire +----------------------------------------------------------- */ + +// called from segments when reclaiming abandoned pages +void _mi_page_reclaim(mi_heap_t* heap, mi_page_t* page) { + mi_assert_expensive(mi_page_is_valid_init(page)); + + mi_assert_internal(mi_page_heap(page) == heap); + mi_assert_internal(mi_page_thread_free_flag(page) != MI_NEVER_DELAYED_FREE); + #if MI_HUGE_PAGE_ABANDON + mi_assert_internal(_mi_page_segment(page)->kind != MI_SEGMENT_HUGE); + #endif + + // TODO: push on full queue immediately if it is full? + mi_page_queue_t* pq = mi_page_queue(heap, mi_page_block_size(page)); + mi_page_queue_push(heap, pq, page); + mi_assert_expensive(_mi_page_is_valid(page)); +} + +// allocate a fresh page from a segment +static mi_page_t* mi_page_fresh_alloc(mi_heap_t* heap, mi_page_queue_t* pq, size_t block_size, size_t page_alignment) { + #if !MI_HUGE_PAGE_ABANDON + mi_assert_internal(pq != NULL); + mi_assert_internal(mi_heap_contains_queue(heap, pq)); + mi_assert_internal(page_alignment > 0 || block_size > MI_MEDIUM_OBJ_SIZE_MAX || block_size == pq->block_size); + #endif + mi_page_t* page = _mi_segment_page_alloc(heap, block_size, page_alignment, &heap->tld->segments, &heap->tld->os); + if (page == NULL) { + // this may be out-of-memory, or an abandoned page was reclaimed (and in our queue) + return NULL; + } + mi_assert_internal(page_alignment >0 || block_size > MI_MEDIUM_OBJ_SIZE_MAX || _mi_page_segment(page)->kind != MI_SEGMENT_HUGE); + mi_assert_internal(pq!=NULL || page->xblock_size != 0); + mi_assert_internal(pq!=NULL || mi_page_block_size(page) >= block_size); + // a fresh page was found, initialize it + const size_t full_block_size = ((pq == NULL || mi_page_queue_is_huge(pq)) ? mi_page_block_size(page) : block_size); // see also: mi_segment_huge_page_alloc + mi_assert_internal(full_block_size >= block_size); + mi_page_init(heap, page, full_block_size, heap->tld); + mi_heap_stat_increase(heap, pages, 1); + if (pq != NULL) { mi_page_queue_push(heap, pq, page); } + mi_assert_expensive(_mi_page_is_valid(page)); + return page; +} + +// Get a fresh page to use +static mi_page_t* mi_page_fresh(mi_heap_t* heap, mi_page_queue_t* pq) { + mi_assert_internal(mi_heap_contains_queue(heap, pq)); + mi_page_t* page = mi_page_fresh_alloc(heap, pq, pq->block_size, 0); + if (page==NULL) return NULL; + mi_assert_internal(pq->block_size==mi_page_block_size(page)); + mi_assert_internal(pq==mi_page_queue(heap, mi_page_block_size(page))); + return page; +} + +/* ----------------------------------------------------------- + Do any delayed frees + (put there by other threads if they deallocated in a full page) +----------------------------------------------------------- */ +void _mi_heap_delayed_free_all(mi_heap_t* heap) { + while (!_mi_heap_delayed_free_partial(heap)) { + mi_atomic_yield(); + } +} + +// returns true if all delayed frees were processed +bool _mi_heap_delayed_free_partial(mi_heap_t* heap) { + // take over the list (note: no atomic exchange since it is often NULL) + mi_block_t* block = mi_atomic_load_ptr_relaxed(mi_block_t, &heap->thread_delayed_free); + while (block != NULL && !mi_atomic_cas_ptr_weak_acq_rel(mi_block_t, &heap->thread_delayed_free, &block, NULL)) { /* nothing */ }; + bool all_freed = true; + + // and free them all + while(block != NULL) { + mi_block_t* next = mi_block_nextx(heap,block, heap->keys); + // use internal free instead of regular one to keep stats etc correct + if (!_mi_free_delayed_block(block)) { + // we might already start delayed freeing while another thread has not yet + // reset the delayed_freeing flag; in that case delay it further by reinserting the current block + // into the delayed free list + all_freed = false; + mi_block_t* dfree = mi_atomic_load_ptr_relaxed(mi_block_t, &heap->thread_delayed_free); + do { + mi_block_set_nextx(heap, block, dfree, heap->keys); + } while (!mi_atomic_cas_ptr_weak_release(mi_block_t,&heap->thread_delayed_free, &dfree, block)); + } + block = next; + } + return all_freed; +} + +/* ----------------------------------------------------------- + Unfull, abandon, free and retire +----------------------------------------------------------- */ + +// Move a page from the full list back to a regular list +void _mi_page_unfull(mi_page_t* page) { + mi_assert_internal(page != NULL); + mi_assert_expensive(_mi_page_is_valid(page)); + mi_assert_internal(mi_page_is_in_full(page)); + if (!mi_page_is_in_full(page)) return; + + mi_heap_t* heap = mi_page_heap(page); + mi_page_queue_t* pqfull = &heap->pages[MI_BIN_FULL]; + mi_page_set_in_full(page, false); // to get the right queue + mi_page_queue_t* pq = mi_heap_page_queue_of(heap, page); + mi_page_set_in_full(page, true); + mi_page_queue_enqueue_from(pq, pqfull, page); +} + +static void mi_page_to_full(mi_page_t* page, mi_page_queue_t* pq) { + mi_assert_internal(pq == mi_page_queue_of(page)); + mi_assert_internal(!mi_page_immediate_available(page)); + mi_assert_internal(!mi_page_is_in_full(page)); + + if (mi_page_is_in_full(page)) return; + mi_page_queue_enqueue_from(&mi_page_heap(page)->pages[MI_BIN_FULL], pq, page); + _mi_page_free_collect(page,false); // try to collect right away in case another thread freed just before MI_USE_DELAYED_FREE was set +} + + +// Abandon a page with used blocks at the end of a thread. +// Note: only call if it is ensured that no references exist from +// the `page->heap->thread_delayed_free` into this page. +// Currently only called through `mi_heap_collect_ex` which ensures this. +void _mi_page_abandon(mi_page_t* page, mi_page_queue_t* pq) { + mi_assert_internal(page != NULL); + mi_assert_expensive(_mi_page_is_valid(page)); + mi_assert_internal(pq == mi_page_queue_of(page)); + mi_assert_internal(mi_page_heap(page) != NULL); + + mi_heap_t* pheap = mi_page_heap(page); + + // remove from our page list + mi_segments_tld_t* segments_tld = &pheap->tld->segments; + mi_page_queue_remove(pq, page); + + // page is no longer associated with our heap + mi_assert_internal(mi_page_thread_free_flag(page)==MI_NEVER_DELAYED_FREE); + mi_page_set_heap(page, NULL); + +#if (MI_DEBUG>1) && !MI_TRACK_ENABLED + // check there are no references left.. + for (mi_block_t* block = (mi_block_t*)pheap->thread_delayed_free; block != NULL; block = mi_block_nextx(pheap, block, pheap->keys)) { + mi_assert_internal(_mi_ptr_page(block) != page); + } +#endif + + // and abandon it + mi_assert_internal(mi_page_heap(page) == NULL); + _mi_segment_page_abandon(page,segments_tld); +} + + +// Free a page with no more free blocks +void _mi_page_free(mi_page_t* page, mi_page_queue_t* pq, bool force) { + mi_assert_internal(page != NULL); + mi_assert_expensive(_mi_page_is_valid(page)); + mi_assert_internal(pq == mi_page_queue_of(page)); + mi_assert_internal(mi_page_all_free(page)); + mi_assert_internal(mi_page_thread_free_flag(page)!=MI_DELAYED_FREEING); + + // no more aligned blocks in here + mi_page_set_has_aligned(page, false); + + mi_heap_t* heap = mi_page_heap(page); + + // remove from the page list + // (no need to do _mi_heap_delayed_free first as all blocks are already free) + mi_segments_tld_t* segments_tld = &heap->tld->segments; + mi_page_queue_remove(pq, page); + + // and free it + mi_page_set_heap(page,NULL); + _mi_segment_page_free(page, force, segments_tld); +} + +// Retire parameters +#define MI_MAX_RETIRE_SIZE (MI_MEDIUM_OBJ_SIZE_MAX) +#define MI_RETIRE_CYCLES (16) + +// Retire a page with no more used blocks +// Important to not retire too quickly though as new +// allocations might coming. +// Note: called from `mi_free` and benchmarks often +// trigger this due to freeing everything and then +// allocating again so careful when changing this. +void _mi_page_retire(mi_page_t* page) mi_attr_noexcept { + mi_assert_internal(page != NULL); + mi_assert_expensive(_mi_page_is_valid(page)); + mi_assert_internal(mi_page_all_free(page)); + + mi_page_set_has_aligned(page, false); + + // don't retire too often.. + // (or we end up retiring and re-allocating most of the time) + // NOTE: refine this more: we should not retire if this + // is the only page left with free blocks. It is not clear + // how to check this efficiently though... + // for now, we don't retire if it is the only page left of this size class. + mi_page_queue_t* pq = mi_page_queue_of(page); + if mi_likely(page->xblock_size <= MI_MAX_RETIRE_SIZE && !mi_page_queue_is_special(pq)) { // not too large && not full or huge queue? + if (pq->last==page && pq->first==page) { // the only page in the queue? + mi_stat_counter_increase(_mi_stats_main.page_no_retire,1); + page->retire_expire = 1 + (page->xblock_size <= MI_SMALL_OBJ_SIZE_MAX ? MI_RETIRE_CYCLES : MI_RETIRE_CYCLES/4); + mi_heap_t* heap = mi_page_heap(page); + mi_assert_internal(pq >= heap->pages); + const size_t index = pq - heap->pages; + mi_assert_internal(index < MI_BIN_FULL && index < MI_BIN_HUGE); + if (index < heap->page_retired_min) heap->page_retired_min = index; + if (index > heap->page_retired_max) heap->page_retired_max = index; + mi_assert_internal(mi_page_all_free(page)); + return; // dont't free after all + } + } + _mi_page_free(page, pq, false); +} + +// free retired pages: we don't need to look at the entire queues +// since we only retire pages that are at the head position in a queue. +void _mi_heap_collect_retired(mi_heap_t* heap, bool force) { + size_t min = MI_BIN_FULL; + size_t max = 0; + for(size_t bin = heap->page_retired_min; bin <= heap->page_retired_max; bin++) { + mi_page_queue_t* pq = &heap->pages[bin]; + mi_page_t* page = pq->first; + if (page != NULL && page->retire_expire != 0) { + if (mi_page_all_free(page)) { + page->retire_expire--; + if (force || page->retire_expire == 0) { + _mi_page_free(pq->first, pq, force); + } + else { + // keep retired, update min/max + if (bin < min) min = bin; + if (bin > max) max = bin; + } + } + else { + page->retire_expire = 0; + } + } + } + heap->page_retired_min = min; + heap->page_retired_max = max; +} + + +/* ----------------------------------------------------------- + Initialize the initial free list in a page. + In secure mode we initialize a randomized list by + alternating between slices. +----------------------------------------------------------- */ + +#define MI_MAX_SLICE_SHIFT (6) // at most 64 slices +#define MI_MAX_SLICES (1UL << MI_MAX_SLICE_SHIFT) +#define MI_MIN_SLICES (2) + +static void mi_page_free_list_extend_secure(mi_heap_t* const heap, mi_page_t* const page, const size_t bsize, const size_t extend, mi_stats_t* const stats) { + MI_UNUSED(stats); + #if (MI_SECURE<=2) + mi_assert_internal(page->free == NULL); + mi_assert_internal(page->local_free == NULL); + #endif + mi_assert_internal(page->capacity + extend <= page->reserved); + mi_assert_internal(bsize == mi_page_block_size(page)); + void* const page_area = _mi_page_start(_mi_page_segment(page), page, NULL); + + // initialize a randomized free list + // set up `slice_count` slices to alternate between + size_t shift = MI_MAX_SLICE_SHIFT; + while ((extend >> shift) == 0) { + shift--; + } + const size_t slice_count = (size_t)1U << shift; + const size_t slice_extend = extend / slice_count; + mi_assert_internal(slice_extend >= 1); + mi_block_t* blocks[MI_MAX_SLICES]; // current start of the slice + size_t counts[MI_MAX_SLICES]; // available objects in the slice + for (size_t i = 0; i < slice_count; i++) { + blocks[i] = mi_page_block_at(page, page_area, bsize, page->capacity + i*slice_extend); + counts[i] = slice_extend; + } + counts[slice_count-1] += (extend % slice_count); // final slice holds the modulus too (todo: distribute evenly?) + + // and initialize the free list by randomly threading through them + // set up first element + const uintptr_t r = _mi_heap_random_next(heap); + size_t current = r % slice_count; + counts[current]--; + mi_block_t* const free_start = blocks[current]; + // and iterate through the rest; use `random_shuffle` for performance + uintptr_t rnd = _mi_random_shuffle(r|1); // ensure not 0 + for (size_t i = 1; i < extend; i++) { + // call random_shuffle only every INTPTR_SIZE rounds + const size_t round = i%MI_INTPTR_SIZE; + if (round == 0) rnd = _mi_random_shuffle(rnd); + // select a random next slice index + size_t next = ((rnd >> 8*round) & (slice_count-1)); + while (counts[next]==0) { // ensure it still has space + next++; + if (next==slice_count) next = 0; + } + // and link the current block to it + counts[next]--; + mi_block_t* const block = blocks[current]; + blocks[current] = (mi_block_t*)((uint8_t*)block + bsize); // bump to the following block + mi_block_set_next(page, block, blocks[next]); // and set next; note: we may have `current == next` + current = next; + } + // prepend to the free list (usually NULL) + mi_block_set_next(page, blocks[current], page->free); // end of the list + page->free = free_start; +} + +static mi_decl_noinline void mi_page_free_list_extend( mi_page_t* const page, const size_t bsize, const size_t extend, mi_stats_t* const stats) +{ + MI_UNUSED(stats); + #if (MI_SECURE <= 2) + mi_assert_internal(page->free == NULL); + mi_assert_internal(page->local_free == NULL); + #endif + mi_assert_internal(page->capacity + extend <= page->reserved); + mi_assert_internal(bsize == mi_page_block_size(page)); + void* const page_area = _mi_page_start(_mi_page_segment(page), page, NULL ); + + mi_block_t* const start = mi_page_block_at(page, page_area, bsize, page->capacity); + + // initialize a sequential free list + mi_block_t* const last = mi_page_block_at(page, page_area, bsize, page->capacity + extend - 1); + mi_block_t* block = start; + while(block <= last) { + mi_block_t* next = (mi_block_t*)((uint8_t*)block + bsize); + mi_block_set_next(page,block,next); + block = next; + } + // prepend to free list (usually `NULL`) + mi_block_set_next(page, last, page->free); + page->free = start; +} + +/* ----------------------------------------------------------- + Page initialize and extend the capacity +----------------------------------------------------------- */ + +#define MI_MAX_EXTEND_SIZE (4*1024) // heuristic, one OS page seems to work well. +#if (MI_SECURE>0) +#define MI_MIN_EXTEND (8*MI_SECURE) // extend at least by this many +#else +#define MI_MIN_EXTEND (4) +#endif + +// Extend the capacity (up to reserved) by initializing a free list +// We do at most `MI_MAX_EXTEND` to avoid touching too much memory +// Note: we also experimented with "bump" allocation on the first +// allocations but this did not speed up any benchmark (due to an +// extra test in malloc? or cache effects?) +static void mi_page_extend_free(mi_heap_t* heap, mi_page_t* page, mi_tld_t* tld) { + MI_UNUSED(tld); + mi_assert_expensive(mi_page_is_valid_init(page)); + #if (MI_SECURE<=2) + mi_assert(page->free == NULL); + mi_assert(page->local_free == NULL); + if (page->free != NULL) return; + #endif + if (page->capacity >= page->reserved) return; + + size_t page_size; + _mi_page_start(_mi_page_segment(page), page, &page_size); + mi_stat_counter_increase(tld->stats.pages_extended, 1); + + // calculate the extend count + const size_t bsize = (page->xblock_size < MI_HUGE_BLOCK_SIZE ? page->xblock_size : page_size); + size_t extend = page->reserved - page->capacity; + mi_assert_internal(extend > 0); + + size_t max_extend = (bsize >= MI_MAX_EXTEND_SIZE ? MI_MIN_EXTEND : MI_MAX_EXTEND_SIZE/(uint32_t)bsize); + if (max_extend < MI_MIN_EXTEND) { max_extend = MI_MIN_EXTEND; } + mi_assert_internal(max_extend > 0); + + if (extend > max_extend) { + // ensure we don't touch memory beyond the page to reduce page commit. + // the `lean` benchmark tests this. Going from 1 to 8 increases rss by 50%. + extend = max_extend; + } + + mi_assert_internal(extend > 0 && extend + page->capacity <= page->reserved); + mi_assert_internal(extend < (1UL<<16)); + + // and append the extend the free list + if (extend < MI_MIN_SLICES || MI_SECURE==0) { //!mi_option_is_enabled(mi_option_secure)) { + mi_page_free_list_extend(page, bsize, extend, &tld->stats ); + } + else { + mi_page_free_list_extend_secure(heap, page, bsize, extend, &tld->stats); + } + // enable the new free list + page->capacity += (uint16_t)extend; + mi_stat_increase(tld->stats.page_committed, extend * bsize); + mi_assert_expensive(mi_page_is_valid_init(page)); +} + +// Initialize a fresh page +static void mi_page_init(mi_heap_t* heap, mi_page_t* page, size_t block_size, mi_tld_t* tld) { + mi_assert(page != NULL); + mi_segment_t* segment = _mi_page_segment(page); + mi_assert(segment != NULL); + mi_assert_internal(block_size > 0); + // set fields + mi_page_set_heap(page, heap); + page->xblock_size = (block_size < MI_HUGE_BLOCK_SIZE ? (uint32_t)block_size : MI_HUGE_BLOCK_SIZE); // initialize before _mi_segment_page_start + size_t page_size; + const void* page_start = _mi_segment_page_start(segment, page, &page_size); + MI_UNUSED(page_start); + mi_track_mem_noaccess(page_start,page_size); + mi_assert_internal(mi_page_block_size(page) <= page_size); + mi_assert_internal(page_size <= page->slice_count*MI_SEGMENT_SLICE_SIZE); + mi_assert_internal(page_size / block_size < (1L<<16)); + page->reserved = (uint16_t)(page_size / block_size); + mi_assert_internal(page->reserved > 0); + #if (MI_PADDING || MI_ENCODE_FREELIST) + page->keys[0] = _mi_heap_random_next(heap); + page->keys[1] = _mi_heap_random_next(heap); + #endif + page->free_is_zero = page->is_zero_init; + #if MI_DEBUG>2 + if (page->is_zero_init) { + mi_track_mem_defined(page_start, page_size); + mi_assert_expensive(mi_mem_is_zero(page_start, page_size)); + } + #endif + + mi_assert_internal(page->is_committed); + mi_assert_internal(page->capacity == 0); + mi_assert_internal(page->free == NULL); + mi_assert_internal(page->used == 0); + mi_assert_internal(page->xthread_free == 0); + mi_assert_internal(page->next == NULL); + mi_assert_internal(page->prev == NULL); + mi_assert_internal(page->retire_expire == 0); + mi_assert_internal(!mi_page_has_aligned(page)); + #if (MI_PADDING || MI_ENCODE_FREELIST) + mi_assert_internal(page->keys[0] != 0); + mi_assert_internal(page->keys[1] != 0); + #endif + mi_assert_expensive(mi_page_is_valid_init(page)); + + // initialize an initial free list + mi_page_extend_free(heap,page,tld); + mi_assert(mi_page_immediate_available(page)); +} + + +/* ----------------------------------------------------------- + Find pages with free blocks +-------------------------------------------------------------*/ + +// Find a page with free blocks of `page->block_size`. +static mi_page_t* mi_page_queue_find_free_ex(mi_heap_t* heap, mi_page_queue_t* pq, bool first_try) +{ + // search through the pages in "next fit" order + #if MI_STAT + size_t count = 0; + #endif + mi_page_t* page = pq->first; + while (page != NULL) + { + mi_page_t* next = page->next; // remember next + #if MI_STAT + count++; + #endif + + // 0. collect freed blocks by us and other threads + _mi_page_free_collect(page, false); + + // 1. if the page contains free blocks, we are done + if (mi_page_immediate_available(page)) { + break; // pick this one + } + + // 2. Try to extend + if (page->capacity < page->reserved) { + mi_page_extend_free(heap, page, heap->tld); + mi_assert_internal(mi_page_immediate_available(page)); + break; + } + + // 3. If the page is completely full, move it to the `mi_pages_full` + // queue so we don't visit long-lived pages too often. + mi_assert_internal(!mi_page_is_in_full(page) && !mi_page_immediate_available(page)); + mi_page_to_full(page, pq); + + page = next; + } // for each page + + mi_heap_stat_counter_increase(heap, searches, count); + + if (page == NULL) { + _mi_heap_collect_retired(heap, false); // perhaps make a page available? + page = mi_page_fresh(heap, pq); + if (page == NULL && first_try) { + // out-of-memory _or_ an abandoned page with free blocks was reclaimed, try once again + page = mi_page_queue_find_free_ex(heap, pq, false); + } + } + else { + mi_assert(pq->first == page); + page->retire_expire = 0; + } + mi_assert_internal(page == NULL || mi_page_immediate_available(page)); + return page; +} + + + +// Find a page with free blocks of `size`. +static inline mi_page_t* mi_find_free_page(mi_heap_t* heap, size_t size) { + mi_page_queue_t* pq = mi_page_queue(heap,size); + mi_page_t* page = pq->first; + if (page != NULL) { + #if (MI_SECURE>=3) // in secure mode, we extend half the time to increase randomness + if (page->capacity < page->reserved && ((_mi_heap_random_next(heap) & 1) == 1)) { + mi_page_extend_free(heap, page, heap->tld); + mi_assert_internal(mi_page_immediate_available(page)); + } + else + #endif + { + _mi_page_free_collect(page,false); + } + + if (mi_page_immediate_available(page)) { + page->retire_expire = 0; + return page; // fast path + } + } + return mi_page_queue_find_free_ex(heap, pq, true); +} + + +/* ----------------------------------------------------------- + Users can register a deferred free function called + when the `free` list is empty. Since the `local_free` + is separate this is deterministically called after + a certain number of allocations. +----------------------------------------------------------- */ + +static mi_deferred_free_fun* volatile deferred_free = NULL; +static _Atomic(void*) deferred_arg; // = NULL + +void _mi_deferred_free(mi_heap_t* heap, bool force) { + heap->tld->heartbeat++; + if (deferred_free != NULL && !heap->tld->recurse) { + heap->tld->recurse = true; + deferred_free(force, heap->tld->heartbeat, mi_atomic_load_ptr_relaxed(void,&deferred_arg)); + heap->tld->recurse = false; + } +} + +void mi_register_deferred_free(mi_deferred_free_fun* fn, void* arg) mi_attr_noexcept { + deferred_free = fn; + mi_atomic_store_ptr_release(void,&deferred_arg, arg); +} + + +/* ----------------------------------------------------------- + General allocation +----------------------------------------------------------- */ + +// Large and huge page allocation. +// Huge pages are allocated directly without being in a queue. +// Because huge pages contain just one block, and the segment contains +// just that page, we always treat them as abandoned and any thread +// that frees the block can free the whole page and segment directly. +// Huge pages are also use if the requested alignment is very large (> MI_ALIGNMENT_MAX). +static mi_page_t* mi_large_huge_page_alloc(mi_heap_t* heap, size_t size, size_t page_alignment) { + size_t block_size = _mi_os_good_alloc_size(size); + mi_assert_internal(mi_bin(block_size) == MI_BIN_HUGE || page_alignment > 0); + bool is_huge = (block_size > MI_LARGE_OBJ_SIZE_MAX || page_alignment > 0); + #if MI_HUGE_PAGE_ABANDON + mi_page_queue_t* pq = (is_huge ? NULL : mi_page_queue(heap, block_size)); + #else + mi_page_queue_t* pq = mi_page_queue(heap, is_huge ? MI_HUGE_BLOCK_SIZE : block_size); // not block_size as that can be low if the page_alignment > 0 + mi_assert_internal(!is_huge || mi_page_queue_is_huge(pq)); + #endif + mi_page_t* page = mi_page_fresh_alloc(heap, pq, block_size, page_alignment); + if (page != NULL) { + mi_assert_internal(mi_page_immediate_available(page)); + + if (is_huge) { + mi_assert_internal(_mi_page_segment(page)->kind == MI_SEGMENT_HUGE); + mi_assert_internal(_mi_page_segment(page)->used==1); + #if MI_HUGE_PAGE_ABANDON + mi_assert_internal(_mi_page_segment(page)->thread_id==0); // abandoned, not in the huge queue + mi_page_set_heap(page, NULL); + #endif + } + else { + mi_assert_internal(_mi_page_segment(page)->kind != MI_SEGMENT_HUGE); + } + + const size_t bsize = mi_page_usable_block_size(page); // note: not `mi_page_block_size` to account for padding + if (bsize <= MI_LARGE_OBJ_SIZE_MAX) { + mi_heap_stat_increase(heap, large, bsize); + mi_heap_stat_counter_increase(heap, large_count, 1); + } + else { + mi_heap_stat_increase(heap, huge, bsize); + mi_heap_stat_counter_increase(heap, huge_count, 1); + } + } + return page; +} + + +// Allocate a page +// Note: in debug mode the size includes MI_PADDING_SIZE and might have overflowed. +static mi_page_t* mi_find_page(mi_heap_t* heap, size_t size, size_t huge_alignment) mi_attr_noexcept { + // huge allocation? + const size_t req_size = size - MI_PADDING_SIZE; // correct for padding_size in case of an overflow on `size` + if mi_unlikely(req_size > (MI_MEDIUM_OBJ_SIZE_MAX - MI_PADDING_SIZE) || huge_alignment > 0) { + if mi_unlikely(req_size > PTRDIFF_MAX) { // we don't allocate more than PTRDIFF_MAX (see ) + _mi_error_message(EOVERFLOW, "allocation request is too large (%zu bytes)\n", req_size); + return NULL; + } + else { + return mi_large_huge_page_alloc(heap,size,huge_alignment); + } + } + else { + // otherwise find a page with free blocks in our size segregated queues + #if MI_PADDING + mi_assert_internal(size >= MI_PADDING_SIZE); + #endif + return mi_find_free_page(heap, size); + } +} + +// Generic allocation routine if the fast path (`alloc.c:mi_page_malloc`) does not succeed. +// Note: in debug mode the size includes MI_PADDING_SIZE and might have overflowed. +// The `huge_alignment` is normally 0 but is set to a multiple of MI_SEGMENT_SIZE for +// very large requested alignments in which case we use a huge segment. +void* _mi_malloc_generic(mi_heap_t* heap, size_t size, bool zero, size_t huge_alignment) mi_attr_noexcept +{ + mi_assert_internal(heap != NULL); + + // initialize if necessary + if mi_unlikely(!mi_heap_is_initialized(heap)) { + heap = mi_heap_get_default(); // calls mi_thread_init + if mi_unlikely(!mi_heap_is_initialized(heap)) { return NULL; } + } + mi_assert_internal(mi_heap_is_initialized(heap)); + + // call potential deferred free routines + _mi_deferred_free(heap, false); + + // free delayed frees from other threads (but skip contended ones) + _mi_heap_delayed_free_partial(heap); + + // find (or allocate) a page of the right size + mi_page_t* page = mi_find_page(heap, size, huge_alignment); + if mi_unlikely(page == NULL) { // first time out of memory, try to collect and retry the allocation once more + mi_heap_collect(heap, true /* force */); + page = mi_find_page(heap, size, huge_alignment); + } + + if mi_unlikely(page == NULL) { // out of memory + const size_t req_size = size - MI_PADDING_SIZE; // correct for padding_size in case of an overflow on `size` + _mi_error_message(ENOMEM, "unable to allocate memory (%zu bytes)\n", req_size); + return NULL; + } + + mi_assert_internal(mi_page_immediate_available(page)); + mi_assert_internal(mi_page_block_size(page) >= size); + + // and try again, this time succeeding! (i.e. this should never recurse through _mi_page_malloc) + if mi_unlikely(zero && page->xblock_size == 0) { + // note: we cannot call _mi_page_malloc with zeroing for huge blocks; we zero it afterwards in that case. + void* p = _mi_page_malloc(heap, page, size, false); + mi_assert_internal(p != NULL); + _mi_memzero_aligned(p, mi_page_usable_block_size(page)); + return p; + } + else { + return _mi_page_malloc(heap, page, size, zero); + } +} diff --git a/compat/mimalloc/prim/windows/prim.c b/compat/mimalloc/prim/windows/prim.c new file mode 100644 index 00000000000000..d060833c5b644d --- /dev/null +++ b/compat/mimalloc/prim/windows/prim.c @@ -0,0 +1,622 @@ +/* ---------------------------------------------------------------------------- +Copyright (c) 2018-2023, Microsoft Research, Daan Leijen +This is free software; you can redistribute it and/or modify it under the +terms of the MIT license. A copy of the license can be found in the file +"LICENSE" at the root of this distribution. +-----------------------------------------------------------------------------*/ + +// This file is included in `src/prim/prim.c` + +#include "mimalloc.h" +#include "mimalloc/internal.h" +#include "mimalloc/atomic.h" +#include "mimalloc/prim.h" +#include // fputs, stderr + + +//--------------------------------------------- +// Dynamically bind Windows API points for portability +//--------------------------------------------- + +// We use VirtualAlloc2 for aligned allocation, but it is only supported on Windows 10 and Windows Server 2016. +// So, we need to look it up dynamically to run on older systems. (use __stdcall for 32-bit compatibility) +// NtAllocateVirtualAllocEx is used for huge OS page allocation (1GiB) +// We define a minimal MEM_EXTENDED_PARAMETER ourselves in order to be able to compile with older SDK's. +typedef enum MI_MEM_EXTENDED_PARAMETER_TYPE_E { + MiMemExtendedParameterInvalidType = 0, + MiMemExtendedParameterAddressRequirements, + MiMemExtendedParameterNumaNode, + MiMemExtendedParameterPartitionHandle, + MiMemExtendedParameterUserPhysicalHandle, + MiMemExtendedParameterAttributeFlags, + MiMemExtendedParameterMax +} MI_MEM_EXTENDED_PARAMETER_TYPE; + +typedef struct DECLSPEC_ALIGN(8) MI_MEM_EXTENDED_PARAMETER_S { + struct { DWORD64 Type : 8; DWORD64 Reserved : 56; } Type; + union { DWORD64 ULong64; PVOID Pointer; SIZE_T Size; HANDLE Handle; DWORD ULong; } Arg; +} MI_MEM_EXTENDED_PARAMETER; + +typedef struct MI_MEM_ADDRESS_REQUIREMENTS_S { + PVOID LowestStartingAddress; + PVOID HighestEndingAddress; + SIZE_T Alignment; +} MI_MEM_ADDRESS_REQUIREMENTS; + +#define MI_MEM_EXTENDED_PARAMETER_NONPAGED_HUGE 0x00000010 + +#include +typedef PVOID (__stdcall *PVirtualAlloc2)(HANDLE, PVOID, SIZE_T, ULONG, ULONG, MI_MEM_EXTENDED_PARAMETER*, ULONG); +typedef NTSTATUS (__stdcall *PNtAllocateVirtualMemoryEx)(HANDLE, PVOID*, SIZE_T*, ULONG, ULONG, MI_MEM_EXTENDED_PARAMETER*, ULONG); +static PVirtualAlloc2 pVirtualAlloc2 = NULL; +static PNtAllocateVirtualMemoryEx pNtAllocateVirtualMemoryEx = NULL; + +// Similarly, GetNumaProcesorNodeEx is only supported since Windows 7 +typedef struct MI_PROCESSOR_NUMBER_S { WORD Group; BYTE Number; BYTE Reserved; } MI_PROCESSOR_NUMBER; + +typedef VOID (__stdcall *PGetCurrentProcessorNumberEx)(MI_PROCESSOR_NUMBER* ProcNumber); +typedef BOOL (__stdcall *PGetNumaProcessorNodeEx)(MI_PROCESSOR_NUMBER* Processor, PUSHORT NodeNumber); +typedef BOOL (__stdcall* PGetNumaNodeProcessorMaskEx)(USHORT Node, PGROUP_AFFINITY ProcessorMask); +typedef BOOL (__stdcall *PGetNumaProcessorNode)(UCHAR Processor, PUCHAR NodeNumber); +static PGetCurrentProcessorNumberEx pGetCurrentProcessorNumberEx = NULL; +static PGetNumaProcessorNodeEx pGetNumaProcessorNodeEx = NULL; +static PGetNumaNodeProcessorMaskEx pGetNumaNodeProcessorMaskEx = NULL; +static PGetNumaProcessorNode pGetNumaProcessorNode = NULL; + +//--------------------------------------------- +// Enable large page support dynamically (if possible) +//--------------------------------------------- + +static bool win_enable_large_os_pages(size_t* large_page_size) +{ + static bool large_initialized = false; + if (large_initialized) return (_mi_os_large_page_size() > 0); + large_initialized = true; + + // Try to see if large OS pages are supported + // To use large pages on Windows, we first need access permission + // Set "Lock pages in memory" permission in the group policy editor + // + unsigned long err = 0; + HANDLE token = NULL; + BOOL ok = OpenProcessToken(GetCurrentProcess(), TOKEN_ADJUST_PRIVILEGES | TOKEN_QUERY, &token); + if (ok) { + TOKEN_PRIVILEGES tp; + ok = LookupPrivilegeValue(NULL, TEXT("SeLockMemoryPrivilege"), &tp.Privileges[0].Luid); + if (ok) { + tp.PrivilegeCount = 1; + tp.Privileges[0].Attributes = SE_PRIVILEGE_ENABLED; + ok = AdjustTokenPrivileges(token, FALSE, &tp, 0, (PTOKEN_PRIVILEGES)NULL, 0); + if (ok) { + err = GetLastError(); + ok = (err == ERROR_SUCCESS); + if (ok && large_page_size != NULL) { + *large_page_size = GetLargePageMinimum(); + } + } + } + CloseHandle(token); + } + if (!ok) { + if (err == 0) err = GetLastError(); + _mi_warning_message("cannot enable large OS page support, error %lu\n", err); + } + return (ok!=0); +} + + +//--------------------------------------------- +// Initialize +//--------------------------------------------- + +void _mi_prim_mem_init( mi_os_mem_config_t* config ) +{ + config->has_overcommit = false; + config->must_free_whole = true; + config->has_virtual_reserve = true; + // get the page size + SYSTEM_INFO si; + GetSystemInfo(&si); + if (si.dwPageSize > 0) { config->page_size = si.dwPageSize; } + if (si.dwAllocationGranularity > 0) { config->alloc_granularity = si.dwAllocationGranularity; } + // get the VirtualAlloc2 function + HINSTANCE hDll; + hDll = LoadLibrary(TEXT("kernelbase.dll")); + if (hDll != NULL) { + // use VirtualAlloc2FromApp if possible as it is available to Windows store apps + pVirtualAlloc2 = (PVirtualAlloc2)(void (*)(void))GetProcAddress(hDll, "VirtualAlloc2FromApp"); + if (pVirtualAlloc2==NULL) pVirtualAlloc2 = (PVirtualAlloc2)(void (*)(void))GetProcAddress(hDll, "VirtualAlloc2"); + FreeLibrary(hDll); + } + // NtAllocateVirtualMemoryEx is used for huge page allocation + hDll = LoadLibrary(TEXT("ntdll.dll")); + if (hDll != NULL) { + pNtAllocateVirtualMemoryEx = (PNtAllocateVirtualMemoryEx)(void (*)(void))GetProcAddress(hDll, "NtAllocateVirtualMemoryEx"); + FreeLibrary(hDll); + } + // Try to use Win7+ numa API + hDll = LoadLibrary(TEXT("kernel32.dll")); + if (hDll != NULL) { + pGetCurrentProcessorNumberEx = (PGetCurrentProcessorNumberEx)(void (*)(void))GetProcAddress(hDll, "GetCurrentProcessorNumberEx"); + pGetNumaProcessorNodeEx = (PGetNumaProcessorNodeEx)(void (*)(void))GetProcAddress(hDll, "GetNumaProcessorNodeEx"); + pGetNumaNodeProcessorMaskEx = (PGetNumaNodeProcessorMaskEx)(void (*)(void))GetProcAddress(hDll, "GetNumaNodeProcessorMaskEx"); + pGetNumaProcessorNode = (PGetNumaProcessorNode)(void (*)(void))GetProcAddress(hDll, "GetNumaProcessorNode"); + FreeLibrary(hDll); + } + if (mi_option_is_enabled(mi_option_allow_large_os_pages) || mi_option_is_enabled(mi_option_reserve_huge_os_pages)) { + win_enable_large_os_pages(&config->large_page_size); + } +} + + +//--------------------------------------------- +// Free +//--------------------------------------------- + +int _mi_prim_free(void* addr, size_t size ) { + MI_UNUSED(size); + DWORD errcode = 0; + bool err = (VirtualFree(addr, 0, MEM_RELEASE) == 0); + if (err) { errcode = GetLastError(); } + if (errcode == ERROR_INVALID_ADDRESS) { + // In mi_os_mem_alloc_aligned the fallback path may have returned a pointer inside + // the memory region returned by VirtualAlloc; in that case we need to free using + // the start of the region. + MEMORY_BASIC_INFORMATION info = { 0 }; + VirtualQuery(addr, &info, sizeof(info)); + if (info.AllocationBase < addr && ((uint8_t*)addr - (uint8_t*)info.AllocationBase) < (ptrdiff_t)MI_SEGMENT_SIZE) { + errcode = 0; + err = (VirtualFree(info.AllocationBase, 0, MEM_RELEASE) == 0); + if (err) { errcode = GetLastError(); } + } + } + return (int)errcode; +} + + +//--------------------------------------------- +// VirtualAlloc +//--------------------------------------------- + +static void* win_virtual_alloc_prim(void* addr, size_t size, size_t try_alignment, DWORD flags) { + #if (MI_INTPTR_SIZE >= 8) + // on 64-bit systems, try to use the virtual address area after 2TiB for 4MiB aligned allocations + if (addr == NULL) { + void* hint = _mi_os_get_aligned_hint(try_alignment,size); + if (hint != NULL) { + void* p = VirtualAlloc(hint, size, flags, PAGE_READWRITE); + if (p != NULL) return p; + _mi_verbose_message("warning: unable to allocate hinted aligned OS memory (%zu bytes, error code: 0x%x, address: %p, alignment: %zu, flags: 0x%x)\n", size, GetLastError(), hint, try_alignment, flags); + // fall through on error + } + } + #endif + // on modern Windows try use VirtualAlloc2 for aligned allocation + if (try_alignment > 1 && (try_alignment % _mi_os_page_size()) == 0 && pVirtualAlloc2 != NULL) { + MI_MEM_ADDRESS_REQUIREMENTS reqs = { 0, 0, 0 }; + reqs.Alignment = try_alignment; + MI_MEM_EXTENDED_PARAMETER param = { {0, 0}, {0} }; + param.Type.Type = MiMemExtendedParameterAddressRequirements; + param.Arg.Pointer = &reqs; + void* p = (*pVirtualAlloc2)(GetCurrentProcess(), addr, size, flags, PAGE_READWRITE, ¶m, 1); + if (p != NULL) return p; + _mi_warning_message("unable to allocate aligned OS memory (%zu bytes, error code: 0x%x, address: %p, alignment: %zu, flags: 0x%x)\n", size, GetLastError(), addr, try_alignment, flags); + // fall through on error + } + // last resort + return VirtualAlloc(addr, size, flags, PAGE_READWRITE); +} + +static void* win_virtual_alloc(void* addr, size_t size, size_t try_alignment, DWORD flags, bool large_only, bool allow_large, bool* is_large) { + mi_assert_internal(!(large_only && !allow_large)); + static _Atomic(size_t) large_page_try_ok; // = 0; + void* p = NULL; + // Try to allocate large OS pages (2MiB) if allowed or required. + if ((large_only || _mi_os_use_large_page(size, try_alignment)) + && allow_large && (flags&MEM_COMMIT)!=0 && (flags&MEM_RESERVE)!=0) { + size_t try_ok = mi_atomic_load_acquire(&large_page_try_ok); + if (!large_only && try_ok > 0) { + // if a large page allocation fails, it seems the calls to VirtualAlloc get very expensive. + // therefore, once a large page allocation failed, we don't try again for `large_page_try_ok` times. + mi_atomic_cas_strong_acq_rel(&large_page_try_ok, &try_ok, try_ok - 1); + } + else { + // large OS pages must always reserve and commit. + *is_large = true; + p = win_virtual_alloc_prim(addr, size, try_alignment, flags | MEM_LARGE_PAGES); + if (large_only) return p; + // fall back to non-large page allocation on error (`p == NULL`). + if (p == NULL) { + mi_atomic_store_release(&large_page_try_ok,10UL); // on error, don't try again for the next N allocations + } + } + } + // Fall back to regular page allocation + if (p == NULL) { + *is_large = ((flags&MEM_LARGE_PAGES) != 0); + p = win_virtual_alloc_prim(addr, size, try_alignment, flags); + } + //if (p == NULL) { _mi_warning_message("unable to allocate OS memory (%zu bytes, error code: 0x%x, address: %p, alignment: %zu, flags: 0x%x, large only: %d, allow large: %d)\n", size, GetLastError(), addr, try_alignment, flags, large_only, allow_large); } + return p; +} + +int _mi_prim_alloc(size_t size, size_t try_alignment, bool commit, bool allow_large, bool* is_large, bool* is_zero, void** addr) { + mi_assert_internal(size > 0 && (size % _mi_os_page_size()) == 0); + mi_assert_internal(commit || !allow_large); + mi_assert_internal(try_alignment > 0); + *is_zero = true; + int flags = MEM_RESERVE; + if (commit) { flags |= MEM_COMMIT; } + *addr = win_virtual_alloc(NULL, size, try_alignment, flags, false, allow_large, is_large); + return (*addr != NULL ? 0 : (int)GetLastError()); +} + + +//--------------------------------------------- +// Commit/Reset/Protect +//--------------------------------------------- +#ifdef _MSC_VER +#pragma warning(disable:6250) // suppress warning calling VirtualFree without MEM_RELEASE (for decommit) +#endif + +int _mi_prim_commit(void* addr, size_t size, bool* is_zero) { + *is_zero = false; + /* + // zero'ing only happens on an initial commit... but checking upfront seems expensive.. + _MEMORY_BASIC_INFORMATION meminfo; _mi_memzero_var(meminfo); + if (VirtualQuery(addr, &meminfo, size) > 0) { + if ((meminfo.State & MEM_COMMIT) == 0) { + *is_zero = true; + } + } + */ + // commit + void* p = VirtualAlloc(addr, size, MEM_COMMIT, PAGE_READWRITE); + if (p == NULL) return (int)GetLastError(); + return 0; +} + +int _mi_prim_decommit(void* addr, size_t size, bool* needs_recommit) { + BOOL ok = VirtualFree(addr, size, MEM_DECOMMIT); + *needs_recommit = true; // for safety, assume always decommitted even in the case of an error. + return (ok ? 0 : (int)GetLastError()); +} + +int _mi_prim_reset(void* addr, size_t size) { + void* p = VirtualAlloc(addr, size, MEM_RESET, PAGE_READWRITE); + mi_assert_internal(p == addr); + #if 0 + if (p != NULL) { + VirtualUnlock(addr,size); // VirtualUnlock after MEM_RESET removes the memory directly from the working set + } + #endif + return (p != NULL ? 0 : (int)GetLastError()); +} + +int _mi_prim_protect(void* addr, size_t size, bool protect) { + DWORD oldprotect = 0; + BOOL ok = VirtualProtect(addr, size, protect ? PAGE_NOACCESS : PAGE_READWRITE, &oldprotect); + return (ok ? 0 : (int)GetLastError()); +} + + +//--------------------------------------------- +// Huge page allocation +//--------------------------------------------- + +static void* _mi_prim_alloc_huge_os_pagesx(void* hint_addr, size_t size, int numa_node) +{ + const DWORD flags = MEM_LARGE_PAGES | MEM_COMMIT | MEM_RESERVE; + + win_enable_large_os_pages(NULL); + + MI_MEM_EXTENDED_PARAMETER params[3] = { {{0,0},{0}},{{0,0},{0}},{{0,0},{0}} }; + // on modern Windows try use NtAllocateVirtualMemoryEx for 1GiB huge pages + static bool mi_huge_pages_available = true; + if (pNtAllocateVirtualMemoryEx != NULL && mi_huge_pages_available) { + params[0].Type.Type = MiMemExtendedParameterAttributeFlags; + params[0].Arg.ULong64 = MI_MEM_EXTENDED_PARAMETER_NONPAGED_HUGE; + ULONG param_count = 1; + if (numa_node >= 0) { + param_count++; + params[1].Type.Type = MiMemExtendedParameterNumaNode; + params[1].Arg.ULong = (unsigned)numa_node; + } + SIZE_T psize = size; + void* base = hint_addr; + NTSTATUS err = (*pNtAllocateVirtualMemoryEx)(GetCurrentProcess(), &base, &psize, flags, PAGE_READWRITE, params, param_count); + if (err == 0 && base != NULL) { + return base; + } + else { + // fall back to regular large pages + mi_huge_pages_available = false; // don't try further huge pages + _mi_warning_message("unable to allocate using huge (1GiB) pages, trying large (2MiB) pages instead (status 0x%lx)\n", err); + } + } + // on modern Windows try use VirtualAlloc2 for numa aware large OS page allocation + if (pVirtualAlloc2 != NULL && numa_node >= 0) { + params[0].Type.Type = MiMemExtendedParameterNumaNode; + params[0].Arg.ULong = (unsigned)numa_node; + return (*pVirtualAlloc2)(GetCurrentProcess(), hint_addr, size, flags, PAGE_READWRITE, params, 1); + } + + // otherwise use regular virtual alloc on older windows + return VirtualAlloc(hint_addr, size, flags, PAGE_READWRITE); +} + +int _mi_prim_alloc_huge_os_pages(void* hint_addr, size_t size, int numa_node, bool* is_zero, void** addr) { + *is_zero = true; + *addr = _mi_prim_alloc_huge_os_pagesx(hint_addr,size,numa_node); + return (*addr != NULL ? 0 : (int)GetLastError()); +} + + +//--------------------------------------------- +// Numa nodes +//--------------------------------------------- + +size_t _mi_prim_numa_node(void) { + USHORT numa_node = 0; + if (pGetCurrentProcessorNumberEx != NULL && pGetNumaProcessorNodeEx != NULL) { + // Extended API is supported + MI_PROCESSOR_NUMBER pnum; + (*pGetCurrentProcessorNumberEx)(&pnum); + USHORT nnode = 0; + BOOL ok = (*pGetNumaProcessorNodeEx)(&pnum, &nnode); + if (ok) { numa_node = nnode; } + } + else if (pGetNumaProcessorNode != NULL) { + // Vista or earlier, use older API that is limited to 64 processors. Issue #277 + DWORD pnum = GetCurrentProcessorNumber(); + UCHAR nnode = 0; + BOOL ok = pGetNumaProcessorNode((UCHAR)pnum, &nnode); + if (ok) { numa_node = nnode; } + } + return numa_node; +} + +size_t _mi_prim_numa_node_count(void) { + ULONG numa_max = 0; + GetNumaHighestNodeNumber(&numa_max); + // find the highest node number that has actual processors assigned to it. Issue #282 + while(numa_max > 0) { + if (pGetNumaNodeProcessorMaskEx != NULL) { + // Extended API is supported + GROUP_AFFINITY affinity; + if ((*pGetNumaNodeProcessorMaskEx)((USHORT)numa_max, &affinity)) { + if (affinity.Mask != 0) break; // found the maximum non-empty node + } + } + else { + // Vista or earlier, use older API that is limited to 64 processors. + ULONGLONG mask; + if (GetNumaNodeProcessorMask((UCHAR)numa_max, &mask)) { + if (mask != 0) break; // found the maximum non-empty node + }; + } + // max node was invalid or had no processor assigned, try again + numa_max--; + } + return ((size_t)numa_max + 1); +} + + +//---------------------------------------------------------------- +// Clock +//---------------------------------------------------------------- + +static mi_msecs_t mi_to_msecs(LARGE_INTEGER t) { + static LARGE_INTEGER mfreq; // = 0 + if (mfreq.QuadPart == 0LL) { + LARGE_INTEGER f; + QueryPerformanceFrequency(&f); + mfreq.QuadPart = f.QuadPart/1000LL; + if (mfreq.QuadPart == 0) mfreq.QuadPart = 1; + } + return (mi_msecs_t)(t.QuadPart / mfreq.QuadPart); +} + +mi_msecs_t _mi_prim_clock_now(void) { + LARGE_INTEGER t; + QueryPerformanceCounter(&t); + return mi_to_msecs(t); +} + + +//---------------------------------------------------------------- +// Process Info +//---------------------------------------------------------------- + +#include +#include + +static mi_msecs_t filetime_msecs(const FILETIME* ftime) { + ULARGE_INTEGER i; + i.LowPart = ftime->dwLowDateTime; + i.HighPart = ftime->dwHighDateTime; + mi_msecs_t msecs = (i.QuadPart / 10000); // FILETIME is in 100 nano seconds + return msecs; +} + +typedef BOOL (WINAPI *PGetProcessMemoryInfo)(HANDLE, PPROCESS_MEMORY_COUNTERS, DWORD); +static PGetProcessMemoryInfo pGetProcessMemoryInfo = NULL; + +void _mi_prim_process_info(mi_process_info_t* pinfo) +{ + FILETIME ct; + FILETIME ut; + FILETIME st; + FILETIME et; + GetProcessTimes(GetCurrentProcess(), &ct, &et, &st, &ut); + pinfo->utime = filetime_msecs(&ut); + pinfo->stime = filetime_msecs(&st); + + // load psapi on demand + if (pGetProcessMemoryInfo == NULL) { + HINSTANCE hDll = LoadLibrary(TEXT("psapi.dll")); + if (hDll != NULL) { + pGetProcessMemoryInfo = (PGetProcessMemoryInfo)(void (*)(void))GetProcAddress(hDll, "GetProcessMemoryInfo"); + } + } + + // get process info + PROCESS_MEMORY_COUNTERS info; + memset(&info, 0, sizeof(info)); + if (pGetProcessMemoryInfo != NULL) { + pGetProcessMemoryInfo(GetCurrentProcess(), &info, sizeof(info)); + } + pinfo->current_rss = (size_t)info.WorkingSetSize; + pinfo->peak_rss = (size_t)info.PeakWorkingSetSize; + pinfo->current_commit = (size_t)info.PagefileUsage; + pinfo->peak_commit = (size_t)info.PeakPagefileUsage; + pinfo->page_faults = (size_t)info.PageFaultCount; +} + +//---------------------------------------------------------------- +// Output +//---------------------------------------------------------------- + +void _mi_prim_out_stderr( const char* msg ) +{ + // on windows with redirection, the C runtime cannot handle locale dependent output + // after the main thread closes so we use direct console output. + if (!_mi_preloading()) { + // _cputs(msg); // _cputs cannot be used at is aborts if it fails to lock the console + static HANDLE hcon = INVALID_HANDLE_VALUE; + static bool hconIsConsole; + if (hcon == INVALID_HANDLE_VALUE) { + CONSOLE_SCREEN_BUFFER_INFO sbi; + hcon = GetStdHandle(STD_ERROR_HANDLE); + hconIsConsole = ((hcon != INVALID_HANDLE_VALUE) && GetConsoleScreenBufferInfo(hcon, &sbi)); + } + const size_t len = _mi_strlen(msg); + if (len > 0 && len < UINT32_MAX) { + DWORD written = 0; + if (hconIsConsole) { + WriteConsoleA(hcon, msg, (DWORD)len, &written, NULL); + } + else if (hcon != INVALID_HANDLE_VALUE) { + // use direct write if stderr was redirected + WriteFile(hcon, msg, (DWORD)len, &written, NULL); + } + else { + // finally fall back to fputs after all + fputs(msg, stderr); + } + } + } +} + + +//---------------------------------------------------------------- +// Environment +//---------------------------------------------------------------- + +// On Windows use GetEnvironmentVariable instead of getenv to work +// reliably even when this is invoked before the C runtime is initialized. +// i.e. when `_mi_preloading() == true`. +// Note: on windows, environment names are not case sensitive. +bool _mi_prim_getenv(const char* name, char* result, size_t result_size) { + result[0] = 0; + size_t len = GetEnvironmentVariableA(name, result, (DWORD)result_size); + return (len > 0 && len < result_size); +} + + + +//---------------------------------------------------------------- +// Random +//---------------------------------------------------------------- + +#if defined(MI_USE_RTLGENRANDOM) // || defined(__cplusplus) +// We prefer to use BCryptGenRandom instead of (the unofficial) RtlGenRandom but when using +// dynamic overriding, we observed it can raise an exception when compiled with C++, and +// sometimes deadlocks when also running under the VS debugger. +// In contrast, issue #623 implies that on Windows Server 2019 we need to use BCryptGenRandom. +// To be continued.. +#pragma comment (lib,"advapi32.lib") +#define RtlGenRandom SystemFunction036 +mi_decl_externc BOOLEAN NTAPI RtlGenRandom(PVOID RandomBuffer, ULONG RandomBufferLength); + +bool _mi_prim_random_buf(void* buf, size_t buf_len) { + return (RtlGenRandom(buf, (ULONG)buf_len) != 0); +} + +#else + +#ifndef BCRYPT_USE_SYSTEM_PREFERRED_RNG +#define BCRYPT_USE_SYSTEM_PREFERRED_RNG 0x00000002 +#endif + +typedef LONG (NTAPI *PBCryptGenRandom)(HANDLE, PUCHAR, ULONG, ULONG); +static PBCryptGenRandom pBCryptGenRandom = NULL; + +bool _mi_prim_random_buf(void* buf, size_t buf_len) { + if (pBCryptGenRandom == NULL) { + HINSTANCE hDll = LoadLibrary(TEXT("bcrypt.dll")); + if (hDll != NULL) { + pBCryptGenRandom = (PBCryptGenRandom)(void (*)(void))GetProcAddress(hDll, "BCryptGenRandom"); + } + if (pBCryptGenRandom == NULL) return false; + } + return (pBCryptGenRandom(NULL, (PUCHAR)buf, (ULONG)buf_len, BCRYPT_USE_SYSTEM_PREFERRED_RNG) >= 0); +} + +#endif // MI_USE_RTLGENRANDOM + +//---------------------------------------------------------------- +// Thread init/done +//---------------------------------------------------------------- + +#if !defined(MI_SHARED_LIB) + +// use thread local storage keys to detect thread ending +#include +#if (_WIN32_WINNT < 0x600) // before Windows Vista +WINBASEAPI DWORD WINAPI FlsAlloc( _In_opt_ PFLS_CALLBACK_FUNCTION lpCallback ); +WINBASEAPI PVOID WINAPI FlsGetValue( _In_ DWORD dwFlsIndex ); +WINBASEAPI BOOL WINAPI FlsSetValue( _In_ DWORD dwFlsIndex, _In_opt_ PVOID lpFlsData ); +WINBASEAPI BOOL WINAPI FlsFree(_In_ DWORD dwFlsIndex); +#endif + +static DWORD mi_fls_key = (DWORD)(-1); + +static void NTAPI mi_fls_done(PVOID value) { + mi_heap_t* heap = (mi_heap_t*)value; + if (heap != NULL) { + _mi_thread_done(heap); + FlsSetValue(mi_fls_key, NULL); // prevent recursion as _mi_thread_done may set it back to the main heap, issue #672 + } +} + +void _mi_prim_thread_init_auto_done(void) { + mi_fls_key = FlsAlloc(&mi_fls_done); +} + +void _mi_prim_thread_done_auto_done(void) { + // call thread-done on all threads (except the main thread) to prevent + // dangling callback pointer if statically linked with a DLL; Issue #208 + FlsFree(mi_fls_key); +} + +void _mi_prim_thread_associate_default_heap(mi_heap_t* heap) { + mi_assert_internal(mi_fls_key != (DWORD)(-1)); + FlsSetValue(mi_fls_key, heap); +} + +#else + +// Dll; nothing to do as in that case thread_done is handled through the DLL_THREAD_DETACH event. + +void _mi_prim_thread_init_auto_done(void) { +} + +void _mi_prim_thread_done_auto_done(void) { +} + +void _mi_prim_thread_associate_default_heap(mi_heap_t* heap) { + MI_UNUSED(heap); +} + +#endif diff --git a/compat/mimalloc/random.c b/compat/mimalloc/random.c new file mode 100644 index 00000000000000..2a18b5aa992dad --- /dev/null +++ b/compat/mimalloc/random.c @@ -0,0 +1,254 @@ +/* ---------------------------------------------------------------------------- +Copyright (c) 2019-2021, Microsoft Research, Daan Leijen +This is free software; you can redistribute it and/or modify it under the +terms of the MIT license. A copy of the license can be found in the file +"LICENSE" at the root of this distribution. +-----------------------------------------------------------------------------*/ +#include "mimalloc.h" +#include "mimalloc/internal.h" +#include "mimalloc/prim.h" // _mi_prim_random_buf +#include // memset + +/* ---------------------------------------------------------------------------- +We use our own PRNG to keep predictable performance of random number generation +and to avoid implementations that use a lock. We only use the OS provided +random source to initialize the initial seeds. Since we do not need ultimate +performance but we do rely on the security (for secret cookies in secure mode) +we use a cryptographically secure generator (chacha20). +-----------------------------------------------------------------------------*/ + +#define MI_CHACHA_ROUNDS (20) // perhaps use 12 for better performance? + + +/* ---------------------------------------------------------------------------- +Chacha20 implementation as the original algorithm with a 64-bit nonce +and counter: https://en.wikipedia.org/wiki/Salsa20 +The input matrix has sixteen 32-bit values: +Position 0 to 3: constant key +Position 4 to 11: the key +Position 12 to 13: the counter. +Position 14 to 15: the nonce. + +The implementation uses regular C code which compiles very well on modern compilers. +(gcc x64 has no register spills, and clang 6+ uses SSE instructions) +-----------------------------------------------------------------------------*/ + +static inline uint32_t rotl(uint32_t x, uint32_t shift) { + return (x << shift) | (x >> (32 - shift)); +} + +static inline void qround(uint32_t x[16], size_t a, size_t b, size_t c, size_t d) { + x[a] += x[b]; x[d] = rotl(x[d] ^ x[a], 16); + x[c] += x[d]; x[b] = rotl(x[b] ^ x[c], 12); + x[a] += x[b]; x[d] = rotl(x[d] ^ x[a], 8); + x[c] += x[d]; x[b] = rotl(x[b] ^ x[c], 7); +} + +static void chacha_block(mi_random_ctx_t* ctx) +{ + // scramble into `x` + uint32_t x[16]; + for (size_t i = 0; i < 16; i++) { + x[i] = ctx->input[i]; + } + for (size_t i = 0; i < MI_CHACHA_ROUNDS; i += 2) { + qround(x, 0, 4, 8, 12); + qround(x, 1, 5, 9, 13); + qround(x, 2, 6, 10, 14); + qround(x, 3, 7, 11, 15); + qround(x, 0, 5, 10, 15); + qround(x, 1, 6, 11, 12); + qround(x, 2, 7, 8, 13); + qround(x, 3, 4, 9, 14); + } + + // add scrambled data to the initial state + for (size_t i = 0; i < 16; i++) { + ctx->output[i] = x[i] + ctx->input[i]; + } + ctx->output_available = 16; + + // increment the counter for the next round + ctx->input[12] += 1; + if (ctx->input[12] == 0) { + ctx->input[13] += 1; + if (ctx->input[13] == 0) { // and keep increasing into the nonce + ctx->input[14] += 1; + } + } +} + +static uint32_t chacha_next32(mi_random_ctx_t* ctx) { + if (ctx->output_available <= 0) { + chacha_block(ctx); + ctx->output_available = 16; // (assign again to suppress static analysis warning) + } + const uint32_t x = ctx->output[16 - ctx->output_available]; + ctx->output[16 - ctx->output_available] = 0; // reset once the data is handed out + ctx->output_available--; + return x; +} + +static inline uint32_t read32(const uint8_t* p, size_t idx32) { + const size_t i = 4*idx32; + return ((uint32_t)p[i+0] | (uint32_t)p[i+1] << 8 | (uint32_t)p[i+2] << 16 | (uint32_t)p[i+3] << 24); +} + +static void chacha_init(mi_random_ctx_t* ctx, const uint8_t key[32], uint64_t nonce) +{ + // since we only use chacha for randomness (and not encryption) we + // do not _need_ to read 32-bit values as little endian but we do anyways + // just for being compatible :-) + memset(ctx, 0, sizeof(*ctx)); + for (size_t i = 0; i < 4; i++) { + const uint8_t* sigma = (uint8_t*)"expand 32-byte k"; + ctx->input[i] = read32(sigma,i); + } + for (size_t i = 0; i < 8; i++) { + ctx->input[i + 4] = read32(key,i); + } + ctx->input[12] = 0; + ctx->input[13] = 0; + ctx->input[14] = (uint32_t)nonce; + ctx->input[15] = (uint32_t)(nonce >> 32); +} + +static void chacha_split(mi_random_ctx_t* ctx, uint64_t nonce, mi_random_ctx_t* ctx_new) { + memset(ctx_new, 0, sizeof(*ctx_new)); + _mi_memcpy(ctx_new->input, ctx->input, sizeof(ctx_new->input)); + ctx_new->input[12] = 0; + ctx_new->input[13] = 0; + ctx_new->input[14] = (uint32_t)nonce; + ctx_new->input[15] = (uint32_t)(nonce >> 32); + mi_assert_internal(ctx->input[14] != ctx_new->input[14] || ctx->input[15] != ctx_new->input[15]); // do not reuse nonces! + chacha_block(ctx_new); +} + + +/* ---------------------------------------------------------------------------- +Random interface +-----------------------------------------------------------------------------*/ + +#if MI_DEBUG>1 +static bool mi_random_is_initialized(mi_random_ctx_t* ctx) { + return (ctx != NULL && ctx->input[0] != 0); +} +#endif + +void _mi_random_split(mi_random_ctx_t* ctx, mi_random_ctx_t* ctx_new) { + mi_assert_internal(mi_random_is_initialized(ctx)); + mi_assert_internal(ctx != ctx_new); + chacha_split(ctx, (uintptr_t)ctx_new /*nonce*/, ctx_new); +} + +uintptr_t _mi_random_next(mi_random_ctx_t* ctx) { + mi_assert_internal(mi_random_is_initialized(ctx)); + #if MI_INTPTR_SIZE <= 4 + return chacha_next32(ctx); + #elif MI_INTPTR_SIZE == 8 + return (((uintptr_t)chacha_next32(ctx) << 32) | chacha_next32(ctx)); + #else + # error "define mi_random_next for this platform" + #endif +} + + +/* ---------------------------------------------------------------------------- +To initialize a fresh random context. +If we cannot get good randomness, we fall back to weak randomness based on a timer and ASLR. +-----------------------------------------------------------------------------*/ + +uintptr_t _mi_os_random_weak(uintptr_t extra_seed) { + uintptr_t x = (uintptr_t)&_mi_os_random_weak ^ extra_seed; // ASLR makes the address random + x ^= _mi_prim_clock_now(); + // and do a few randomization steps + uintptr_t max = ((x ^ (x >> 17)) & 0x0F) + 1; + for (uintptr_t i = 0; i < max; i++) { + x = _mi_random_shuffle(x); + } + mi_assert_internal(x != 0); + return x; +} + +static void mi_random_init_ex(mi_random_ctx_t* ctx, bool use_weak) { + uint8_t key[32]; + if (use_weak || !_mi_prim_random_buf(key, sizeof(key))) { + // if we fail to get random data from the OS, we fall back to a + // weak random source based on the current time + #if !defined(__wasi__) + if (!use_weak) { _mi_warning_message("unable to use secure randomness\n"); } + #endif + uintptr_t x = _mi_os_random_weak(0); + for (size_t i = 0; i < 8; i++) { // key is eight 32-bit words. + x = _mi_random_shuffle(x); + ((uint32_t*)key)[i] = (uint32_t)x; + } + ctx->weak = true; + } + else { + ctx->weak = false; + } + chacha_init(ctx, key, (uintptr_t)ctx /*nonce*/ ); +} + +void _mi_random_init(mi_random_ctx_t* ctx) { + mi_random_init_ex(ctx, false); +} + +void _mi_random_init_weak(mi_random_ctx_t * ctx) { + mi_random_init_ex(ctx, true); +} + +void _mi_random_reinit_if_weak(mi_random_ctx_t * ctx) { + if (ctx->weak) { + _mi_random_init(ctx); + } +} + +/* -------------------------------------------------------- +test vectors from +----------------------------------------------------------- */ +/* +static bool array_equals(uint32_t* x, uint32_t* y, size_t n) { + for (size_t i = 0; i < n; i++) { + if (x[i] != y[i]) return false; + } + return true; +} +static void chacha_test(void) +{ + uint32_t x[4] = { 0x11111111, 0x01020304, 0x9b8d6f43, 0x01234567 }; + uint32_t x_out[4] = { 0xea2a92f4, 0xcb1cf8ce, 0x4581472e, 0x5881c4bb }; + qround(x, 0, 1, 2, 3); + mi_assert_internal(array_equals(x, x_out, 4)); + + uint32_t y[16] = { + 0x879531e0, 0xc5ecf37d, 0x516461b1, 0xc9a62f8a, + 0x44c20ef3, 0x3390af7f, 0xd9fc690b, 0x2a5f714c, + 0x53372767, 0xb00a5631, 0x974c541a, 0x359e9963, + 0x5c971061, 0x3d631689, 0x2098d9d6, 0x91dbd320 }; + uint32_t y_out[16] = { + 0x879531e0, 0xc5ecf37d, 0xbdb886dc, 0xc9a62f8a, + 0x44c20ef3, 0x3390af7f, 0xd9fc690b, 0xcfacafd2, + 0xe46bea80, 0xb00a5631, 0x974c541a, 0x359e9963, + 0x5c971061, 0xccc07c79, 0x2098d9d6, 0x91dbd320 }; + qround(y, 2, 7, 8, 13); + mi_assert_internal(array_equals(y, y_out, 16)); + + mi_random_ctx_t r = { + { 0x61707865, 0x3320646e, 0x79622d32, 0x6b206574, + 0x03020100, 0x07060504, 0x0b0a0908, 0x0f0e0d0c, + 0x13121110, 0x17161514, 0x1b1a1918, 0x1f1e1d1c, + 0x00000001, 0x09000000, 0x4a000000, 0x00000000 }, + {0}, + 0 + }; + uint32_t r_out[16] = { + 0xe4e7f110, 0x15593bd1, 0x1fdd0f50, 0xc47120a3, + 0xc7f4d1c7, 0x0368c033, 0x9aaa2204, 0x4e6cd4c3, + 0x466482d2, 0x09aa9f07, 0x05d7c214, 0xa2028bd9, + 0xd19c12b5, 0xb94e16de, 0xe883d0cb, 0x4e3c50a2 }; + chacha_block(&r); + mi_assert_internal(array_equals(r.output, r_out, 16)); +} +*/ diff --git a/compat/mimalloc/segment-cache.c b/compat/mimalloc/segment-cache.c new file mode 100644 index 00000000000000..e69de29bb2d1d6 diff --git a/compat/mimalloc/segment-map.c b/compat/mimalloc/segment-map.c new file mode 100644 index 00000000000000..3cd2127e56c1a7 --- /dev/null +++ b/compat/mimalloc/segment-map.c @@ -0,0 +1,153 @@ +/* ---------------------------------------------------------------------------- +Copyright (c) 2019-2023, Microsoft Research, Daan Leijen +This is free software; you can redistribute it and/or modify it under the +terms of the MIT license. A copy of the license can be found in the file +"LICENSE" at the root of this distribution. +-----------------------------------------------------------------------------*/ + +/* ----------------------------------------------------------- + The following functions are to reliably find the segment or + block that encompasses any pointer p (or NULL if it is not + in any of our segments). + We maintain a bitmap of all memory with 1 bit per MI_SEGMENT_SIZE (64MiB) + set to 1 if it contains the segment meta data. +----------------------------------------------------------- */ +#include "mimalloc.h" +#include "mimalloc/internal.h" +#include "mimalloc/atomic.h" + +#if (MI_INTPTR_SIZE==8) +#define MI_MAX_ADDRESS ((size_t)40 << 40) // 40TB (to include huge page areas) +#else +#define MI_MAX_ADDRESS ((size_t)2 << 30) // 2Gb +#endif + +#define MI_SEGMENT_MAP_BITS (MI_MAX_ADDRESS / MI_SEGMENT_SIZE) +#define MI_SEGMENT_MAP_SIZE (MI_SEGMENT_MAP_BITS / 8) +#define MI_SEGMENT_MAP_WSIZE (MI_SEGMENT_MAP_SIZE / MI_INTPTR_SIZE) + +static _Atomic(uintptr_t) mi_segment_map[MI_SEGMENT_MAP_WSIZE + 1]; // 2KiB per TB with 64MiB segments + +static size_t mi_segment_map_index_of(const mi_segment_t* segment, size_t* bitidx) { + mi_assert_internal(_mi_ptr_segment(segment + 1) == segment); // is it aligned on MI_SEGMENT_SIZE? + if ((uintptr_t)segment >= MI_MAX_ADDRESS) { + *bitidx = 0; + return MI_SEGMENT_MAP_WSIZE; + } + else { + const uintptr_t segindex = ((uintptr_t)segment) / MI_SEGMENT_SIZE; + *bitidx = segindex % MI_INTPTR_BITS; + const size_t mapindex = segindex / MI_INTPTR_BITS; + mi_assert_internal(mapindex < MI_SEGMENT_MAP_WSIZE); + return mapindex; + } +} + +void _mi_segment_map_allocated_at(const mi_segment_t* segment) { + size_t bitidx; + size_t index = mi_segment_map_index_of(segment, &bitidx); + mi_assert_internal(index <= MI_SEGMENT_MAP_WSIZE); + if (index==MI_SEGMENT_MAP_WSIZE) return; + uintptr_t mask = mi_atomic_load_relaxed(&mi_segment_map[index]); + uintptr_t newmask; + do { + newmask = (mask | ((uintptr_t)1 << bitidx)); + } while (!mi_atomic_cas_weak_release(&mi_segment_map[index], &mask, newmask)); +} + +void _mi_segment_map_freed_at(const mi_segment_t* segment) { + size_t bitidx; + size_t index = mi_segment_map_index_of(segment, &bitidx); + mi_assert_internal(index <= MI_SEGMENT_MAP_WSIZE); + if (index == MI_SEGMENT_MAP_WSIZE) return; + uintptr_t mask = mi_atomic_load_relaxed(&mi_segment_map[index]); + uintptr_t newmask; + do { + newmask = (mask & ~((uintptr_t)1 << bitidx)); + } while (!mi_atomic_cas_weak_release(&mi_segment_map[index], &mask, newmask)); +} + +// Determine the segment belonging to a pointer or NULL if it is not in a valid segment. +static mi_segment_t* _mi_segment_of(const void* p) { + if (p == NULL) return NULL; + mi_segment_t* segment = _mi_ptr_segment(p); + mi_assert_internal(segment != NULL); + size_t bitidx; + size_t index = mi_segment_map_index_of(segment, &bitidx); + // fast path: for any pointer to valid small/medium/large object or first MI_SEGMENT_SIZE in huge + const uintptr_t mask = mi_atomic_load_relaxed(&mi_segment_map[index]); + if mi_likely((mask & ((uintptr_t)1 << bitidx)) != 0) { + return segment; // yes, allocated by us + } + if (index==MI_SEGMENT_MAP_WSIZE) return NULL; + + // TODO: maintain max/min allocated range for efficiency for more efficient rejection of invalid pointers? + + // search downwards for the first segment in case it is an interior pointer + // could be slow but searches in MI_INTPTR_SIZE * MI_SEGMENT_SIZE (512MiB) steps trough + // valid huge objects + // note: we could maintain a lowest index to speed up the path for invalid pointers? + size_t lobitidx; + size_t loindex; + uintptr_t lobits = mask & (((uintptr_t)1 << bitidx) - 1); + if (lobits != 0) { + loindex = index; + lobitidx = mi_bsr(lobits); // lobits != 0 + } + else if (index == 0) { + return NULL; + } + else { + mi_assert_internal(index > 0); + uintptr_t lomask = mask; + loindex = index; + do { + loindex--; + lomask = mi_atomic_load_relaxed(&mi_segment_map[loindex]); + } while (lomask != 0 && loindex > 0); + if (lomask == 0) return NULL; + lobitidx = mi_bsr(lomask); // lomask != 0 + } + mi_assert_internal(loindex < MI_SEGMENT_MAP_WSIZE); + // take difference as the addresses could be larger than the MAX_ADDRESS space. + size_t diff = (((index - loindex) * (8*MI_INTPTR_SIZE)) + bitidx - lobitidx) * MI_SEGMENT_SIZE; + segment = (mi_segment_t*)((uint8_t*)segment - diff); + + if (segment == NULL) return NULL; + mi_assert_internal((void*)segment < p); + bool cookie_ok = (_mi_ptr_cookie(segment) == segment->cookie); + mi_assert_internal(cookie_ok); + if mi_unlikely(!cookie_ok) return NULL; + if (((uint8_t*)segment + mi_segment_size(segment)) <= (uint8_t*)p) return NULL; // outside the range + mi_assert_internal(p >= (void*)segment && (uint8_t*)p < (uint8_t*)segment + mi_segment_size(segment)); + return segment; +} + +// Is this a valid pointer in our heap? +static bool mi_is_valid_pointer(const void* p) { + return ((_mi_segment_of(p) != NULL) || (_mi_arena_contains(p))); +} + +mi_decl_nodiscard mi_decl_export bool mi_is_in_heap_region(const void* p) mi_attr_noexcept { + return mi_is_valid_pointer(p); +} + +/* +// Return the full segment range belonging to a pointer +static void* mi_segment_range_of(const void* p, size_t* size) { + mi_segment_t* segment = _mi_segment_of(p); + if (segment == NULL) { + if (size != NULL) *size = 0; + return NULL; + } + else { + if (size != NULL) *size = segment->segment_size; + return segment; + } + mi_assert_expensive(page == NULL || mi_segment_is_valid(_mi_page_segment(page),tld)); + mi_assert_internal(page == NULL || (mi_segment_page_size(_mi_page_segment(page)) - (MI_SECURE == 0 ? 0 : _mi_os_page_size())) >= block_size); + mi_reset_delayed(tld); + mi_assert_internal(page == NULL || mi_page_not_in_queue(page, tld)); + return page; +} +*/ diff --git a/compat/mimalloc/segment.c b/compat/mimalloc/segment.c new file mode 100644 index 00000000000000..6b901f6cc80f13 --- /dev/null +++ b/compat/mimalloc/segment.c @@ -0,0 +1,1617 @@ +/* ---------------------------------------------------------------------------- +Copyright (c) 2018-2020, Microsoft Research, Daan Leijen +This is free software; you can redistribute it and/or modify it under the +terms of the MIT license. A copy of the license can be found in the file +"LICENSE" at the root of this distribution. +-----------------------------------------------------------------------------*/ +#include "mimalloc.h" +#include "mimalloc/internal.h" +#include "mimalloc/atomic.h" + +#include // memset +#include + +#define MI_PAGE_HUGE_ALIGN (256*1024) + +static void mi_segment_try_purge(mi_segment_t* segment, bool force, mi_stats_t* stats); + + +// ------------------------------------------------------------------- +// commit mask +// ------------------------------------------------------------------- + +static bool mi_commit_mask_all_set(const mi_commit_mask_t* commit, const mi_commit_mask_t* cm) { + for (size_t i = 0; i < MI_COMMIT_MASK_FIELD_COUNT; i++) { + if ((commit->mask[i] & cm->mask[i]) != cm->mask[i]) return false; + } + return true; +} + +static bool mi_commit_mask_any_set(const mi_commit_mask_t* commit, const mi_commit_mask_t* cm) { + for (size_t i = 0; i < MI_COMMIT_MASK_FIELD_COUNT; i++) { + if ((commit->mask[i] & cm->mask[i]) != 0) return true; + } + return false; +} + +static void mi_commit_mask_create_intersect(const mi_commit_mask_t* commit, const mi_commit_mask_t* cm, mi_commit_mask_t* res) { + for (size_t i = 0; i < MI_COMMIT_MASK_FIELD_COUNT; i++) { + res->mask[i] = (commit->mask[i] & cm->mask[i]); + } +} + +static void mi_commit_mask_clear(mi_commit_mask_t* res, const mi_commit_mask_t* cm) { + for (size_t i = 0; i < MI_COMMIT_MASK_FIELD_COUNT; i++) { + res->mask[i] &= ~(cm->mask[i]); + } +} + +static void mi_commit_mask_set(mi_commit_mask_t* res, const mi_commit_mask_t* cm) { + for (size_t i = 0; i < MI_COMMIT_MASK_FIELD_COUNT; i++) { + res->mask[i] |= cm->mask[i]; + } +} + +static void mi_commit_mask_create(size_t bitidx, size_t bitcount, mi_commit_mask_t* cm) { + mi_assert_internal(bitidx < MI_COMMIT_MASK_BITS); + mi_assert_internal((bitidx + bitcount) <= MI_COMMIT_MASK_BITS); + if (bitcount == MI_COMMIT_MASK_BITS) { + mi_assert_internal(bitidx==0); + mi_commit_mask_create_full(cm); + } + else if (bitcount == 0) { + mi_commit_mask_create_empty(cm); + } + else { + mi_commit_mask_create_empty(cm); + size_t i = bitidx / MI_COMMIT_MASK_FIELD_BITS; + size_t ofs = bitidx % MI_COMMIT_MASK_FIELD_BITS; + while (bitcount > 0) { + mi_assert_internal(i < MI_COMMIT_MASK_FIELD_COUNT); + size_t avail = MI_COMMIT_MASK_FIELD_BITS - ofs; + size_t count = (bitcount > avail ? avail : bitcount); + size_t mask = (count >= MI_COMMIT_MASK_FIELD_BITS ? ~((size_t)0) : (((size_t)1 << count) - 1) << ofs); + cm->mask[i] = mask; + bitcount -= count; + ofs = 0; + i++; + } + } +} + +size_t _mi_commit_mask_committed_size(const mi_commit_mask_t* cm, size_t total) { + mi_assert_internal((total%MI_COMMIT_MASK_BITS)==0); + size_t count = 0; + for (size_t i = 0; i < MI_COMMIT_MASK_FIELD_COUNT; i++) { + size_t mask = cm->mask[i]; + if (~mask == 0) { + count += MI_COMMIT_MASK_FIELD_BITS; + } + else { + for (; mask != 0; mask >>= 1) { // todo: use popcount + if ((mask&1)!=0) count++; + } + } + } + // we use total since for huge segments each commit bit may represent a larger size + return ((total / MI_COMMIT_MASK_BITS) * count); +} + + +size_t _mi_commit_mask_next_run(const mi_commit_mask_t* cm, size_t* idx) { + size_t i = (*idx) / MI_COMMIT_MASK_FIELD_BITS; + size_t ofs = (*idx) % MI_COMMIT_MASK_FIELD_BITS; + size_t mask = 0; + // find first ones + while (i < MI_COMMIT_MASK_FIELD_COUNT) { + mask = cm->mask[i]; + mask >>= ofs; + if (mask != 0) { + while ((mask&1) == 0) { + mask >>= 1; + ofs++; + } + break; + } + i++; + ofs = 0; + } + if (i >= MI_COMMIT_MASK_FIELD_COUNT) { + // not found + *idx = MI_COMMIT_MASK_BITS; + return 0; + } + else { + // found, count ones + size_t count = 0; + *idx = (i*MI_COMMIT_MASK_FIELD_BITS) + ofs; + do { + mi_assert_internal(ofs < MI_COMMIT_MASK_FIELD_BITS && (mask&1) == 1); + do { + count++; + mask >>= 1; + } while ((mask&1) == 1); + if ((((*idx + count) % MI_COMMIT_MASK_FIELD_BITS) == 0)) { + i++; + if (i >= MI_COMMIT_MASK_FIELD_COUNT) break; + mask = cm->mask[i]; + ofs = 0; + } + } while ((mask&1) == 1); + mi_assert_internal(count > 0); + return count; + } +} + + +/* -------------------------------------------------------------------------------- + Segment allocation + + If a thread ends, it "abandons" pages with used blocks + and there is an abandoned segment list whose segments can + be reclaimed by still running threads, much like work-stealing. +-------------------------------------------------------------------------------- */ + + +/* ----------------------------------------------------------- + Slices +----------------------------------------------------------- */ + + +static const mi_slice_t* mi_segment_slices_end(const mi_segment_t* segment) { + return &segment->slices[segment->slice_entries]; +} + +static uint8_t* mi_slice_start(const mi_slice_t* slice) { + mi_segment_t* segment = _mi_ptr_segment(slice); + mi_assert_internal(slice >= segment->slices && slice < mi_segment_slices_end(segment)); + return ((uint8_t*)segment + ((slice - segment->slices)*MI_SEGMENT_SLICE_SIZE)); +} + + +/* ----------------------------------------------------------- + Bins +----------------------------------------------------------- */ +// Use bit scan forward to quickly find the first zero bit if it is available + +static inline size_t mi_slice_bin8(size_t slice_count) { + if (slice_count<=1) return slice_count; + mi_assert_internal(slice_count <= MI_SLICES_PER_SEGMENT); + slice_count--; + size_t s = mi_bsr(slice_count); // slice_count > 1 + if (s <= 2) return slice_count + 1; + size_t bin = ((s << 2) | ((slice_count >> (s - 2))&0x03)) - 4; + return bin; +} + +static inline size_t mi_slice_bin(size_t slice_count) { + mi_assert_internal(slice_count*MI_SEGMENT_SLICE_SIZE <= MI_SEGMENT_SIZE); + mi_assert_internal(mi_slice_bin8(MI_SLICES_PER_SEGMENT) <= MI_SEGMENT_BIN_MAX); + size_t bin = mi_slice_bin8(slice_count); + mi_assert_internal(bin <= MI_SEGMENT_BIN_MAX); + return bin; +} + +static inline size_t mi_slice_index(const mi_slice_t* slice) { + mi_segment_t* segment = _mi_ptr_segment(slice); + ptrdiff_t index = slice - segment->slices; + mi_assert_internal(index >= 0 && index < (ptrdiff_t)segment->slice_entries); + return index; +} + + +/* ----------------------------------------------------------- + Slice span queues +----------------------------------------------------------- */ + +static void mi_span_queue_push(mi_span_queue_t* sq, mi_slice_t* slice) { + // todo: or push to the end? + mi_assert_internal(slice->prev == NULL && slice->next==NULL); + slice->prev = NULL; // paranoia + slice->next = sq->first; + sq->first = slice; + if (slice->next != NULL) slice->next->prev = slice; + else sq->last = slice; + slice->xblock_size = 0; // free +} + +static mi_span_queue_t* mi_span_queue_for(size_t slice_count, mi_segments_tld_t* tld) { + size_t bin = mi_slice_bin(slice_count); + mi_span_queue_t* sq = &tld->spans[bin]; + mi_assert_internal(sq->slice_count >= slice_count); + return sq; +} + +static void mi_span_queue_delete(mi_span_queue_t* sq, mi_slice_t* slice) { + mi_assert_internal(slice->xblock_size==0 && slice->slice_count>0 && slice->slice_offset==0); + // should work too if the queue does not contain slice (which can happen during reclaim) + if (slice->prev != NULL) slice->prev->next = slice->next; + if (slice == sq->first) sq->first = slice->next; + if (slice->next != NULL) slice->next->prev = slice->prev; + if (slice == sq->last) sq->last = slice->prev; + slice->prev = NULL; + slice->next = NULL; + slice->xblock_size = 1; // no more free +} + + +/* ----------------------------------------------------------- + Invariant checking +----------------------------------------------------------- */ + +static bool mi_slice_is_used(const mi_slice_t* slice) { + return (slice->xblock_size > 0); +} + + +#if (MI_DEBUG>=3) +static bool mi_span_queue_contains(mi_span_queue_t* sq, mi_slice_t* slice) { + for (mi_slice_t* s = sq->first; s != NULL; s = s->next) { + if (s==slice) return true; + } + return false; +} + +static bool mi_segment_is_valid(mi_segment_t* segment, mi_segments_tld_t* tld) { + mi_assert_internal(segment != NULL); + mi_assert_internal(_mi_ptr_cookie(segment) == segment->cookie); + mi_assert_internal(segment->abandoned <= segment->used); + mi_assert_internal(segment->thread_id == 0 || segment->thread_id == _mi_thread_id()); + mi_assert_internal(mi_commit_mask_all_set(&segment->commit_mask, &segment->purge_mask)); // can only decommit committed blocks + //mi_assert_internal(segment->segment_info_size % MI_SEGMENT_SLICE_SIZE == 0); + mi_slice_t* slice = &segment->slices[0]; + const mi_slice_t* end = mi_segment_slices_end(segment); + size_t used_count = 0; + mi_span_queue_t* sq; + while(slice < end) { + mi_assert_internal(slice->slice_count > 0); + mi_assert_internal(slice->slice_offset == 0); + size_t index = mi_slice_index(slice); + size_t maxindex = (index + slice->slice_count >= segment->slice_entries ? segment->slice_entries : index + slice->slice_count) - 1; + if (mi_slice_is_used(slice)) { // a page in use, we need at least MAX_SLICE_OFFSET valid back offsets + used_count++; + for (size_t i = 0; i <= MI_MAX_SLICE_OFFSET && index + i <= maxindex; i++) { + mi_assert_internal(segment->slices[index + i].slice_offset == i*sizeof(mi_slice_t)); + mi_assert_internal(i==0 || segment->slices[index + i].slice_count == 0); + mi_assert_internal(i==0 || segment->slices[index + i].xblock_size == 1); + } + // and the last entry as well (for coalescing) + const mi_slice_t* last = slice + slice->slice_count - 1; + if (last > slice && last < mi_segment_slices_end(segment)) { + mi_assert_internal(last->slice_offset == (slice->slice_count-1)*sizeof(mi_slice_t)); + mi_assert_internal(last->slice_count == 0); + mi_assert_internal(last->xblock_size == 1); + } + } + else { // free range of slices; only last slice needs a valid back offset + mi_slice_t* last = &segment->slices[maxindex]; + if (segment->kind != MI_SEGMENT_HUGE || slice->slice_count <= (segment->slice_entries - segment->segment_info_slices)) { + mi_assert_internal((uint8_t*)slice == (uint8_t*)last - last->slice_offset); + } + mi_assert_internal(slice == last || last->slice_count == 0 ); + mi_assert_internal(last->xblock_size == 0 || (segment->kind==MI_SEGMENT_HUGE && last->xblock_size==1)); + if (segment->kind != MI_SEGMENT_HUGE && segment->thread_id != 0) { // segment is not huge or abandoned + sq = mi_span_queue_for(slice->slice_count,tld); + mi_assert_internal(mi_span_queue_contains(sq,slice)); + } + } + slice = &segment->slices[maxindex+1]; + } + mi_assert_internal(slice == end); + mi_assert_internal(used_count == segment->used + 1); + return true; +} +#endif + +/* ----------------------------------------------------------- + Segment size calculations +----------------------------------------------------------- */ + +static size_t mi_segment_info_size(mi_segment_t* segment) { + return segment->segment_info_slices * MI_SEGMENT_SLICE_SIZE; +} + +static uint8_t* _mi_segment_page_start_from_slice(const mi_segment_t* segment, const mi_slice_t* slice, size_t xblock_size, size_t* page_size) +{ + ptrdiff_t idx = slice - segment->slices; + size_t psize = (size_t)slice->slice_count * MI_SEGMENT_SLICE_SIZE; + // make the start not OS page aligned for smaller blocks to avoid page/cache effects + // note: the offset must always be an xblock_size multiple since we assume small allocations + // are aligned (see `mi_heap_malloc_aligned`). + size_t start_offset = 0; + if (xblock_size >= MI_INTPTR_SIZE) { + if (xblock_size <= 64) { start_offset = 3*xblock_size; } + else if (xblock_size <= 512) { start_offset = xblock_size; } + } + if (page_size != NULL) { *page_size = psize - start_offset; } + return (uint8_t*)segment + ((idx*MI_SEGMENT_SLICE_SIZE) + start_offset); +} + +// Start of the page available memory; can be used on uninitialized pages +uint8_t* _mi_segment_page_start(const mi_segment_t* segment, const mi_page_t* page, size_t* page_size) +{ + const mi_slice_t* slice = mi_page_to_slice((mi_page_t*)page); + uint8_t* p = _mi_segment_page_start_from_slice(segment, slice, page->xblock_size, page_size); + mi_assert_internal(page->xblock_size > 0 || _mi_ptr_page(p) == page); + mi_assert_internal(_mi_ptr_segment(p) == segment); + return p; +} + + +static size_t mi_segment_calculate_slices(size_t required, size_t* pre_size, size_t* info_slices) { + size_t page_size = _mi_os_page_size(); + size_t isize = _mi_align_up(sizeof(mi_segment_t), page_size); + size_t guardsize = 0; + + if (MI_SECURE>0) { + // in secure mode, we set up a protected page in between the segment info + // and the page data (and one at the end of the segment) + guardsize = page_size; + if (required > 0) { + required = _mi_align_up(required, MI_SEGMENT_SLICE_SIZE) + page_size; + } + } + + if (pre_size != NULL) *pre_size = isize; + isize = _mi_align_up(isize + guardsize, MI_SEGMENT_SLICE_SIZE); + if (info_slices != NULL) *info_slices = isize / MI_SEGMENT_SLICE_SIZE; + size_t segment_size = (required==0 ? MI_SEGMENT_SIZE : _mi_align_up( required + isize + guardsize, MI_SEGMENT_SLICE_SIZE) ); + mi_assert_internal(segment_size % MI_SEGMENT_SLICE_SIZE == 0); + return (segment_size / MI_SEGMENT_SLICE_SIZE); +} + + +/* ---------------------------------------------------------------------------- +Segment caches +We keep a small segment cache per thread to increase local +reuse and avoid setting/clearing guard pages in secure mode. +------------------------------------------------------------------------------- */ + +static void mi_segments_track_size(long segment_size, mi_segments_tld_t* tld) { + if (segment_size>=0) _mi_stat_increase(&tld->stats->segments,1); + else _mi_stat_decrease(&tld->stats->segments,1); + tld->count += (segment_size >= 0 ? 1 : -1); + if (tld->count > tld->peak_count) tld->peak_count = tld->count; + tld->current_size += segment_size; + if (tld->current_size > tld->peak_size) tld->peak_size = tld->current_size; +} + +static void mi_segment_os_free(mi_segment_t* segment, mi_segments_tld_t* tld) { + segment->thread_id = 0; + _mi_segment_map_freed_at(segment); + mi_segments_track_size(-((long)mi_segment_size(segment)),tld); + if (MI_SECURE>0) { + // _mi_os_unprotect(segment, mi_segment_size(segment)); // ensure no more guard pages are set + // unprotect the guard pages; we cannot just unprotect the whole segment size as part may be decommitted + size_t os_pagesize = _mi_os_page_size(); + _mi_os_unprotect((uint8_t*)segment + mi_segment_info_size(segment) - os_pagesize, os_pagesize); + uint8_t* end = (uint8_t*)segment + mi_segment_size(segment) - os_pagesize; + _mi_os_unprotect(end, os_pagesize); + } + + // purge delayed decommits now? (no, leave it to the arena) + // mi_segment_try_purge(segment,true,tld->stats); + + const size_t size = mi_segment_size(segment); + const size_t csize = _mi_commit_mask_committed_size(&segment->commit_mask, size); + + _mi_abandoned_await_readers(); // wait until safe to free + _mi_arena_free(segment, mi_segment_size(segment), csize, segment->memid, tld->stats); +} + +// called by threads that are terminating +void _mi_segment_thread_collect(mi_segments_tld_t* tld) { + MI_UNUSED(tld); + // nothing to do +} + + +/* ----------------------------------------------------------- + Commit/Decommit ranges +----------------------------------------------------------- */ + +static void mi_segment_commit_mask(mi_segment_t* segment, bool conservative, uint8_t* p, size_t size, uint8_t** start_p, size_t* full_size, mi_commit_mask_t* cm) { + mi_assert_internal(_mi_ptr_segment(p + 1) == segment); + mi_assert_internal(segment->kind != MI_SEGMENT_HUGE); + mi_commit_mask_create_empty(cm); + if (size == 0 || size > MI_SEGMENT_SIZE || segment->kind == MI_SEGMENT_HUGE) return; + const size_t segstart = mi_segment_info_size(segment); + const size_t segsize = mi_segment_size(segment); + if (p >= (uint8_t*)segment + segsize) return; + + size_t pstart = (p - (uint8_t*)segment); + mi_assert_internal(pstart + size <= segsize); + + size_t start; + size_t end; + if (conservative) { + // decommit conservative + start = _mi_align_up(pstart, MI_COMMIT_SIZE); + end = _mi_align_down(pstart + size, MI_COMMIT_SIZE); + mi_assert_internal(start >= segstart); + mi_assert_internal(end <= segsize); + } + else { + // commit liberal + start = _mi_align_down(pstart, MI_MINIMAL_COMMIT_SIZE); + end = _mi_align_up(pstart + size, MI_MINIMAL_COMMIT_SIZE); + } + if (pstart >= segstart && start < segstart) { // note: the mask is also calculated for an initial commit of the info area + start = segstart; + } + if (end > segsize) { + end = segsize; + } + + mi_assert_internal(start <= pstart && (pstart + size) <= end); + mi_assert_internal(start % MI_COMMIT_SIZE==0 && end % MI_COMMIT_SIZE == 0); + *start_p = (uint8_t*)segment + start; + *full_size = (end > start ? end - start : 0); + if (*full_size == 0) return; + + size_t bitidx = start / MI_COMMIT_SIZE; + mi_assert_internal(bitidx < MI_COMMIT_MASK_BITS); + + size_t bitcount = *full_size / MI_COMMIT_SIZE; // can be 0 + if (bitidx + bitcount > MI_COMMIT_MASK_BITS) { + _mi_warning_message("commit mask overflow: idx=%zu count=%zu start=%zx end=%zx p=0x%p size=%zu fullsize=%zu\n", bitidx, bitcount, start, end, p, size, *full_size); + } + mi_assert_internal((bitidx + bitcount) <= MI_COMMIT_MASK_BITS); + mi_commit_mask_create(bitidx, bitcount, cm); +} + +static bool mi_segment_commit(mi_segment_t* segment, uint8_t* p, size_t size, mi_stats_t* stats) { + mi_assert_internal(mi_commit_mask_all_set(&segment->commit_mask, &segment->purge_mask)); + + // commit liberal + uint8_t* start = NULL; + size_t full_size = 0; + mi_commit_mask_t mask; + mi_segment_commit_mask(segment, false /* conservative? */, p, size, &start, &full_size, &mask); + if (mi_commit_mask_is_empty(&mask) || full_size == 0) return true; + + if (!mi_commit_mask_all_set(&segment->commit_mask, &mask)) { + // committing + bool is_zero = false; + mi_commit_mask_t cmask; + mi_commit_mask_create_intersect(&segment->commit_mask, &mask, &cmask); + _mi_stat_decrease(&_mi_stats_main.committed, _mi_commit_mask_committed_size(&cmask, MI_SEGMENT_SIZE)); // adjust for overlap + if (!_mi_os_commit(start, full_size, &is_zero, stats)) return false; + mi_commit_mask_set(&segment->commit_mask, &mask); + } + + // increase purge expiration when using part of delayed purges -- we assume more allocations are coming soon. + if (mi_commit_mask_any_set(&segment->purge_mask, &mask)) { + segment->purge_expire = _mi_clock_now() + mi_option_get(mi_option_purge_delay); + } + + // always clear any delayed purges in our range (as they are either committed now) + mi_commit_mask_clear(&segment->purge_mask, &mask); + return true; +} + +static bool mi_segment_ensure_committed(mi_segment_t* segment, uint8_t* p, size_t size, mi_stats_t* stats) { + mi_assert_internal(mi_commit_mask_all_set(&segment->commit_mask, &segment->purge_mask)); + // note: assumes commit_mask is always full for huge segments as otherwise the commit mask bits can overflow + if (mi_commit_mask_is_full(&segment->commit_mask) && mi_commit_mask_is_empty(&segment->purge_mask)) return true; // fully committed + mi_assert_internal(segment->kind != MI_SEGMENT_HUGE); + return mi_segment_commit(segment, p, size, stats); +} + +static bool mi_segment_purge(mi_segment_t* segment, uint8_t* p, size_t size, mi_stats_t* stats) { + mi_assert_internal(mi_commit_mask_all_set(&segment->commit_mask, &segment->purge_mask)); + if (!segment->allow_purge) return true; + + // purge conservative + uint8_t* start = NULL; + size_t full_size = 0; + mi_commit_mask_t mask; + mi_segment_commit_mask(segment, true /* conservative? */, p, size, &start, &full_size, &mask); + if (mi_commit_mask_is_empty(&mask) || full_size==0) return true; + + if (mi_commit_mask_any_set(&segment->commit_mask, &mask)) { + // purging + mi_assert_internal((void*)start != (void*)segment); + mi_assert_internal(segment->allow_decommit); + const bool decommitted = _mi_os_purge(start, full_size, stats); // reset or decommit + if (decommitted) { + mi_commit_mask_t cmask; + mi_commit_mask_create_intersect(&segment->commit_mask, &mask, &cmask); + _mi_stat_increase(&_mi_stats_main.committed, full_size - _mi_commit_mask_committed_size(&cmask, MI_SEGMENT_SIZE)); // adjust for double counting + mi_commit_mask_clear(&segment->commit_mask, &mask); + } + } + + // always clear any scheduled purges in our range + mi_commit_mask_clear(&segment->purge_mask, &mask); + return true; +} + +static void mi_segment_schedule_purge(mi_segment_t* segment, uint8_t* p, size_t size, mi_stats_t* stats) { + if (!segment->allow_purge) return; + + if (mi_option_get(mi_option_purge_delay) == 0) { + mi_segment_purge(segment, p, size, stats); + } + else { + // register for future purge in the purge mask + uint8_t* start = NULL; + size_t full_size = 0; + mi_commit_mask_t mask; + mi_segment_commit_mask(segment, true /*conservative*/, p, size, &start, &full_size, &mask); + if (mi_commit_mask_is_empty(&mask) || full_size==0) return; + + // update delayed commit + mi_assert_internal(segment->purge_expire > 0 || mi_commit_mask_is_empty(&segment->purge_mask)); + mi_commit_mask_t cmask; + mi_commit_mask_create_intersect(&segment->commit_mask, &mask, &cmask); // only purge what is committed; span_free may try to decommit more + mi_commit_mask_set(&segment->purge_mask, &cmask); + mi_msecs_t now = _mi_clock_now(); + if (segment->purge_expire == 0) { + // no previous purgess, initialize now + segment->purge_expire = now + mi_option_get(mi_option_purge_delay); + } + else if (segment->purge_expire <= now) { + // previous purge mask already expired + if (segment->purge_expire + mi_option_get(mi_option_purge_extend_delay) <= now) { + mi_segment_try_purge(segment, true, stats); + } + else { + segment->purge_expire = now + mi_option_get(mi_option_purge_extend_delay); // (mi_option_get(mi_option_purge_delay) / 8); // wait a tiny bit longer in case there is a series of free's + } + } + else { + // previous purge mask is not yet expired, increase the expiration by a bit. + segment->purge_expire += mi_option_get(mi_option_purge_extend_delay); + } + } +} + +static void mi_segment_try_purge(mi_segment_t* segment, bool force, mi_stats_t* stats) { + if (!segment->allow_purge || mi_commit_mask_is_empty(&segment->purge_mask)) return; + mi_msecs_t now = _mi_clock_now(); + if (!force && now < segment->purge_expire) return; + + mi_commit_mask_t mask = segment->purge_mask; + segment->purge_expire = 0; + mi_commit_mask_create_empty(&segment->purge_mask); + + size_t idx; + size_t count; + mi_commit_mask_foreach(&mask, idx, count) { + // if found, decommit that sequence + if (count > 0) { + uint8_t* p = (uint8_t*)segment + (idx*MI_COMMIT_SIZE); + size_t size = count * MI_COMMIT_SIZE; + mi_segment_purge(segment, p, size, stats); + } + } + mi_commit_mask_foreach_end() + mi_assert_internal(mi_commit_mask_is_empty(&segment->purge_mask)); +} + + +/* ----------------------------------------------------------- + Span free +----------------------------------------------------------- */ + +static bool mi_segment_is_abandoned(mi_segment_t* segment) { + return (segment->thread_id == 0); +} + +// note: can be called on abandoned segments +static void mi_segment_span_free(mi_segment_t* segment, size_t slice_index, size_t slice_count, bool allow_purge, mi_segments_tld_t* tld) { + mi_assert_internal(slice_index < segment->slice_entries); + mi_span_queue_t* sq = (segment->kind == MI_SEGMENT_HUGE || mi_segment_is_abandoned(segment) + ? NULL : mi_span_queue_for(slice_count,tld)); + if (slice_count==0) slice_count = 1; + mi_assert_internal(slice_index + slice_count - 1 < segment->slice_entries); + + // set first and last slice (the intermediates can be undetermined) + mi_slice_t* slice = &segment->slices[slice_index]; + slice->slice_count = (uint32_t)slice_count; + mi_assert_internal(slice->slice_count == slice_count); // no overflow? + slice->slice_offset = 0; + if (slice_count > 1) { + mi_slice_t* last = &segment->slices[slice_index + slice_count - 1]; + last->slice_count = 0; + last->slice_offset = (uint32_t)(sizeof(mi_page_t)*(slice_count - 1)); + last->xblock_size = 0; + } + + // perhaps decommit + if (allow_purge) { + mi_segment_schedule_purge(segment, mi_slice_start(slice), slice_count * MI_SEGMENT_SLICE_SIZE, tld->stats); + } + + // and push it on the free page queue (if it was not a huge page) + if (sq != NULL) mi_span_queue_push( sq, slice ); + else slice->xblock_size = 0; // mark huge page as free anyways +} + +/* +// called from reclaim to add existing free spans +static void mi_segment_span_add_free(mi_slice_t* slice, mi_segments_tld_t* tld) { + mi_segment_t* segment = _mi_ptr_segment(slice); + mi_assert_internal(slice->xblock_size==0 && slice->slice_count>0 && slice->slice_offset==0); + size_t slice_index = mi_slice_index(slice); + mi_segment_span_free(segment,slice_index,slice->slice_count,tld); +} +*/ + +static void mi_segment_span_remove_from_queue(mi_slice_t* slice, mi_segments_tld_t* tld) { + mi_assert_internal(slice->slice_count > 0 && slice->slice_offset==0 && slice->xblock_size==0); + mi_assert_internal(_mi_ptr_segment(slice)->kind != MI_SEGMENT_HUGE); + mi_span_queue_t* sq = mi_span_queue_for(slice->slice_count, tld); + mi_span_queue_delete(sq, slice); +} + +// note: can be called on abandoned segments +static mi_slice_t* mi_segment_span_free_coalesce(mi_slice_t* slice, mi_segments_tld_t* tld) { + mi_assert_internal(slice != NULL && slice->slice_count > 0 && slice->slice_offset == 0); + mi_segment_t* segment = _mi_ptr_segment(slice); + bool is_abandoned = mi_segment_is_abandoned(segment); + + // for huge pages, just mark as free but don't add to the queues + if (segment->kind == MI_SEGMENT_HUGE) { + // issue #691: segment->used can be 0 if the huge page block was freed while abandoned (reclaim will get here in that case) + mi_assert_internal((segment->used==0 && slice->xblock_size==0) || segment->used == 1); // decreased right after this call in `mi_segment_page_clear` + slice->xblock_size = 0; // mark as free anyways + // we should mark the last slice `xblock_size=0` now to maintain invariants but we skip it to + // avoid a possible cache miss (and the segment is about to be freed) + return slice; + } + + // otherwise coalesce the span and add to the free span queues + size_t slice_count = slice->slice_count; + mi_slice_t* next = slice + slice->slice_count; + mi_assert_internal(next <= mi_segment_slices_end(segment)); + if (next < mi_segment_slices_end(segment) && next->xblock_size==0) { + // free next block -- remove it from free and merge + mi_assert_internal(next->slice_count > 0 && next->slice_offset==0); + slice_count += next->slice_count; // extend + if (!is_abandoned) { mi_segment_span_remove_from_queue(next, tld); } + } + if (slice > segment->slices) { + mi_slice_t* prev = mi_slice_first(slice - 1); + mi_assert_internal(prev >= segment->slices); + if (prev->xblock_size==0) { + // free previous slice -- remove it from free and merge + mi_assert_internal(prev->slice_count > 0 && prev->slice_offset==0); + slice_count += prev->slice_count; + if (!is_abandoned) { mi_segment_span_remove_from_queue(prev, tld); } + slice = prev; + } + } + + // and add the new free page + mi_segment_span_free(segment, mi_slice_index(slice), slice_count, true, tld); + return slice; +} + + + +/* ----------------------------------------------------------- + Page allocation +----------------------------------------------------------- */ + +// Note: may still return NULL if committing the memory failed +static mi_page_t* mi_segment_span_allocate(mi_segment_t* segment, size_t slice_index, size_t slice_count, mi_segments_tld_t* tld) { + mi_assert_internal(slice_index < segment->slice_entries); + mi_slice_t* const slice = &segment->slices[slice_index]; + mi_assert_internal(slice->xblock_size==0 || slice->xblock_size==1); + + // commit before changing the slice data + if (!mi_segment_ensure_committed(segment, _mi_segment_page_start_from_slice(segment, slice, 0, NULL), slice_count * MI_SEGMENT_SLICE_SIZE, tld->stats)) { + return NULL; // commit failed! + } + + // convert the slices to a page + slice->slice_offset = 0; + slice->slice_count = (uint32_t)slice_count; + mi_assert_internal(slice->slice_count == slice_count); + const size_t bsize = slice_count * MI_SEGMENT_SLICE_SIZE; + slice->xblock_size = (uint32_t)(bsize >= MI_HUGE_BLOCK_SIZE ? MI_HUGE_BLOCK_SIZE : bsize); + mi_page_t* page = mi_slice_to_page(slice); + mi_assert_internal(mi_page_block_size(page) == bsize); + + // set slice back pointers for the first MI_MAX_SLICE_OFFSET entries + size_t extra = slice_count-1; + if (extra > MI_MAX_SLICE_OFFSET) extra = MI_MAX_SLICE_OFFSET; + if (slice_index + extra >= segment->slice_entries) extra = segment->slice_entries - slice_index - 1; // huge objects may have more slices than avaiable entries in the segment->slices + + mi_slice_t* slice_next = slice + 1; + for (size_t i = 1; i <= extra; i++, slice_next++) { + slice_next->slice_offset = (uint32_t)(sizeof(mi_slice_t)*i); + slice_next->slice_count = 0; + slice_next->xblock_size = 1; + } + + // and also for the last one (if not set already) (the last one is needed for coalescing and for large alignments) + // note: the cast is needed for ubsan since the index can be larger than MI_SLICES_PER_SEGMENT for huge allocations (see #543) + mi_slice_t* last = slice + slice_count - 1; + mi_slice_t* end = (mi_slice_t*)mi_segment_slices_end(segment); + if (last > end) last = end; + if (last > slice) { + last->slice_offset = (uint32_t)(sizeof(mi_slice_t) * (last - slice)); + last->slice_count = 0; + last->xblock_size = 1; + } + + // and initialize the page + page->is_committed = true; + segment->used++; + return page; +} + +static void mi_segment_slice_split(mi_segment_t* segment, mi_slice_t* slice, size_t slice_count, mi_segments_tld_t* tld) { + mi_assert_internal(_mi_ptr_segment(slice) == segment); + mi_assert_internal(slice->slice_count >= slice_count); + mi_assert_internal(slice->xblock_size > 0); // no more in free queue + if (slice->slice_count <= slice_count) return; + mi_assert_internal(segment->kind != MI_SEGMENT_HUGE); + size_t next_index = mi_slice_index(slice) + slice_count; + size_t next_count = slice->slice_count - slice_count; + mi_segment_span_free(segment, next_index, next_count, false /* don't purge left-over part */, tld); + slice->slice_count = (uint32_t)slice_count; +} + +static mi_page_t* mi_segments_page_find_and_allocate(size_t slice_count, mi_arena_id_t req_arena_id, mi_segments_tld_t* tld) { + mi_assert_internal(slice_count*MI_SEGMENT_SLICE_SIZE <= MI_LARGE_OBJ_SIZE_MAX); + // search from best fit up + mi_span_queue_t* sq = mi_span_queue_for(slice_count, tld); + if (slice_count == 0) slice_count = 1; + while (sq <= &tld->spans[MI_SEGMENT_BIN_MAX]) { + for (mi_slice_t* slice = sq->first; slice != NULL; slice = slice->next) { + if (slice->slice_count >= slice_count) { + // found one + mi_segment_t* segment = _mi_ptr_segment(slice); + if (_mi_arena_memid_is_suitable(segment->memid, req_arena_id)) { + // found a suitable page span + mi_span_queue_delete(sq, slice); + + if (slice->slice_count > slice_count) { + mi_segment_slice_split(segment, slice, slice_count, tld); + } + mi_assert_internal(slice != NULL && slice->slice_count == slice_count && slice->xblock_size > 0); + mi_page_t* page = mi_segment_span_allocate(segment, mi_slice_index(slice), slice->slice_count, tld); + if (page == NULL) { + // commit failed; return NULL but first restore the slice + mi_segment_span_free_coalesce(slice, tld); + return NULL; + } + return page; + } + } + } + sq++; + } + // could not find a page.. + return NULL; +} + + +/* ----------------------------------------------------------- + Segment allocation +----------------------------------------------------------- */ + +static mi_segment_t* mi_segment_os_alloc( size_t required, size_t page_alignment, bool eager_delayed, mi_arena_id_t req_arena_id, + size_t* psegment_slices, size_t* ppre_size, size_t* pinfo_slices, + bool commit, mi_segments_tld_t* tld, mi_os_tld_t* os_tld) + +{ + mi_memid_t memid; + bool allow_large = (!eager_delayed && (MI_SECURE == 0)); // only allow large OS pages once we are no longer lazy + size_t align_offset = 0; + size_t alignment = MI_SEGMENT_ALIGN; + + if (page_alignment > 0) { + // mi_assert_internal(huge_page != NULL); + mi_assert_internal(page_alignment >= MI_SEGMENT_ALIGN); + alignment = page_alignment; + const size_t info_size = (*pinfo_slices) * MI_SEGMENT_SLICE_SIZE; + align_offset = _mi_align_up( info_size, MI_SEGMENT_ALIGN ); + const size_t extra = align_offset - info_size; + // recalculate due to potential guard pages + *psegment_slices = mi_segment_calculate_slices(required + extra, ppre_size, pinfo_slices); + } + + const size_t segment_size = (*psegment_slices) * MI_SEGMENT_SLICE_SIZE; + mi_segment_t* segment = (mi_segment_t*)_mi_arena_alloc_aligned(segment_size, alignment, align_offset, commit, allow_large, req_arena_id, &memid, os_tld); + if (segment == NULL) { + return NULL; // failed to allocate + } + + // ensure metadata part of the segment is committed + mi_commit_mask_t commit_mask; + if (memid.initially_committed) { + mi_commit_mask_create_full(&commit_mask); + } + else { + // at least commit the info slices + const size_t commit_needed = _mi_divide_up((*pinfo_slices)*MI_SEGMENT_SLICE_SIZE, MI_COMMIT_SIZE); + mi_assert_internal(commit_needed>0); + mi_commit_mask_create(0, commit_needed, &commit_mask); + mi_assert_internal(commit_needed*MI_COMMIT_SIZE >= (*pinfo_slices)*MI_SEGMENT_SLICE_SIZE); + if (!_mi_os_commit(segment, commit_needed*MI_COMMIT_SIZE, NULL, tld->stats)) { + _mi_arena_free(segment,segment_size,0,memid,tld->stats); + return NULL; + } + } + mi_assert_internal(segment != NULL && (uintptr_t)segment % MI_SEGMENT_SIZE == 0); + + segment->memid = memid; + segment->allow_decommit = !memid.is_pinned; + segment->allow_purge = segment->allow_decommit && (mi_option_get(mi_option_purge_delay) >= 0); + segment->segment_size = segment_size; + segment->commit_mask = commit_mask; + segment->purge_expire = 0; + mi_commit_mask_create_empty(&segment->purge_mask); + mi_atomic_store_ptr_release(mi_segment_t, &segment->abandoned_next, NULL); // tsan + + mi_segments_track_size((long)(segment_size), tld); + _mi_segment_map_allocated_at(segment); + return segment; +} + + +// Allocate a segment from the OS aligned to `MI_SEGMENT_SIZE` . +static mi_segment_t* mi_segment_alloc(size_t required, size_t page_alignment, mi_arena_id_t req_arena_id, mi_segments_tld_t* tld, mi_os_tld_t* os_tld, mi_page_t** huge_page) +{ + mi_assert_internal((required==0 && huge_page==NULL) || (required>0 && huge_page != NULL)); + + // calculate needed sizes first + size_t info_slices; + size_t pre_size; + size_t segment_slices = mi_segment_calculate_slices(required, &pre_size, &info_slices); + + // Commit eagerly only if not the first N lazy segments (to reduce impact of many threads that allocate just a little) + const bool eager_delay = (// !_mi_os_has_overcommit() && // never delay on overcommit systems + _mi_current_thread_count() > 1 && // do not delay for the first N threads + tld->count < (size_t)mi_option_get(mi_option_eager_commit_delay)); + const bool eager = !eager_delay && mi_option_is_enabled(mi_option_eager_commit); + bool commit = eager || (required > 0); + + // Allocate the segment from the OS + mi_segment_t* segment = mi_segment_os_alloc(required, page_alignment, eager_delay, req_arena_id, + &segment_slices, &pre_size, &info_slices, commit, tld, os_tld); + if (segment == NULL) return NULL; + + // zero the segment info? -- not always needed as it may be zero initialized from the OS + if (!segment->memid.initially_zero) { + ptrdiff_t ofs = offsetof(mi_segment_t, next); + size_t prefix = offsetof(mi_segment_t, slices) - ofs; + size_t zsize = prefix + (sizeof(mi_slice_t) * (segment_slices + 1)); // one more + _mi_memzero((uint8_t*)segment + ofs, zsize); + } + + // initialize the rest of the segment info + const size_t slice_entries = (segment_slices > MI_SLICES_PER_SEGMENT ? MI_SLICES_PER_SEGMENT : segment_slices); + segment->segment_slices = segment_slices; + segment->segment_info_slices = info_slices; + segment->thread_id = _mi_thread_id(); + segment->cookie = _mi_ptr_cookie(segment); + segment->slice_entries = slice_entries; + segment->kind = (required == 0 ? MI_SEGMENT_NORMAL : MI_SEGMENT_HUGE); + + // _mi_memzero(segment->slices, sizeof(mi_slice_t)*(info_slices+1)); + _mi_stat_increase(&tld->stats->page_committed, mi_segment_info_size(segment)); + + // set up guard pages + size_t guard_slices = 0; + if (MI_SECURE>0) { + // in secure mode, we set up a protected page in between the segment info + // and the page data, and at the end of the segment. + size_t os_pagesize = _mi_os_page_size(); + mi_assert_internal(mi_segment_info_size(segment) - os_pagesize >= pre_size); + _mi_os_protect((uint8_t*)segment + mi_segment_info_size(segment) - os_pagesize, os_pagesize); + uint8_t* end = (uint8_t*)segment + mi_segment_size(segment) - os_pagesize; + mi_segment_ensure_committed(segment, end, os_pagesize, tld->stats); + _mi_os_protect(end, os_pagesize); + if (slice_entries == segment_slices) segment->slice_entries--; // don't use the last slice :-( + guard_slices = 1; + } + + // reserve first slices for segment info + mi_page_t* page0 = mi_segment_span_allocate(segment, 0, info_slices, tld); + mi_assert_internal(page0!=NULL); if (page0==NULL) return NULL; // cannot fail as we always commit in advance + mi_assert_internal(segment->used == 1); + segment->used = 0; // don't count our internal slices towards usage + + // initialize initial free pages + if (segment->kind == MI_SEGMENT_NORMAL) { // not a huge page + mi_assert_internal(huge_page==NULL); + mi_segment_span_free(segment, info_slices, segment->slice_entries - info_slices, false /* don't purge */, tld); + } + else { + mi_assert_internal(huge_page!=NULL); + mi_assert_internal(mi_commit_mask_is_empty(&segment->purge_mask)); + mi_assert_internal(mi_commit_mask_is_full(&segment->commit_mask)); + *huge_page = mi_segment_span_allocate(segment, info_slices, segment_slices - info_slices - guard_slices, tld); + mi_assert_internal(*huge_page != NULL); // cannot fail as we commit in advance + } + + mi_assert_expensive(mi_segment_is_valid(segment,tld)); + return segment; +} + + +static void mi_segment_free(mi_segment_t* segment, bool force, mi_segments_tld_t* tld) { + MI_UNUSED(force); + mi_assert_internal(segment != NULL); + mi_assert_internal(segment->next == NULL); + mi_assert_internal(segment->used == 0); + + // Remove the free pages + mi_slice_t* slice = &segment->slices[0]; + const mi_slice_t* end = mi_segment_slices_end(segment); + #if MI_DEBUG>1 + size_t page_count = 0; + #endif + while (slice < end) { + mi_assert_internal(slice->slice_count > 0); + mi_assert_internal(slice->slice_offset == 0); + mi_assert_internal(mi_slice_index(slice)==0 || slice->xblock_size == 0); // no more used pages .. + if (slice->xblock_size == 0 && segment->kind != MI_SEGMENT_HUGE) { + mi_segment_span_remove_from_queue(slice, tld); + } + #if MI_DEBUG>1 + page_count++; + #endif + slice = slice + slice->slice_count; + } + mi_assert_internal(page_count == 2); // first page is allocated by the segment itself + + // stats + _mi_stat_decrease(&tld->stats->page_committed, mi_segment_info_size(segment)); + + // return it to the OS + mi_segment_os_free(segment, tld); +} + + +/* ----------------------------------------------------------- + Page Free +----------------------------------------------------------- */ + +static void mi_segment_abandon(mi_segment_t* segment, mi_segments_tld_t* tld); + +// note: can be called on abandoned pages +static mi_slice_t* mi_segment_page_clear(mi_page_t* page, mi_segments_tld_t* tld) { + mi_assert_internal(page->xblock_size > 0); + mi_assert_internal(mi_page_all_free(page)); + mi_segment_t* segment = _mi_ptr_segment(page); + mi_assert_internal(segment->used > 0); + + size_t inuse = page->capacity * mi_page_block_size(page); + _mi_stat_decrease(&tld->stats->page_committed, inuse); + _mi_stat_decrease(&tld->stats->pages, 1); + + // reset the page memory to reduce memory pressure? + if (segment->allow_decommit && mi_option_is_enabled(mi_option_deprecated_page_reset)) { + size_t psize; + uint8_t* start = _mi_page_start(segment, page, &psize); + _mi_os_reset(start, psize, tld->stats); + } + + // zero the page data, but not the segment fields + page->is_zero_init = false; + ptrdiff_t ofs = offsetof(mi_page_t, capacity); + _mi_memzero((uint8_t*)page + ofs, sizeof(*page) - ofs); + page->xblock_size = 1; + + // and free it + mi_slice_t* slice = mi_segment_span_free_coalesce(mi_page_to_slice(page), tld); + segment->used--; + // cannot assert segment valid as it is called during reclaim + // mi_assert_expensive(mi_segment_is_valid(segment, tld)); + return slice; +} + +void _mi_segment_page_free(mi_page_t* page, bool force, mi_segments_tld_t* tld) +{ + mi_assert(page != NULL); + + mi_segment_t* segment = _mi_page_segment(page); + mi_assert_expensive(mi_segment_is_valid(segment,tld)); + + // mark it as free now + mi_segment_page_clear(page, tld); + mi_assert_expensive(mi_segment_is_valid(segment, tld)); + + if (segment->used == 0) { + // no more used pages; remove from the free list and free the segment + mi_segment_free(segment, force, tld); + } + else if (segment->used == segment->abandoned) { + // only abandoned pages; remove from free list and abandon + mi_segment_abandon(segment,tld); + } +} + + +/* ----------------------------------------------------------- +Abandonment + +When threads terminate, they can leave segments with +live blocks (reachable through other threads). Such segments +are "abandoned" and will be reclaimed by other threads to +reuse their pages and/or free them eventually + +We maintain a global list of abandoned segments that are +reclaimed on demand. Since this is shared among threads +the implementation needs to avoid the A-B-A problem on +popping abandoned segments: +We use tagged pointers to avoid accidentally identifying +reused segments, much like stamped references in Java. +Secondly, we maintain a reader counter to avoid resetting +or decommitting segments that have a pending read operation. + +Note: the current implementation is one possible design; +another way might be to keep track of abandoned segments +in the arenas/segment_cache's. This would have the advantage of keeping +all concurrent code in one place and not needing to deal +with ABA issues. The drawback is that it is unclear how to +scan abandoned segments efficiently in that case as they +would be spread among all other segments in the arenas. +----------------------------------------------------------- */ + +// Use the bottom 20-bits (on 64-bit) of the aligned segment pointers +// to put in a tag that increments on update to avoid the A-B-A problem. +#define MI_TAGGED_MASK MI_SEGMENT_MASK +typedef uintptr_t mi_tagged_segment_t; + +static mi_segment_t* mi_tagged_segment_ptr(mi_tagged_segment_t ts) { + return (mi_segment_t*)(ts & ~MI_TAGGED_MASK); +} + +static mi_tagged_segment_t mi_tagged_segment(mi_segment_t* segment, mi_tagged_segment_t ts) { + mi_assert_internal(((uintptr_t)segment & MI_TAGGED_MASK) == 0); + uintptr_t tag = ((ts & MI_TAGGED_MASK) + 1) & MI_TAGGED_MASK; + return ((uintptr_t)segment | tag); +} + +// This is a list of visited abandoned pages that were full at the time. +// this list migrates to `abandoned` when that becomes NULL. The use of +// this list reduces contention and the rate at which segments are visited. +static mi_decl_cache_align _Atomic(mi_segment_t*) abandoned_visited; // = NULL + +// The abandoned page list (tagged as it supports pop) +static mi_decl_cache_align _Atomic(mi_tagged_segment_t) abandoned; // = NULL + +// Maintain these for debug purposes (these counts may be a bit off) +static mi_decl_cache_align _Atomic(size_t) abandoned_count; +static mi_decl_cache_align _Atomic(size_t) abandoned_visited_count; + +// We also maintain a count of current readers of the abandoned list +// in order to prevent resetting/decommitting segment memory if it might +// still be read. +static mi_decl_cache_align _Atomic(size_t) abandoned_readers; // = 0 + +// Push on the visited list +static void mi_abandoned_visited_push(mi_segment_t* segment) { + mi_assert_internal(segment->thread_id == 0); + mi_assert_internal(mi_atomic_load_ptr_relaxed(mi_segment_t,&segment->abandoned_next) == NULL); + mi_assert_internal(segment->next == NULL); + mi_assert_internal(segment->used > 0); + mi_segment_t* anext = mi_atomic_load_ptr_relaxed(mi_segment_t, &abandoned_visited); + do { + mi_atomic_store_ptr_release(mi_segment_t, &segment->abandoned_next, anext); + } while (!mi_atomic_cas_ptr_weak_release(mi_segment_t, &abandoned_visited, &anext, segment)); + mi_atomic_increment_relaxed(&abandoned_visited_count); +} + +// Move the visited list to the abandoned list. +static bool mi_abandoned_visited_revisit(void) +{ + // quick check if the visited list is empty + if (mi_atomic_load_ptr_relaxed(mi_segment_t, &abandoned_visited) == NULL) return false; + + // grab the whole visited list + mi_segment_t* first = mi_atomic_exchange_ptr_acq_rel(mi_segment_t, &abandoned_visited, NULL); + if (first == NULL) return false; + + // first try to swap directly if the abandoned list happens to be NULL + mi_tagged_segment_t afirst; + mi_tagged_segment_t ts = mi_atomic_load_relaxed(&abandoned); + if (mi_tagged_segment_ptr(ts)==NULL) { + size_t count = mi_atomic_load_relaxed(&abandoned_visited_count); + afirst = mi_tagged_segment(first, ts); + if (mi_atomic_cas_strong_acq_rel(&abandoned, &ts, afirst)) { + mi_atomic_add_relaxed(&abandoned_count, count); + mi_atomic_sub_relaxed(&abandoned_visited_count, count); + return true; + } + } + + // find the last element of the visited list: O(n) + mi_segment_t* last = first; + mi_segment_t* next; + while ((next = mi_atomic_load_ptr_relaxed(mi_segment_t, &last->abandoned_next)) != NULL) { + last = next; + } + + // and atomically prepend to the abandoned list + // (no need to increase the readers as we don't access the abandoned segments) + mi_tagged_segment_t anext = mi_atomic_load_relaxed(&abandoned); + size_t count; + do { + count = mi_atomic_load_relaxed(&abandoned_visited_count); + mi_atomic_store_ptr_release(mi_segment_t, &last->abandoned_next, mi_tagged_segment_ptr(anext)); + afirst = mi_tagged_segment(first, anext); + } while (!mi_atomic_cas_weak_release(&abandoned, &anext, afirst)); + mi_atomic_add_relaxed(&abandoned_count, count); + mi_atomic_sub_relaxed(&abandoned_visited_count, count); + return true; +} + +// Push on the abandoned list. +static void mi_abandoned_push(mi_segment_t* segment) { + mi_assert_internal(segment->thread_id == 0); + mi_assert_internal(mi_atomic_load_ptr_relaxed(mi_segment_t, &segment->abandoned_next) == NULL); + mi_assert_internal(segment->next == NULL); + mi_assert_internal(segment->used > 0); + mi_tagged_segment_t next; + mi_tagged_segment_t ts = mi_atomic_load_relaxed(&abandoned); + do { + mi_atomic_store_ptr_release(mi_segment_t, &segment->abandoned_next, mi_tagged_segment_ptr(ts)); + next = mi_tagged_segment(segment, ts); + } while (!mi_atomic_cas_weak_release(&abandoned, &ts, next)); + mi_atomic_increment_relaxed(&abandoned_count); +} + +// Wait until there are no more pending reads on segments that used to be in the abandoned list +// called for example from `arena.c` before decommitting +void _mi_abandoned_await_readers(void) { + size_t n; + do { + n = mi_atomic_load_acquire(&abandoned_readers); + if (n != 0) mi_atomic_yield(); + } while (n != 0); +} + +// Pop from the abandoned list +static mi_segment_t* mi_abandoned_pop(void) { + mi_segment_t* segment; + // Check efficiently if it is empty (or if the visited list needs to be moved) + mi_tagged_segment_t ts = mi_atomic_load_relaxed(&abandoned); + segment = mi_tagged_segment_ptr(ts); + if mi_likely(segment == NULL) { + if mi_likely(!mi_abandoned_visited_revisit()) { // try to swap in the visited list on NULL + return NULL; + } + } + + // Do a pop. We use a reader count to prevent + // a segment to be decommitted while a read is still pending, + // and a tagged pointer to prevent A-B-A link corruption. + // (this is called from `region.c:_mi_mem_free` for example) + mi_atomic_increment_relaxed(&abandoned_readers); // ensure no segment gets decommitted + mi_tagged_segment_t next = 0; + ts = mi_atomic_load_acquire(&abandoned); + do { + segment = mi_tagged_segment_ptr(ts); + if (segment != NULL) { + mi_segment_t* anext = mi_atomic_load_ptr_relaxed(mi_segment_t, &segment->abandoned_next); + next = mi_tagged_segment(anext, ts); // note: reads the segment's `abandoned_next` field so should not be decommitted + } + } while (segment != NULL && !mi_atomic_cas_weak_acq_rel(&abandoned, &ts, next)); + mi_atomic_decrement_relaxed(&abandoned_readers); // release reader lock + if (segment != NULL) { + mi_atomic_store_ptr_release(mi_segment_t, &segment->abandoned_next, NULL); + mi_atomic_decrement_relaxed(&abandoned_count); + } + return segment; +} + +/* ----------------------------------------------------------- + Abandon segment/page +----------------------------------------------------------- */ + +static void mi_segment_abandon(mi_segment_t* segment, mi_segments_tld_t* tld) { + mi_assert_internal(segment->used == segment->abandoned); + mi_assert_internal(segment->used > 0); + mi_assert_internal(mi_atomic_load_ptr_relaxed(mi_segment_t, &segment->abandoned_next) == NULL); + mi_assert_internal(segment->abandoned_visits == 0); + mi_assert_expensive(mi_segment_is_valid(segment,tld)); + + // remove the free pages from the free page queues + mi_slice_t* slice = &segment->slices[0]; + const mi_slice_t* end = mi_segment_slices_end(segment); + while (slice < end) { + mi_assert_internal(slice->slice_count > 0); + mi_assert_internal(slice->slice_offset == 0); + if (slice->xblock_size == 0) { // a free page + mi_segment_span_remove_from_queue(slice,tld); + slice->xblock_size = 0; // but keep it free + } + slice = slice + slice->slice_count; + } + + // perform delayed decommits (forcing is much slower on mstress) + mi_segment_try_purge(segment, mi_option_is_enabled(mi_option_abandoned_page_purge) /* force? */, tld->stats); + + // all pages in the segment are abandoned; add it to the abandoned list + _mi_stat_increase(&tld->stats->segments_abandoned, 1); + mi_segments_track_size(-((long)mi_segment_size(segment)), tld); + segment->thread_id = 0; + mi_atomic_store_ptr_release(mi_segment_t, &segment->abandoned_next, NULL); + segment->abandoned_visits = 1; // from 0 to 1 to signify it is abandoned + mi_abandoned_push(segment); +} + +void _mi_segment_page_abandon(mi_page_t* page, mi_segments_tld_t* tld) { + mi_assert(page != NULL); + mi_assert_internal(mi_page_thread_free_flag(page)==MI_NEVER_DELAYED_FREE); + mi_assert_internal(mi_page_heap(page) == NULL); + mi_segment_t* segment = _mi_page_segment(page); + + mi_assert_expensive(mi_segment_is_valid(segment,tld)); + segment->abandoned++; + + _mi_stat_increase(&tld->stats->pages_abandoned, 1); + mi_assert_internal(segment->abandoned <= segment->used); + if (segment->used == segment->abandoned) { + // all pages are abandoned, abandon the entire segment + mi_segment_abandon(segment, tld); + } +} + +/* ----------------------------------------------------------- + Reclaim abandoned pages +----------------------------------------------------------- */ + +static mi_slice_t* mi_slices_start_iterate(mi_segment_t* segment, const mi_slice_t** end) { + mi_slice_t* slice = &segment->slices[0]; + *end = mi_segment_slices_end(segment); + mi_assert_internal(slice->slice_count>0 && slice->xblock_size>0); // segment allocated page + slice = slice + slice->slice_count; // skip the first segment allocated page + return slice; +} + +// Possibly free pages and check if free space is available +static bool mi_segment_check_free(mi_segment_t* segment, size_t slices_needed, size_t block_size, mi_segments_tld_t* tld) +{ + mi_assert_internal(block_size < MI_HUGE_BLOCK_SIZE); + mi_assert_internal(mi_segment_is_abandoned(segment)); + bool has_page = false; + + // for all slices + const mi_slice_t* end; + mi_slice_t* slice = mi_slices_start_iterate(segment, &end); + while (slice < end) { + mi_assert_internal(slice->slice_count > 0); + mi_assert_internal(slice->slice_offset == 0); + if (mi_slice_is_used(slice)) { // used page + // ensure used count is up to date and collect potential concurrent frees + mi_page_t* const page = mi_slice_to_page(slice); + _mi_page_free_collect(page, false); + if (mi_page_all_free(page)) { + // if this page is all free now, free it without adding to any queues (yet) + mi_assert_internal(page->next == NULL && page->prev==NULL); + _mi_stat_decrease(&tld->stats->pages_abandoned, 1); + segment->abandoned--; + slice = mi_segment_page_clear(page, tld); // re-assign slice due to coalesce! + mi_assert_internal(!mi_slice_is_used(slice)); + if (slice->slice_count >= slices_needed) { + has_page = true; + } + } + else { + if (page->xblock_size == block_size && mi_page_has_any_available(page)) { + // a page has available free blocks of the right size + has_page = true; + } + } + } + else { + // empty span + if (slice->slice_count >= slices_needed) { + has_page = true; + } + } + slice = slice + slice->slice_count; + } + return has_page; +} + +// Reclaim an abandoned segment; returns NULL if the segment was freed +// set `right_page_reclaimed` to `true` if it reclaimed a page of the right `block_size` that was not full. +static mi_segment_t* mi_segment_reclaim(mi_segment_t* segment, mi_heap_t* heap, size_t requested_block_size, bool* right_page_reclaimed, mi_segments_tld_t* tld) { + mi_assert_internal(mi_atomic_load_ptr_relaxed(mi_segment_t, &segment->abandoned_next) == NULL); + mi_assert_expensive(mi_segment_is_valid(segment, tld)); + if (right_page_reclaimed != NULL) { *right_page_reclaimed = false; } + + segment->thread_id = _mi_thread_id(); + segment->abandoned_visits = 0; + mi_segments_track_size((long)mi_segment_size(segment), tld); + mi_assert_internal(segment->next == NULL); + _mi_stat_decrease(&tld->stats->segments_abandoned, 1); + + // for all slices + const mi_slice_t* end; + mi_slice_t* slice = mi_slices_start_iterate(segment, &end); + while (slice < end) { + mi_assert_internal(slice->slice_count > 0); + mi_assert_internal(slice->slice_offset == 0); + if (mi_slice_is_used(slice)) { + // in use: reclaim the page in our heap + mi_page_t* page = mi_slice_to_page(slice); + mi_assert_internal(page->is_committed); + mi_assert_internal(mi_page_thread_free_flag(page)==MI_NEVER_DELAYED_FREE); + mi_assert_internal(mi_page_heap(page) == NULL); + mi_assert_internal(page->next == NULL && page->prev==NULL); + _mi_stat_decrease(&tld->stats->pages_abandoned, 1); + segment->abandoned--; + // set the heap again and allow delayed free again + mi_page_set_heap(page, heap); + _mi_page_use_delayed_free(page, MI_USE_DELAYED_FREE, true); // override never (after heap is set) + _mi_page_free_collect(page, false); // ensure used count is up to date + if (mi_page_all_free(page)) { + // if everything free by now, free the page + slice = mi_segment_page_clear(page, tld); // set slice again due to coalesceing + } + else { + // otherwise reclaim it into the heap + _mi_page_reclaim(heap, page); + if (requested_block_size == page->xblock_size && mi_page_has_any_available(page)) { + if (right_page_reclaimed != NULL) { *right_page_reclaimed = true; } + } + } + } + else { + // the span is free, add it to our page queues + slice = mi_segment_span_free_coalesce(slice, tld); // set slice again due to coalesceing + } + mi_assert_internal(slice->slice_count>0 && slice->slice_offset==0); + slice = slice + slice->slice_count; + } + + mi_assert(segment->abandoned == 0); + if (segment->used == 0) { // due to page_clear + mi_assert_internal(right_page_reclaimed == NULL || !(*right_page_reclaimed)); + mi_segment_free(segment, false, tld); + return NULL; + } + else { + return segment; + } +} + + +void _mi_abandoned_reclaim_all(mi_heap_t* heap, mi_segments_tld_t* tld) { + mi_segment_t* segment; + while ((segment = mi_abandoned_pop()) != NULL) { + mi_segment_reclaim(segment, heap, 0, NULL, tld); + } +} + +static mi_segment_t* mi_segment_try_reclaim(mi_heap_t* heap, size_t needed_slices, size_t block_size, bool* reclaimed, mi_segments_tld_t* tld) +{ + *reclaimed = false; + mi_segment_t* segment; + long max_tries = mi_option_get_clamp(mi_option_max_segment_reclaim, 8, 1024); // limit the work to bound allocation times + while ((max_tries-- > 0) && ((segment = mi_abandoned_pop()) != NULL)) { + segment->abandoned_visits++; + // todo: an arena exclusive heap will potentially visit many abandoned unsuitable segments + // and push them into the visited list and use many tries. Perhaps we can skip non-suitable ones in a better way? + bool is_suitable = _mi_heap_memid_is_suitable(heap, segment->memid); + bool has_page = mi_segment_check_free(segment,needed_slices,block_size,tld); // try to free up pages (due to concurrent frees) + if (segment->used == 0) { + // free the segment (by forced reclaim) to make it available to other threads. + // note1: we prefer to free a segment as that might lead to reclaiming another + // segment that is still partially used. + // note2: we could in principle optimize this by skipping reclaim and directly + // freeing but that would violate some invariants temporarily) + mi_segment_reclaim(segment, heap, 0, NULL, tld); + } + else if (has_page && is_suitable) { + // found a large enough free span, or a page of the right block_size with free space + // we return the result of reclaim (which is usually `segment`) as it might free + // the segment due to concurrent frees (in which case `NULL` is returned). + return mi_segment_reclaim(segment, heap, block_size, reclaimed, tld); + } + else if (segment->abandoned_visits > 3 && is_suitable) { + // always reclaim on 3rd visit to limit the abandoned queue length. + mi_segment_reclaim(segment, heap, 0, NULL, tld); + } + else { + // otherwise, push on the visited list so it gets not looked at too quickly again + mi_segment_try_purge(segment, true /* force? */, tld->stats); // force purge if needed as we may not visit soon again + mi_abandoned_visited_push(segment); + } + } + return NULL; +} + + +void _mi_abandoned_collect(mi_heap_t* heap, bool force, mi_segments_tld_t* tld) +{ + mi_segment_t* segment; + int max_tries = (force ? 16*1024 : 1024); // limit latency + if (force) { + mi_abandoned_visited_revisit(); + } + while ((max_tries-- > 0) && ((segment = mi_abandoned_pop()) != NULL)) { + mi_segment_check_free(segment,0,0,tld); // try to free up pages (due to concurrent frees) + if (segment->used == 0) { + // free the segment (by forced reclaim) to make it available to other threads. + // note: we could in principle optimize this by skipping reclaim and directly + // freeing but that would violate some invariants temporarily) + mi_segment_reclaim(segment, heap, 0, NULL, tld); + } + else { + // otherwise, purge if needed and push on the visited list + // note: forced purge can be expensive if many threads are destroyed/created as in mstress. + mi_segment_try_purge(segment, force, tld->stats); + mi_abandoned_visited_push(segment); + } + } +} + +/* ----------------------------------------------------------- + Reclaim or allocate +----------------------------------------------------------- */ + +static mi_segment_t* mi_segment_reclaim_or_alloc(mi_heap_t* heap, size_t needed_slices, size_t block_size, mi_segments_tld_t* tld, mi_os_tld_t* os_tld) +{ + mi_assert_internal(block_size < MI_HUGE_BLOCK_SIZE); + mi_assert_internal(block_size <= MI_LARGE_OBJ_SIZE_MAX); + + // 1. try to reclaim an abandoned segment + bool reclaimed; + mi_segment_t* segment = mi_segment_try_reclaim(heap, needed_slices, block_size, &reclaimed, tld); + if (reclaimed) { + // reclaimed the right page right into the heap + mi_assert_internal(segment != NULL); + return NULL; // pretend out-of-memory as the page will be in the page queue of the heap with available blocks + } + else if (segment != NULL) { + // reclaimed a segment with a large enough empty span in it + return segment; + } + // 2. otherwise allocate a fresh segment + return mi_segment_alloc(0, 0, heap->arena_id, tld, os_tld, NULL); +} + + +/* ----------------------------------------------------------- + Page allocation +----------------------------------------------------------- */ + +static mi_page_t* mi_segments_page_alloc(mi_heap_t* heap, mi_page_kind_t page_kind, size_t required, size_t block_size, mi_segments_tld_t* tld, mi_os_tld_t* os_tld) +{ + mi_assert_internal(required <= MI_LARGE_OBJ_SIZE_MAX && page_kind <= MI_PAGE_LARGE); + + // find a free page + size_t page_size = _mi_align_up(required, (required > MI_MEDIUM_PAGE_SIZE ? MI_MEDIUM_PAGE_SIZE : MI_SEGMENT_SLICE_SIZE)); + size_t slices_needed = page_size / MI_SEGMENT_SLICE_SIZE; + mi_assert_internal(slices_needed * MI_SEGMENT_SLICE_SIZE == page_size); + mi_page_t* page = mi_segments_page_find_and_allocate(slices_needed, heap->arena_id, tld); //(required <= MI_SMALL_SIZE_MAX ? 0 : slices_needed), tld); + if (page==NULL) { + // no free page, allocate a new segment and try again + if (mi_segment_reclaim_or_alloc(heap, slices_needed, block_size, tld, os_tld) == NULL) { + // OOM or reclaimed a good page in the heap + return NULL; + } + else { + // otherwise try again + return mi_segments_page_alloc(heap, page_kind, required, block_size, tld, os_tld); + } + } + mi_assert_internal(page != NULL && page->slice_count*MI_SEGMENT_SLICE_SIZE == page_size); + mi_assert_internal(_mi_ptr_segment(page)->thread_id == _mi_thread_id()); + mi_segment_try_purge(_mi_ptr_segment(page), false, tld->stats); + return page; +} + + + +/* ----------------------------------------------------------- + Huge page allocation +----------------------------------------------------------- */ + +static mi_page_t* mi_segment_huge_page_alloc(size_t size, size_t page_alignment, mi_arena_id_t req_arena_id, mi_segments_tld_t* tld, mi_os_tld_t* os_tld) +{ + mi_page_t* page = NULL; + mi_segment_t* segment = mi_segment_alloc(size,page_alignment,req_arena_id,tld,os_tld,&page); + if (segment == NULL || page==NULL) return NULL; + mi_assert_internal(segment->used==1); + mi_assert_internal(mi_page_block_size(page) >= size); + #if MI_HUGE_PAGE_ABANDON + segment->thread_id = 0; // huge segments are immediately abandoned + #endif + + // for huge pages we initialize the xblock_size as we may + // overallocate to accommodate large alignments. + size_t psize; + uint8_t* start = _mi_segment_page_start(segment, page, &psize); + page->xblock_size = (psize > MI_HUGE_BLOCK_SIZE ? MI_HUGE_BLOCK_SIZE : (uint32_t)psize); + + // decommit the part of the prefix of a page that will not be used; this can be quite large (close to MI_SEGMENT_SIZE) + if (page_alignment > 0 && segment->allow_decommit) { + uint8_t* aligned_p = (uint8_t*)_mi_align_up((uintptr_t)start, page_alignment); + mi_assert_internal(_mi_is_aligned(aligned_p, page_alignment)); + mi_assert_internal(psize - (aligned_p - start) >= size); + uint8_t* decommit_start = start + sizeof(mi_block_t); // for the free list + ptrdiff_t decommit_size = aligned_p - decommit_start; + _mi_os_reset(decommit_start, decommit_size, &_mi_stats_main); // note: cannot use segment_decommit on huge segments + } + + return page; +} + +#if MI_HUGE_PAGE_ABANDON +// free huge block from another thread +void _mi_segment_huge_page_free(mi_segment_t* segment, mi_page_t* page, mi_block_t* block) { + // huge page segments are always abandoned and can be freed immediately by any thread + mi_assert_internal(segment->kind==MI_SEGMENT_HUGE); + mi_assert_internal(segment == _mi_page_segment(page)); + mi_assert_internal(mi_atomic_load_relaxed(&segment->thread_id)==0); + + // claim it and free + mi_heap_t* heap = mi_heap_get_default(); // issue #221; don't use the internal get_default_heap as we need to ensure the thread is initialized. + // paranoia: if this it the last reference, the cas should always succeed + size_t expected_tid = 0; + if (mi_atomic_cas_strong_acq_rel(&segment->thread_id, &expected_tid, heap->thread_id)) { + mi_block_set_next(page, block, page->free); + page->free = block; + page->used--; + page->is_zero = false; + mi_assert(page->used == 0); + mi_tld_t* tld = heap->tld; + _mi_segment_page_free(page, true, &tld->segments); + } +#if (MI_DEBUG!=0) + else { + mi_assert_internal(false); + } +#endif +} + +#else +// reset memory of a huge block from another thread +void _mi_segment_huge_page_reset(mi_segment_t* segment, mi_page_t* page, mi_block_t* block) { + MI_UNUSED(page); + mi_assert_internal(segment->kind == MI_SEGMENT_HUGE); + mi_assert_internal(segment == _mi_page_segment(page)); + mi_assert_internal(page->used == 1); // this is called just before the free + mi_assert_internal(page->free == NULL); + if (segment->allow_decommit) { + size_t csize = mi_usable_size(block); + if (csize > sizeof(mi_block_t)) { + csize = csize - sizeof(mi_block_t); + uint8_t* p = (uint8_t*)block + sizeof(mi_block_t); + _mi_os_reset(p, csize, &_mi_stats_main); // note: cannot use segment_decommit on huge segments + } + } +} +#endif + +/* ----------------------------------------------------------- + Page allocation and free +----------------------------------------------------------- */ +mi_page_t* _mi_segment_page_alloc(mi_heap_t* heap, size_t block_size, size_t page_alignment, mi_segments_tld_t* tld, mi_os_tld_t* os_tld) { + mi_page_t* page; + if mi_unlikely(page_alignment > MI_ALIGNMENT_MAX) { + mi_assert_internal(_mi_is_power_of_two(page_alignment)); + mi_assert_internal(page_alignment >= MI_SEGMENT_SIZE); + if (page_alignment < MI_SEGMENT_SIZE) { page_alignment = MI_SEGMENT_SIZE; } + page = mi_segment_huge_page_alloc(block_size,page_alignment,heap->arena_id,tld,os_tld); + } + else if (block_size <= MI_SMALL_OBJ_SIZE_MAX) { + page = mi_segments_page_alloc(heap,MI_PAGE_SMALL,block_size,block_size,tld,os_tld); + } + else if (block_size <= MI_MEDIUM_OBJ_SIZE_MAX) { + page = mi_segments_page_alloc(heap,MI_PAGE_MEDIUM,MI_MEDIUM_PAGE_SIZE,block_size,tld, os_tld); + } + else if (block_size <= MI_LARGE_OBJ_SIZE_MAX) { + page = mi_segments_page_alloc(heap,MI_PAGE_LARGE,block_size,block_size,tld, os_tld); + } + else { + page = mi_segment_huge_page_alloc(block_size,page_alignment,heap->arena_id,tld,os_tld); + } + mi_assert_internal(page == NULL || _mi_heap_memid_is_suitable(heap, _mi_page_segment(page)->memid)); + mi_assert_expensive(page == NULL || mi_segment_is_valid(_mi_page_segment(page),tld)); + return page; +} diff --git a/compat/mimalloc/stats.c b/compat/mimalloc/stats.c new file mode 100644 index 00000000000000..6817e07aa1ee9f --- /dev/null +++ b/compat/mimalloc/stats.c @@ -0,0 +1,467 @@ +/* ---------------------------------------------------------------------------- +Copyright (c) 2018-2021, Microsoft Research, Daan Leijen +This is free software; you can redistribute it and/or modify it under the +terms of the MIT license. A copy of the license can be found in the file +"LICENSE" at the root of this distribution. +-----------------------------------------------------------------------------*/ +#include "mimalloc.h" +#include "mimalloc/internal.h" +#include "mimalloc/atomic.h" +#include "mimalloc/prim.h" + +#include // snprintf +#include // memset + +#if defined(_MSC_VER) && (_MSC_VER < 1920) +#pragma warning(disable:4204) // non-constant aggregate initializer +#endif + +/* ----------------------------------------------------------- + Statistics operations +----------------------------------------------------------- */ + +static bool mi_is_in_main(void* stat) { + return ((uint8_t*)stat >= (uint8_t*)&_mi_stats_main + && (uint8_t*)stat < ((uint8_t*)&_mi_stats_main + sizeof(mi_stats_t))); +} + +static void mi_stat_update(mi_stat_count_t* stat, int64_t amount) { + if (amount == 0) return; + if (mi_is_in_main(stat)) + { + // add atomically (for abandoned pages) + int64_t current = mi_atomic_addi64_relaxed(&stat->current, amount); + mi_atomic_maxi64_relaxed(&stat->peak, current + amount); + if (amount > 0) { + mi_atomic_addi64_relaxed(&stat->allocated,amount); + } + else { + mi_atomic_addi64_relaxed(&stat->freed, -amount); + } + } + else { + // add thread local + stat->current += amount; + if (stat->current > stat->peak) stat->peak = stat->current; + if (amount > 0) { + stat->allocated += amount; + } + else { + stat->freed += -amount; + } + } +} + +void _mi_stat_counter_increase(mi_stat_counter_t* stat, size_t amount) { + if (mi_is_in_main(stat)) { + mi_atomic_addi64_relaxed( &stat->count, 1 ); + mi_atomic_addi64_relaxed( &stat->total, (int64_t)amount ); + } + else { + stat->count++; + stat->total += amount; + } +} + +void _mi_stat_increase(mi_stat_count_t* stat, size_t amount) { + mi_stat_update(stat, (int64_t)amount); +} + +void _mi_stat_decrease(mi_stat_count_t* stat, size_t amount) { + mi_stat_update(stat, -((int64_t)amount)); +} + +// must be thread safe as it is called from stats_merge +static void mi_stat_add(mi_stat_count_t* stat, const mi_stat_count_t* src, int64_t unit) { + if (stat==src) return; + if (src->allocated==0 && src->freed==0) return; + mi_atomic_addi64_relaxed( &stat->allocated, src->allocated * unit); + mi_atomic_addi64_relaxed( &stat->current, src->current * unit); + mi_atomic_addi64_relaxed( &stat->freed, src->freed * unit); + // peak scores do not work across threads.. + mi_atomic_addi64_relaxed( &stat->peak, src->peak * unit); +} + +static void mi_stat_counter_add(mi_stat_counter_t* stat, const mi_stat_counter_t* src, int64_t unit) { + if (stat==src) return; + mi_atomic_addi64_relaxed( &stat->total, src->total * unit); + mi_atomic_addi64_relaxed( &stat->count, src->count * unit); +} + +// must be thread safe as it is called from stats_merge +static void mi_stats_add(mi_stats_t* stats, const mi_stats_t* src) { + if (stats==src) return; + mi_stat_add(&stats->segments, &src->segments,1); + mi_stat_add(&stats->pages, &src->pages,1); + mi_stat_add(&stats->reserved, &src->reserved, 1); + mi_stat_add(&stats->committed, &src->committed, 1); + mi_stat_add(&stats->reset, &src->reset, 1); + mi_stat_add(&stats->purged, &src->purged, 1); + mi_stat_add(&stats->page_committed, &src->page_committed, 1); + + mi_stat_add(&stats->pages_abandoned, &src->pages_abandoned, 1); + mi_stat_add(&stats->segments_abandoned, &src->segments_abandoned, 1); + mi_stat_add(&stats->threads, &src->threads, 1); + + mi_stat_add(&stats->malloc, &src->malloc, 1); + mi_stat_add(&stats->segments_cache, &src->segments_cache, 1); + mi_stat_add(&stats->normal, &src->normal, 1); + mi_stat_add(&stats->huge, &src->huge, 1); + mi_stat_add(&stats->large, &src->large, 1); + + mi_stat_counter_add(&stats->pages_extended, &src->pages_extended, 1); + mi_stat_counter_add(&stats->mmap_calls, &src->mmap_calls, 1); + mi_stat_counter_add(&stats->commit_calls, &src->commit_calls, 1); + mi_stat_counter_add(&stats->reset_calls, &src->reset_calls, 1); + mi_stat_counter_add(&stats->purge_calls, &src->purge_calls, 1); + + mi_stat_counter_add(&stats->page_no_retire, &src->page_no_retire, 1); + mi_stat_counter_add(&stats->searches, &src->searches, 1); + mi_stat_counter_add(&stats->normal_count, &src->normal_count, 1); + mi_stat_counter_add(&stats->huge_count, &src->huge_count, 1); + mi_stat_counter_add(&stats->large_count, &src->large_count, 1); +#if MI_STAT>1 + for (size_t i = 0; i <= MI_BIN_HUGE; i++) { + if (src->normal_bins[i].allocated > 0 || src->normal_bins[i].freed > 0) { + mi_stat_add(&stats->normal_bins[i], &src->normal_bins[i], 1); + } + } +#endif +} + +/* ----------------------------------------------------------- + Display statistics +----------------------------------------------------------- */ + +// unit > 0 : size in binary bytes +// unit == 0: count as decimal +// unit < 0 : count in binary +static void mi_printf_amount(int64_t n, int64_t unit, mi_output_fun* out, void* arg, const char* fmt) { + char buf[32]; buf[0] = 0; + int len = 32; + const char* suffix = (unit <= 0 ? " " : "B"); + const int64_t base = (unit == 0 ? 1000 : 1024); + if (unit>0) n *= unit; + + const int64_t pos = (n < 0 ? -n : n); + if (pos < base) { + if (n!=1 || suffix[0] != 'B') { // skip printing 1 B for the unit column + snprintf(buf, len, "%d %-3s", (int)n, (n==0 ? "" : suffix)); + } + } + else { + int64_t divider = base; + const char* magnitude = "K"; + if (pos >= divider*base) { divider *= base; magnitude = "M"; } + if (pos >= divider*base) { divider *= base; magnitude = "G"; } + const int64_t tens = (n / (divider/10)); + const long whole = (long)(tens/10); + const long frac1 = (long)(tens%10); + char unitdesc[8]; + snprintf(unitdesc, 8, "%s%s%s", magnitude, (base==1024 ? "i" : ""), suffix); + snprintf(buf, len, "%ld.%ld %-3s", whole, (frac1 < 0 ? -frac1 : frac1), unitdesc); + } + _mi_fprintf(out, arg, (fmt==NULL ? "%12s" : fmt), buf); +} + + +static void mi_print_amount(int64_t n, int64_t unit, mi_output_fun* out, void* arg) { + mi_printf_amount(n,unit,out,arg,NULL); +} + +static void mi_print_count(int64_t n, int64_t unit, mi_output_fun* out, void* arg) { + if (unit==1) _mi_fprintf(out, arg, "%12s"," "); + else mi_print_amount(n,0,out,arg); +} + +static void mi_stat_print_ex(const mi_stat_count_t* stat, const char* msg, int64_t unit, mi_output_fun* out, void* arg, const char* notok ) { + _mi_fprintf(out, arg,"%10s:", msg); + if (unit > 0) { + mi_print_amount(stat->peak, unit, out, arg); + mi_print_amount(stat->allocated, unit, out, arg); + mi_print_amount(stat->freed, unit, out, arg); + mi_print_amount(stat->current, unit, out, arg); + mi_print_amount(unit, 1, out, arg); + mi_print_count(stat->allocated, unit, out, arg); + if (stat->allocated > stat->freed) { + _mi_fprintf(out, arg, " "); + _mi_fprintf(out, arg, (notok == NULL ? "not all freed" : notok)); + _mi_fprintf(out, arg, "\n"); + } + else { + _mi_fprintf(out, arg, " ok\n"); + } + } + else if (unit<0) { + mi_print_amount(stat->peak, -1, out, arg); + mi_print_amount(stat->allocated, -1, out, arg); + mi_print_amount(stat->freed, -1, out, arg); + mi_print_amount(stat->current, -1, out, arg); + if (unit==-1) { + _mi_fprintf(out, arg, "%24s", ""); + } + else { + mi_print_amount(-unit, 1, out, arg); + mi_print_count((stat->allocated / -unit), 0, out, arg); + } + if (stat->allocated > stat->freed) + _mi_fprintf(out, arg, " not all freed!\n"); + else + _mi_fprintf(out, arg, " ok\n"); + } + else { + mi_print_amount(stat->peak, 1, out, arg); + mi_print_amount(stat->allocated, 1, out, arg); + _mi_fprintf(out, arg, "%11s", " "); // no freed + mi_print_amount(stat->current, 1, out, arg); + _mi_fprintf(out, arg, "\n"); + } +} + +static void mi_stat_print(const mi_stat_count_t* stat, const char* msg, int64_t unit, mi_output_fun* out, void* arg) { + mi_stat_print_ex(stat, msg, unit, out, arg, NULL); +} + +static void mi_stat_peak_print(const mi_stat_count_t* stat, const char* msg, int64_t unit, mi_output_fun* out, void* arg) { + _mi_fprintf(out, arg, "%10s:", msg); + mi_print_amount(stat->peak, unit, out, arg); + _mi_fprintf(out, arg, "\n"); +} + +static void mi_stat_counter_print(const mi_stat_counter_t* stat, const char* msg, mi_output_fun* out, void* arg ) { + _mi_fprintf(out, arg, "%10s:", msg); + mi_print_amount(stat->total, -1, out, arg); + _mi_fprintf(out, arg, "\n"); +} + + +static void mi_stat_counter_print_avg(const mi_stat_counter_t* stat, const char* msg, mi_output_fun* out, void* arg) { + const int64_t avg_tens = (stat->count == 0 ? 0 : (stat->total*10 / stat->count)); + const long avg_whole = (long)(avg_tens/10); + const long avg_frac1 = (long)(avg_tens%10); + _mi_fprintf(out, arg, "%10s: %5ld.%ld avg\n", msg, avg_whole, avg_frac1); +} + + +static void mi_print_header(mi_output_fun* out, void* arg ) { + _mi_fprintf(out, arg, "%10s: %11s %11s %11s %11s %11s %11s\n", "heap stats", "peak ", "total ", "freed ", "current ", "unit ", "count "); +} + +#if MI_STAT>1 +static void mi_stats_print_bins(const mi_stat_count_t* bins, size_t max, const char* fmt, mi_output_fun* out, void* arg) { + bool found = false; + char buf[64]; + for (size_t i = 0; i <= max; i++) { + if (bins[i].allocated > 0) { + found = true; + int64_t unit = _mi_bin_size((uint8_t)i); + snprintf(buf, 64, "%s %3lu", fmt, (long)i); + mi_stat_print(&bins[i], buf, unit, out, arg); + } + } + if (found) { + _mi_fprintf(out, arg, "\n"); + mi_print_header(out, arg); + } +} +#endif + + + +//------------------------------------------------------------ +// Use an output wrapper for line-buffered output +// (which is nice when using loggers etc.) +//------------------------------------------------------------ +typedef struct buffered_s { + mi_output_fun* out; // original output function + void* arg; // and state + char* buf; // local buffer of at least size `count+1` + size_t used; // currently used chars `used <= count` + size_t count; // total chars available for output +} buffered_t; + +static void mi_buffered_flush(buffered_t* buf) { + buf->buf[buf->used] = 0; + _mi_fputs(buf->out, buf->arg, NULL, buf->buf); + buf->used = 0; +} + +static void mi_cdecl mi_buffered_out(const char* msg, void* arg) { + buffered_t* buf = (buffered_t*)arg; + if (msg==NULL || buf==NULL) return; + for (const char* src = msg; *src != 0; src++) { + char c = *src; + if (buf->used >= buf->count) mi_buffered_flush(buf); + mi_assert_internal(buf->used < buf->count); + buf->buf[buf->used++] = c; + if (c == '\n') mi_buffered_flush(buf); + } +} + +//------------------------------------------------------------ +// Print statistics +//------------------------------------------------------------ + +static void _mi_stats_print(mi_stats_t* stats, mi_output_fun* out0, void* arg0) mi_attr_noexcept { + // wrap the output function to be line buffered + char buf[256]; + buffered_t buffer = { out0, arg0, NULL, 0, 255 }; + buffer.buf = buf; + mi_output_fun* out = &mi_buffered_out; + void* arg = &buffer; + + // and print using that + mi_print_header(out,arg); + #if MI_STAT>1 + mi_stats_print_bins(stats->normal_bins, MI_BIN_HUGE, "normal",out,arg); + #endif + #if MI_STAT + mi_stat_print(&stats->normal, "normal", (stats->normal_count.count == 0 ? 1 : -(stats->normal.allocated / stats->normal_count.count)), out, arg); + mi_stat_print(&stats->large, "large", (stats->large_count.count == 0 ? 1 : -(stats->large.allocated / stats->large_count.count)), out, arg); + mi_stat_print(&stats->huge, "huge", (stats->huge_count.count == 0 ? 1 : -(stats->huge.allocated / stats->huge_count.count)), out, arg); + mi_stat_count_t total = { 0,0,0,0 }; + mi_stat_add(&total, &stats->normal, 1); + mi_stat_add(&total, &stats->large, 1); + mi_stat_add(&total, &stats->huge, 1); + mi_stat_print(&total, "total", 1, out, arg); + #endif + #if MI_STAT>1 + mi_stat_print(&stats->malloc, "malloc req", 1, out, arg); + _mi_fprintf(out, arg, "\n"); + #endif + mi_stat_print_ex(&stats->reserved, "reserved", 1, out, arg, ""); + mi_stat_print_ex(&stats->committed, "committed", 1, out, arg, ""); + mi_stat_peak_print(&stats->reset, "reset", 1, out, arg ); + mi_stat_peak_print(&stats->purged, "purged", 1, out, arg ); + mi_stat_print(&stats->page_committed, "touched", 1, out, arg); + mi_stat_print(&stats->segments, "segments", -1, out, arg); + mi_stat_print(&stats->segments_abandoned, "-abandoned", -1, out, arg); + mi_stat_print(&stats->segments_cache, "-cached", -1, out, arg); + mi_stat_print(&stats->pages, "pages", -1, out, arg); + mi_stat_print(&stats->pages_abandoned, "-abandoned", -1, out, arg); + mi_stat_counter_print(&stats->pages_extended, "-extended", out, arg); + mi_stat_counter_print(&stats->page_no_retire, "-noretire", out, arg); + mi_stat_counter_print(&stats->mmap_calls, "mmaps", out, arg); + mi_stat_counter_print(&stats->commit_calls, "commits", out, arg); + mi_stat_counter_print(&stats->reset_calls, "resets", out, arg); + mi_stat_counter_print(&stats->purge_calls, "purges", out, arg); + mi_stat_print(&stats->threads, "threads", -1, out, arg); + mi_stat_counter_print_avg(&stats->searches, "searches", out, arg); + _mi_fprintf(out, arg, "%10s: %5zu\n", "numa nodes", _mi_os_numa_node_count()); + + size_t elapsed; + size_t user_time; + size_t sys_time; + size_t current_rss; + size_t peak_rss; + size_t current_commit; + size_t peak_commit; + size_t page_faults; + mi_process_info(&elapsed, &user_time, &sys_time, ¤t_rss, &peak_rss, ¤t_commit, &peak_commit, &page_faults); + _mi_fprintf(out, arg, "%10s: %5ld.%03ld s\n", "elapsed", elapsed/1000, elapsed%1000); + _mi_fprintf(out, arg, "%10s: user: %ld.%03ld s, system: %ld.%03ld s, faults: %lu, rss: ", "process", + user_time/1000, user_time%1000, sys_time/1000, sys_time%1000, (unsigned long)page_faults ); + mi_printf_amount((int64_t)peak_rss, 1, out, arg, "%s"); + if (peak_commit > 0) { + _mi_fprintf(out, arg, ", commit: "); + mi_printf_amount((int64_t)peak_commit, 1, out, arg, "%s"); + } + _mi_fprintf(out, arg, "\n"); +} + +static mi_msecs_t mi_process_start; // = 0 + +static mi_stats_t* mi_stats_get_default(void) { + mi_heap_t* heap = mi_heap_get_default(); + return &heap->tld->stats; +} + +static void mi_stats_merge_from(mi_stats_t* stats) { + if (stats != &_mi_stats_main) { + mi_stats_add(&_mi_stats_main, stats); + memset(stats, 0, sizeof(mi_stats_t)); + } +} + +void mi_stats_reset(void) mi_attr_noexcept { + mi_stats_t* stats = mi_stats_get_default(); + if (stats != &_mi_stats_main) { memset(stats, 0, sizeof(mi_stats_t)); } + memset(&_mi_stats_main, 0, sizeof(mi_stats_t)); + if (mi_process_start == 0) { mi_process_start = _mi_clock_start(); }; +} + +void mi_stats_merge(void) mi_attr_noexcept { + mi_stats_merge_from( mi_stats_get_default() ); +} + +void _mi_stats_done(mi_stats_t* stats) { // called from `mi_thread_done` + mi_stats_merge_from(stats); +} + +void mi_stats_print_out(mi_output_fun* out, void* arg) mi_attr_noexcept { + mi_stats_merge_from(mi_stats_get_default()); + _mi_stats_print(&_mi_stats_main, out, arg); +} + +void mi_stats_print(void* out) mi_attr_noexcept { + // for compatibility there is an `out` parameter (which can be `stdout` or `stderr`) + mi_stats_print_out((mi_output_fun*)out, NULL); +} + +void mi_thread_stats_print_out(mi_output_fun* out, void* arg) mi_attr_noexcept { + _mi_stats_print(mi_stats_get_default(), out, arg); +} + + +// ---------------------------------------------------------------- +// Basic timer for convenience; use milli-seconds to avoid doubles +// ---------------------------------------------------------------- + +static mi_msecs_t mi_clock_diff; + +mi_msecs_t _mi_clock_now(void) { + return _mi_prim_clock_now(); +} + +mi_msecs_t _mi_clock_start(void) { + if (mi_clock_diff == 0.0) { + mi_msecs_t t0 = _mi_clock_now(); + mi_clock_diff = _mi_clock_now() - t0; + } + return _mi_clock_now(); +} + +mi_msecs_t _mi_clock_end(mi_msecs_t start) { + mi_msecs_t end = _mi_clock_now(); + return (end - start - mi_clock_diff); +} + + +// -------------------------------------------------------- +// Basic process statistics +// -------------------------------------------------------- + +mi_decl_export void mi_process_info(size_t* elapsed_msecs, size_t* user_msecs, size_t* system_msecs, size_t* current_rss, size_t* peak_rss, size_t* current_commit, size_t* peak_commit, size_t* page_faults) mi_attr_noexcept +{ + mi_process_info_t pinfo; + _mi_memzero_var(pinfo); + pinfo.elapsed = _mi_clock_end(mi_process_start); + pinfo.current_commit = (size_t)(mi_atomic_loadi64_relaxed((_Atomic(int64_t)*)&_mi_stats_main.committed.current)); + pinfo.peak_commit = (size_t)(mi_atomic_loadi64_relaxed((_Atomic(int64_t)*)&_mi_stats_main.committed.peak)); + pinfo.current_rss = pinfo.current_commit; + pinfo.peak_rss = pinfo.peak_commit; + pinfo.utime = 0; + pinfo.stime = 0; + pinfo.page_faults = 0; + + _mi_prim_process_info(&pinfo); + + if (elapsed_msecs!=NULL) *elapsed_msecs = (pinfo.elapsed < 0 ? 0 : (pinfo.elapsed < (mi_msecs_t)PTRDIFF_MAX ? (size_t)pinfo.elapsed : PTRDIFF_MAX)); + if (user_msecs!=NULL) *user_msecs = (pinfo.utime < 0 ? 0 : (pinfo.utime < (mi_msecs_t)PTRDIFF_MAX ? (size_t)pinfo.utime : PTRDIFF_MAX)); + if (system_msecs!=NULL) *system_msecs = (pinfo.stime < 0 ? 0 : (pinfo.stime < (mi_msecs_t)PTRDIFF_MAX ? (size_t)pinfo.stime : PTRDIFF_MAX)); + if (current_rss!=NULL) *current_rss = pinfo.current_rss; + if (peak_rss!=NULL) *peak_rss = pinfo.peak_rss; + if (current_commit!=NULL) *current_commit = pinfo.current_commit; + if (peak_commit!=NULL) *peak_commit = pinfo.peak_commit; + if (page_faults!=NULL) *page_faults = pinfo.page_faults; +} From 7940371c0584836772be156da7248056baf219f6 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 24 Jun 2019 23:41:27 +0200 Subject: [PATCH 076/297] mimalloc: adjust for building inside Git We want to compile mimalloc's source code as part of Git, rather than requiring the code to be built as an external library: mimalloc uses a CMake-based build, which is not necessarily easy to integrate into the flavors of Git for Windows (which will be the main benefitting port). Signed-off-by: Johannes Schindelin --- compat/mimalloc/alloc.c | 4 ---- compat/mimalloc/mimalloc.h | 3 ++- 2 files changed, 2 insertions(+), 5 deletions(-) diff --git a/compat/mimalloc/alloc.c b/compat/mimalloc/alloc.c index 961f6d53d0f2c7..ae272c1fb54504 100644 --- a/compat/mimalloc/alloc.c +++ b/compat/mimalloc/alloc.c @@ -16,10 +16,6 @@ terms of the MIT license. A copy of the license can be found in the file #include // memset, strlen (for mi_strdup) #include // malloc, abort -#define MI_IN_ALLOC_C -#include "alloc-override.c" -#undef MI_IN_ALLOC_C - // ------------------------------------------------------ // Allocation // ------------------------------------------------------ diff --git a/compat/mimalloc/mimalloc.h b/compat/mimalloc/mimalloc.h index c0f5e96e51e975..7e3b5dd66e91a0 100644 --- a/compat/mimalloc/mimalloc.h +++ b/compat/mimalloc/mimalloc.h @@ -95,7 +95,8 @@ terms of the MIT license. A copy of the license can be found in the file // Includes // ------------------------------------------------------ -#include // size_t +#include "git-compat-util.h" + #include // bool #include // INTPTR_MAX From eb0f08026e124141164a14699823d97d61bf9c37 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 24 Jun 2019 23:43:06 +0200 Subject: [PATCH 077/297] mimalloc: offer a build-time option to enable it By defining `USE_MIMALLOC`, Git can now be compiled with that nicely-fast and small allocator. Note that we have to disable a couple `DEVELOPER` options to build mimalloc's source code, as it makes heavy use of declarations after statements, among other things that disagree with Git's conventions. We even have to silence some GCC warnings in non-DEVELOPER mode. For example, the `-Wno-array-bounds` flag is needed because in `-O2` builds, trying to call `NtCurrentTeb()` (which `_mi_thread_id()` does on Windows) causes the bogus warning about a system header, likely related to https://sourceforge.net/p/mingw-w64/mailman/message/37674519/ and to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99578: C:/git-sdk-64-minimal/mingw64/include/psdk_inc/intrin-impl.h:838:1: error: array subscript 0 is outside array bounds of 'long long unsigned int[0]' [-Werror=array-bounds] 838 | __buildreadseg(__readgsqword, unsigned __int64, "gs", "q") | ^~~~~~~~~~~~~~ Also: The `mimalloc` library uses C11-style atomics, therefore we must require that standard when compiling with GCC if we want to use `mimalloc` (instead of requiring "only" C99). This is what we do in the CMake definition already, therefore this commit does not need to touch `contrib/buildsystems/`. Signed-off-by: Johannes Schindelin --- Makefile | 37 +++++++++++++++++++++++++++++++++++++ config.mak.dev | 2 ++ config.mak.uname | 2 +- git-compat-util.h | 10 ++++++++++ 4 files changed, 50 insertions(+), 1 deletion(-) diff --git a/Makefile b/Makefile index db48c72620c78d..600f7380e70127 100644 --- a/Makefile +++ b/Makefile @@ -2072,6 +2072,43 @@ ifdef USE_NED_ALLOCATOR OVERRIDE_STRDUP = YesPlease endif +ifdef USE_MIMALLOC + MIMALLOC_OBJS = \ + compat/mimalloc/alloc-aligned.o \ + compat/mimalloc/alloc.o \ + compat/mimalloc/arena.o \ + compat/mimalloc/bitmap.o \ + compat/mimalloc/heap.o \ + compat/mimalloc/init.o \ + compat/mimalloc/options.o \ + compat/mimalloc/os.o \ + compat/mimalloc/page.o \ + compat/mimalloc/random.o \ + compat/mimalloc/prim/windows/prim.o \ + compat/mimalloc/segment.o \ + compat/mimalloc/segment-cache.o \ + compat/mimalloc/segment-map.o \ + compat/mimalloc/stats.o + + COMPAT_CFLAGS += -Icompat/mimalloc -DMI_DEBUG=0 -DUSE_MIMALLOC --std=gnu11 + COMPAT_OBJS += $(MIMALLOC_OBJS) + +$(MIMALLOC_OBJS): COMPAT_CFLAGS += -DBANNED_H + +$(MIMALLOC_OBJS): COMPAT_CFLAGS += \ + -Wno-attributes \ + -Wno-unknown-pragmas \ + -Wno-array-bounds + +ifdef DEVELOPER +$(MIMALLOC_OBJS): COMPAT_CFLAGS += \ + -Wno-pedantic \ + -Wno-declaration-after-statement \ + -Wno-old-style-definition \ + -Wno-missing-prototypes +endif +endif + ifdef OVERRIDE_STRDUP COMPAT_CFLAGS += -DOVERRIDE_STRDUP COMPAT_OBJS += compat/strdup.o diff --git a/config.mak.dev b/config.mak.dev index 5229c3548451b7..56b2a581475160 100644 --- a/config.mak.dev +++ b/config.mak.dev @@ -22,8 +22,10 @@ endif ifneq ($(uname_S),FreeBSD) ifneq ($(or $(filter gcc6,$(COMPILER_FEATURES)),$(filter clang7,$(COMPILER_FEATURES))),) +ifndef USE_MIMALLOC DEVELOPER_CFLAGS += -std=gnu99 endif +endif else # FreeBSD cannot limit to C99 because its system headers unconditionally # rely on C11 features. diff --git a/config.mak.uname b/config.mak.uname index 984a1ef67b450b..1b2980d9eec829 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -494,7 +494,7 @@ endif CC = compat/vcbuild/scripts/clink.pl AR = compat/vcbuild/scripts/lib.pl CFLAGS = - BASIC_CFLAGS = -nologo -I. -Icompat/vcbuild/include -DWIN32 -D_CONSOLE -DHAVE_STRING_H -D_CRT_SECURE_NO_WARNINGS -D_CRT_NONSTDC_NO_DEPRECATE + BASIC_CFLAGS = -nologo -I. -Icompat/vcbuild/include -DWIN32 -D_CONSOLE -DHAVE_STRING_H -D_CRT_SECURE_NO_WARNINGS -D_CRT_NONSTDC_NO_DEPRECATE -MP -std:c11 COMPAT_OBJS = compat/msvc.o compat/winansi.o \ compat/win32/flush.o \ compat/win32/path-utils.o \ diff --git a/git-compat-util.h b/git-compat-util.h index 468d40498fd35a..4fc80210ede16f 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -420,6 +420,16 @@ char *gitdirname(char *); # include #endif +#ifdef USE_MIMALLOC +#include "mimalloc.h" +#define malloc mi_malloc +#define calloc mi_calloc +#define realloc mi_realloc +#define free mi_free +#define strdup mi_strdup +#define strndup mi_strndup +#endif + /* On most systems would have given us this, but * not on some systems (e.g. z/OS). */ From 8be8dec090b9f842ca18138e75a748b05456d19d Mon Sep 17 00:00:00 2001 From: Jeff Hostetler Date: Fri, 12 May 2023 15:54:11 -0400 Subject: [PATCH 078/297] mimalloc: use "weak" random seed when statically linked Always use the internal "use_weak" random seed when initializing the "mimalloc" heap when statically linked on Windows. The imported "mimalloc" routines support several random sources to seed the heap data structures, including BCrypt.dll and RtlGenRandom. Crashes have been reported when using BCrypt.dll if it initialized during an `atexit()` handler function. Granted, such DLL initialization should not happen in an atexit handler, but yet the crashes remain. It should be noted that on Windows when statically linked, the mimalloc startup code (called by the GCC CRT to initialize static data prior to calling `main()`) always uses the internal "weak" random seed. "mimalloc" does not try to load an alternate random source until after the OS initialization has completed. Heap data is stored in `__declspec(thread)` TLS data and in theory each Git thread will have its own heap data. However, testing shows that the "mimalloc" library doesn't actually call `os_random_buf()` (to load a new random source) when creating these new per-thread heap structures. However, if an atexit handler is forced to run on a non-main thread, the "mimalloc" library *WILL* try to create a new heap and seed it with `os_random_buf()`. (The reason for this is still a mystery to this author.) The `os_random_buf()` call can cause the (previously uninitialized BCrypt.dll library) to be dynamically loaded and a call made into it. Crashes have been reported in v2.40.1.vfs.0.0 while in this call. As a workaround, the fix here forces the use of the internal "use_weak" random code for the subsequent `os_random_buf()` calls. Since we have been using that random generator for the majority of the program, it seems safe to use it for the final few mallocs in the atexit handler (of which there really shouldn't be that many. Signed-off-by: Jeff Hostetler Signed-off-by: Johannes Schindelin --- compat/mimalloc/init.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/compat/mimalloc/init.c b/compat/mimalloc/init.c index 4670d5510db187..4ec5812e3ce1d0 100644 --- a/compat/mimalloc/init.c +++ b/compat/mimalloc/init.c @@ -302,7 +302,11 @@ static bool _mi_heap_init(void) { _mi_memcpy_aligned(tld, &tld_empty, sizeof(*tld)); _mi_memcpy_aligned(heap, &_mi_heap_empty, sizeof(*heap)); heap->thread_id = _mi_thread_id(); + #if defined(_WIN32) && !defined(MI_SHARED_LIB) + _mi_random_init_weak(&heap->random); // match mi_heap_main_init() + #else _mi_random_init(&heap->random); + #endif heap->cookie = _mi_heap_random_next(heap) | 1; heap->keys[0] = _mi_heap_random_next(heap); heap->keys[1] = _mi_heap_random_next(heap); From 6ea34cc96c96e09fc6c0a3f3a850eb6bc7510aea Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 24 Jun 2019 23:45:21 +0200 Subject: [PATCH 079/297] mingw: use mimalloc Thorough benchmarking with repacking a subset of linux.git (the commit history reachable from 93a6fefe2f ([PATCH] fix the SYSCTL=n compilation, 2007-02-28), to be precise) suggest that this allocator is on par, in multi-threaded situations maybe even better than nedmalloc: `git repack -adfq` with mimalloc, 8 threads: 31.166991900 27.576763800 28.712311000 27.373859000 27.163141900 `git repack -adfq` with nedmalloc, 8 threads: 31.915032900 27.149883100 28.244933700 27.240188800 28.580849500 In a different test using GitHub Actions build agents (probably single-threaded, a core-strength of nedmalloc)): `git repack -q -d -l -A --unpack-unreachable=2.weeks.ago` with mimalloc: 943.426 978.500 939.709 959.811 954.605 `git repack -q -d -l -A --unpack-unreachable=2.weeks.ago` with nedmalloc: 995.383 952.179 943.253 963.043 980.468 While these measurements were not executed with complete scientific rigor, as no hardware was set aside specifically for these benchmarks, it shows that mimalloc and nedmalloc perform almost the same, nedmalloc with a bit higher variance and also slightly higher average (further testing suggests that nedmalloc performs worse in multi-threaded situations than in single-threaded ones). In short: mimalloc seems to be slightly better suited for our purposes than nedmalloc. Seeing that mimalloc is developed actively, while nedmalloc ceased to see any updates in eight years, let's use mimalloc on Windows instead. Signed-off-by: Johannes Schindelin --- config.mak.uname | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/config.mak.uname b/config.mak.uname index 1b2980d9eec829..958d4c9c1326f3 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -737,7 +737,7 @@ ifeq ($(uname_S),MINGW) HAVE_LIBCHARSET_H = YesPlease USE_GETTEXT_SCHEME = fallthrough USE_LIBPCRE = YesPlease - USE_NED_ALLOCATOR = YesPlease + USE_MIMALLOC = YesPlease NO_PYTHON = ifeq (/mingw64,$(subst 32,64,$(prefix))) # Move system config into top-level /etc/ From ffcf774b409968f28d313fa64ca10e678a87de1f Mon Sep 17 00:00:00 2001 From: Thomas Braun Date: Thu, 8 May 2014 21:43:24 +0200 Subject: [PATCH 080/297] transport: optionally disable side-band-64k Since commit 0c499ea60fda (send-pack: demultiplex a sideband stream with status data, 2010-02-05) the send-pack builtin uses the side-band-64k capability if advertised by the server. Unfortunately this breaks pushing over the dump git protocol if used over a network connection. The detailed reasons for this breakage are (by courtesy of Jeff Preshing, quoted from https://groups.google.com/d/msg/msysgit/at8D7J-h7mw/eaLujILGUWoJ): MinGW wraps Windows sockets in CRT file descriptors in order to mimic the functionality of POSIX sockets. This causes msvcrt.dll to treat sockets as Installable File System (IFS) handles, calling ReadFile, WriteFile, DuplicateHandle and CloseHandle on them. This approach works well in simple cases on recent versions of Windows, but does not support all usage patterns. In particular, using this approach, any attempt to read & write concurrently on the same socket (from one or more processes) will deadlock in a scenario where the read waits for a response from the server which is only invoked after the write. This is what send_pack currently attempts to do in the use_sideband codepath. The new config option `sendpack.sideband` allows to override the side-band-64k capability of the server, and thus makes the dumb git protocol work. Other transportation methods like ssh and http/https still benefit from the sideband channel, therefore the default value of `sendpack.sideband` is still true. Signed-off-by: Thomas Braun Signed-off-by: Oliver Schneider Signed-off-by: Johannes Schindelin --- Documentation/config.txt | 2 ++ Documentation/config/sendpack.txt | 5 +++++ send-pack.c | 6 +++--- 3 files changed, 10 insertions(+), 3 deletions(-) create mode 100644 Documentation/config/sendpack.txt diff --git a/Documentation/config.txt b/Documentation/config.txt index 8c0b3ed8075214..3ad4956fe48944 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -518,6 +518,8 @@ include::config/safe.txt[] include::config/sendemail.txt[] +include::config/sendpack.txt[] + include::config/sequencer.txt[] include::config/showbranch.txt[] diff --git a/Documentation/config/sendpack.txt b/Documentation/config/sendpack.txt new file mode 100644 index 00000000000000..e306f657fba7dd --- /dev/null +++ b/Documentation/config/sendpack.txt @@ -0,0 +1,5 @@ +sendpack.sideband:: + Allows to disable the side-band-64k capability for send-pack even + when it is advertised by the server. Makes it possible to work + around a limitation in the git for windows implementation together + with the dump git protocol. Defaults to true. diff --git a/send-pack.c b/send-pack.c index fa2f5eec17bba1..f9e6255ee13328 100644 --- a/send-pack.c +++ b/send-pack.c @@ -488,7 +488,7 @@ int send_pack(struct send_pack_args *args, int need_pack_data = 0; int allow_deleting_refs = 0; int status_report = 0; - int use_sideband = 0; + int use_sideband = 1; int quiet_supported = 0; int agent_supported = 0; int advertise_sid = 0; @@ -511,6 +511,7 @@ int send_pack(struct send_pack_args *args, return 0; } + git_config_get_bool("sendpack.sideband", &use_sideband); git_config_get_bool("push.negotiate", &push_negotiate); if (push_negotiate) get_commons_through_negotiation(args->url, remote_refs, &commons); @@ -529,8 +530,7 @@ int send_pack(struct send_pack_args *args, allow_deleting_refs = 1; if (server_supports("ofs-delta")) args->use_ofs_delta = 1; - if (server_supports("side-band-64k")) - use_sideband = 1; + use_sideband = use_sideband && server_supports("side-band-64k"); if (server_supports("quiet")) quiet_supported = 1; if (server_supports("agent")) From f5100a65db65a447993390f11273ee54a072b2d1 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Thu, 14 Nov 2019 20:09:23 +0100 Subject: [PATCH 081/297] mingw: make sure `errno` is set correctly when socket operations fail The winsock2 library provides functions that work on different data types than file descriptors, therefore we wrap them. But that is not the only difference: they also do not set `errno` but expect the callers to enquire about errors via `WSAGetLastError()`. Let's translate that into appropriate `errno` values whenever the socket operations fail so that Git's code base does not have to change its expectations. This closes https://github.com/git-for-windows/git/issues/2404 Helped-by: Jeff Hostetler Signed-off-by: Johannes Schindelin --- compat/mingw.c | 157 +++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 147 insertions(+), 10 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index 29d3f09768c29a..40b952071e8243 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -2041,18 +2041,150 @@ static void ensure_socket_initialization(void) initialized = 1; } +static int winsock_error_to_errno(DWORD err) +{ + switch (err) { + case WSAEINTR: return EINTR; + case WSAEBADF: return EBADF; + case WSAEACCES: return EACCES; + case WSAEFAULT: return EFAULT; + case WSAEINVAL: return EINVAL; + case WSAEMFILE: return EMFILE; + case WSAEWOULDBLOCK: return EWOULDBLOCK; + case WSAEINPROGRESS: return EINPROGRESS; + case WSAEALREADY: return EALREADY; + case WSAENOTSOCK: return ENOTSOCK; + case WSAEDESTADDRREQ: return EDESTADDRREQ; + case WSAEMSGSIZE: return EMSGSIZE; + case WSAEPROTOTYPE: return EPROTOTYPE; + case WSAENOPROTOOPT: return ENOPROTOOPT; + case WSAEPROTONOSUPPORT: return EPROTONOSUPPORT; + case WSAEOPNOTSUPP: return EOPNOTSUPP; + case WSAEAFNOSUPPORT: return EAFNOSUPPORT; + case WSAEADDRINUSE: return EADDRINUSE; + case WSAEADDRNOTAVAIL: return EADDRNOTAVAIL; + case WSAENETDOWN: return ENETDOWN; + case WSAENETUNREACH: return ENETUNREACH; + case WSAENETRESET: return ENETRESET; + case WSAECONNABORTED: return ECONNABORTED; + case WSAECONNRESET: return ECONNRESET; + case WSAENOBUFS: return ENOBUFS; + case WSAEISCONN: return EISCONN; + case WSAENOTCONN: return ENOTCONN; + case WSAETIMEDOUT: return ETIMEDOUT; + case WSAECONNREFUSED: return ECONNREFUSED; + case WSAELOOP: return ELOOP; + case WSAENAMETOOLONG: return ENAMETOOLONG; + case WSAEHOSTUNREACH: return EHOSTUNREACH; + case WSAENOTEMPTY: return ENOTEMPTY; + /* No errno equivalent; default to EIO */ + case WSAESOCKTNOSUPPORT: + case WSAEPFNOSUPPORT: + case WSAESHUTDOWN: + case WSAETOOMANYREFS: + case WSAEHOSTDOWN: + case WSAEPROCLIM: + case WSAEUSERS: + case WSAEDQUOT: + case WSAESTALE: + case WSAEREMOTE: + case WSASYSNOTREADY: + case WSAVERNOTSUPPORTED: + case WSANOTINITIALISED: + case WSAEDISCON: + case WSAENOMORE: + case WSAECANCELLED: + case WSAEINVALIDPROCTABLE: + case WSAEINVALIDPROVIDER: + case WSAEPROVIDERFAILEDINIT: + case WSASYSCALLFAILURE: + case WSASERVICE_NOT_FOUND: + case WSATYPE_NOT_FOUND: + case WSA_E_NO_MORE: + case WSA_E_CANCELLED: + case WSAEREFUSED: + case WSAHOST_NOT_FOUND: + case WSATRY_AGAIN: + case WSANO_RECOVERY: + case WSANO_DATA: + case WSA_QOS_RECEIVERS: + case WSA_QOS_SENDERS: + case WSA_QOS_NO_SENDERS: + case WSA_QOS_NO_RECEIVERS: + case WSA_QOS_REQUEST_CONFIRMED: + case WSA_QOS_ADMISSION_FAILURE: + case WSA_QOS_POLICY_FAILURE: + case WSA_QOS_BAD_STYLE: + case WSA_QOS_BAD_OBJECT: + case WSA_QOS_TRAFFIC_CTRL_ERROR: + case WSA_QOS_GENERIC_ERROR: + case WSA_QOS_ESERVICETYPE: + case WSA_QOS_EFLOWSPEC: + case WSA_QOS_EPROVSPECBUF: + case WSA_QOS_EFILTERSTYLE: + case WSA_QOS_EFILTERTYPE: + case WSA_QOS_EFILTERCOUNT: + case WSA_QOS_EOBJLENGTH: + case WSA_QOS_EFLOWCOUNT: +#ifndef _MSC_VER + case WSA_QOS_EUNKNOWNPSOBJ: +#endif + case WSA_QOS_EPOLICYOBJ: + case WSA_QOS_EFLOWDESC: + case WSA_QOS_EPSFLOWSPEC: + case WSA_QOS_EPSFILTERSPEC: + case WSA_QOS_ESDMODEOBJ: + case WSA_QOS_ESHAPERATEOBJ: + case WSA_QOS_RESERVED_PETYPE: + default: return EIO; + } +} + +/* + * On Windows, `errno` is a global macro to a function call. + * This makes it difficult to debug and single-step our mappings. + */ +static inline void set_wsa_errno(void) +{ + DWORD wsa = WSAGetLastError(); + int e = winsock_error_to_errno(wsa); + errno = e; + +#ifdef DEBUG_WSA_ERRNO + fprintf(stderr, "winsock error: %d -> %d\n", wsa, e); + fflush(stderr); +#endif +} + +static inline int winsock_return(int ret) +{ + if (ret < 0) + set_wsa_errno(); + + return ret; +} + +#define WINSOCK_RETURN(x) do { return winsock_return(x); } while (0) + #undef gethostname int mingw_gethostname(char *name, int namelen) { - ensure_socket_initialization(); - return gethostname(name, namelen); + ensure_socket_initialization(); + WINSOCK_RETURN(gethostname(name, namelen)); } #undef gethostbyname struct hostent *mingw_gethostbyname(const char *host) { + struct hostent *ret; + ensure_socket_initialization(); - return gethostbyname(host); + + ret = gethostbyname(host); + if (!ret) + set_wsa_errno(); + + return ret; } #undef getaddrinfo @@ -2060,7 +2192,7 @@ int mingw_getaddrinfo(const char *node, const char *service, const struct addrinfo *hints, struct addrinfo **res) { ensure_socket_initialization(); - return getaddrinfo(node, service, hints, res); + WINSOCK_RETURN(getaddrinfo(node, service, hints, res)); } int mingw_socket(int domain, int type, int protocol) @@ -2080,7 +2212,7 @@ int mingw_socket(int domain, int type, int protocol) * in errno so that _if_ someone looks up the code somewhere, * then it is at least the number that are usually listed. */ - errno = WSAGetLastError(); + set_wsa_errno(); return -1; } /* convert into a file descriptor */ @@ -2096,35 +2228,35 @@ int mingw_socket(int domain, int type, int protocol) int mingw_connect(int sockfd, struct sockaddr *sa, size_t sz) { SOCKET s = (SOCKET)_get_osfhandle(sockfd); - return connect(s, sa, sz); + WINSOCK_RETURN(connect(s, sa, sz)); } #undef bind int mingw_bind(int sockfd, struct sockaddr *sa, size_t sz) { SOCKET s = (SOCKET)_get_osfhandle(sockfd); - return bind(s, sa, sz); + WINSOCK_RETURN(bind(s, sa, sz)); } #undef setsockopt int mingw_setsockopt(int sockfd, int lvl, int optname, void *optval, int optlen) { SOCKET s = (SOCKET)_get_osfhandle(sockfd); - return setsockopt(s, lvl, optname, (const char*)optval, optlen); + WINSOCK_RETURN(setsockopt(s, lvl, optname, (const char*)optval, optlen)); } #undef shutdown int mingw_shutdown(int sockfd, int how) { SOCKET s = (SOCKET)_get_osfhandle(sockfd); - return shutdown(s, how); + WINSOCK_RETURN(shutdown(s, how)); } #undef listen int mingw_listen(int sockfd, int backlog) { SOCKET s = (SOCKET)_get_osfhandle(sockfd); - return listen(s, backlog); + WINSOCK_RETURN(listen(s, backlog)); } #undef accept @@ -2135,6 +2267,11 @@ int mingw_accept(int sockfd1, struct sockaddr *sa, socklen_t *sz) SOCKET s1 = (SOCKET)_get_osfhandle(sockfd1); SOCKET s2 = accept(s1, sa, sz); + if (s2 == INVALID_SOCKET) { + set_wsa_errno(); + return -1; + } + /* convert into a file descriptor */ if ((sockfd2 = _open_osfhandle(s2, O_RDWR|O_BINARY)) < 0) { int err = errno; From 007ede4d4a386db02b8efe8188338c3a78958412 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 1 Jan 2020 21:07:22 +0100 Subject: [PATCH 082/297] mingw: do resolve symlinks in `getcwd()` As pointed out in https://github.com/git-for-windows/git/issues/1676, the `git rev-parse --is-inside-work-tree` command currently fails when the current directory's path contains symbolic links. The underlying reason for this bug is that `getcwd()` is supposed to resolve symbolic links, but our `mingw_getcwd()` implementation did not. We do have all the building blocks for that, though: the `GetFinalPathByHandleW()` function will resolve symbolic links. However, we only called that function if `GetLongPathNameW()` failed, for historical reasons: the latter function was supported for a long time, but the former API function was introduced only with Windows Vista, and we used to support also Windows XP. With that support having been dropped, we are free to call the symbolic link-resolving function right away. Signed-off-by: Johannes Schindelin --- compat/mingw.c | 18 +++++++----------- 1 file changed, 7 insertions(+), 11 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index 29d3f09768c29a..0c8e526e3ece64 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -1141,18 +1141,16 @@ char *mingw_getcwd(char *pointer, int len) { wchar_t cwd[MAX_PATH], wpointer[MAX_PATH]; DWORD ret = GetCurrentDirectoryW(ARRAY_SIZE(cwd), cwd); + HANDLE hnd; if (!ret || ret >= ARRAY_SIZE(cwd)) { errno = ret ? ENAMETOOLONG : err_win_to_posix(GetLastError()); return NULL; } - ret = GetLongPathNameW(cwd, wpointer, ARRAY_SIZE(wpointer)); - if (!ret && GetLastError() == ERROR_ACCESS_DENIED) { - HANDLE hnd = CreateFileW(cwd, 0, - FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, NULL, - OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, NULL); - if (hnd == INVALID_HANDLE_VALUE) - return NULL; + hnd = CreateFileW(cwd, 0, + FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, NULL, + OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, NULL); + if (hnd != INVALID_HANDLE_VALUE) { ret = GetFinalPathNameByHandleW(hnd, wpointer, ARRAY_SIZE(wpointer), 0); CloseHandle(hnd); if (!ret || ret >= ARRAY_SIZE(wpointer)) @@ -1161,13 +1159,11 @@ char *mingw_getcwd(char *pointer, int len) return NULL; return pointer; } - if (!ret || ret >= ARRAY_SIZE(wpointer)) - return NULL; - if (GetFileAttributesW(wpointer) == INVALID_FILE_ATTRIBUTES) { + if (GetFileAttributesW(cwd) == INVALID_FILE_ATTRIBUTES) { errno = ENOENT; return NULL; } - if (xwcstoutf(pointer, wpointer, len) < 0) + if (xwcstoutf(pointer, cwd, len) < 0) return NULL; convert_slashes(pointer); return pointer; From c300d5eb77f1bfc6490476a1ea0b8ca7a795463a Mon Sep 17 00:00:00 2001 From: Bjoern Mueller Date: Wed, 22 Jan 2020 13:49:13 +0100 Subject: [PATCH 083/297] mingw: fix fatal error working on mapped network drives on Windows In 1e64d18 (mingw: do resolve symlinks in `getcwd()`) a problem was introduced that causes git for Windows to stop working with certain mapped network drives (in particular, drives that are mapped to locations with long path names). Error message was "fatal: Unable to read current working directory: No such file or directory". Present change fixes this issue as discussed in https://github.com/git-for-windows/git/issues/2480 Signed-off-by: Bjoern Mueller --- compat/mingw.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index 29d3f09768c29a..d45ba0a06400a2 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -1155,8 +1155,13 @@ char *mingw_getcwd(char *pointer, int len) return NULL; ret = GetFinalPathNameByHandleW(hnd, wpointer, ARRAY_SIZE(wpointer), 0); CloseHandle(hnd); - if (!ret || ret >= ARRAY_SIZE(wpointer)) - return NULL; + if (!ret || ret >= ARRAY_SIZE(wpointer)) { + ret = GetLongPathNameW(cwd, wpointer, ARRAY_SIZE(wpointer)); + if (!ret || ret >= ARRAY_SIZE(wpointer)) { + errno = ret ? ENAMETOOLONG : err_win_to_posix(GetLastError()); + return NULL; + } + } if (xwcstoutf(pointer, normalize_ntpath(wpointer), len) < 0) return NULL; return pointer; From 0dcd919989c83963883ecd4a1d027456d91146eb Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 31 Jan 2020 12:02:47 +0100 Subject: [PATCH 084/297] mingw: demonstrate a `git add` issue with NTFS junctions NTFS junctions are somewhat similar in spirit to Unix bind mounts: they point to a different directory and are resolved by the filesystem driver. As such, they appear to `lstat()` as if they are directories, not as if they are symbolic links. _Any_ user can create junctions, while symbolic links can only be created by non-administrators in Developer Mode on Windows 10. Hence NTFS junctions are much more common "in the wild" than NTFS symbolic links. It was reported in https://github.com/git-for-windows/git/issues/2481 that adding files via an absolute path that traverses an NTFS junction: since 1e64d18 (mingw: do resolve symlinks in `getcwd()`), we resolve not only symbolic links but also NTFS junctions when determining the absolute path of the current directory. The same is not true for `git add `, where symbolic links are resolved in ``, but not NTFS junctions. Signed-off-by: Johannes Schindelin --- t/t3700-add.sh | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/t/t3700-add.sh b/t/t3700-add.sh index 839c904745a286..2e0cd98d0e7167 100755 --- a/t/t3700-add.sh +++ b/t/t3700-add.sh @@ -549,4 +549,15 @@ test_expect_success CASE_INSENSITIVE_FS 'path is case-insensitive' ' git add "$downcased" ' +test_expect_failure MINGW 'can add files via NTFS junctions' ' + test_when_finished "cmd //c rmdir junction && rm -rf target" && + test_create_repo target && + cmd //c "mklink /j junction target" && + >target/via-junction && + git -C junction add "$(pwd)/junction/via-junction" && + echo via-junction >expect && + git -C target diff --cached --name-only >actual && + test_cmp expect actual +' + test_done From a688f81f809f153f43ecc8a0facd818701640500 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 21 Feb 2017 13:28:58 +0100 Subject: [PATCH 085/297] mingw: ensure valid CTYPE A change between versions 2.4.1 and 2.6.0 of the MSYS2 runtime modified how Cygwin's runtime (and hence Git for Windows' MSYS2 runtime derivative) handles locales: d16a56306d (Consolidate wctomb/mbtowc calls for POSIX-1.2008, 2016-07-20). An unintended side-effect is that "cold-calling" into the POSIX emulation will start with a locale based on the current code page, something that Git for Windows is very ill-prepared for, as it expects to be able to pass a command-line containing non-ASCII characters to the shell without having those characters munged. One symptom of this behavior: when `git clone` or `git fetch` shell out to call `git-upload-pack` with a path that contains non-ASCII characters, the shell tried to interpret the entire command-line (including command-line parameters) as executable path, which obviously must fail. This fixes https://github.com/git-for-windows/git/issues/1036 Signed-off-by: Johannes Schindelin --- compat/mingw.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/compat/mingw.c b/compat/mingw.c index 29d3f09768c29a..f20d36a04113eb 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -2669,6 +2669,9 @@ static void setup_windows_environment(void) if (!tmp && (tmp = getenv("USERPROFILE"))) setenv("HOME", tmp, 1); } + + if (!getenv("LC_ALL") && !getenv("LC_CTYPE") && !getenv("LANG")) + setenv("LC_CTYPE", "C.UTF-8", 1); } static PSID get_current_user_sid(void) From 5f51144462f85fb4e897f52556e6af38c1e1f1f1 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 31 Jan 2020 11:44:31 +0100 Subject: [PATCH 086/297] strbuf_realpath(): use platform-dependent API if available Some platforms (e.g. Windows) provide API functions to resolve paths much quicker. Let's offer a way to short-cut `strbuf_realpath()` on those platforms. Signed-off-by: Johannes Schindelin --- abspath.c | 3 +++ git-compat-util.h | 4 ++++ 2 files changed, 7 insertions(+) diff --git a/abspath.c b/abspath.c index 1202cde23dbc9b..0c17e98654e4b0 100644 --- a/abspath.c +++ b/abspath.c @@ -93,6 +93,9 @@ static char *strbuf_realpath_1(struct strbuf *resolved, const char *path, goto error_out; } + if (platform_strbuf_realpath(resolved, path)) + return resolved->buf; + strbuf_addstr(&remaining, path); get_root_part(resolved, &remaining); diff --git a/git-compat-util.h b/git-compat-util.h index 71b4d23f0386b2..6c5d2e18d203dc 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -607,6 +607,10 @@ static inline int git_has_dir_sep(const char *path) #define query_user_email() NULL #endif +#ifndef platform_strbuf_realpath +#define platform_strbuf_realpath(resolved, path) NULL +#endif + #ifdef __TANDEM #include #include From 2b23718fcfe5433e083f7df9c6190af0c69a53d5 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Sat, 1 Feb 2020 00:31:16 +0100 Subject: [PATCH 087/297] mingw: allow `git.exe` to be used instead of the "Git wrapper" Git for Windows wants to add `git.exe` to the users' `PATH`, without cluttering the latter with unnecessary executables such as `wish.exe`. To that end, it invented the concept of its "Git wrapper", i.e. a tiny executable located in `C:\Program Files\Git\cmd\git.exe` (originally a CMD script) whose sole purpose is to set up a couple of environment variables and then spawn the _actual_ `git.exe` (which nowadays lives in `C:\Program Files\Git\mingw64\bin\git.exe` for 64-bit, and the obvious equivalent for 32-bit installations). Currently, the following environment variables are set unless already initialized: - `MSYSTEM`, to make sure that the MSYS2 Bash and the MSYS2 Perl interpreter behave as expected, and - `PLINK_PROTOCOL`, to force PuTTY's `plink.exe` to use the SSH protocol instead of Telnet, - `PATH`, to make sure that the `bin` folder in the user's home directory, as well as the `/mingw64/bin` and the `/usr/bin` directories are included. The trick here is that the `/mingw64/bin/` and `/usr/bin/` directories are relative to the top-level installation directory of Git for Windows (which the included Bash interprets as `/`, i.e. as the MSYS pseudo root directory). Using the absence of `MSYSTEM` as a tell-tale, we can detect in `git.exe` whether these environment variables have been initialized properly. Therefore we can call `C:\Program Files\Git\mingw64\bin\git` in-place after this change, without having to call Git through the Git wrapper. Obviously, above-mentioned directories must be _prepended_ to the `PATH` variable, otherwise we risk picking up executables from unrelated Git installations. We do that by constructing the new `PATH` value from scratch, appending `$HOME/bin` (if `HOME` is set), then the MSYS2 system directories, and then appending the original `PATH`. Side note: this modification of the `PATH` variable is independent of the modification necessary to reach the executables and scripts in `/mingw64/libexec/git-core/`, i.e. the `GIT_EXEC_PATH`. That modification is still performed by Git, elsewhere, long after making the changes described above. While we _still_ cannot simply hard-link `mingw64\bin\git.exe` to `cmd` (because the former depends on a couple of `.dll` files that are only in `mingw64\bin`, i.e. calling `...\cmd\git.exe` would fail to load due to missing dependencies), at least we can now avoid that extra process of running the Git wrapper (which then has to wait for the spawned `git.exe` to finish) by calling `...\mingw64\bin\git.exe` directly, via its absolute path. Testing this is in Git's test suite tricky: we set up a "new" MSYS pseudo-root and copy the `git.exe` file into the appropriate location, then verify that `MSYSTEM` is set properly, and also that the `PATH` is modified so that scripts can be found in `$HOME/bin`, `/mingw64/bin/` and `/usr/bin/`. This addresses https://github.com/git-for-windows/git/issues/2283 Signed-off-by: Johannes Schindelin --- compat/mingw.c | 69 +++++++++++++++++++++++++++++++++++++++++++ config.mak.uname | 4 +-- t/t0060-path-utils.sh | 31 ++++++++++++++++++- 3 files changed, 101 insertions(+), 3 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index f20d36a04113eb..3ec814aa1e8f2a 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -2618,6 +2618,47 @@ int xwcstoutf(char *utf, const wchar_t *wcs, size_t utflen) return -1; } +#ifdef ENSURE_MSYSTEM_IS_SET +static size_t append_system_bin_dirs(char *path, size_t size) +{ +#if !defined(RUNTIME_PREFIX) || !defined(HAVE_WPGMPTR) + return 0; +#else + char prefix[32768]; + const char *slash; + size_t len = xwcstoutf(prefix, _wpgmptr, sizeof(prefix)), off = 0; + + if (len == 0 || len >= sizeof(prefix) || + !(slash = find_last_dir_sep(prefix))) + return 0; + /* strip trailing `git.exe` */ + len = slash - prefix; + + /* strip trailing `cmd` or `mingw64\bin` or `mingw32\bin` or `bin` or `libexec\git-core` */ + if (strip_suffix_mem(prefix, &len, "\\mingw64\\libexec\\git-core") || + strip_suffix_mem(prefix, &len, "\\mingw64\\bin")) + off += xsnprintf(path + off, size - off, + "%.*s\\mingw64\\bin;", (int)len, prefix); + else if (strip_suffix_mem(prefix, &len, "\\mingw32\\libexec\\git-core") || + strip_suffix_mem(prefix, &len, "\\mingw32\\bin")) + off += xsnprintf(path + off, size - off, + "%.*s\\mingw32\\bin;", (int)len, prefix); + else if (strip_suffix_mem(prefix, &len, "\\cmd") || + strip_suffix_mem(prefix, &len, "\\bin") || + strip_suffix_mem(prefix, &len, "\\libexec\\git-core")) + off += xsnprintf(path + off, size - off, + "%.*s\\mingw%d\\bin;", (int)len, prefix, + (int)(sizeof(void *) * 8)); + else + return 0; + + off += xsnprintf(path + off, size - off, + "%.*s\\usr\\bin;", (int)len, prefix); + return off; +#endif +} +#endif + static void setup_windows_environment(void) { char *tmp = getenv("TMPDIR"); @@ -2670,6 +2711,34 @@ static void setup_windows_environment(void) setenv("HOME", tmp, 1); } + if (!getenv("PLINK_PROTOCOL")) + setenv("PLINK_PROTOCOL", "ssh", 0); + +#ifdef ENSURE_MSYSTEM_IS_SET + if (!(tmp = getenv("MSYSTEM")) || !tmp[0]) { + const char *home = getenv("HOME"), *path = getenv("PATH"); + char buf[32768]; + size_t off = 0; + + xsnprintf(buf, sizeof(buf), + "MINGW%d", (int)(sizeof(void *) * 8)); + setenv("MSYSTEM", buf, 1); + + if (home) + off += xsnprintf(buf + off, sizeof(buf) - off, + "%s\\bin;", home); + off += append_system_bin_dirs(buf + off, sizeof(buf) - off); + if (path) + off += xsnprintf(buf + off, sizeof(buf) - off, + "%s", path); + else if (off > 0) + buf[off - 1] = '\0'; + else + buf[0] = '\0'; + setenv("PATH", buf, 1); + } +#endif + if (!getenv("LC_ALL") && !getenv("LC_CTYPE") && !getenv("LANG")) setenv("LC_CTYPE", "C.UTF-8", 1); } diff --git a/config.mak.uname b/config.mak.uname index 85d63821ec95f6..e91b9beafe0a01 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -501,7 +501,7 @@ endif compat/win32/pthread.o compat/win32/syslog.o \ compat/win32/trace2_win32_process_info.o \ compat/win32/dirent.o - COMPAT_CFLAGS = -D__USE_MINGW_ACCESS -DDETECT_MSYS_TTY -DNOGDI -DHAVE_STRING_H -Icompat -Icompat/regex -Icompat/win32 -DSTRIP_EXTENSION=\".exe\" + COMPAT_CFLAGS = -D__USE_MINGW_ACCESS -DDETECT_MSYS_TTY -DENSURE_MSYSTEM_IS_SET -DNOGDI -DHAVE_STRING_H -Icompat -Icompat/regex -Icompat/win32 -DSTRIP_EXTENSION=\".exe\" BASIC_LDFLAGS = -IGNORE:4217 -IGNORE:4049 -NOLOGO -ENTRY:wmainCRTStartup -SUBSYSTEM:CONSOLE # invalidcontinue.obj allows Git's source code to close the same file # handle twice, or to access the osfhandle of an already-closed stdout @@ -729,7 +729,7 @@ ifeq ($(uname_S),MINGW) endif CC = gcc COMPAT_CFLAGS += -D__USE_MINGW_ANSI_STDIO=0 -DDETECT_MSYS_TTY \ - -fstack-protector-strong + -DENSURE_MSYSTEM_IS_SET -fstack-protector-strong EXTLIBS += -lntdll EXTRA_PROGRAMS += headless-git$X INSTALL = /bin/install diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh index 0afa3d0d312ca6..e73835e41b96e4 100755 --- a/t/t0060-path-utils.sh +++ b/t/t0060-path-utils.sh @@ -599,7 +599,8 @@ test_expect_success !VALGRIND,RUNTIME_PREFIX,CAN_EXEC_IN_PWD 'RUNTIME_PREFIX wor cp "$GIT_EXEC_PATH"/git$X pretend/bin/ && GIT_EXEC_PATH= ./pretend/bin/git here >actual && echo HERE >expect && - test_cmp expect actual' + test_cmp expect actual +' test_expect_success !VALGRIND,RUNTIME_PREFIX,CAN_EXEC_IN_PWD '%(prefix)/ works' ' mkdir -p pretend/bin && @@ -610,4 +611,32 @@ test_expect_success !VALGRIND,RUNTIME_PREFIX,CAN_EXEC_IN_PWD '%(prefix)/ works' test_cmp expect actual ' +test_expect_success MINGW 'MSYSTEM/PATH is adjusted if necessary' ' + mkdir -p "$HOME"/bin pretend/mingw64/bin \ + pretend/mingw64/libexec/git-core pretend/usr/bin && + cp "$GIT_EXEC_PATH"/git.exe pretend/mingw64/bin/ && + cp "$GIT_EXEC_PATH"/git.exe pretend/mingw64/libexec/git-core/ && + # copy the .dll files, if any (happens when building via CMake) + case "$GIT_EXEC_PATH"/*.dll in + */"*.dll") ;; # no `.dll` files to be copied + *) + cp "$GIT_EXEC_PATH"/*.dll pretend/mingw64/bin/ && + cp "$GIT_EXEC_PATH"/*.dll pretend/mingw64/libexec/git-core/ + ;; + esac && + echo "env | grep MSYSTEM=" | write_script "$HOME"/bin/git-test-home && + echo "echo mingw64" | write_script pretend/mingw64/bin/git-test-bin && + echo "echo usr" | write_script pretend/usr/bin/git-test-bin2 && + + ( + MSYSTEM= && + GIT_EXEC_PATH= && + pretend/mingw64/libexec/git-core/git.exe test-home >actual && + pretend/mingw64/libexec/git-core/git.exe test-bin >>actual && + pretend/mingw64/bin/git.exe test-bin2 >>actual + ) && + test_write_lines MSYSTEM=$MSYSTEM mingw64 usr >expect && + test_cmp expect actual +' + test_done From 0898dca4cf6d974936c1fd45d65c17e0a66e10ba Mon Sep 17 00:00:00 2001 From: Jeff Hostetler Date: Thu, 30 Jan 2020 14:22:27 -0500 Subject: [PATCH 088/297] clink.pl: fix MSVC compile script to handle libcurl-d.lib Update clink.pl to link with either libcurl.lib or libcurl-d.lib depending on whether DEBUG=1 is set. Signed-off-by: Jeff Hostetler Signed-off-by: Johannes Schindelin --- compat/vcbuild/scripts/clink.pl | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/compat/vcbuild/scripts/clink.pl b/compat/vcbuild/scripts/clink.pl index 3bd824154be381..c4c99d1a11f18c 100755 --- a/compat/vcbuild/scripts/clink.pl +++ b/compat/vcbuild/scripts/clink.pl @@ -56,7 +56,8 @@ # need to use that instead? foreach my $flag (@lflags) { if ($flag =~ /^-LIBPATH:(.*)/) { - foreach my $l ("libcurl_imp.lib", "libcurl.lib") { + my $libcurl = $is_debug ? "libcurl-d.lib" : "libcurl.lib"; + foreach my $l ("libcurl_imp.lib", $libcurl) { if (-f "$1/$l") { $lib = $l; last; From 1afb7a6cab799852e8a9b8b9c5a9f3bc40d2a192 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 31 Jan 2020 11:49:04 +0100 Subject: [PATCH 089/297] mingw: implement a platform-specific `strbuf_realpath()` There is a Win32 API function to resolve symbolic links, and we can use that instead of resolving them manually. Even better, this function also resolves NTFS junction points (which are somewhat similar to bind mounts). This fixes https://github.com/git-for-windows/git/issues/2481. Signed-off-by: Johannes Schindelin --- compat/mingw.c | 76 +++++++++++++++++++++++++++++++++++++++++++ compat/mingw.h | 3 ++ t/t0060-path-utils.sh | 8 +++++ t/t3700-add.sh | 2 +- t/t5601-clone.sh | 7 ++++ 5 files changed, 95 insertions(+), 1 deletion(-) diff --git a/compat/mingw.c b/compat/mingw.c index 29d3f09768c29a..e9d2b4d11c8e8c 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -1137,6 +1137,82 @@ struct tm *localtime_r(const time_t *timep, struct tm *result) } #endif +char *mingw_strbuf_realpath(struct strbuf *resolved, const char *path) +{ + wchar_t wpath[MAX_PATH]; + HANDLE h; + DWORD ret; + int len; + const char *last_component = NULL; + char *append = NULL; + + if (xutftowcs_path(wpath, path) < 0) + return NULL; + + h = CreateFileW(wpath, 0, + FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, NULL, + OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, NULL); + + /* + * strbuf_realpath() allows the last path component to not exist. If + * that is the case, now it's time to try without last component. + */ + if (h == INVALID_HANDLE_VALUE && + GetLastError() == ERROR_FILE_NOT_FOUND) { + /* cut last component off of `wpath` */ + wchar_t *p = wpath + wcslen(wpath); + + while (p != wpath) + if (*(--p) == L'/' || *p == L'\\') + break; /* found start of last component */ + + if (p != wpath && (last_component = find_last_dir_sep(path))) { + append = xstrdup(last_component + 1); /* skip directory separator */ + /* + * Do not strip the trailing slash at the drive root, otherwise + * the path would be e.g. `C:` (which resolves to the + * _current_ directory on that drive). + */ + if (p[-1] == L':') + p[1] = L'\0'; + else + *p = L'\0'; + h = CreateFileW(wpath, 0, FILE_SHARE_READ | + FILE_SHARE_WRITE | FILE_SHARE_DELETE, + NULL, OPEN_EXISTING, + FILE_FLAG_BACKUP_SEMANTICS, NULL); + } + } + + if (h == INVALID_HANDLE_VALUE) { +realpath_failed: + FREE_AND_NULL(append); + return NULL; + } + + ret = GetFinalPathNameByHandleW(h, wpath, ARRAY_SIZE(wpath), 0); + CloseHandle(h); + if (!ret || ret >= ARRAY_SIZE(wpath)) + goto realpath_failed; + + len = wcslen(wpath) * 3; + strbuf_grow(resolved, len); + len = xwcstoutf(resolved->buf, normalize_ntpath(wpath), len); + if (len < 0) + goto realpath_failed; + resolved->len = len; + + if (append) { + /* Use forward-slash, like `normalize_ntpath()` */ + strbuf_complete(resolved, '/'); + strbuf_addstr(resolved, append); + FREE_AND_NULL(append); + } + + return resolved->buf; + +} + char *mingw_getcwd(char *pointer, int len) { wchar_t cwd[MAX_PATH], wpointer[MAX_PATH]; diff --git a/compat/mingw.h b/compat/mingw.h index 27b61284f46be6..150ce2432f8f99 100644 --- a/compat/mingw.h +++ b/compat/mingw.h @@ -457,6 +457,9 @@ static inline void convert_slashes(char *path) #define PATH_SEP ';' char *mingw_query_user_email(void); #define query_user_email mingw_query_user_email +struct strbuf; +char *mingw_strbuf_realpath(struct strbuf *resolved, const char *path); +#define platform_strbuf_realpath mingw_strbuf_realpath #if !defined(__MINGW64_VERSION_MAJOR) && (!defined(_MSC_VER) || _MSC_VER < 1800) #define PRIuMAX "I64u" #define PRId64 "I64d" diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh index 0afa3d0d312ca6..7a31cdd0d0ca48 100755 --- a/t/t0060-path-utils.sh +++ b/t/t0060-path-utils.sh @@ -282,6 +282,14 @@ test_expect_success SYMLINKS 'real path works on symlinks' ' test_cmp expect actual ' +test_expect_success MINGW 'real path works near drive root' ' + # we need a non-existing path at the drive root; simply skip if C:/xyz exists + if test ! -e C:/xyz + then + test C:/xyz = $(test-tool path-utils real_path C:/xyz) + fi +' + test_expect_success SYMLINKS 'prefix_path works with absolute paths to work tree symlinks' ' ln -s target symlink && echo "symlink" >expect && diff --git a/t/t3700-add.sh b/t/t3700-add.sh index 2e0cd98d0e7167..ad7a0e925b195e 100755 --- a/t/t3700-add.sh +++ b/t/t3700-add.sh @@ -549,7 +549,7 @@ test_expect_success CASE_INSENSITIVE_FS 'path is case-insensitive' ' git add "$downcased" ' -test_expect_failure MINGW 'can add files via NTFS junctions' ' +test_expect_success MINGW 'can add files via NTFS junctions' ' test_when_finished "cmd //c rmdir junction && rm -rf target" && test_create_repo target && cmd //c "mklink /j junction target" && diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh index 5d7ea147f1abc5..6ee5495a32ab73 100755 --- a/t/t5601-clone.sh +++ b/t/t5601-clone.sh @@ -78,6 +78,13 @@ test_expect_success 'clone respects GIT_WORK_TREE' ' ' +test_expect_success CASE_INSENSITIVE_FS 'core.worktree is not added due to path case' ' + + mkdir UPPERCASE && + git clone src "$(pwd)/uppercase" && + test "unset" = "$(git -C UPPERCASE config --default unset core.worktree)" +' + test_expect_success 'clone from hooks' ' test_create_repo r0 && From f2131b242334b60ddb824592c0e786f2cefe6395 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 25 Aug 2020 12:13:26 +0200 Subject: [PATCH 090/297] mingw: ignore HOMEDRIVE/HOMEPATH if it points to Windows' system directory Internally, Git expects the environment variable `HOME` to be set, and to point to the current user's home directory. This environment variable is not set by default on Windows, and therefore Git tries its best to construct one if it finds `HOME` unset. There are actually two different approaches Git tries: first, it looks at `HOMEDRIVE`/`HOMEPATH` because this is widely used in corporate environments with roaming profiles, and a user generally wants their global Git settings to be in a roaming profile. Only when `HOMEDRIVE`/`HOMEPATH` is either unset or does not point to a valid location, Git will fall back to using `USERPROFILE` instead. However, starting with Windows Vista, for secondary logons and services, the environment variables `HOMEDRIVE`/`HOMEPATH` point to Windows' system directory (usually `C:\Windows\system32`). That is undesirable, and that location is usually write-protected anyway. So let's verify that the `HOMEDRIVE`/`HOMEPATH` combo does not point to Windows' system directory before using it, falling back to `USERPROFILE` if it does. This fixes git-for-windows#2709 Initial-Path-by: Ivan Pozdeev Signed-off-by: Johannes Schindelin --- compat/mingw.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/compat/mingw.c b/compat/mingw.c index 3ec814aa1e8f2a..7a98ce22273b3a 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -2659,6 +2659,18 @@ static size_t append_system_bin_dirs(char *path, size_t size) } #endif +static int is_system32_path(const char *path) +{ + WCHAR system32[MAX_PATH], wpath[MAX_PATH]; + + if (xutftowcs_path(wpath, path) < 0 || + !GetSystemDirectoryW(system32, ARRAY_SIZE(system32)) || + _wcsicmp(system32, wpath)) + return 0; + + return 1; +} + static void setup_windows_environment(void) { char *tmp = getenv("TMPDIR"); @@ -2699,7 +2711,8 @@ static void setup_windows_environment(void) strbuf_addstr(&buf, tmp); if ((tmp = getenv("HOMEPATH"))) { strbuf_addstr(&buf, tmp); - if (is_directory(buf.buf)) + if (!is_system32_path(buf.buf) && + is_directory(buf.buf)) setenv("HOME", buf.buf, 1); else tmp = NULL; /* use $USERPROFILE */ From 4ee453cdb870fabf182aec474e18957ead63d7b4 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Sat, 9 May 2020 14:08:36 +0200 Subject: [PATCH 091/297] vcxproj: unclash project directories with build outputs It already caused problems with the test suite that the directory containing `git.vcxproj` is called the same as the Git executable without its file extension: `./git` is ambiguous, it could refer both to the directory `git/` as well as to `git.exe`. Now there is one more problem: when our GitHub workflow runs on the `vs/master` branch, it fails in all but the Windows builds, as they want to write the file `git` but there is already a directory in the way. Let's just go ahead and append `.proj` to all of those directories, e.g. `git.proj/` instead of `git/`. Signed-off-by: Johannes Schindelin --- config.mak.uname | 8 ++++---- contrib/buildsystems/Generators/Vcxproj.pm | 18 ++++++++++-------- 2 files changed, 14 insertions(+), 12 deletions(-) diff --git a/config.mak.uname b/config.mak.uname index 85d63821ec95f6..3c9b2e81ebaa04 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -767,7 +767,7 @@ vcxproj: # Make .vcxproj files and add them perl contrib/buildsystems/generate -g Vcxproj - git add -f git.sln {*,*/lib,t/helper/*}/*.vcxproj + git add -f git.sln {*,*/lib.proj,t/helper/*}/*.vcxproj # Generate the LinkOrCopyBuiltins.targets and LinkOrCopyRemoteHttp.targets file (echo '' && \ @@ -777,7 +777,7 @@ vcxproj: echo ' '; \ done && \ echo ' ' && \ - echo '') >git/LinkOrCopyBuiltins.targets + echo '') >git.proj/LinkOrCopyBuiltins.targets (echo '' && \ echo ' ' && \ for name in $(REMOTE_CURL_ALIASES); \ @@ -785,8 +785,8 @@ vcxproj: echo ' '; \ done && \ echo ' ' && \ - echo '') >git-remote-http/LinkOrCopyRemoteHttp.targets - git add -f git/LinkOrCopyBuiltins.targets git-remote-http/LinkOrCopyRemoteHttp.targets + echo '') >git-remote-http.proj/LinkOrCopyRemoteHttp.targets + git add -f git.proj/LinkOrCopyBuiltins.targets git-remote-http.proj/LinkOrCopyRemoteHttp.targets # Add generated headers $(MAKE) MSVC=1 SKIP_VCPKG=1 prefix=/mingw64 $(GENERATED_H) diff --git a/contrib/buildsystems/Generators/Vcxproj.pm b/contrib/buildsystems/Generators/Vcxproj.pm index b2e68a16715e39..0439b82f55e243 100644 --- a/contrib/buildsystems/Generators/Vcxproj.pm +++ b/contrib/buildsystems/Generators/Vcxproj.pm @@ -58,8 +58,8 @@ sub createProject { my $uuid = generate_guid($name); $$build_structure{"$prefix${target}_GUID"} = $uuid; my $vcxproj = $target; - $vcxproj =~ s/(.*\/)?(.*)/$&\/$2.vcxproj/; - $vcxproj =~ s/([^\/]*)(\/lib)\/(lib.vcxproj)/$1$2\/$1_$3/; + $vcxproj =~ s/(.*\/)?(.*)/$&.proj\/$2.vcxproj/; + $vcxproj =~ s/([^\/]*)(\/lib\.proj)\/(lib.vcxproj)/$1$2\/$1_$3/; $$build_structure{"$prefix${target}_VCXPROJ"} = $vcxproj; my @srcs = sort(map("$rel_dir\\$_", @{$$build_structure{"$prefix${name}_SOURCES"}})); @@ -89,7 +89,9 @@ sub createProject { $defines =~ s/>/>/g; $defines =~ s/\'//g; - die "Could not create the directory $target for $label project!\n" unless (-d "$target" || mkdir "$target"); + my $dir = $vcxproj; + $dir =~ s/\/[^\/]*$//; + die "Could not create the directory $dir for $label project!\n" unless (-d "$dir" || mkdir "$dir"); open F, ">$vcxproj" or die "Could not open $vcxproj for writing!\n"; binmode F, ":crlf :utf8"; @@ -237,7 +239,7 @@ EOM print F << "EOM"; - + $uuid_libgit false @@ -252,7 +254,7 @@ EOM } if (!($name =~ 'xdiff')) { print F << "EOM"; - + $uuid_xdiff_lib false @@ -261,7 +263,7 @@ EOM if ($name =~ /(test-(line-buffer|svn-fe)|^git-remote-testsvn)\.exe$/) { my $uuid_vcs_svn_lib = $$build_structure{"LIBS_vcs-svn/lib_GUID"}; print F << "EOM"; - + $uuid_vcs_svn_lib false @@ -338,7 +340,7 @@ sub createGlueProject { my $vcxproj = $build_structure{"APPS_${appname}_VCXPROJ"}; $vcxproj =~ s/\//\\/g; $appname =~ s/.*\///; - print F "\"${appname}\", \"${vcxproj}\", \"${uuid}\""; + print F "\"${appname}.proj\", \"${vcxproj}\", \"${uuid}\""; print F "$SLN_POST"; } foreach (@libs) { @@ -348,7 +350,7 @@ sub createGlueProject { my $vcxproj = $build_structure{"LIBS_${libname}_VCXPROJ"}; $vcxproj =~ s/\//\\/g; $libname =~ s/\//_/g; - print F "\"${libname}\", \"${vcxproj}\", \"${uuid}\""; + print F "\"${libname}.proj\", \"${vcxproj}\", \"${uuid}\""; print F "$SLN_POST"; } From 234e711b875cbcbcb67d5e29d8183f7b6a853706 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Sat, 9 May 2020 16:19:06 +0200 Subject: [PATCH 092/297] t5505/t5516: allow running without `.git/branches/` in the templates When we commit the template directory as part of `make vcxproj`, the `branches/` directory is not actually commited, as it is empty. Two tests were not prepared for that situation. This developer tried to get rid of the support for `.git/branches/` a long time ago, but that effort did not bear fruit, so the best we can do is work around in these here tests. Signed-off-by: Johannes Schindelin --- t/t5505-remote.sh | 4 ++-- t/t5516-fetch-push.sh | 8 ++++---- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/t/t5505-remote.sh b/t/t5505-remote.sh index 08424e878e104c..c3728f2065987e 100755 --- a/t/t5505-remote.sh +++ b/t/t5505-remote.sh @@ -1039,7 +1039,7 @@ test_expect_success 'migrate a remote from named file in $GIT_DIR/branches' ' ( cd six && git remote rm origin && - mkdir .git/branches && + mkdir -p .git/branches && echo "$origin_url#main" >.git/branches/origin && git remote rename origin origin && test_path_is_missing .git/branches/origin && @@ -1054,7 +1054,7 @@ test_expect_success 'migrate a remote from named file in $GIT_DIR/branches (2)' ( cd seven && git remote rm origin && - mkdir .git/branches && + mkdir -p .git/branches && echo "quux#foom" > .git/branches/origin && git remote rename origin origin && test_path_is_missing .git/branches/origin && diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh index 9d693eb57f7790..7827db70b53b06 100755 --- a/t/t5516-fetch-push.sh +++ b/t/t5516-fetch-push.sh @@ -979,7 +979,7 @@ test_expect_success 'fetch with branches' ' mk_empty testrepo && git branch second $the_first_commit && git checkout second && - mkdir testrepo/.git/branches && + mkdir -p testrepo/.git/branches && echo ".." > testrepo/.git/branches/branch1 && ( cd testrepo && @@ -993,7 +993,7 @@ test_expect_success 'fetch with branches' ' test_expect_success 'fetch with branches containing #' ' mk_empty testrepo && - mkdir testrepo/.git/branches && + mkdir -p testrepo/.git/branches && echo "..#second" > testrepo/.git/branches/branch2 && ( cd testrepo && @@ -1010,7 +1010,7 @@ test_expect_success 'push with branches' ' git checkout second && test_when_finished "rm -rf .git/branches" && - mkdir .git/branches && + mkdir -p .git/branches && echo "testrepo" > .git/branches/branch1 && git push branch1 && @@ -1026,7 +1026,7 @@ test_expect_success 'push with branches containing #' ' mk_empty testrepo && test_when_finished "rm -rf .git/branches" && - mkdir .git/branches && + mkdir -p .git/branches && echo "testrepo#branch3" > .git/branches/branch2 && git push branch2 && From e7122008b0161092016554979af38827aba80719 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 4 Mar 2020 21:55:28 +0100 Subject: [PATCH 093/297] http: use new "best effort" strategy for Secure Channel revoke checking MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The native Windows HTTPS backend is based on Secure Channel which lets the caller decide how to handle revocation checking problems caused by missing information in the certificate or offline CRL distribution points. Unfortunately, cURL chose to handle these problems differently than OpenSSL by default: while OpenSSL happily ignores those problems (essentially saying "¯\_(ツ)_/¯"), the Secure Channel backend will error out instead. As a remedy, the "no revoke" mode was introduced, which turns off revocation checking altogether. This is a bit heavy-handed. We support this via the `http.schannelCheckRevoke` setting. In https://github.com/curl/curl/pull/4981, we contributed an opt-in "best effort" strategy that emulates what OpenSSL seems to do. In Git for Windows, we actually want this to be the default. This patch makes it so, introducing it as a new value for the `http.schannelCheckRevoke" setting, which now becmes a tristate: it accepts the values "false", "true" or "best-effort" (defaulting to the last one). Signed-off-by: Johannes Schindelin --- Documentation/config/http.txt | 12 +++++++----- http.c | 26 ++++++++++++++++++++++---- 2 files changed, 29 insertions(+), 9 deletions(-) diff --git a/Documentation/config/http.txt b/Documentation/config/http.txt index 162b33fc52f967..101487cfa43a32 100644 --- a/Documentation/config/http.txt +++ b/Documentation/config/http.txt @@ -218,11 +218,13 @@ http.sslBackend:: http.schannelCheckRevoke:: Used to enforce or disable certificate revocation checks in cURL - when http.sslBackend is set to "schannel". Defaults to `true` if - unset. Only necessary to disable this if Git consistently errors - and the message is about checking the revocation status of a - certificate. This option is ignored if cURL lacks support for - setting the relevant SSL option at runtime. + when http.sslBackend is set to "schannel" via "true" and "false", + respectively. Another accepted value is "best-effort" (the default) + in which case revocation checks are performed, but errors due to + revocation list distribution points that are offline are silently + ignored, as well as errors due to certificates missing revocation + list distribution points. This option is ignored if cURL lacks + support for setting the relevant SSL option at runtime. http.schannelUseSSLCAInfo:: As of cURL v7.60.0, the Secure Channel backend can use the diff --git a/http.c b/http.c index 623ed234891f44..f9c55c5cacff37 100644 --- a/http.c +++ b/http.c @@ -147,7 +147,13 @@ static char *cached_accept_language; static char *http_ssl_backend; -static int http_schannel_check_revoke = 1; +static int http_schannel_check_revoke_mode = +#ifdef CURLSSLOPT_REVOKE_BEST_EFFORT + CURLSSLOPT_REVOKE_BEST_EFFORT; +#else + CURLSSLOPT_NO_REVOKE; +#endif + /* * With the backend being set to `schannel`, setting sslCAinfo would override * the Certificate Store in cURL v7.60.0 and later, which is not what we want @@ -422,7 +428,19 @@ static int http_options(const char *var, const char *value, } if (!strcmp("http.schannelcheckrevoke", var)) { - http_schannel_check_revoke = git_config_bool(var, value); + if (value && !strcmp(value, "best-effort")) { + http_schannel_check_revoke_mode = +#ifdef CURLSSLOPT_REVOKE_BEST_EFFORT + CURLSSLOPT_REVOKE_BEST_EFFORT; +#else + CURLSSLOPT_NO_REVOKE; + warning(_("%s=%s unsupported by current cURL"), + var, value); +#endif + } else + http_schannel_check_revoke_mode = + (git_config_bool(var, value) ? + 0 : CURLSSLOPT_NO_REVOKE); return 0; } @@ -1084,9 +1102,9 @@ static CURL *get_curl_handle(void) #endif if (http_ssl_backend && !strcmp("schannel", http_ssl_backend) && - !http_schannel_check_revoke) { + http_schannel_check_revoke_mode) { #ifdef GIT_CURL_HAVE_CURLSSLOPT_NO_REVOKE - curl_easy_setopt(result, CURLOPT_SSL_OPTIONS, CURLSSLOPT_NO_REVOKE); + curl_easy_setopt(result, CURLOPT_SSL_OPTIONS, http_schannel_check_revoke_mode); #else warning(_("CURLSSLOPT_NO_REVOKE not supported with cURL < 7.44.0")); #endif From df9e510af73ecc68726fa2dcf5d1c0a787dc691c Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Sat, 9 May 2020 19:24:23 +0200 Subject: [PATCH 094/297] t5505/t5516: fix white-space around redirectors The convention in Git project's shell scripts is to have white-space _before_, but not _after_ the `>` (or `<`). Signed-off-by: Johannes Schindelin --- t/t5505-remote.sh | 6 +++--- t/t5516-fetch-push.sh | 10 +++++----- 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/t/t5505-remote.sh b/t/t5505-remote.sh index c3728f2065987e..57fdd3f9ab5a48 100755 --- a/t/t5505-remote.sh +++ b/t/t5505-remote.sh @@ -835,8 +835,8 @@ test_expect_success '"remote show" does not show symbolic refs' ' ( cd three && git remote show origin >output && - ! grep "^ *HEAD$" < output && - ! grep -i stale < output + ! grep "^ *HEAD$" .git/branches/origin && + echo "quux#foom" >.git/branches/origin && git remote rename origin origin && test_path_is_missing .git/branches/origin && test "$(git config remote.origin.url)" = "quux" && diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh index 7827db70b53b06..1fc37f2350aa3d 100755 --- a/t/t5516-fetch-push.sh +++ b/t/t5516-fetch-push.sh @@ -980,7 +980,7 @@ test_expect_success 'fetch with branches' ' git branch second $the_first_commit && git checkout second && mkdir -p testrepo/.git/branches && - echo ".." > testrepo/.git/branches/branch1 && + echo ".." >testrepo/.git/branches/branch1 && ( cd testrepo && git fetch branch1 && @@ -994,7 +994,7 @@ test_expect_success 'fetch with branches' ' test_expect_success 'fetch with branches containing #' ' mk_empty testrepo && mkdir -p testrepo/.git/branches && - echo "..#second" > testrepo/.git/branches/branch2 && + echo "..#second" >testrepo/.git/branches/branch2 && ( cd testrepo && git fetch branch2 && @@ -1011,7 +1011,7 @@ test_expect_success 'push with branches' ' test_when_finished "rm -rf .git/branches" && mkdir -p .git/branches && - echo "testrepo" > .git/branches/branch1 && + echo "testrepo" >.git/branches/branch1 && git push branch1 && ( @@ -1027,7 +1027,7 @@ test_expect_success 'push with branches containing #' ' test_when_finished "rm -rf .git/branches" && mkdir -p .git/branches && - echo "testrepo#branch3" > .git/branches/branch2 && + echo "testrepo#branch3" >.git/branches/branch2 && git push branch2 && ( @@ -1556,7 +1556,7 @@ EOF git init no-thin && git --git-dir=no-thin/.git config receive.unpacklimit 0 && git push no-thin/.git refs/heads/main:refs/heads/foo && - echo modified >> path1 && + echo modified >>path1 && git commit -am modified && git repack -adf && rcvpck="git receive-pack --reject-thin-pack-for-testing" && From 8972cd7cc90b2dd1293c433cdef7e71da40f7f15 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Sat, 12 Sep 2015 12:25:47 +0200 Subject: [PATCH 095/297] t3701: verify that we can add *lots* of files interactively Signed-off-by: Johannes Schindelin --- t/t3701-add-interactive.sh | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh index 9a48933cecf5ae..947bd21c67a943 100755 --- a/t/t3701-add-interactive.sh +++ b/t/t3701-add-interactive.sh @@ -1119,6 +1119,27 @@ test_expect_success 'checkout -p patch editing of added file' ' ) ' +test_expect_success EXPENSIVE 'add -i with a lot of files' ' + git reset --hard && + x160=0123456789012345678901234567890123456789 && + x160=$x160$x160$x160$x160 && + y= && + i=0 && + while test $i -le 200 + do + name=$(printf "%s%03d" $x160 $i) && + echo $name >$name && + git add -N $name && + y="${y}y$LF" && + i=$(($i+1)) || + exit 1 + done && + echo "$y" | git add -p -- . && + git diff --cached >staged && + test_line_count = 1407 staged && + git reset --hard +' + test_expect_success 'show help from add--helper' ' git reset --hard && cat >expect <<-EOF && From 90d2072fde2f5c65cac8aaa82cedc2c20e14013c Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Thu, 2 Jul 2020 16:35:05 +0200 Subject: [PATCH 096/297] git add -i: handle CR/LF line endings in the interactive input As of Git for Windows v2.27.0, there is an option to use Windows' newly-introduced Pseudo Console support. When running an interactive add operation with this support enabled, Git will receive CR/LF line endings. Therefore, let's not pretend that we are expecting Unix line endings. This fixes https://github.com/git-for-windows/git/issues/2729 Signed-off-by: Johannes Schindelin --- prompt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/prompt.c b/prompt.c index 8935fe4dfb9414..cfce7e15a091a7 100644 --- a/prompt.c +++ b/prompt.c @@ -78,7 +78,7 @@ int git_read_line_interactively(struct strbuf *line) int ret; fflush(stdout); - ret = strbuf_getline_lf(line, stdin); + ret = strbuf_getline(line, stdin); if (ret != EOF) strbuf_trim_trailing_newline(line); From 2dbe355ca6d6eca7ffef984dd1d3624b1bf831cf Mon Sep 17 00:00:00 2001 From: Luke Bonanomi Date: Wed, 24 Jun 2020 07:45:52 -0400 Subject: [PATCH 097/297] commit: accept "scissors" with CR/LF line endings This change enhances `git commit --cleanup=scissors` by detecting scissors lines ending in either LF (UNIX-style) or CR/LF (DOS-style). Regression tests are included to specifically test for trailing comments after a CR/LF-terminated scissors line. Signed-off-by: Luke Bonanomi Signed-off-by: Johannes Schindelin --- t/t7502-commit-porcelain.sh | 42 +++++++++++++++++++++++++++++++++++++ wt-status.c | 13 +++++++++--- 2 files changed, 52 insertions(+), 3 deletions(-) diff --git a/t/t7502-commit-porcelain.sh b/t/t7502-commit-porcelain.sh index b37e2018a74a7b..c38b96b66cd20a 100755 --- a/t/t7502-commit-porcelain.sh +++ b/t/t7502-commit-porcelain.sh @@ -623,6 +623,48 @@ test_expect_success 'cleanup commit messages (scissors option,-F,-e, scissors on test_must_be_empty actual ' +test_expect_success 'helper-editor' ' + + write_script lf-to-crlf.sh <<-\EOF + sed "s/\$/Q/" <"$1" | tr Q "\\015" >"$1".new && + mv -f "$1".new "$1" + EOF +' + +test_expect_success 'cleanup commit messages (scissors option,-F,-e, CR/LF line endings)' ' + + test_config core.editor "\"$PWD/lf-to-crlf.sh\"" && + scissors="# ------------------------ >8 ------------------------" && + + test_write_lines >text \ + "# Keep this comment" "" " $scissors" \ + "# Keep this comment, too" "$scissors" \ + "# Remove this comment" "$scissors" \ + "Remove this comment, too" && + + test_write_lines >expect \ + "# Keep this comment" "" " $scissors" \ + "# Keep this comment, too" && + + git commit --cleanup=scissors -e -F text --allow-empty && + git cat-file -p HEAD >raw && + sed -e "1,/^\$/d" raw >actual && + test_cmp expect actual +' + +test_expect_success 'cleanup commit messages (scissors option,-F,-e, scissors on first line, CR/LF line endings)' ' + + scissors="# ------------------------ >8 ------------------------" && + test_write_lines >text \ + "$scissors" \ + "# Remove this comment and any following lines" && + cp text /tmp/test2-text && + git commit --cleanup=scissors -e -F text --allow-empty --allow-empty-message && + git cat-file -p HEAD >raw && + sed -e "1,/^\$/d" raw >actual && + test_must_be_empty actual +' + test_expect_success 'cleanup commit messages (strip option,-F)' ' echo >>negative && diff --git a/wt-status.c b/wt-status.c index b778eef989d5a8..e9b09675822440 100644 --- a/wt-status.c +++ b/wt-status.c @@ -38,7 +38,7 @@ #define UF_DELAY_WARNING_IN_MS (2 * 1000) static const char cut_line[] = -"------------------------ >8 ------------------------\n"; +"------------------------ >8 ------------------------"; static char default_wt_status_colors[][COLOR_MAXLEN] = { GIT_COLOR_NORMAL, /* WT_STATUS_HEADER */ @@ -1090,15 +1090,22 @@ static void wt_longstatus_print_other(struct wt_status *s, status_printf_ln(s, GIT_COLOR_NORMAL, "%s", ""); } +static inline int starts_with_newline(const char *p) +{ + return *p == '\n' || (*p == '\r' && p[1] == '\n'); +} + size_t wt_status_locate_end(const char *s, size_t len) { const char *p; struct strbuf pattern = STRBUF_INIT; strbuf_addf(&pattern, "\n%s %s", comment_line_str, cut_line); - if (starts_with(s, pattern.buf + 1)) + if (starts_with(s, pattern.buf + 1) && + starts_with_newline(s + pattern.len - 1)) len = 0; - else if ((p = strstr(s, pattern.buf))) { + else if ((p = strstr(s, pattern.buf)) && + starts_with_newline(p + pattern.len)) { size_t newlen = p - s + 1; if (newlen < len) len = newlen; From d42d78a374857895dd2dd148d534be06eda2f53e Mon Sep 17 00:00:00 2001 From: Jens Glathe Date: Tue, 2 Jun 2020 12:12:25 +0200 Subject: [PATCH 098/297] t0014: fix indentation For some reason, this test case was indented with 4 spaces instead of 1 horizontal tab. The other test cases in the same test script are fine. Signed-off-by: Jens Glathe Signed-off-by: Johannes Schindelin --- t/t0014-alias.sh | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/t/t0014-alias.sh b/t/t0014-alias.sh index 854d59ec58c25a..30708146887d19 100755 --- a/t/t0014-alias.sh +++ b/t/t0014-alias.sh @@ -38,10 +38,10 @@ test_expect_success 'looping aliases - internal execution' ' #' test_expect_success 'run-command formats empty args properly' ' - test_must_fail env GIT_TRACE=1 git frotz a "" b " " c 2>actual.raw && - sed -ne "/run_command:/s/.*trace: run_command: //p" actual.raw >actual && - echo "git-frotz a '\'''\'' b '\'' '\'' c" >expect && - test_cmp expect actual + test_must_fail env GIT_TRACE=1 git frotz a "" b " " c 2>actual.raw && + sed -ne "/run_command:/s/.*trace: run_command: //p" actual.raw >actual && + echo "git-frotz a '\'''\'' b '\'' '\'' c" >expect && + test_cmp expect actual ' test_expect_success 'tracing a shell alias with arguments shows trace of prepared command' ' From 7b2e2b9f4d70fb39c9cd54611a117f661fab64ad Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 12 Aug 2020 15:06:17 +0000 Subject: [PATCH 099/297] git-gui: accommodate for intent-to-add files As of Git v2.28.0, the diff for files staged via `git add -N` marks them as new files. Git GUI was ill-prepared for that, and this patch teaches Git GUI about them. Please note that this will not even fix things with v2.28.0, as the `rp/apply-cached-with-i-t-a` patches are required on Git's side, too. This fixes https://github.com/git-for-windows/git/issues/2779 Signed-off-by: Johannes Schindelin Signed-off-by: Pratyush Yadav --- git-gui/git-gui.sh | 2 ++ git-gui/lib/diff.tcl | 12 ++++++++---- 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/git-gui/git-gui.sh b/git-gui/git-gui.sh index 8fe7538e72084d..9f2ea858edb594 100755 --- a/git-gui/git-gui.sh +++ b/git-gui/git-gui.sh @@ -2079,6 +2079,7 @@ set all_icons(U$ui_index) file_merge set all_icons(T$ui_index) file_statechange set all_icons(_$ui_workdir) file_plain +set all_icons(A$ui_workdir) file_plain set all_icons(M$ui_workdir) file_mod set all_icons(D$ui_workdir) file_question set all_icons(U$ui_workdir) file_merge @@ -2105,6 +2106,7 @@ foreach i { {A_ {mc "Staged for commit"}} {AM {mc "Portions staged for commit"}} {AD {mc "Staged for commit, missing"}} + {AA {mc "Intended to be added"}} {_D {mc "Missing"}} {D_ {mc "Staged for removal"}} diff --git a/git-gui/lib/diff.tcl b/git-gui/lib/diff.tcl index 871ad488c2a1c0..36d3715f7b25a2 100644 --- a/git-gui/lib/diff.tcl +++ b/git-gui/lib/diff.tcl @@ -582,7 +582,8 @@ proc apply_or_revert_hunk {x y revert} { if {$current_diff_side eq $ui_index} { set failed_msg [mc "Failed to unstage selected hunk."] lappend apply_cmd --reverse --cached - if {[string index $mi 0] ne {M}} { + set file_state [string index $mi 0] + if {$file_state ne {M} && $file_state ne {A}} { unlock_index return } @@ -595,7 +596,8 @@ proc apply_or_revert_hunk {x y revert} { lappend apply_cmd --cached } - if {[string index $mi 1] ne {M}} { + set file_state [string index $mi 1] + if {$file_state ne {M} && $file_state ne {A}} { unlock_index return } @@ -687,7 +689,8 @@ proc apply_or_revert_range_or_line {x y revert} { set failed_msg [mc "Failed to unstage selected line."] set to_context {+} lappend apply_cmd --reverse --cached - if {[string index $mi 0] ne {M}} { + set file_state [string index $mi 0] + if {$file_state ne {M} && $file_state ne {A}} { unlock_index return } @@ -702,7 +705,8 @@ proc apply_or_revert_range_or_line {x y revert} { lappend apply_cmd --cached } - if {[string index $mi 1] ne {M}} { + set file_state [string index $mi 1] + if {$file_state ne {M} && $file_state ne {A}} { unlock_index return } From c2deb3c0ca5676a289ca7a108b5b5e8f007becd8 Mon Sep 17 00:00:00 2001 From: Jeff Hostetler Date: Tue, 30 Mar 2021 14:25:31 -0400 Subject: [PATCH 100/297] clink.pl: fix libexpatd.lib link error when using MSVC When building with `make MSVC=1 DEBUG=1`, link to `libexpatd.lib` rather than `libexpat.lib`. It appears that the `vcpkg` package for "libexpat" has changed and now creates `libexpatd.lib` for debug mode builds. Previously, both debug and release builds created a ".lib" with the same basename. Signed-off-by: Jeff Hostetler --- compat/vcbuild/scripts/clink.pl | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/compat/vcbuild/scripts/clink.pl b/compat/vcbuild/scripts/clink.pl index 3bd824154be381..2768ae15f1879f 100755 --- a/compat/vcbuild/scripts/clink.pl +++ b/compat/vcbuild/scripts/clink.pl @@ -66,7 +66,11 @@ } push(@args, $lib); } elsif ("$arg" eq "-lexpat") { + if ($is_debug) { + push(@args, "libexpatd.lib"); + } else { push(@args, "libexpat.lib"); + } } elsif ("$arg" =~ /^-L/ && "$arg" ne "-LTCG") { $arg =~ s/^-L/-LIBPATH:/; push(@lflags, $arg); From af45d36dc2b4cbfb9e0b092e17d8c0b2ffbd96cc Mon Sep 17 00:00:00 2001 From: Jeff Hostetler Date: Mon, 5 Apr 2021 15:27:38 -0400 Subject: [PATCH 101/297] Makefile: clean up .ilk files when MSVC=1 Signed-off-by: Jeff Hostetler --- Makefile | 3 +++ 1 file changed, 3 insertions(+) diff --git a/Makefile b/Makefile index d6479092a0b8df..51ec3ae131873d 100644 --- a/Makefile +++ b/Makefile @@ -3734,12 +3734,15 @@ ifdef MSVC $(RM) $(patsubst %.o,%.o.pdb,$(OBJECTS)) $(RM) headless-git.o.pdb $(RM) $(patsubst %.exe,%.pdb,$(OTHER_PROGRAMS)) + $(RM) $(patsubst %.exe,%.ilk,$(OTHER_PROGRAMS)) $(RM) $(patsubst %.exe,%.iobj,$(OTHER_PROGRAMS)) $(RM) $(patsubst %.exe,%.ipdb,$(OTHER_PROGRAMS)) $(RM) $(patsubst %.exe,%.pdb,$(PROGRAMS)) + $(RM) $(patsubst %.exe,%.ilk,$(PROGRAMS)) $(RM) $(patsubst %.exe,%.iobj,$(PROGRAMS)) $(RM) $(patsubst %.exe,%.ipdb,$(PROGRAMS)) $(RM) $(patsubst %.exe,%.pdb,$(TEST_PROGRAMS)) + $(RM) $(patsubst %.exe,%.ilk,$(TEST_PROGRAMS)) $(RM) $(patsubst %.exe,%.iobj,$(TEST_PROGRAMS)) $(RM) $(patsubst %.exe,%.ipdb,$(TEST_PROGRAMS)) $(RM) compat/vcbuild/MSVC-DEFS-GEN From 96c1f694d4431f89849fa4db57766c07fa1a70ba Mon Sep 17 00:00:00 2001 From: Jeff Hostetler Date: Mon, 5 Apr 2021 14:08:22 -0400 Subject: [PATCH 102/297] vcbuild: add support for compiling Windows resource files Create a wrapper for the Windows Resource Compiler (RC.EXE) for use by the MSVC=1 builds. This is similar to the CL.EXE and LIB.EXE wrappers used for the MSVC=1 builds. Signed-off-by: Jeff Hostetler --- compat/vcbuild/find_vs_env.bat | 7 ++++++ compat/vcbuild/scripts/rc.pl | 46 ++++++++++++++++++++++++++++++++++ config.mak.uname | 3 ++- 3 files changed, 55 insertions(+), 1 deletion(-) create mode 100644 compat/vcbuild/scripts/rc.pl diff --git a/compat/vcbuild/find_vs_env.bat b/compat/vcbuild/find_vs_env.bat index b35d264c0e6bed..379b16296e09c2 100644 --- a/compat/vcbuild/find_vs_env.bat +++ b/compat/vcbuild/find_vs_env.bat @@ -99,6 +99,7 @@ REM ================================================================ SET sdk_dir=%WindowsSdkDir% SET sdk_ver=%WindowsSDKVersion% + SET sdk_ver_bin_dir=%WindowsSdkVerBinPath%%tgt% SET si=%sdk_dir%Include\%sdk_ver% SET sdk_includes=-I"%si%ucrt" -I"%si%um" -I"%si%shared" SET sl=%sdk_dir%lib\%sdk_ver% @@ -130,6 +131,7 @@ REM ================================================================ SET sdk_dir=%WindowsSdkDir% SET sdk_ver=%WindowsSDKVersion% + SET sdk_ver_bin_dir=%WindowsSdkVerBinPath%bin\amd64 SET si=%sdk_dir%Include\%sdk_ver% SET sdk_includes=-I"%si%ucrt" -I"%si%um" -I"%si%shared" -I"%si%winrt" SET sl=%sdk_dir%lib\%sdk_ver% @@ -160,6 +162,11 @@ REM ================================================================ echo msvc_includes=%msvc_includes% echo msvc_libs=%msvc_libs% + echo sdk_ver_bin_dir=%sdk_ver_bin_dir% + SET X1=%sdk_ver_bin_dir:C:=/C% + SET X2=%X1:\=/% + echo sdk_ver_bin_dir_msys=%X2% + echo sdk_includes=%sdk_includes% echo sdk_libs=%sdk_libs% diff --git a/compat/vcbuild/scripts/rc.pl b/compat/vcbuild/scripts/rc.pl new file mode 100644 index 00000000000000..7bca4cd81c6c63 --- /dev/null +++ b/compat/vcbuild/scripts/rc.pl @@ -0,0 +1,46 @@ +#!/usr/bin/perl -w +###################################################################### +# Compile Resources on Windows +# +# This is a wrapper to facilitate the compilation of Git with MSVC +# using GNU Make as the build system. So, instead of manipulating the +# Makefile into something nasty, just to support non-space arguments +# etc, we use this wrapper to fix the command line options +# +###################################################################### +use strict; +my @args = (); +my @input = (); + +while (@ARGV) { + my $arg = shift @ARGV; + if ("$arg" =~ /^-[dD]/) { + # GIT_VERSION gets passed with too many + # layers of dquote escaping. + $arg =~ s/\\"/"/g; + + push(@args, $arg); + + } elsif ("$arg" eq "-i") { + my $arg = shift @ARGV; + # TODO complain if NULL or is dashed ?? + push(@input, $arg); + + } elsif ("$arg" eq "-o") { + my $arg = shift @ARGV; + # TODO complain if NULL or is dashed ?? + push(@args, "-fo$arg"); + + } else { + push(@args, $arg); + } +} + +push(@args, "-nologo"); +push(@args, "-v"); +push(@args, @input); + +unshift(@args, "rc.exe"); +printf("**** @args\n"); + +exit (system(@args) != 0); diff --git a/config.mak.uname b/config.mak.uname index e17fa72a73063e..172c04305d831c 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -440,7 +440,7 @@ ifeq ($(uname_S),Windows) # link.exe next to, and required by, cl.exe, we have to prepend this # onto the existing $PATH. # - SANE_TOOL_PATH ?= $(msvc_bin_dir_msys) + SANE_TOOL_PATH ?= $(msvc_bin_dir_msys):$(sdk_ver_bin_dir_msys) HAVE_ALLOCA_H = YesPlease NO_PREAD = YesPlease NEEDS_CRYPTO_WITH_SSL = YesPlease @@ -508,6 +508,7 @@ endif # See https://msdn.microsoft.com/en-us/library/ms235330.aspx EXTLIBS = user32.lib advapi32.lib shell32.lib wininet.lib ws2_32.lib invalidcontinue.obj kernel32.lib ntdll.lib PTHREAD_LIBS = + RC = compat/vcbuild/scripts/rc.pl lib = BASIC_CFLAGS += $(vcpkg_inc) $(sdk_includes) $(msvc_includes) ifndef DEBUG From d62f09ee013aa62de46a314c363d8e1ac00ce667 Mon Sep 17 00:00:00 2001 From: Jeff Hostetler Date: Mon, 5 Apr 2021 14:12:14 -0400 Subject: [PATCH 103/297] config.mak.uname: add git.rc to MSVC builds Teach MSVC=1 builds to depend on the `git.rc` file so that the resulting executables have Windows-style resources and version number information within them. Signed-off-by: Jeff Hostetler --- config.mak.uname | 1 + 1 file changed, 1 insertion(+) diff --git a/config.mak.uname b/config.mak.uname index 172c04305d831c..a2341e344c7724 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -507,6 +507,7 @@ endif # handle twice, or to access the osfhandle of an already-closed stdout # See https://msdn.microsoft.com/en-us/library/ms235330.aspx EXTLIBS = user32.lib advapi32.lib shell32.lib wininet.lib ws2_32.lib invalidcontinue.obj kernel32.lib ntdll.lib + GITLIBS += git.res PTHREAD_LIBS = RC = compat/vcbuild/scripts/rc.pl lib = From 236daaa2fa484b61f6fa25eb0c74941dde940569 Mon Sep 17 00:00:00 2001 From: Jeff Hostetler Date: Mon, 5 Apr 2021 14:24:52 -0400 Subject: [PATCH 104/297] clink.pl: ignore no-stack-protector arg on MSVC=1 builds Ignore the `-fno-stack-protector` compiler argument when building with MSVC. This will be used in a later commit that needs to build a Win32 GUI app. Signed-off-by: Jeff Hostetler --- compat/vcbuild/scripts/clink.pl | 2 ++ 1 file changed, 2 insertions(+) diff --git a/compat/vcbuild/scripts/clink.pl b/compat/vcbuild/scripts/clink.pl index 2768ae15f1879f..73c8a2b184f38b 100755 --- a/compat/vcbuild/scripts/clink.pl +++ b/compat/vcbuild/scripts/clink.pl @@ -122,6 +122,8 @@ push(@cflags, "-wd4996"); } elsif ("$arg" =~ /^-W[a-z]/) { # let's ignore those + } elsif ("$arg" eq "-fno-stack-protector") { + # eat this } else { push(@args, $arg); } From 5723dd0c4dd49a11b720923f33ce06a5dfec9c20 Mon Sep 17 00:00:00 2001 From: Jeff Hostetler Date: Mon, 5 Apr 2021 14:39:33 -0400 Subject: [PATCH 105/297] clink.pl: move default linker options for MSVC=1 builds Move the default `-ENTRY` and `-SUBSYSTEM` arguments for MSVC=1 builds from `config.mak.uname` into `clink.pl`. These args are constant for console-mode executables. Add support to `clink.pl` for generating a Win32 GUI application using the `-mwindows` argument (to match how GCC does it). This changes the `-ENTRY` and `-SUBSYSTEM` arguments accordingly. Signed-off-by: Jeff Hostetler --- compat/vcbuild/scripts/clink.pl | 11 +++++++++++ config.mak.uname | 2 +- 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/compat/vcbuild/scripts/clink.pl b/compat/vcbuild/scripts/clink.pl index 73c8a2b184f38b..a38b360015ece9 100755 --- a/compat/vcbuild/scripts/clink.pl +++ b/compat/vcbuild/scripts/clink.pl @@ -15,6 +15,7 @@ my @lflags = (); my $is_linking = 0; my $is_debug = 0; +my $is_gui = 0; while (@ARGV) { my $arg = shift @ARGV; if ("$arg" eq "-DDEBUG") { @@ -124,11 +125,21 @@ # let's ignore those } elsif ("$arg" eq "-fno-stack-protector") { # eat this + } elsif ("$arg" eq "-mwindows") { + $is_gui = 1; } else { push(@args, $arg); } } if ($is_linking) { + if ($is_gui) { + push(@args, "-ENTRY:wWinMainCRTStartup"); + push(@args, "-SUBSYSTEM:WINDOWS"); + } else { + push(@args, "-ENTRY:wmainCRTStartup"); + push(@args, "-SUBSYSTEM:CONSOLE"); + } + push(@args, @lflags); unshift(@args, "link.exe"); } else { diff --git a/config.mak.uname b/config.mak.uname index a2341e344c7724..b722c2b80e919d 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -502,7 +502,7 @@ endif compat/win32/trace2_win32_process_info.o \ compat/win32/dirent.o COMPAT_CFLAGS = -D__USE_MINGW_ACCESS -DDETECT_MSYS_TTY -DENSURE_MSYSTEM_IS_SET -DNOGDI -DHAVE_STRING_H -Icompat -Icompat/regex -Icompat/win32 -DSTRIP_EXTENSION=\".exe\" - BASIC_LDFLAGS = -IGNORE:4217 -IGNORE:4049 -NOLOGO -ENTRY:wmainCRTStartup -SUBSYSTEM:CONSOLE + BASIC_LDFLAGS = -IGNORE:4217 -IGNORE:4049 -NOLOGO # invalidcontinue.obj allows Git's source code to close the same file # handle twice, or to access the osfhandle of an already-closed stdout # See https://msdn.microsoft.com/en-us/library/ms235330.aspx From 1941190e4f736067cf970d1f870b48b2735d5eb0 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 7 Apr 2021 15:29:21 +0200 Subject: [PATCH 106/297] buildsystems: remove duplicate clause This seems to have been there since 259d87c35495 (Add scripts to generate projects for other buildsystems (MSVC vcproj, QMake), 2009-09-16), i.e. since the beginning of that file. Signed-off-by: Johannes Schindelin --- contrib/buildsystems/engine.pl | 1 - 1 file changed, 1 deletion(-) diff --git a/contrib/buildsystems/engine.pl b/contrib/buildsystems/engine.pl index 069be7e4befcd7..4243784e8a6b6e 100755 --- a/contrib/buildsystems/engine.pl +++ b/contrib/buildsystems/engine.pl @@ -265,7 +265,6 @@ sub handleCompileLine shift @parts; } elsif ("$part" eq "-c") { # ignore compile flag - } elsif ("$part" eq "-c") { } elsif ($part =~ /^.?-I/) { push(@incpaths, $part); } elsif ($part =~ /^.?-D/) { From 68dab01540eb6d1a177a1758344cee87ad65bb9c Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 7 Apr 2021 15:15:08 +0200 Subject: [PATCH 107/297] vcxproj: handle resource files, too On Windows, we also compile a "resource" file, which is similar to source code, but contains metadata (such as the program version). So far, we did not compile it in `MSVC` mode, only when compiling Git for Windows with the GNU C Compiler. In preparation for including it also when compiling with MS Visual C, let's teach our `vcxproj` generator to handle those sort of files, too. Signed-off-by: Johannes Schindelin --- contrib/buildsystems/Generators/Vcxproj.pm | 17 ++++++++++++++++- contrib/buildsystems/engine.pl | 9 +++++---- 2 files changed, 21 insertions(+), 5 deletions(-) diff --git a/contrib/buildsystems/Generators/Vcxproj.pm b/contrib/buildsystems/Generators/Vcxproj.pm index 0439b82f55e243..5e97f2ff99d344 100644 --- a/contrib/buildsystems/Generators/Vcxproj.pm +++ b/contrib/buildsystems/Generators/Vcxproj.pm @@ -89,6 +89,9 @@ sub createProject { $defines =~ s/>/>/g; $defines =~ s/\'//g; + my $rcdefines = $defines; + $rcdefines =~ s/(?WIN32;_DEBUG;$defines;%(PreprocessorDefinitions) MultiThreadedDebugDLL + + WIN32;_DEBUG;$rcdefines;%(PreprocessorDefinitions) + true @@ -216,6 +222,9 @@ EOM true Speed + + WIN32;NDEBUG;$rcdefines;%(PreprocessorDefinitions) + true true @@ -225,9 +234,15 @@ EOM EOM foreach(@sources) { - print F << "EOM"; + if (/\.rc$/) { + print F << "EOM"; + +EOM + } else { + print F << "EOM"; EOM + } } print F << "EOM"; diff --git a/contrib/buildsystems/engine.pl b/contrib/buildsystems/engine.pl index 4243784e8a6b6e..82849812212efc 100755 --- a/contrib/buildsystems/engine.pl +++ b/contrib/buildsystems/engine.pl @@ -165,7 +165,7 @@ sub parseMakeOutput next; } - if($text =~ / -c /) { + if($text =~ / -c / || $text =~ / -i \S+\.rc /) { # compilation handleCompileLine($text, $line); @@ -263,7 +263,7 @@ sub handleCompileLine if ("$part" eq "-o") { # ignore object file shift @parts; - } elsif ("$part" eq "-c") { + } elsif ("$part" eq "-c" || "$part" eq "-i") { # ignore compile flag } elsif ($part =~ /^.?-I/) { push(@incpaths, $part); @@ -271,7 +271,7 @@ sub handleCompileLine push(@defines, $part); } elsif ($part =~ /^-/) { push(@cflags, $part); - } elsif ($part =~ /\.(c|cc|cpp)$/) { + } elsif ($part =~ /\.(c|cc|cpp|rc)$/) { $sourcefile = $part; } else { die "Unhandled compiler option @ line $lineno: $part"; @@ -358,7 +358,7 @@ sub handleLinkLine push(@libs, $part); } elsif ($part eq 'invalidcontinue.obj') { # ignore - known to MSVC - } elsif ($part =~ /\.o$/) { + } elsif ($part =~ /\.(o|res)$/) { push(@objfiles, $part); } elsif ($part =~ /\.obj$/) { # do nothing, 'make' should not be producing .obj, only .o files @@ -372,6 +372,7 @@ sub handleLinkLine my $sourcefile = $_; $sourcefile =~ s/^headless-git\.o$/compat\/win32\/headless.c/; $sourcefile =~ s/\.o$/.c/; + $sourcefile =~ s/\.res$/.rc/; push(@sources, $sourcefile); push(@cflags, @{$compile_options{"${sourcefile}_CFLAGS"}}); push(@defines, @{$compile_options{"${sourcefile}_DEFINES"}}); From 7c3c91c762da6c278ccaa7302e9f93244d7d6fb5 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 7 Apr 2021 21:57:31 +0200 Subject: [PATCH 108/297] vcxproj: ignore -fno-stack-protector and -fno-common An upcoming commit will introduce those compile options; MSVC does not understand them, so let's suppress them when generating the Visual Studio project files. Signed-off-by: Johannes Schindelin --- contrib/buildsystems/engine.pl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/contrib/buildsystems/engine.pl b/contrib/buildsystems/engine.pl index 82849812212efc..417ae71d44ccab 100755 --- a/contrib/buildsystems/engine.pl +++ b/contrib/buildsystems/engine.pl @@ -263,7 +263,7 @@ sub handleCompileLine if ("$part" eq "-o") { # ignore object file shift @parts; - } elsif ("$part" eq "-c" || "$part" eq "-i") { + } elsif ("$part" eq "-c" || "$part" eq "-i" || "$part" =~ /^-fno-/) { # ignore compile flag } elsif ($part =~ /^.?-I/) { push(@incpaths, $part); From 5c194dac505e51fe4d7b2667ff01bf6fbe24b607 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 7 Apr 2021 15:48:50 +0200 Subject: [PATCH 109/297] vcxproj: handle GUI programs, too So far, we only built Console programs, but we are about to introduce a program that targets the Windows subsystem (i.e. it is a so-called "GUI" program). Let's handle this preemptively in the script that generates the Visual Studio files. Signed-off-by: Johannes Schindelin --- contrib/buildsystems/Generators/Vcxproj.pm | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/contrib/buildsystems/Generators/Vcxproj.pm b/contrib/buildsystems/Generators/Vcxproj.pm index 5e97f2ff99d344..a6d1c6b8d05682 100644 --- a/contrib/buildsystems/Generators/Vcxproj.pm +++ b/contrib/buildsystems/Generators/Vcxproj.pm @@ -92,6 +92,13 @@ sub createProject { my $rcdefines = $defines; $rcdefines =~ s/(?\$(VCPKGLibDirectory);%(AdditionalLibraryDirectories) \$(VCPKGLibs);\$(AdditionalDependencies) invalidcontinue.obj %(AdditionalOptions) - wmainCRTStartup + $entrypoint $cdup\\compat\\win32\\git.manifest - Console + $subsystem EOM if ($target eq 'libgit') { From f537112ba34e5ebf341d6f4338a66ebbd0ab8ccc Mon Sep 17 00:00:00 2001 From: Ian Bearman Date: Fri, 31 Jan 2020 15:37:27 -0800 Subject: [PATCH 110/297] vcxproj: support building Windows/ARM64 binaries Signed-off-by: Ian Bearman Signed-off-by: Dennis Ameling Signed-off-by: Johannes Schindelin --- contrib/buildsystems/Generators/Vcxproj.pm | 23 ++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-) diff --git a/contrib/buildsystems/Generators/Vcxproj.pm b/contrib/buildsystems/Generators/Vcxproj.pm index b2e68a16715e39..f9db773fdbb6cb 100644 --- a/contrib/buildsystems/Generators/Vcxproj.pm +++ b/contrib/buildsystems/Generators/Vcxproj.pm @@ -114,12 +114,21 @@ sub createProject { Release x64 + + Debug + ARM64 + + + Release + ARM64 + $uuid Win32Proj x86-windows - x64-windows + x64-windows + arm64-windows $cdup\\compat\\vcbuild\\vcpkg\\installed\\\$(VCPKGArch) \$(VCPKGArchDirectory)\\debug\\bin \$(VCPKGArchDirectory)\\debug\\lib @@ -140,7 +149,7 @@ sub createProject { $config_type - v140 + v142 ..\\ @@ -355,8 +364,10 @@ sub createGlueProject { print F << "EOM"; Global GlobalSection(SolutionConfigurationPlatforms) = preSolution + Debug|ARM64 = Debug|ARM64 Debug|x64 = Debug|x64 Debug|x86 = Debug|x86 + Release|ARM64 = Release|ARM64 Release|x64 = Release|x64 Release|x86 = Release|x86 EndGlobalSection @@ -367,10 +378,14 @@ EOM foreach (@apps) { my $appname = $_; my $uuid = $build_structure{"APPS_${appname}_GUID"}; + print F "\t\t${uuid}.Debug|ARM64.ActiveCfg = Debug|ARM64\n"; + print F "\t\t${uuid}.Debug|ARM64.Build.0 = Debug|ARM64\n"; print F "\t\t${uuid}.Debug|x64.ActiveCfg = Debug|x64\n"; print F "\t\t${uuid}.Debug|x64.Build.0 = Debug|x64\n"; print F "\t\t${uuid}.Debug|x86.ActiveCfg = Debug|Win32\n"; print F "\t\t${uuid}.Debug|x86.Build.0 = Debug|Win32\n"; + print F "\t\t${uuid}.Release|ARM64.ActiveCfg = Release|ARM64\n"; + print F "\t\t${uuid}.Release|ARM64.Build.0 = Release|ARM64\n"; print F "\t\t${uuid}.Release|x64.ActiveCfg = Release|x64\n"; print F "\t\t${uuid}.Release|x64.Build.0 = Release|x64\n"; print F "\t\t${uuid}.Release|x86.ActiveCfg = Release|Win32\n"; @@ -379,10 +394,14 @@ EOM foreach (@libs) { my $libname = $_; my $uuid = $build_structure{"LIBS_${libname}_GUID"}; + print F "\t\t${uuid}.Debug|ARM64.ActiveCfg = Debug|ARM64\n"; + print F "\t\t${uuid}.Debug|ARM64.Build.0 = Debug|ARM64\n"; print F "\t\t${uuid}.Debug|x64.ActiveCfg = Debug|x64\n"; print F "\t\t${uuid}.Debug|x64.Build.0 = Debug|x64\n"; print F "\t\t${uuid}.Debug|x86.ActiveCfg = Debug|Win32\n"; print F "\t\t${uuid}.Debug|x86.Build.0 = Debug|Win32\n"; + print F "\t\t${uuid}.Release|ARM64.ActiveCfg = Release|ARM64\n"; + print F "\t\t${uuid}.Release|ARM64.Build.0 = Release|ARM64\n"; print F "\t\t${uuid}.Release|x64.ActiveCfg = Release|x64\n"; print F "\t\t${uuid}.Release|x64.Build.0 = Release|x64\n"; print F "\t\t${uuid}.Release|x86.ActiveCfg = Release|Win32\n"; From 4c775c04622ca270c03e37df0c98559849b2b0b9 Mon Sep 17 00:00:00 2001 From: Ian Bearman Date: Fri, 31 Jan 2020 16:00:25 -0800 Subject: [PATCH 111/297] vcbuild: install ARM64 dependencies when building ARM64 binaries Co-authored-by: Dennis Ameling Signed-off-by: Ian Bearman Signed-off-by: Dennis Ameling Signed-off-by: Johannes Schindelin --- compat/vcbuild/README | 6 +++++- compat/vcbuild/vcpkg_copy_dlls.bat | 7 ++++++- compat/vcbuild/vcpkg_install.bat | 9 +++++++-- contrib/buildsystems/Generators/Vcxproj.pm | 2 +- 4 files changed, 19 insertions(+), 5 deletions(-) diff --git a/compat/vcbuild/README b/compat/vcbuild/README index 29ec1d0f104b80..1df1cabb1ebbbd 100644 --- a/compat/vcbuild/README +++ b/compat/vcbuild/README @@ -6,7 +6,11 @@ The Steps to Build Git with VS2015 or VS2017 from the command line. Prompt or from an SDK bash window: $ cd - $ ./compat/vcbuild/vcpkg_install.bat + $ ./compat/vcbuild/vcpkg_install.bat x64-windows + + or + + $ ./compat/vcbuild/vcpkg_install.bat arm64-windows The vcpkg tools and all of the third-party sources will be installed in this folder: diff --git a/compat/vcbuild/vcpkg_copy_dlls.bat b/compat/vcbuild/vcpkg_copy_dlls.bat index 13661c14f8705c..8bea0cbf83b6cf 100644 --- a/compat/vcbuild/vcpkg_copy_dlls.bat +++ b/compat/vcbuild/vcpkg_copy_dlls.bat @@ -15,7 +15,12 @@ REM ================================================================ @FOR /F "delims=" %%D IN ("%~dp0") DO @SET cwd=%%~fD cd %cwd% - SET arch=x64-windows + SET arch=%2 + IF NOT DEFINED arch ( + echo defaulting to 'x64-windows`. Invoke %0 with 'x86-windows', 'x64-windows', or 'arm64-windows' + set arch=x64-windows + ) + SET inst=%cwd%vcpkg\installed\%arch% IF [%1]==[release] ( diff --git a/compat/vcbuild/vcpkg_install.bat b/compat/vcbuild/vcpkg_install.bat index 8330d8120fb511..cacef18c11dc79 100644 --- a/compat/vcbuild/vcpkg_install.bat +++ b/compat/vcbuild/vcpkg_install.bat @@ -31,6 +31,12 @@ REM ================================================================ SETLOCAL EnableDelayedExpansion + SET arch=%1 + IF NOT DEFINED arch ( + echo defaulting to 'x64-windows`. Invoke %0 with 'x86-windows', 'x64-windows', or 'arm64-windows' + set arch=x64-windows + ) + @FOR /F "delims=" %%D IN ("%~dp0") DO @SET cwd=%%~fD cd %cwd% @@ -55,9 +61,8 @@ REM ================================================================ echo Successfully installed %cwd%vcpkg\vcpkg.exe :install_libraries - SET arch=x64-windows - echo Installing third-party libraries... + echo Installing third-party libraries(%arch%)... FOR %%i IN (zlib expat libiconv openssl libssh2 curl) DO ( cd %cwd%vcpkg IF NOT EXIST "packages\%%i_%arch%" CALL :sub__install_one %%i diff --git a/contrib/buildsystems/Generators/Vcxproj.pm b/contrib/buildsystems/Generators/Vcxproj.pm index f9db773fdbb6cb..dc32493f91bf7f 100644 --- a/contrib/buildsystems/Generators/Vcxproj.pm +++ b/contrib/buildsystems/Generators/Vcxproj.pm @@ -193,7 +193,7 @@ EOM Initialize VCPKG del "$cdup\\compat\\vcbuild\\vcpkg" - call "$cdup\\compat\\vcbuild\\vcpkg_install.bat" + call "$cdup\\compat\\vcbuild\\vcpkg_install.bat" \$(VCPKGArch) EOM } From 8456f8c3d2288124dd06bd01232225f2eaa76531 Mon Sep 17 00:00:00 2001 From: Ian Bearman Date: Tue, 4 Feb 2020 10:34:40 -0800 Subject: [PATCH 112/297] vcbuild: add an option to install individual 'features' In this context, a "feature" is a dependency combined with its own dependencies. Signed-off-by: Ian Bearman Signed-off-by: Johannes Schindelin --- compat/vcbuild/vcpkg_install.bat | 35 +++++++++++++++++++++++++++++++- 1 file changed, 34 insertions(+), 1 deletion(-) diff --git a/compat/vcbuild/vcpkg_install.bat b/compat/vcbuild/vcpkg_install.bat index cacef18c11dc79..8da212487ae97d 100644 --- a/compat/vcbuild/vcpkg_install.bat +++ b/compat/vcbuild/vcpkg_install.bat @@ -85,14 +85,47 @@ REM ================================================================ :sub__install_one echo Installing package %1... + call :%1_features + REM vcpkg may not be reliable on slow, intermittent or proxy REM connections, see e.g. REM https://social.msdn.microsoft.com/Forums/windowsdesktop/en-US/4a8f7be5-5e15-4213-a7bb-ddf424a954e6/winhttpsendrequest-ends-with-12002-errorhttptimeout-after-21-seconds-no-matter-what-timeout?forum=windowssdk REM which explains the hidden 21 second timeout REM (last post by Dave : Microsoft - Windows Networking team) - .\vcpkg.exe install %1:%arch% + .\vcpkg.exe install %1%features%:%arch% IF ERRORLEVEL 1 ( EXIT /B 1 ) echo Finished %1 goto :EOF + +:: +:: features for each vcpkg to install +:: there should be an entry here for each package to install +:: 'set features=' means use the default otherwise +:: 'set features=[comma-delimited-feature-set]' is the syntax +:: + +:zlib_features +set features= +goto :EOF + +:expat_features +set features= +goto :EOF + +:libiconv_features +set features= +goto :EOF + +:openssl_features +set features= +goto :EOF + +:libssh2_features +set features= +goto :EOF + +:curl_features +set features=[core,openssl] +goto :EOF From fb77fca476dd33d33e2071a2b75c4a2c53633b76 Mon Sep 17 00:00:00 2001 From: Dennis Ameling Date: Fri, 4 Dec 2020 14:11:34 +0100 Subject: [PATCH 113/297] cmake: allow building for Windows/ARM64 Signed-off-by: Dennis Ameling Signed-off-by: Johannes Schindelin --- contrib/buildsystems/CMakeLists.txt | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt index 832f46b316b764..fb1abbea219bde 100644 --- a/contrib/buildsystems/CMakeLists.txt +++ b/contrib/buildsystems/CMakeLists.txt @@ -65,9 +65,9 @@ if(USE_VCPKG) set(VCPKG_DIR "${CMAKE_SOURCE_DIR}/compat/vcbuild/vcpkg") if(NOT EXISTS ${VCPKG_DIR}) message("Initializing vcpkg and building the Git's dependencies (this will take a while...)") - execute_process(COMMAND ${CMAKE_SOURCE_DIR}/compat/vcbuild/vcpkg_install.bat) + execute_process(COMMAND ${CMAKE_SOURCE_DIR}/compat/vcbuild/vcpkg_install.bat ${VCPKG_ARCH}) endif() - list(APPEND CMAKE_PREFIX_PATH "${VCPKG_DIR}/installed/x64-windows") + list(APPEND CMAKE_PREFIX_PATH "${VCPKG_DIR}/installed/${VCPKG_ARCH}") # In the vcpkg edition, we need this to be able to link to libcurl set(CURL_NO_CURL_CMAKE ON) @@ -1106,7 +1106,7 @@ file(APPEND ${CMAKE_BINARY_DIR}/GIT-BUILD-OPTIONS "RUNTIME_PREFIX='${RUNTIME_PRE file(APPEND ${CMAKE_BINARY_DIR}/GIT-BUILD-OPTIONS "NO_PYTHON='${NO_PYTHON}'\n") file(APPEND ${CMAKE_BINARY_DIR}/GIT-BUILD-OPTIONS "SUPPORTS_SIMPLE_IPC='${SUPPORTS_SIMPLE_IPC}'\n") if(USE_VCPKG) - file(APPEND ${CMAKE_BINARY_DIR}/GIT-BUILD-OPTIONS "PATH=\"$PATH:$TEST_DIRECTORY/../compat/vcbuild/vcpkg/installed/x64-windows/bin\"\n") + file(APPEND ${CMAKE_BINARY_DIR}/GIT-BUILD-OPTIONS "PATH=\"$PATH:$TEST_DIRECTORY/../compat/vcbuild/vcpkg/installed/${VCPKG_ARCH}/bin\"\n") endif() #Make the tests work when building out of the source tree From bfde5e9cc1878e70d1fee0b9509f68e2d83885d8 Mon Sep 17 00:00:00 2001 From: Philip Oakley Date: Sun, 6 Oct 2019 18:40:55 +0100 Subject: [PATCH 114/297] vcpkg_install: detect lack of Git The vcpkg_install batch file depends on the availability of a working Git on the CMD path. This may not be present if the user has selected the 'bash only' option during Git-for-Windows install. Detect and tell the user about their lack of a working Git in the CMD window. Fixes #2348. A separate PR https://github.com/git-for-windows/build-extra/pull/258 now highlights the recommended path setting during install. Signed-off-by: Philip Oakley --- compat/vcbuild/vcpkg_install.bat | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/compat/vcbuild/vcpkg_install.bat b/compat/vcbuild/vcpkg_install.bat index ebd0bad242a8ca..bcbbf536af3141 100644 --- a/compat/vcbuild/vcpkg_install.bat +++ b/compat/vcbuild/vcpkg_install.bat @@ -36,6 +36,13 @@ REM ================================================================ dir vcpkg\vcpkg.exe >nul 2>nul && GOTO :install_libraries + git.exe version 2>nul + IF ERRORLEVEL 1 ( + echo "***" + echo "Git not found. Please adjust your CMD path or Git install option." + echo "***" + EXIT /B 1 ) + echo Fetching vcpkg in %cwd%vcpkg git.exe clone https://github.com/Microsoft/vcpkg vcpkg IF ERRORLEVEL 1 ( EXIT /B 1 ) From ce1f802bec33e9ccb45958d992609c5bfe5678eb Mon Sep 17 00:00:00 2001 From: Dennis Ameling Date: Sun, 29 Nov 2020 00:12:26 +0100 Subject: [PATCH 115/297] ci(vs-build) also build Windows/ARM64 artifacts There are no Windows/ARM64 agents in GitHub Actions yet, therefore we just skip adjusting the `vs-test` job for now. Signed-off-by: Dennis Ameling Signed-off-by: Johannes Schindelin --- .github/workflows/main.yml | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index 1ee0433acc14a2..183fb5d7d90db1 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -169,8 +169,11 @@ jobs: NO_PERL: 1 GIT_CONFIG_PARAMETERS: "'user.name=CI' 'user.email=ci@git'" runs-on: windows-latest + strategy: + matrix: + arch: [x64, arm64] concurrency: - group: vs-build-${{ github.ref }} + group: vs-build-${{ github.ref }}-${{ matrix.arch }} cancel-in-progress: ${{ needs.ci-config.outputs.skip_concurrent == 'yes' }} steps: - uses: actions/checkout@v4 @@ -189,14 +192,14 @@ jobs: uses: microsoft/setup-msbuild@v2 - name: copy dlls to root shell: cmd - run: compat\vcbuild\vcpkg_copy_dlls.bat release + run: compat\vcbuild\vcpkg_copy_dlls.bat release ${{ matrix.arch }}-windows - name: generate Visual Studio solution shell: bash run: | - cmake `pwd`/contrib/buildsystems/ -DCMAKE_PREFIX_PATH=`pwd`/compat/vcbuild/vcpkg/installed/x64-windows \ - -DNO_GETTEXT=YesPlease -DPERL_TESTS=OFF -DPYTHON_TESTS=OFF -DCURL_NO_CURL_CMAKE=ON + cmake `pwd`/contrib/buildsystems/ -DCMAKE_PREFIX_PATH=`pwd`/compat/vcbuild/vcpkg/installed/${{ matrix.arch }}-windows \ + -DNO_GETTEXT=YesPlease -DPERL_TESTS=OFF -DPYTHON_TESTS=OFF -DCURL_NO_CURL_CMAKE=ON -DCMAKE_GENERATOR_PLATFORM=${{ matrix.arch }} -DVCPKG_ARCH=${{ matrix.arch }}-windows - name: MSBuild - run: msbuild git.sln -property:Configuration=Release -property:Platform=x64 -maxCpuCount:4 -property:PlatformToolset=v142 + run: msbuild git.sln -property:Configuration=Release -property:Platform=${{ matrix.arch }} -maxCpuCount:4 -property:PlatformToolset=v142 - name: bundle artifact tar shell: bash env: @@ -210,7 +213,7 @@ jobs: - name: upload tracked files and build artifacts uses: actions/upload-artifact@v4 with: - name: vs-artifacts + name: vs-artifacts-${{ matrix.arch }} path: artifacts vs-test: name: win+VS test @@ -228,7 +231,7 @@ jobs: - name: download tracked files and build artifacts uses: actions/download-artifact@v4 with: - name: vs-artifacts + name: vs-artifacts-x64 path: ${{github.workspace}} - name: extract tracked files and build artifacts shell: bash From 0b15fede8f4cdbd33eff7cd7ba759fadb34d698a Mon Sep 17 00:00:00 2001 From: Yuyi Wang Date: Sat, 11 Mar 2023 17:51:18 +0800 Subject: [PATCH 116/297] cmake: install headless-git. headless-git is a git executable without opening a console window. It is useful when other GUI executables want to call git. We should install it together with git on Windows. Signed-off-by: Yuyi Wang --- contrib/buildsystems/CMakeLists.txt | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt index 832f46b316b764..5775c1c7bbdfad 100644 --- a/contrib/buildsystems/CMakeLists.txt +++ b/contrib/buildsystems/CMakeLists.txt @@ -740,6 +740,7 @@ if(WIN32) endif() add_executable(headless-git ${CMAKE_SOURCE_DIR}/compat/win32/headless.c) + list(APPEND PROGRAMS_BUILT headless-git) if(CMAKE_C_COMPILER_ID STREQUAL "GNU" OR CMAKE_C_COMPILER_ID STREQUAL "Clang") target_link_options(headless-git PUBLIC -municode -Wl,-subsystem,windows) elseif(CMAKE_C_COMPILER_ID STREQUAL "MSVC") @@ -919,7 +920,7 @@ list(TRANSFORM git_perl_scripts PREPEND "${CMAKE_BINARY_DIR}/") #install foreach(program ${PROGRAMS_BUILT}) -if(program MATCHES "^(git|git-shell|scalar)$") +if(program MATCHES "^(git|git-shell|headless-git|scalar)$") install(TARGETS ${program} RUNTIME DESTINATION bin) else() From 7175027087ef6a5e210d3926f8a925e23dbf1e03 Mon Sep 17 00:00:00 2001 From: Philip Oakley Date: Sun, 6 Oct 2019 18:43:57 +0100 Subject: [PATCH 117/297] vcpkg_install: add comment regarding slow network connections The vcpkg downloads may not succeed. Warn careful readers of the time out. A simple retry will usually resolve the issue. Signed-off-by: Philip Oakley Signed-off-by: Johannes Schindelin --- compat/vcbuild/vcpkg_install.bat | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/compat/vcbuild/vcpkg_install.bat b/compat/vcbuild/vcpkg_install.bat index bcbbf536af3141..8330d8120fb511 100644 --- a/compat/vcbuild/vcpkg_install.bat +++ b/compat/vcbuild/vcpkg_install.bat @@ -80,6 +80,12 @@ REM ================================================================ :sub__install_one echo Installing package %1... + REM vcpkg may not be reliable on slow, intermittent or proxy + REM connections, see e.g. + REM https://social.msdn.microsoft.com/Forums/windowsdesktop/en-US/4a8f7be5-5e15-4213-a7bb-ddf424a954e6/winhttpsendrequest-ends-with-12002-errorhttptimeout-after-21-seconds-no-matter-what-timeout?forum=windowssdk + REM which explains the hidden 21 second timeout + REM (last post by Dave : Microsoft - Windows Networking team) + .\vcpkg.exe install %1:%arch% IF ERRORLEVEL 1 ( EXIT /B 1 ) From b5ae0a0bd9d751097bb4709e259ac12feb445a96 Mon Sep 17 00:00:00 2001 From: Dennis Ameling Date: Sun, 6 Dec 2020 18:39:26 +0100 Subject: [PATCH 118/297] Add schannel to curl installation Signed-off-by: Dennis Ameling --- compat/vcbuild/vcpkg_install.bat | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/compat/vcbuild/vcpkg_install.bat b/compat/vcbuild/vcpkg_install.bat index 8da212487ae97d..575c65c20ba307 100644 --- a/compat/vcbuild/vcpkg_install.bat +++ b/compat/vcbuild/vcpkg_install.bat @@ -127,5 +127,5 @@ set features= goto :EOF :curl_features -set features=[core,openssl] +set features=[core,openssl,schannel] goto :EOF From 7ea5028c167a7322b35946e00219173fa9768682 Mon Sep 17 00:00:00 2001 From: Dennis Ameling Date: Mon, 19 Jul 2021 13:02:16 +0200 Subject: [PATCH 119/297] cmake(): allow setting HOST_CPU for cross-compilation Git's regular Makefile mentions that HOST_CPU should be defined when cross-compiling Git: https://github.com/git-for-windows/git/blob/37796bca76ef4180c39ee508ca3e42c0777ba444/Makefile#L438-L439 This is then used to set the GIT_HOST_CPU variable when compiling Git: https://github.com/git-for-windows/git/blob/37796bca76ef4180c39ee508ca3e42c0777ba444/Makefile#L1337-L1341 Then, when the user runs `git version --build-options`, it returns that value: https://github.com/git-for-windows/git/blob/37796bca76ef4180c39ee508ca3e42c0777ba444/help.c#L658 This commit adds the same functionality to the CMake configuration. Users can now set -DHOST_CPU= to set the target architecture. Signed-off-by: Dennis Ameling --- .github/workflows/main.yml | 2 +- contrib/buildsystems/CMakeLists.txt | 9 ++++++++- 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index 183fb5d7d90db1..8e29c351720ab8 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -197,7 +197,7 @@ jobs: shell: bash run: | cmake `pwd`/contrib/buildsystems/ -DCMAKE_PREFIX_PATH=`pwd`/compat/vcbuild/vcpkg/installed/${{ matrix.arch }}-windows \ - -DNO_GETTEXT=YesPlease -DPERL_TESTS=OFF -DPYTHON_TESTS=OFF -DCURL_NO_CURL_CMAKE=ON -DCMAKE_GENERATOR_PLATFORM=${{ matrix.arch }} -DVCPKG_ARCH=${{ matrix.arch }}-windows + -DNO_GETTEXT=YesPlease -DPERL_TESTS=OFF -DPYTHON_TESTS=OFF -DCURL_NO_CURL_CMAKE=ON -DCMAKE_GENERATOR_PLATFORM=${{ matrix.arch }} -DVCPKG_ARCH=${{ matrix.arch }}-windows -DHOST_CPU=${{ matrix.arch }} - name: MSBuild run: msbuild git.sln -property:Configuration=Release -property:Platform=${{ matrix.arch }} -maxCpuCount:4 -property:PlatformToolset=v142 - name: bundle artifact tar diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt index fb1abbea219bde..d3d14501b437e2 100644 --- a/contrib/buildsystems/CMakeLists.txt +++ b/contrib/buildsystems/CMakeLists.txt @@ -223,7 +223,14 @@ endif() #default behaviour include_directories(${CMAKE_SOURCE_DIR}) -add_compile_definitions(GIT_HOST_CPU="${CMAKE_SYSTEM_PROCESSOR}") + +# When cross-compiling, define HOST_CPU as the canonical name of the CPU on +# which the built Git will run (for instance "x86_64"). +if(NOT HOST_CPU) + add_compile_definitions(GIT_HOST_CPU="${CMAKE_SYSTEM_PROCESSOR}") +else() + add_compile_definitions(GIT_HOST_CPU="${HOST_CPU}") +endif() add_compile_definitions(SHA256_BLK INTERNAL_QSORT RUNTIME_PREFIX) add_compile_definitions(NO_OPENSSL SHA1_DC SHA1DC_NO_STANDARD_INCLUDES SHA1DC_INIT_SAFE_HASH_DEFAULT=0 From c038faf3ed15fc95a5a183f20c196f858b5f5ebf Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 2 Apr 2021 22:50:54 +0200 Subject: [PATCH 120/297] mingw: allow for longer paths in `parse_interpreter()` MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit As reported in https://github.com/newren/git-filter-repo/pull/225, it looks like 99 bytes is not really sufficient to represent e.g. the full path to Python when installed via Windows Store (and this path is used in the hasb bang line when installing scripts via `pip`). Let's increase it to what is probably the maximum sensible path size: MAX_PATH. This makes `parse_interpreter()` in line with what `lookup_prog()` handles. Signed-off-by: Johannes Schindelin Signed-off-by: Vilius Šumskas --- compat/mingw.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/compat/mingw.c b/compat/mingw.c index 29d3f09768c29a..afb6ccb95ef92a 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -1269,7 +1269,7 @@ static const char *quote_arg_msys2(const char *arg) static const char *parse_interpreter(const char *cmd) { - static char buf[100]; + static char buf[MAX_PATH]; char *p, *opt; int n, fd; From e52d8ed5684b5a7863b4647d90e3b0d5923da191 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 17 May 2021 10:46:52 +0200 Subject: [PATCH 121/297] compat/vcbuild: document preferred way to build in Visual Studio We used to have that `make vcxproj` hack, but a hack it is. In the meantime, we have a much cleaner solution: using CMake, either explicitly, or even more conveniently via Visual Studio's built-in CMake support (simply open Git's top-level directory via File>Open>Folder...). Let's let the `README` reflect this. Signed-off-by: Johannes Schindelin --- compat/vcbuild/README | 28 +++++++++------------------- 1 file changed, 9 insertions(+), 19 deletions(-) diff --git a/compat/vcbuild/README b/compat/vcbuild/README index 29ec1d0f104b80..5c71ea2daa4017 100644 --- a/compat/vcbuild/README +++ b/compat/vcbuild/README @@ -37,27 +37,17 @@ The Steps to Build Git with VS2015 or VS2017 from the command line. ================================================================ -Alternatively, run `make vcxproj` and then load the generated `git.sln` in -Visual Studio. The initial build will install the vcpkg system and build the +Alternatively, just open Git's top-level directory in Visual Studio, via +`File>Open>Folder...`. This will use CMake internally to generate the +project definitions. It will also install the vcpkg system and build the dependencies automatically. This will take a while. -Instead of generating the `git.sln` file yourself (which requires a full Git -for Windows SDK), you may want to consider fetching the `vs/master` branch of -https://github.com/git-for-windows/git instead (which is updated automatically -via CI running `make vcxproj`). The `vs/master` branch does not require a Git -for Windows to build, but you can run the test scripts in a regular Git Bash. - -Note that `make vcxproj` will automatically add and commit the generated `.sln` -and `.vcxproj` files to the repo. This is necessary to allow building a -fully-testable Git in Visual Studio, where a regular Git Bash can be used to -run the test scripts (as opposed to a full Git for Windows SDK): a number of -build targets, such as Git commands implemented as Unix shell scripts (where -`@@SHELL_PATH@@` and other placeholders are interpolated) require a full-blown -Git for Windows SDK (which is about 10x the size of a regular Git for Windows -installation). - -If your plan is to open a Pull Request with Git for Windows, it is a good idea -to drop this commit before submitting. +You can also generate the Visual Studio solution manually by downloading +and running CMake explicitly rather than letting Visual Studio doing +that implicitly. + +Another, deprecated option is to run `make vcxproj`. This option is +superseded by the CMake-based build, and will be removed at some point. ================================================================ The Steps of Build Git with VS2008 From 866758a6a12539ac7d769dcb0f9e018ae590307b Mon Sep 17 00:00:00 2001 From: Pascal Muller Date: Wed, 23 Jun 2021 21:21:10 +0200 Subject: [PATCH 122/297] http: optionally send SSL client certificate This adds support for a new http.sslAutoClientCert config value. In cURL 7.77 or later the schannel backend does not automatically send client certificates from the Windows Certificate Store anymore. This config value is only used if http.sslBackend is set to "schannel", and can be used to opt in to the old behavior and force cURL to send client certificates. This fixes https://github.com/git-for-windows/git/issues/3292 Signed-off-by: Pascal Muller --- Documentation/config/http.txt | 5 +++++ git-curl-compat.h | 8 ++++++++ http.c | 26 ++++++++++++++++++++++---- 3 files changed, 35 insertions(+), 4 deletions(-) diff --git a/Documentation/config/http.txt b/Documentation/config/http.txt index 101487cfa43a32..dcd3dd90e34013 100644 --- a/Documentation/config/http.txt +++ b/Documentation/config/http.txt @@ -234,6 +234,11 @@ http.schannelUseSSLCAInfo:: when the `schannel` backend was configured via `http.sslBackend`, unless `http.schannelUseSSLCAInfo` overrides this behavior. +http.sslAutoClientCert:: + As of cURL v7.77.0, the Secure Channel backend won't automatically + send client certificates from the Windows Certificate Store anymore. + To opt in to the old behavior, http.sslAutoClientCert can be set. + http.pinnedPubkey:: Public key of the https service. It may either be the filename of a PEM or DER encoded public key file or a string starting with diff --git a/git-curl-compat.h b/git-curl-compat.h index e1d0bdd273501f..d4755194ae0940 100644 --- a/git-curl-compat.h +++ b/git-curl-compat.h @@ -143,4 +143,12 @@ #define GIT_CURL_HAVE_CURLOPT_PROTOCOLS_STR 1 #endif +/** + * CURLSSLOPT_AUTO_CLIENT_CERT was added in 7.77.0, released in May + * 2021. + */ +#if LIBCURL_VERSION_NUM >= 0x074d00 +#define GIT_CURL_HAVE_CURLSSLOPT_AUTO_CLIENT_CERT +#endif + #endif diff --git a/http.c b/http.c index f9c55c5cacff37..c89f96cebab17b 100644 --- a/http.c +++ b/http.c @@ -161,6 +161,8 @@ static int http_schannel_check_revoke_mode = */ static int http_schannel_use_ssl_cainfo; +static int http_auto_client_cert; + static int always_auth_proactively(void) { return http_proactive_auth != PROACTIVE_AUTH_NONE && @@ -449,6 +451,11 @@ static int http_options(const char *var, const char *value, return 0; } + if (!strcmp("http.sslautoclientcert", var)) { + http_auto_client_cert = git_config_bool(var, value); + return 0; + } + if (!strcmp("http.minsessions", var)) { min_curl_sessions = git_config_int(var, value, ctx->kvi); if (min_curl_sessions > 1) @@ -1101,13 +1108,24 @@ static CURL *get_curl_handle(void) } #endif - if (http_ssl_backend && !strcmp("schannel", http_ssl_backend) && - http_schannel_check_revoke_mode) { + if (http_ssl_backend && !strcmp("schannel", http_ssl_backend)) { + long ssl_options = 0; + if (http_schannel_check_revoke_mode) { #ifdef GIT_CURL_HAVE_CURLSSLOPT_NO_REVOKE - curl_easy_setopt(result, CURLOPT_SSL_OPTIONS, http_schannel_check_revoke_mode); + ssl_options |= http_schannel_check_revoke_mode; #else - warning(_("CURLSSLOPT_NO_REVOKE not supported with cURL < 7.44.0")); + warning(_("CURLSSLOPT_NO_REVOKE not supported with cURL < 7.44.0")); #endif + } + + if (http_auto_client_cert) { +#ifdef GIT_CURL_HAVE_CURLSSLOPT_AUTO_CLIENT_CERT + ssl_options |= CURLSSLOPT_AUTO_CLIENT_CERT; +#endif + } + + if (ssl_options) + curl_easy_setopt(result, CURLOPT_SSL_OPTIONS, ssl_options); } if (http_proactive_auth != PROACTIVE_AUTH_NONE) From 1a39b65765d2fbf5bad51932ac298c2b69d2c712 Mon Sep 17 00:00:00 2001 From: Philip Oakley Date: Fri, 2 Jul 2021 00:30:24 +0100 Subject: [PATCH 123/297] CMake: default Visual Studio generator has changed Correct some wording and inform users regarding the Visual Studio changes (from V16.6) to the default generator. Subsequent commits ensure that Git for Windows can be directly opened in modern Visual Studio without needing special configuration of the CMakeLists settings. It appeares that internally Visual Studio creates it's own version of the .sln file (etc.) for extension tools that expect them. The large number of references below document the shifting of Visual Studio default and CMake setting options. refs: https://docs.microsoft.com/en-us/search/?scope=C%2B%2B&view=msvc-150&terms=Ninja 1. https://docs.microsoft.com/en-us/cpp/linux/cmake-linux-configure?view=msvc-160 (note the linux bit) "In Visual Studio 2019 version 16.6 or later ***, Ninja is the default generator for configurations targeting a remote system or WSL. For more information, see this post on the C++ Team Blog [https://devblogs.microsoft.com/cppblog/linux-development-with-visual-studio-first-class-support-for-gdbserver-improved-build-times-with-ninja-and-updates-to-the-connection-manager/]. For more information about these settings, see CMakeSettings.json reference [https://docs.microsoft.com/en-us/cpp/build/cmakesettings-reference?view=msvc-160]." 2. https://docs.microsoft.com/en-us/cpp/build/cmake-presets-vs?view=msvc-160 "CMake supports two files that allow users to specify common configure, build, and test options and share them with others: CMakePresets.json and CMakeUserPresets.json." " Both files are supported in Visual Studio 2019 version 16.10 or later. ***" 3. https://devblogs.microsoft.com/cppblog/linux-development-with-visual-studio-first-class-support-for-gdbserver-improved-build-times-with-ninja-and-updates-to-the-connection-manager/ " Ninja has been the default generator (underlying build system) for CMake configurations targeting Windows for some time***, but in Visual Studio 2019 version 16.6 Preview 3*** we added support for Ninja on Linux." 4. https://docs.microsoft.com/en-us/cpp/build/cmakesettings-reference?view=msvc-160 " `generator`: specifies CMake generator to use for this configuration. May be one of: Visual Studio 2019 only: Visual Studio 16 2019 Visual Studio 16 2019 Win64 Visual Studio 16 2019 ARM Visual Studio 2017 and later: Visual Studio 15 2017 Visual Studio 15 2017 Win64 Visual Studio 15 2017 ARM Visual Studio 14 2015 Visual Studio 14 2015 Win64 Visual Studio 14 2015 ARM Unix Makefiles Ninja Because Ninja is designed for fast build speeds instead of flexibility and function, it is set as the default. However, some CMake projects may be unable to correctly build using Ninja. If this occurs, you can instruct CMake to generate Visual Studio projects instead. To specify a Visual Studio generator in Visual Studio 2017, open the settings editor from the main menu by choosing CMake | Change CMake Settings. Delete "Ninja" and type "V". This activates IntelliSense, which enables you to choose the generator you want." "To specify a Visual Studio generator in Visual Studio 2019, right-click on the CMakeLists.txt file in Solution Explorer and choose CMake Settings for project > Show Advanced Settings > CMake Generator. When the active configuration specifies a Visual Studio generator, by default MSBuild.exe is invoked with` -m -v:minimal` arguments." 5. https://docs.microsoft.com/en-us/cpp/build/cmake-presets-vs?view=msvc-160#enable-cmakepresetsjson-integration-in-visual-studio-2019 "Enable CMakePresets.json integration in Visual Studio 2019 CMakePresets.json integration isn't enabled by default in Visual Studio 2019. You can enable it for all CMake projects in Tools > Options > CMake > General: (tick a box)" ... see more. 6. https://docs.microsoft.com/en-us/cpp/build/cmakesettings-reference?view=msvc-140 (whichever v140 is..) "CMake projects are supported in Visual Studio 2017 and later." 7. https://docs.microsoft.com/en-us/cpp/overview/what-s-new-for-cpp-2017?view=msvc-150 "Support added for the CMake Ninja generator." 8. https://docs.microsoft.com/en-us/cpp/overview/what-s-new-for-cpp-2017?view=msvc-150#cmake-support-via-open-folder "CMake support via Open Folder Visual Studio 2017 introduces support for using CMake projects without converting to MSBuild project files (.vcxproj). For more information, see CMake projects in Visual Studio[https://docs.microsoft.com/en-us/cpp/build/cmake-projects-in-visual-studio?view=msvc-150]. Opening CMake projects with Open Folder automatically configures the environment for C++ editing, building, and debugging." ... +more! 9. https://docs.microsoft.com/en-us/cpp/build/cmake-presets-vs?view=msvc-160#supported-cmake-and-cmakepresetsjson-versions "Visual Studio reads and evaluates CMakePresets.json and CMakeUserPresets.json itself and doesn't invoke CMake directly with the --preset option. So, CMake version 3.20 or later isn't strictly required when you're building with CMakePresets.json inside Visual Studio. We recommend using CMake version 3.14 or later." 10. https://docs.microsoft.com/en-us/cpp/build/cmake-presets-vs?view=msvc-160#enable-cmakepresetsjson-integration-in-visual-studio-2019 "If you don't want to enable CMakePresets.json integration for all CMake projects, you can enable CMakePresets.json integration for a single CMake project by adding a CMakePresets.json file to the root of the open folder. You must close and reopen the folder in Visual Studio to activate the integration. 11. https://docs.microsoft.com/en-us/cpp/build/cmake-presets-vs?view=msvc-160#default-configure-presets ***(doesn't actually say which version..) "Default Configure Presets If no CMakePresets.json or CMakeUserPresets.json file exists, or if CMakePresets.json or CMakeUserPresets.json is invalid, Visual Studio will fall back*** on the following default Configure Presets: Windows example JSON { "name": "windows-default", "displayName": "Windows x64 Debug", "description": "Sets Ninja generator, compilers, x64 architecture, build and install directory, debug build type", "generator": "Ninja", "binaryDir": "${sourceDir}/out/build/${presetName}", "architecture": { "value": "x64", "strategy": "external" }, "cacheVariables": { "CMAKE_BUILD_TYPE": "Debug", "CMAKE_INSTALL_PREFIX": "${sourceDir}/out/install/${presetName}" }, "vendor": { "microsoft.com/VisualStudioSettings/CMake/1.0": { "hostOS": [ "Windows" ] } } }, " Signed-off-by: Philip Oakley --- contrib/buildsystems/CMakeLists.txt | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt index d3d14501b437e2..03e63ec4feec7f 100644 --- a/contrib/buildsystems/CMakeLists.txt +++ b/contrib/buildsystems/CMakeLists.txt @@ -14,6 +14,11 @@ Note: Visual Studio also has the option of opening `CMakeLists.txt` directly; Using this option, Visual Studio will not find the source code, though, therefore the `File>Open>Folder...` option is preferred. +Visual Studio does not produce a .sln solution file nor the .vcxproj files +that may be required by VS extension tools. + +To generate the .sln/.vcxproj files run CMake manually, as described below. + Instructions to run CMake manually: mkdir -p contrib/buildsystems/out @@ -22,7 +27,7 @@ Instructions to run CMake manually: This will build the git binaries in contrib/buildsystems/out directory (our top-level .gitignore file knows to ignore contents of -this directory). +this directory). The project .sln and .vcxproj files are also generated. Possible build configurations(-DCMAKE_BUILD_TYPE) with corresponding compiler flags @@ -35,17 +40,16 @@ empty(default) : NOTE: -DCMAKE_BUILD_TYPE is optional. For multi-config generators like Visual Studio this option is ignored -This process generates a Makefile(Linux/*BSD/MacOS) , Visual Studio solution(Windows) by default. +This process generates a Makefile(Linux/*BSD/MacOS), Visual Studio solution(Windows) by default. Run `make` to build Git on Linux/*BSD/MacOS. Open git.sln on Windows and build Git. -NOTE: By default CMake uses Makefile as the build tool on Linux and Visual Studio in Windows, -to use another tool say `ninja` add this to the command line when configuring. -`-G Ninja` - NOTE: By default CMake will install vcpkg locally to your source tree on configuration, to avoid this, add `-DNO_VCPKG=TRUE` to the command line when configuring. +The Visual Studio default generator changed in v16.6 from its Visual Studio +implemenation to `Ninja` This required changes to many CMake scripts. + ]] cmake_minimum_required(VERSION 3.14) From d99334cab11296bdc137d5e2c45fe9f0c9c579fe Mon Sep 17 00:00:00 2001 From: Philip Oakley Date: Sat, 24 Apr 2021 11:09:58 +0100 Subject: [PATCH 124/297] .gitignore: add Visual Studio CMakeSetting.json file The CMakeSettings.json file is tool generated. Developers may track it should they provide additional settings. Signed-off-by: Philip Oakley --- .gitignore | 1 + 1 file changed, 1 insertion(+) diff --git a/.gitignore b/.gitignore index 8caf3700c2305d..bf97276163b19b 100644 --- a/.gitignore +++ b/.gitignore @@ -248,3 +248,4 @@ Release/ /git.VC.db *.dSYM /contrib/buildsystems/out +CMakeSettings.json From 2819b4f892324abd2204b4b278f2537c275ac869 Mon Sep 17 00:00:00 2001 From: Victoria Dye Date: Thu, 5 Aug 2021 19:04:13 -0400 Subject: [PATCH 125/297] subtree: update `contrib/subtree` `test` target The intention of this change is to align with how the top-level git `Makefile` defines its own test target (which also internally calls `$(MAKE) -C t/ all`). This change also ensures the consistency of `make -C contrib/subtree test` with other testing in CI executions (which rely on `$DEFAULT_TEST_TARGET` being defined as `prove`). Signed-off-by: Victoria Dye --- contrib/subtree/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/contrib/subtree/Makefile b/contrib/subtree/Makefile index 6fa7496bfdb3fd..6f6e90c4cb49b6 100644 --- a/contrib/subtree/Makefile +++ b/contrib/subtree/Makefile @@ -94,7 +94,7 @@ $(GIT_SUBTREE_TEST): $(GIT_SUBTREE) cp $< $@ test: $(GIT_SUBTREE_TEST) - $(MAKE) -C t/ test + $(MAKE) -C t/ all clean: $(RM) $(GIT_SUBTREE) From 80c9b3fffe0c5926825602ceb74481092b0644d5 Mon Sep 17 00:00:00 2001 From: Philip Oakley Date: Thu, 22 Apr 2021 11:11:38 +0100 Subject: [PATCH 126/297] CMakeLists: add default "x64-windows" arch for Visual Studio In Git-for-Windows, work on using ARM64 has progressed. The commit 2d94b77b27 (cmake: allow building for Windows/ARM64, 2020-12-04) failed to notice that /compat/vcbuild/vcpkg_install.bat will default to using the "x64-windows" architecture for the vcpkg installation if not set, but CMake is not told of this default. Commit 635b6d99b3 (vcbuild: install ARM64 dependencies when building ARM64 binaries, 2020-01-31) later updated vcpkg_install.bat to accept an arch (%1) parameter, but retained the default. This default is neccessary for the use case where the project directory is opened directly in Visual Studio, which will find and build a CMakeLists.txt file without any parameters, thus expecting use of the default setting. Also Visual studio will generate internal .sln solution and .vcxproj project files needed for some extension tools. Inform users of the additional .sln/.vcxproj generation. ** How to test: rm -rf '.vs' # remove old visual studio settings rm -rf 'compat/vcbuild/vcpkg' # remove any vcpkg downloads rm -rf 'contrib/buildsystems/out' # remove builds & CMake artifacts with a fresh Visual Studio Community Edition, File>>Open>>(git *folder*) to load the project (which will take some time!). check for successful compilation. The implicit .sln (etc.) are in the hidden .vs directory created by Visual Studio. Signed-off-by: Philip Oakley --- contrib/buildsystems/CMakeLists.txt | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt index 03e63ec4feec7f..951473c76e046b 100644 --- a/contrib/buildsystems/CMakeLists.txt +++ b/contrib/buildsystems/CMakeLists.txt @@ -71,6 +71,10 @@ if(USE_VCPKG) message("Initializing vcpkg and building the Git's dependencies (this will take a while...)") execute_process(COMMAND ${CMAKE_SOURCE_DIR}/compat/vcbuild/vcpkg_install.bat ${VCPKG_ARCH}) endif() + if(NOT EXISTS ${VCPKG_ARCH}) + message("VCPKG_ARCH: unset, using 'x64-windows'") + set(VCPKG_ARCH "x64-windows") # default from vcpkg_install.bat + endif() list(APPEND CMAKE_PREFIX_PATH "${VCPKG_DIR}/installed/${VCPKG_ARCH}") # In the vcpkg edition, we need this to be able to link to libcurl From 150fc9ec383e72ff0f68ac37c1a00753470a49c4 Mon Sep 17 00:00:00 2001 From: Philip Oakley Date: Sun, 31 Oct 2021 23:15:13 +0000 Subject: [PATCH 127/297] hash-object: demonstrate a >4GB/LLP64 problem On LLP64 systems, such as Windows, the size of `long`, `int`, etc. is only 32 bits (for backward compatibility). Git's use of `unsigned long` for file memory sizes in many places, rather than size_t, limits the handling of large files on LLP64 systems (commonly given as `>4GB`). Provide a minimum test for handling a >4GB file. The `hash-object` command, with the `--literally` and without `-w` option avoids writing the object, either loose or packed. This avoids the code paths hitting the `bigFileThreshold` config test code, the zlib code, and the pack code. Subsequent patches will walk the test's call chain, converting types to `size_t` (which is larger in LLP64 data models) where appropriate. Signed-off-by: Philip Oakley Signed-off-by: Johannes Schindelin --- t/t1007-hash-object.sh | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh index d73a5cc23705e2..6299612850e574 100755 --- a/t/t1007-hash-object.sh +++ b/t/t1007-hash-object.sh @@ -50,6 +50,9 @@ test_expect_success 'setup' ' example sha1:ddd3f836d3e3fbb7ae289aa9ae83536f76956399 example sha256:b44fe1fe65589848253737db859bd490453510719d7424daab03daf0767b85ae + + large5GB sha1:0be2be10a4c8764f32c4bf372a98edc731a4b204 + large5GB sha256:dc18ca621300c8d3cfa505a275641ebab00de189859e022a975056882d313e64 EOF ' @@ -266,4 +269,12 @@ test_expect_success '--stdin outside of repository (uses SHA-1)' ' test_cmp expect actual ' +test_expect_failure EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \ + 'files over 4GB hash literally' ' + test-tool genzeros $((5*1024*1024*1024)) >big && + test_oid large5GB >expect && + git hash-object --stdin --literally actual && + test_cmp expect actual +' + test_done From 3800a28d861e4a83ad59592caf762415fd356fbe Mon Sep 17 00:00:00 2001 From: Victoria Dye Date: Thu, 5 Aug 2021 19:11:59 -0400 Subject: [PATCH 128/297] ci: run `contrib/subtree` tests in CI builds Because `git subtree` (unlike most other `contrib` modules) is included as part of the standard release of Git for Windows, its stability should be verified as consistently as it is for the rest of git. By including the `git subtree` tests in the CI workflow, these tests are as much of a gate to merging and indicator of stability as the standard test suite. Signed-off-by: Victoria Dye --- ci/run-build-and-tests.sh | 4 ++++ ci/run-test-slice.sh | 3 +++ 2 files changed, 7 insertions(+) diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh index 98dda42045d3f3..18ffedba05081b 100755 --- a/ci/run-build-and-tests.sh +++ b/ci/run-build-and-tests.sh @@ -56,4 +56,8 @@ then fi check_unignored_build_artifacts +case " $MAKE_TARGETS " in +*" all "*) make -C contrib/subtree test;; +esac + save_good_tree diff --git a/ci/run-test-slice.sh b/ci/run-test-slice.sh index e167e646f79e3d..8cb0038d07ca64 100755 --- a/ci/run-test-slice.sh +++ b/ci/run-test-slice.sh @@ -20,4 +20,7 @@ if [ "$1" == "0" ] ; then group "Run unit tests" make --quiet -C t unit-tests-test-tool fi +# Run the git subtree tests only if main tests succeeded +test 0 != "$1" || make -C contrib/subtree test + check_unignored_build_artifacts From 30158eb617cda8cd9d346d427775ac5174a300bd Mon Sep 17 00:00:00 2001 From: Philip Oakley Date: Mon, 10 May 2021 16:47:40 +0100 Subject: [PATCH 129/297] CMake: show Win32 and Generator_platform build-option values Ensure key CMake option values are part of the CMake output to facilitate user support when tool updates impact the wider CMake actions, particularly ongoing 'improvements' in Visual Studio. These CMake displays perform the same function as the build-options.txt provided in the main Git for Windows. CMake is already chatty. The setting of CMAKE_EXPORT_COMPILE_COMMANDS is also reported. Include the environment's CMAKE_EXPORT_COMPILE_COMMANDS value which may have been propogated to CMake's internal value. Testing the CMAKE_EXPORT_COMPILE_COMMANDS processing can be difficult in the Visual Studio environment, as it may be cached in many places. The 'environment' may include the OS, the user shell, CMake's own environment, along with the Visual Studio presets and caches. See previous commit for arefacts that need removing for a clean test. Signed-off-by: Philip Oakley --- contrib/buildsystems/CMakeLists.txt | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt index 951473c76e046b..5100383ede92e0 100644 --- a/contrib/buildsystems/CMakeLists.txt +++ b/contrib/buildsystems/CMakeLists.txt @@ -63,10 +63,20 @@ endif() if(NOT DEFINED CMAKE_EXPORT_COMPILE_COMMANDS) set(CMAKE_EXPORT_COMPILE_COMMANDS TRUE) + message("settting CMAKE_EXPORT_COMPILE_COMMANDS: ${CMAKE_EXPORT_COMPILE_COMMANDS}") endif() if(USE_VCPKG) set(VCPKG_DIR "${CMAKE_SOURCE_DIR}/compat/vcbuild/vcpkg") + message("WIN32: ${WIN32}") # show its underlying text values + message("VCPKG_DIR: ${VCPKG_DIR}") + message("VCPKG_ARCH: ${VCPKG_ARCH}") # maybe unset + message("MSVC: ${MSVC}") + message("CMAKE_GENERATOR: ${CMAKE_GENERATOR}") + message("CMAKE_CXX_COMPILER_ID: ${CMAKE_CXX_COMPILER_ID}") + message("CMAKE_GENERATOR_PLATFORM: ${CMAKE_GENERATOR_PLATFORM}") + message("CMAKE_EXPORT_COMPILE_COMMANDS: ${CMAKE_EXPORT_COMPILE_COMMANDS}") + message("ENV(CMAKE_EXPORT_COMPILE_COMMANDS): $ENV{CMAKE_EXPORT_COMPILE_COMMANDS}") if(NOT EXISTS ${VCPKG_DIR}) message("Initializing vcpkg and building the Git's dependencies (this will take a while...)") execute_process(COMMAND ${CMAKE_SOURCE_DIR}/compat/vcbuild/vcpkg_install.bat ${VCPKG_ARCH}) From 30c9806d8156f08a522a265092e76fed7af59930 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 8 Sep 2021 13:05:42 +0200 Subject: [PATCH 130/297] init: do parse _all_ core.* settings early In Git for Windows, `has_symlinks` is set to 0 by default. Therefore, we need to parse the config setting `core.symlinks` to know if it has been set to `true`. In `git init`, we must do that before copying the templates because they might contain symbolic links. Even if the support for symbolic links on Windows has not made it to upstream Git yet, we really should make sure that all the `core.*` settings are parsed before proceeding, as they might very well change the behavior of `git init` in a way the user intended. This fixes https://github.com/git-for-windows/git/issues/3414 Signed-off-by: Johannes Schindelin --- config.c | 4 ++-- config.h | 2 ++ setup.c | 2 +- 3 files changed, 5 insertions(+), 3 deletions(-) diff --git a/config.c b/config.c index 05f369ec0d8876..0a8d6901c83608 100644 --- a/config.c +++ b/config.c @@ -1387,8 +1387,8 @@ int git_config_color(char *dest, const char *var, const char *value) return 0; } -static int git_default_core_config(const char *var, const char *value, - const struct config_context *ctx, void *cb) +int git_default_core_config(const char *var, const char *value, + const struct config_context *ctx, void *cb) { /* This needs a better name */ if (!strcmp(var, "core.filemode")) { diff --git a/config.h b/config.h index 54b47dec9e2709..418384646b04d3 100644 --- a/config.h +++ b/config.h @@ -167,6 +167,8 @@ typedef int (*config_fn_t)(const char *, const char *, int git_default_config(const char *, const char *, const struct config_context *, void *); +int git_default_core_config(const char *var, const char *value, + const struct config_context *ctx, void *cb); /** * Read a specific file in git-config format. diff --git a/setup.c b/setup.c index 5f81d9fac0d7e8..79c8f52e59bd83 100644 --- a/setup.c +++ b/setup.c @@ -2411,7 +2411,7 @@ int init_db(const char *git_dir, const char *real_git_dir, * have set up the repository format such that we can evaluate * includeIf conditions correctly in the case of re-initialization. */ - git_config(platform_core_config, NULL); + git_config(git_default_core_config, NULL); safe_create_dir(git_dir, 0); From b6675e06531b71aa44ffc149f63ad157681eed91 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 5 Mar 2021 23:12:11 +0100 Subject: [PATCH 131/297] Enable the built-in FSMonitor as an experimental feature If `feature.experimental` and `feature.manyFiles` are set and the user has not explicitly turned off the builtin FSMonitor, we now start the built-in FSMonitor by default. Only forcing it when UNSET matches the behavior of UPDATE_DEFAULT_BOOL() used for other repo settings. Signed-off-by: Johannes Schindelin Signed-off-by: Jeff Hostetler --- repo-settings.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/repo-settings.c b/repo-settings.c index 2b4e68731bedce..f104e6fc73c38f 100644 --- a/repo-settings.c +++ b/repo-settings.c @@ -2,6 +2,8 @@ #include "config.h" #include "repository.h" #include "midx.h" +#include "fsmonitor-ipc.h" +#include "fsmonitor-settings.h" static void repo_cfg_bool(struct repository *r, const char *key, int *dest, int def) @@ -45,6 +47,30 @@ void prepare_repo_settings(struct repository *r) r->settings.fetch_negotiation_algorithm = FETCH_NEGOTIATION_SKIPPING; r->settings.pack_use_bitmap_boundary_traversal = 1; r->settings.pack_use_multi_pack_reuse = 1; + + /* + * Force enable the builtin FSMonitor (unless the repo + * is incompatible or they've already selected it or + * the hook version). But only if they haven't + * explicitly turned it off -- so only if our config + * value is UNSET. + * + * lookup_fsmonitor_settings() and check_for_ipc() do + * not distinguish between explicitly set FALSE and + * UNSET, so we re-test for an UNSET config key here. + * + * I'm not sure I want to fix fsmonitor-settings.c to + * have more than one _DISABLED state since our usage + * here is only to support this experimental period + * (and I don't want to overload the _reason field + * because it describes incompabilities). + */ + if (manyfiles && + fsmonitor_ipc__is_supported() && + fsm_settings__get_mode(r) == FSMONITOR_MODE_DISABLED && + repo_config_get_maybe_bool(r, "core.fsmonitor", &value) > 0 && + repo_config_get_bool(r, "core.useBuiltinFSMonitor", &value)) + fsm_settings__set_ipc(r); } if (manyfiles) { r->settings.index_version = 4; From 00b803786a6993d4a39f90cada60f1324e9e2535 Mon Sep 17 00:00:00 2001 From: Philip Oakley Date: Fri, 12 Nov 2021 21:07:03 +0000 Subject: [PATCH 132/297] write_object_file_literally(): use size_t The previous commit adds a test that demonstrates a problem in the `hash-object --literally` command, manifesting in an unnecessary file size limit on systems using the LLP64 data model (which includes Windows). Walking the affected code path is `cmd_hash_object()` >> `hash_fd()` >> `hash_literally()` >> `hash_object_file_literally()`. The function `hash_object_file_literally()` is the first with a file length parameter (via a mem buffer). This commit changes the type of that parameter to the LLP64 compatible `size_t` type. There are no other uses of the function. The `strbuf` type is already `size_t` compatible. Note: The hash-object test does not yet pass. Subsequent commits will continue to walk the call tree's lower level functions to identify further fixes. Signed-off-by: Philip Oakley Signed-off-by: Johannes Schindelin --- object-file.c | 4 ++-- object-store-ll.h | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/object-file.c b/object-file.c index 065103be3ea923..737a4902d5f183 100644 --- a/object-file.c +++ b/object-file.c @@ -1889,7 +1889,7 @@ static void write_object_file_prepare(const struct git_hash_algo *algo, } static void write_object_file_prepare_literally(const struct git_hash_algo *algo, - const void *buf, unsigned long len, + const void *buf, size_t len, const char *type, struct object_id *oid, char *hdr, int *hdrlen) { @@ -2357,7 +2357,7 @@ int write_object_file_flags(const void *buf, unsigned long len, return 0; } -int write_object_file_literally(const void *buf, unsigned long len, +int write_object_file_literally(const void *buf, size_t len, const char *type, struct object_id *oid, unsigned flags) { diff --git a/object-store-ll.h b/object-store-ll.h index c5f2bb2fc2fe6e..e1d61d15ffccda 100644 --- a/object-store-ll.h +++ b/object-store-ll.h @@ -262,7 +262,7 @@ static inline int write_object_file(const void *buf, unsigned long len, return write_object_file_flags(buf, len, type, oid, NULL, 0); } -int write_object_file_literally(const void *buf, unsigned long len, +int write_object_file_literally(const void *buf, size_t len, const char *type, struct object_id *oid, unsigned flags); int stream_loose_object(struct input_stream *in_stream, size_t len, From d5dc8b2a689804150d4dab091ed3ec1703c07fe3 Mon Sep 17 00:00:00 2001 From: Philip Oakley Date: Fri, 12 Nov 2021 21:14:50 +0000 Subject: [PATCH 133/297] object-file.c: use size_t for header lengths Continue walking the code path for the >4GB `hash-object --literally` test. The `hash_object_file_literally()` function internally uses both `hash_object_file()` and `write_object_file_prepare()`. Both function signatures use `unsigned long` rather than `size_t` for the mem buffer sizes. Use `size_t` instead, for LLP64 compatibility. While at it, convert those function's object's header buffer length to `size_t` for consistency. The value is already upcast to `uintmax_t` for print format compatibility. Note: The hash-object test still does not pass. A subsequent commit continues to walk the call tree's lower level hash functions to identify further fixes. Signed-off-by: Philip Oakley Signed-off-by: Johannes Schindelin --- object-file.c | 22 +++++++++++----------- object-store-ll.h | 4 ++-- 2 files changed, 13 insertions(+), 13 deletions(-) diff --git a/object-file.c b/object-file.c index 737a4902d5f183..4b0f0d4a6749bf 100644 --- a/object-file.c +++ b/object-file.c @@ -1866,7 +1866,7 @@ void *read_object_with_reference(struct repository *r, static void hash_object_body(const struct git_hash_algo *algo, git_hash_ctx *c, const void *buf, unsigned long len, struct object_id *oid, - char *hdr, int *hdrlen) + char *hdr, size_t *hdrlen) { algo->init_fn(c); algo->update_fn(c, hdr, *hdrlen); @@ -1875,9 +1875,9 @@ static void hash_object_body(const struct git_hash_algo *algo, git_hash_ctx *c, } static void write_object_file_prepare(const struct git_hash_algo *algo, - const void *buf, unsigned long len, + const void *buf, size_t len, enum object_type type, struct object_id *oid, - char *hdr, int *hdrlen) + char *hdr, size_t *hdrlen) { git_hash_ctx c; @@ -1891,7 +1891,7 @@ static void write_object_file_prepare(const struct git_hash_algo *algo, static void write_object_file_prepare_literally(const struct git_hash_algo *algo, const void *buf, size_t len, const char *type, struct object_id *oid, - char *hdr, int *hdrlen) + char *hdr, size_t *hdrlen) { git_hash_ctx c; @@ -1943,17 +1943,17 @@ int finalize_object_file(const char *tmpfile, const char *filename) } static void hash_object_file_literally(const struct git_hash_algo *algo, - const void *buf, unsigned long len, + const void *buf, size_t len, const char *type, struct object_id *oid) { char hdr[MAX_HEADER_LEN]; - int hdrlen = sizeof(hdr); + size_t hdrlen = sizeof(hdr); write_object_file_prepare_literally(algo, buf, len, type, oid, hdr, &hdrlen); } void hash_object_file(const struct git_hash_algo *algo, const void *buf, - unsigned long len, enum object_type type, + size_t len, enum object_type type, struct object_id *oid) { hash_object_file_literally(algo, buf, len, type_name(type), oid); @@ -2317,7 +2317,7 @@ int stream_loose_object(struct input_stream *in_stream, size_t len, return err; } -int write_object_file_flags(const void *buf, unsigned long len, +int write_object_file_flags(const void *buf, size_t len, enum object_type type, struct object_id *oid, struct object_id *compat_oid_in, unsigned flags) { @@ -2326,7 +2326,7 @@ int write_object_file_flags(const void *buf, unsigned long len, const struct git_hash_algo *compat = repo->compat_hash_algo; struct object_id compat_oid; char hdr[MAX_HEADER_LEN]; - int hdrlen = sizeof(hdr); + size_t hdrlen = sizeof(hdr); /* Generate compat_oid */ if (compat) { @@ -2366,8 +2366,8 @@ int write_object_file_literally(const void *buf, size_t len, const struct git_hash_algo *algo = repo->hash_algo; const struct git_hash_algo *compat = repo->compat_hash_algo; struct object_id compat_oid; - int hdrlen, status = 0; - int compat_type = -1; + size_t hdrlen; + int status = 0, compat_type = -1; if (compat) { compat_type = type_from_string_gently(type, -1, 1); diff --git a/object-store-ll.h b/object-store-ll.h index e1d61d15ffccda..ccade7117f8e37 100644 --- a/object-store-ll.h +++ b/object-store-ll.h @@ -250,10 +250,10 @@ void *repo_read_object_file(struct repository *r, int oid_object_info(struct repository *r, const struct object_id *, unsigned long *); void hash_object_file(const struct git_hash_algo *algo, const void *buf, - unsigned long len, enum object_type type, + size_t len, enum object_type type, struct object_id *oid); -int write_object_file_flags(const void *buf, unsigned long len, +int write_object_file_flags(const void *buf, size_t len, enum object_type type, struct object_id *oid, struct object_id *comapt_oid_in, unsigned flags); static inline int write_object_file(const void *buf, unsigned long len, From 3b810959e6d1ca4963554fdc07107cb9365703b6 Mon Sep 17 00:00:00 2001 From: Philip Oakley Date: Fri, 12 Nov 2021 21:16:51 +0000 Subject: [PATCH 134/297] hash algorithms: use size_t for section lengths Continue walking the code path for the >4GB `hash-object --literally` test to the hash algorithm step for LLP64 systems. This patch lets the SHA1DC code use `size_t`, making it compatible with LLP64 data models (as used e.g. by Windows). The interested reader of this patch will note that we adjust the signature of the `git_SHA1DCUpdate()` function without updating _any_ call site. This certainly puzzled at least one reviewer already, so here is an explanation: This function is never called directly, but always via the macro `platform_SHA1_Update`, which is usually called via the macro `git_SHA1_Update`. However, we never call `git_SHA1_Update()` directly in `struct git_hash_algo`. Instead, we call `git_hash_sha1_update()`, which is defined thusly: static void git_hash_sha1_update(git_hash_ctx *ctx, const void *data, size_t len) { git_SHA1_Update(&ctx->sha1, data, len); } i.e. it contains an implicit downcast from `size_t` to `unsigned long` (before this here patch). With this patch, there is no downcast anymore. With this patch, finally, the t1007-hash-object.sh "files over 4GB hash literally" test case is fixed. Signed-off-by: Philip Oakley Signed-off-by: Johannes Schindelin --- object-file.c | 4 ++-- sha1dc_git.c | 3 +-- sha1dc_git.h | 2 +- t/t1007-hash-object.sh | 2 +- 4 files changed, 5 insertions(+), 6 deletions(-) diff --git a/object-file.c b/object-file.c index 4b0f0d4a6749bf..c672adc74ca791 100644 --- a/object-file.c +++ b/object-file.c @@ -1864,7 +1864,7 @@ void *read_object_with_reference(struct repository *r, } static void hash_object_body(const struct git_hash_algo *algo, git_hash_ctx *c, - const void *buf, unsigned long len, + const void *buf, size_t len, struct object_id *oid, char *hdr, size_t *hdrlen) { @@ -1884,7 +1884,7 @@ static void write_object_file_prepare(const struct git_hash_algo *algo, /* Generate the header */ *hdrlen = format_object_header(hdr, *hdrlen, type, len); - /* Sha1.. */ + /* Hash (function pointers) computation */ hash_object_body(algo, &c, buf, len, oid, hdr, hdrlen); } diff --git a/sha1dc_git.c b/sha1dc_git.c index 9b675a046ee699..fe58d7962a30c9 100644 --- a/sha1dc_git.c +++ b/sha1dc_git.c @@ -27,10 +27,9 @@ void git_SHA1DCFinal(unsigned char hash[20], SHA1_CTX *ctx) /* * Same as SHA1DCUpdate, but adjust types to match git's usual interface. */ -void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *vdata, unsigned long len) +void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *vdata, size_t len) { const char *data = vdata; - /* We expect an unsigned long, but sha1dc only takes an int */ while (len > INT_MAX) { SHA1DCUpdate(ctx, data, INT_MAX); data += INT_MAX; diff --git a/sha1dc_git.h b/sha1dc_git.h index 60e3ce84395cea..159540e13a382a 100644 --- a/sha1dc_git.h +++ b/sha1dc_git.h @@ -15,7 +15,7 @@ void git_SHA1DCInit(SHA1_CTX *); #endif void git_SHA1DCFinal(unsigned char [20], SHA1_CTX *); -void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *data, unsigned long len); +void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *data, size_t len); #define platform_SHA_IS_SHA1DC /* used by "test-tool sha1-is-sha1dc" */ #define platform_SHA_CTX SHA1_CTX diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh index 6299612850e574..62166f5c057779 100755 --- a/t/t1007-hash-object.sh +++ b/t/t1007-hash-object.sh @@ -269,7 +269,7 @@ test_expect_success '--stdin outside of repository (uses SHA-1)' ' test_cmp expect actual ' -test_expect_failure EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \ +test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \ 'files over 4GB hash literally' ' test-tool genzeros $((5*1024*1024*1024)) >big && test_oid large5GB >expect && From 138987c68c22cb0db73cac6d4b4497cc82d084e4 Mon Sep 17 00:00:00 2001 From: Philip Oakley Date: Mon, 6 Dec 2021 22:26:50 +0000 Subject: [PATCH 135/297] hash-object --stdin: verify that it works with >4GB/LLP64 Just like the `hash-object --literally` code path, the `--stdin` code path also needs to use `size_t` instead of `unsigned long` to represent memory sizes, otherwise it would cause problems on platforms using the LLP64 data model (such as Windows). To limit the scope of the test case, the object is explicitly not written to the object store, nor are any filters applied. The `big` file from the previous test case is reused to save setup time; To avoid relying on that side effect, it is generated if it does not exist (e.g. when running via `sh t1007-*.sh --long --run=1,41`). Signed-off-by: Philip Oakley Signed-off-by: Johannes Schindelin --- t/t1007-hash-object.sh | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh index 62166f5c057779..03cd0f8ba83532 100755 --- a/t/t1007-hash-object.sh +++ b/t/t1007-hash-object.sh @@ -277,4 +277,12 @@ test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \ test_cmp expect actual ' +test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \ + 'files over 4GB hash correctly via --stdin' ' + { test -f big || test-tool genzeros $((5*1024*1024*1024)) >big; } && + test_oid large5GB >expect && + git hash-object --stdin actual && + test_cmp expect actual +' + test_done From 03b4d549c24011fa9c8c15f2f69f26a0146b41bf Mon Sep 17 00:00:00 2001 From: Philip Oakley Date: Mon, 6 Dec 2021 22:42:46 +0000 Subject: [PATCH 136/297] hash-object: add another >4GB/LLP64 test case To complement the `--stdin` and `--literally` test cases that verify that we can hash files larger than 4GB on 64-bit platforms using the LLP64 data model, here is a test case that exercises `hash-object` _without_ any options. Just as before, we use the `big` file from the previous test case if it exists to save on setup time, otherwise generate it. Signed-off-by: Philip Oakley Signed-off-by: Johannes Schindelin --- t/t1007-hash-object.sh | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh index 03cd0f8ba83532..3ece6a9349ac53 100755 --- a/t/t1007-hash-object.sh +++ b/t/t1007-hash-object.sh @@ -285,4 +285,12 @@ test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \ test_cmp expect actual ' +test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \ + 'files over 4GB hash correctly' ' + { test -f big || test-tool genzeros $((5*1024*1024*1024)) >big; } && + test_oid large5GB >expect && + git hash-object -- big >actual && + test_cmp expect actual +' + test_done From 6caf7c9f9ec6427c14db37635346643aba297790 Mon Sep 17 00:00:00 2001 From: Derrick Stolee Date: Wed, 13 Apr 2022 14:49:17 -0400 Subject: [PATCH 137/297] setup: properly use "%(prefix)/" when in WSL Signed-off-by: Derrick Stolee --- setup.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/setup.c b/setup.c index 5f81d9fac0d7e8..a6abc27334834f 100644 --- a/setup.c +++ b/setup.c @@ -1671,10 +1671,19 @@ const char *setup_git_directory_gently(int *nongit_ok) break; case GIT_DIR_INVALID_OWNERSHIP: if (!nongit_ok) { + struct strbuf prequoted = STRBUF_INIT; struct strbuf quoted = STRBUF_INIT; strbuf_complete(&report, '\n'); - sq_quote_buf_pretty("ed, dir.buf); + +#ifdef __MINGW32__ + if (dir.buf[0] == '/') + strbuf_addstr(&prequoted, "%(prefix)/"); +#endif + + strbuf_add(&prequoted, dir.buf, dir.len); + sq_quote_buf_pretty("ed, prequoted.buf); + die(_("detected dubious ownership in repository at '%s'\n" "%s" "To add an exception for this directory, call:\n" From 065337c4cf798b70a50dab784d27c2095ddf17aa Mon Sep 17 00:00:00 2001 From: Philip Oakley Date: Tue, 7 Dec 2021 09:53:41 +0000 Subject: [PATCH 138/297] hash-object: add a >4GB/LLP64 test case using filtered input To verify that the `clean` side of the `clean`/`smudge` filter code is correct with regards to LLP64 (read: to ensure that `size_t` is used instead of `unsigned long`), here is a test case using a trivial filter, specifically _not_ writing anything to the object store to limit the scope of the test case. As in previous commits, the `big` file from previous test cases is reused if available, to save setup time, otherwise re-generated. Signed-off-by: Philip Oakley Signed-off-by: Johannes Schindelin --- t/t1007-hash-object.sh | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh index 3ece6a9349ac53..3035e612c0f42b 100755 --- a/t/t1007-hash-object.sh +++ b/t/t1007-hash-object.sh @@ -293,4 +293,16 @@ test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \ test_cmp expect actual ' +# This clean filter does nothing, other than excercising the interface. +# We ensure that cleaning doesn't mangle large files on 64-bit Windows. +test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \ + 'hash filtered files over 4GB correctly' ' + { test -f big || test-tool genzeros $((5*1024*1024*1024)) >big; } && + test_oid large5GB >expect && + test_config filter.null-filter.clean "cat" && + echo "big filter=null-filter" >.gitattributes && + git hash-object -- big >actual && + test_cmp expect actual +' + test_done From 301048a694dd51ee3bb045d57a83452a4a5c908c Mon Sep 17 00:00:00 2001 From: Derrick Stolee Date: Wed, 13 Apr 2022 14:54:43 -0400 Subject: [PATCH 139/297] compat/mingw.c: do not warn when failing to get owner In the case of Git for Windows (say, in a Git Bash window) running in a Windows Subsystem for Linux (WSL) directory, the GetNamedSecurityInfoW() call in is_path_owned_By_current_side() returns an error code other than ERROR_SUCCESS. This is consistent behavior across this boundary. In these cases, the owner would always be different because the WSL owner is a different entity than the Windows user. The change here is to suppress the error message that looks like this: error: failed to get owner for '//wsl.localhost/...' (1) Before this change, this warning happens for every Git command, regardless of whether the directory is marked with safe.directory. Signed-off-by: Derrick Stolee --- compat/mingw.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index 29d3f09768c29a..6c2bb00cfacfe7 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -2771,9 +2771,7 @@ int is_path_owned_by_current_sid(const char *path, struct strbuf *report) DACL_SECURITY_INFORMATION, &sid, NULL, NULL, NULL, &descriptor); - if (err != ERROR_SUCCESS) - error(_("failed to get owner for '%s' (%ld)"), path, err); - else if (sid && IsValidSid(sid)) { + if (err == ERROR_SUCCESS && sid && IsValidSid(sid)) { /* Now, verify that the SID matches the current user's */ static PSID current_user_sid; BOOL is_member; From 4d925f54ed3c6243c3e75289b8a42eeba55f03f7 Mon Sep 17 00:00:00 2001 From: Rafael Kitover Date: Tue, 12 Apr 2022 19:53:33 +0000 Subject: [PATCH 140/297] mingw: $env:TERM="xterm-256color" for newer OSes For Windows builds >= 15063 set $env:TERM to "xterm-256color" instead of "cygwin" because they have a more capable console system that supports this. Also set $env:COLORTERM="truecolor" if unset. $env:TERM is initialized so that ANSI colors in color.c work, see 29a3963484 (Win32: patch Windows environment on startup, 2012-01-15). See git-for-windows/git#3629 regarding problems caused by always setting $env:TERM="cygwin". This is the same heuristic used by the Cygwin runtime. Signed-off-by: Rafael Kitover Signed-off-by: Johannes Schindelin --- compat/mingw.c | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index 29d3f09768c29a..0824849f209e4a 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -2642,9 +2642,20 @@ static void setup_windows_environment(void) convert_slashes(tmp); } - /* simulate TERM to enable auto-color (see color.c) */ - if (!getenv("TERM")) - setenv("TERM", "cygwin", 1); + + /* + * Make sure TERM is set up correctly to enable auto-color + * (see color.c .) Use "cygwin" for older OS releases which + * works correctly with MSYS2 utilities on older consoles. + */ + if (!getenv("TERM")) { + if ((GetVersion() >> 16) < 15063) + setenv("TERM", "cygwin", 0); + else { + setenv("TERM", "xterm-256color", 0); + setenv("COLORTERM", "truecolor", 0); + } + } /* calculate HOME if not set */ if (!getenv("HOME")) { From a4dc7358cb651cb61643033a782300c26edc0b49 Mon Sep 17 00:00:00 2001 From: Christopher Degawa Date: Sat, 28 May 2022 14:53:54 -0500 Subject: [PATCH 141/297] winansi: check result and Buffer before using Name NtQueryObject under Wine can return a success but fill out no name. In those situations, Wine will set Buffer to NULL, and set result to the sizeof(OBJECT_NAME_INFORMATION). Running a command such as echo "$(git.exe --version 2>/dev/null)" will crash due to a NULL pointer dereference when the code attempts to null terminate the buffer, although, weirdly, removing the subshell or redirecting stdout to a file will not trigger the crash. Code has been added to also check Buffer and Length to ensure the check is as robust as possible due to the current behavior being fragile at best, and could potentially change in the future This code is based on the behavior of NtQueryObject under wine and reactos. Signed-off-by: Christopher Degawa --- compat/winansi.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/compat/winansi.c b/compat/winansi.c index 575813bde8e806..993c9a7945f8a6 100644 --- a/compat/winansi.c +++ b/compat/winansi.c @@ -573,6 +573,9 @@ static void detect_msys_tty(int fd) if (!NT_SUCCESS(NtQueryObject(h, ObjectNameInformation, buffer, sizeof(buffer) - 2, &result))) return; + if (result < sizeof(*nameinfo) || !nameinfo->Name.Buffer || + !nameinfo->Name.Length) + return; name = nameinfo->Name.Buffer; name[nameinfo->Name.Length / sizeof(*name)] = 0; From 38a219eaf2f96144f665926b62d54a9fc3ce2a95 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 29 Mar 2022 12:05:18 +0200 Subject: [PATCH 142/297] vcxproj: allow building with `NO_PERL` again This is another fall-out of the recent refactoring flurry. Signed-off-by: Johannes Schindelin --- config.mak.uname | 2 ++ 1 file changed, 2 insertions(+) diff --git a/config.mak.uname b/config.mak.uname index b722c2b80e919d..3482fd93cecfd6 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -801,9 +801,11 @@ vcxproj: sed -i '/^git_broken_path_fix ".*/d' git-sh-setup git add -f $(SCRIPT_LIB) $(SCRIPTS) +ifndef NO_PERL # Add Perl module $(MAKE) $(LIB_PERL_GEN) git add -f perl/build +endif # Add bin-wrappers, for testing rm -rf bin-wrappers/ From 0cd5262504f67254b4547aaf2007c4f49e7190bf Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 28 Jun 2022 16:35:04 +0200 Subject: [PATCH 143/297] vcxproj: require C11 This fixes the build after 7bc341e21b (git-compat-util: add a test balloon for C99 support, 2021-12-01). Signed-off-by: Johannes Schindelin --- contrib/buildsystems/Generators/Vcxproj.pm | 1 + 1 file changed, 1 insertion(+) diff --git a/contrib/buildsystems/Generators/Vcxproj.pm b/contrib/buildsystems/Generators/Vcxproj.pm index a6d1c6b8d05682..1858107378396a 100644 --- a/contrib/buildsystems/Generators/Vcxproj.pm +++ b/contrib/buildsystems/Generators/Vcxproj.pm @@ -178,6 +178,7 @@ sub createProject { OnlyExplicitInline ProgramDatabase + stdc11 true From 1d0780b112827c8adfdb1af9f27b5277aa35921d Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 28 Jun 2022 16:38:12 +0200 Subject: [PATCH 144/297] vcxproj: ignore the `-pedantic` option This is now passed by default, ever since 6a8cbc41ba (developer: enable pedantic by default, 2021-09-03). Signed-off-by: Johannes Schindelin --- contrib/buildsystems/engine.pl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/contrib/buildsystems/engine.pl b/contrib/buildsystems/engine.pl index 417ae71d44ccab..ee4fca200cc506 100755 --- a/contrib/buildsystems/engine.pl +++ b/contrib/buildsystems/engine.pl @@ -263,7 +263,7 @@ sub handleCompileLine if ("$part" eq "-o") { # ignore object file shift @parts; - } elsif ("$part" eq "-c" || "$part" eq "-i" || "$part" =~ /^-fno-/) { + } elsif ("$part" eq "-c" || "$part" eq "-i" || "$part" =~ /^-fno-/ || "$part" eq '-pedantic') { # ignore compile flag } elsif ($part =~ /^.?-I/) { push(@incpaths, $part); From 4c51d47c47e9c4d6f9e471fbe635e2e7fe732b3a Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 28 Jun 2022 17:00:59 +0200 Subject: [PATCH 145/297] vcxproj: include reftable when committing `.vcxproj` files Signed-off-by: Johannes Schindelin --- config.mak.uname | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/config.mak.uname b/config.mak.uname index 3482fd93cecfd6..1f8b09e1501f70 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -769,7 +769,7 @@ vcxproj: # Make .vcxproj files and add them perl contrib/buildsystems/generate -g Vcxproj - git add -f git.sln {*,*/lib.proj,t/helper/*}/*.vcxproj + git add -f git.sln {*,*/lib.proj,t/helper/*,reftable/libreftable{,_test}.proj}/*.vcxproj # Generate the LinkOrCopyBuiltins.targets and LinkOrCopyRemoteHttp.targets file (echo '' && \ From ba05b17696afedcccf2f04eee563c6f15a62763f Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 28 Jun 2022 18:04:01 +0200 Subject: [PATCH 146/297] vcxproj: handle libreftable_test, too Since ef8a6c6268 (reftable: utility functions, 2021-10-07) we not only have a libreftable, but also a libreftable_test. Signed-off-by: Johannes Schindelin --- contrib/buildsystems/Generators/Vcxproj.pm | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/contrib/buildsystems/Generators/Vcxproj.pm b/contrib/buildsystems/Generators/Vcxproj.pm index 1858107378396a..20d91ea84bfd44 100644 --- a/contrib/buildsystems/Generators/Vcxproj.pm +++ b/contrib/buildsystems/Generators/Vcxproj.pm @@ -77,7 +77,7 @@ sub createProject { my $libs_release = "\n "; my $libs_debug = "\n "; if (!$static_library && $name ne 'headless-git') { - $libs_release = join(";", sort(grep /^(?!libgit\.lib|xdiff\/lib\.lib|vcs-svn\/lib\.lib|reftable\/libreftable\.lib)/, @{$$build_structure{"$prefix${name}_LIBS"}})); + $libs_release = join(";", sort(grep /^(?!libgit\.lib|xdiff\/lib\.lib|vcs-svn\/lib\.lib|reftable\/libreftable(_test)?\.lib)/, @{$$build_structure{"$prefix${name}_LIBS"}})); $libs_debug = $libs_release; $libs_debug =~ s/zlib\.lib/zlibd\.lib/g; $libs_debug =~ s/libexpat\.lib/libexpatd\.lib/g; @@ -258,6 +258,7 @@ EOM if ((!$static_library || $target =~ 'vcs-svn' || $target =~ 'xdiff') && !($name =~ /headless-git/)) { my $uuid_libgit = $$build_structure{"LIBS_libgit_GUID"}; my $uuid_libreftable = $$build_structure{"LIBS_reftable/libreftable_GUID"}; + my $uuid_libreftable_test = $$build_structure{"LIBS_reftable/libreftable_test_GUID"}; my $uuid_xdiff_lib = $$build_structure{"LIBS_xdiff/lib_GUID"}; print F << "EOM"; @@ -269,10 +270,14 @@ EOM EOM if (!($name =~ /xdiff|libreftable/)) { print F << "EOM"; - + $uuid_libreftable false + + $uuid_libreftable_test + false + EOM } if (!($name =~ 'xdiff')) { From 85874a10f59ed5f082d86bddc4364f0e8a8cab14 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 28 Jun 2022 17:36:21 +0200 Subject: [PATCH 147/297] vcxproj: avoid escaping double quotes in the defines Visual Studio 2022 does not like that at all. Signed-off-by: Johannes Schindelin --- contrib/buildsystems/Generators/Vcxproj.pm | 1 + 1 file changed, 1 insertion(+) diff --git a/contrib/buildsystems/Generators/Vcxproj.pm b/contrib/buildsystems/Generators/Vcxproj.pm index 20d91ea84bfd44..bf77a44e11f463 100644 --- a/contrib/buildsystems/Generators/Vcxproj.pm +++ b/contrib/buildsystems/Generators/Vcxproj.pm @@ -88,6 +88,7 @@ sub createProject { $defines =~ s//>/g; $defines =~ s/\'//g; + $defines =~ s/\\"/"/g; my $rcdefines = $defines; $rcdefines =~ s/(? Date: Sun, 10 Jul 2022 00:39:32 +0200 Subject: [PATCH 148/297] ci: adjust Azure Pipeline for `runs_on_pool` These refactorings are really gifts that keep on giving. Signed-off-by: Johannes Schindelin --- ci/lib.sh | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/ci/lib.sh b/ci/lib.sh index 51f8f59a296132..e75108a72f9fbc 100755 --- a/ci/lib.sh +++ b/ci/lib.sh @@ -224,6 +224,12 @@ then GIT_TEST_OPTS="--write-junit-xml" JOBS=10 + case "$CI_OS_NAME" in + linux) runs_on_pool=ubuntu-latest;; + macos|osx) runs_on_pool=macos-latest;; + windows_nt) runs_on_pool=windows-latest;; + *) echo "Unhandled OS: $CI_OS_NAME" >&2; exit 1;; + esac elif test true = "$GITHUB_ACTIONS" then CI_TYPE=github-actions From 60e1dc4e6f201f777b553db6eda915a6d7ef0459 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Sun, 10 Jul 2022 01:15:08 +0200 Subject: [PATCH 149/297] ci: stop linking the `prove` cache It is not useful because we do not have any persisted directory anymore, not since dropping our Travis CI support. Signed-off-by: Johannes Schindelin --- ci/run-build-and-tests.sh | 5 ----- ci/run-test-slice.sh | 5 ----- 2 files changed, 10 deletions(-) diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh index 98dda42045d3f3..2df089e69d6b30 100755 --- a/ci/run-build-and-tests.sh +++ b/ci/run-build-and-tests.sh @@ -5,11 +5,6 @@ . ${0%/*}/lib.sh -case "$CI_OS_NAME" in -windows*) cmd //c mklink //j t\\.prove "$(cygpath -aw "$cache_dir/.prove")";; -*) ln -s "$cache_dir/.prove" t/.prove;; -esac - run_tests=t case "$jobname" in diff --git a/ci/run-test-slice.sh b/ci/run-test-slice.sh index e167e646f79e3d..0444c79c023c82 100755 --- a/ci/run-test-slice.sh +++ b/ci/run-test-slice.sh @@ -5,11 +5,6 @@ . ${0%/*}/lib.sh -case "$CI_OS_NAME" in -windows*) cmd //c mklink //j t\\.prove "$(cygpath -aw "$cache_dir/.prove")";; -*) ln -s "$cache_dir/.prove" t/.prove;; -esac - group "Run tests" make --quiet -C t T="$(cd t && ./helper/test-tool path-utils slice-tests "$1" "$2" t[0-9]*.sh | tr '\n' ' ')" || From d33fb7d805ff497585d8fb9a0299de640d53388b Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Thu, 11 Feb 2021 15:09:57 +0100 Subject: [PATCH 150/297] ci: reinstate Azure Pipelines support ... so that we can test a MinGit backport in a private repository (with GitHub Actions, minutes and parallel jobs are limited way more than with Azure Pipelines in private repositories). In this commit, we reinstate the exact version of `azure-pipelines.yml` as 6081d3898fe (ci: retire the Azure Pipelines definition, 2020-04-11) deleted. Naturally, many adjustments are required to make it work again. Some of the changes are actually outside of that file (such as the `runs_on_pool` changes that are needed in the Azure Pipelines part of `ci/lib.sh`) and they were made in the commits leading up to this here commit. However, other adjustments are required in the `azure-pipelines.yml` file itself, and for ease of review (read: to build confidence in those changes) they will be made in subsequent, individual commits that explain the intent, context, implementation and justification like every good commit message should do. Signed-off-by: Johannes Schindelin --- azure-pipelines.yml | 558 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 558 insertions(+) create mode 100644 azure-pipelines.yml diff --git a/azure-pipelines.yml b/azure-pipelines.yml new file mode 100644 index 00000000000000..11413f66f89662 --- /dev/null +++ b/azure-pipelines.yml @@ -0,0 +1,558 @@ +variables: + Agent.Source.Git.ShallowFetchDepth: 1 + +jobs: +- job: windows_build + displayName: Windows Build + condition: succeeded() + pool: + vmImage: windows-latest + timeoutInMinutes: 240 + steps: + - powershell: | + if ("$GITFILESHAREPWD" -ne "" -and "$GITFILESHAREPWD" -ne "`$`(gitfileshare.pwd)") { + net use s: \\gitfileshare.file.core.windows.net\test-cache "$GITFILESHAREPWD" /user:AZURE\gitfileshare /persistent:no + cmd /c mklink /d "$(Build.SourcesDirectory)\test-cache" S:\ + } + displayName: 'Mount test-cache' + env: + GITFILESHAREPWD: $(gitfileshare.pwd) + - powershell: | + $urlbase = "https://dev.azure.com/git-for-windows/git/_apis/build/builds" + $id = ((Invoke-WebRequest -UseBasicParsing "${urlbase}?definitions=22&statusFilter=completed&resultFilter=succeeded&`$top=1").content | ConvertFrom-JSON).value[0].id + $downloadUrl = ((Invoke-WebRequest -UseBasicParsing "${urlbase}/$id/artifacts").content | ConvertFrom-JSON).value[1].resource.downloadUrl + (New-Object Net.WebClient).DownloadFile($downloadUrl,"git-sdk-64-minimal.zip") + Expand-Archive git-sdk-64-minimal.zip -DestinationPath . -Force + Remove-Item git-sdk-64-minimal.zip + + # Let Git ignore the SDK and the test-cache + "/git-sdk-64-minimal/`n/test-cache/`n" | Out-File -NoNewLine -Encoding ascii -Append "$(Build.SourcesDirectory)\.git\info\exclude" + displayName: 'Download git-sdk-64-minimal' + - powershell: | + & git-sdk-64-minimal\usr\bin\bash.exe -lc @" + ci/make-test-artifacts.sh artifacts + "@ + if (!$?) { exit(1) } + displayName: Build + env: + HOME: $(Build.SourcesDirectory) + MSYSTEM: MINGW64 + DEVELOPER: 1 + NO_PERL: 1 + - task: PublishPipelineArtifact@0 + displayName: 'Publish Pipeline Artifact: test artifacts' + inputs: + artifactName: 'windows-artifacts' + targetPath: '$(Build.SourcesDirectory)\artifacts' + - task: PublishPipelineArtifact@0 + displayName: 'Publish Pipeline Artifact: git-sdk-64-minimal' + inputs: + artifactName: 'git-sdk-64-minimal' + targetPath: '$(Build.SourcesDirectory)\git-sdk-64-minimal' + - powershell: | + if ("$GITFILESHAREPWD" -ne "" -and "$GITFILESHAREPWD" -ne "`$`(gitfileshare.pwd)") { + cmd /c rmdir "$(Build.SourcesDirectory)\test-cache" + } + displayName: 'Unmount test-cache' + condition: true + env: + GITFILESHAREPWD: $(gitfileshare.pwd) + +- job: windows_test + displayName: Windows Test + dependsOn: windows_build + condition: succeeded() + pool: + vmImage: windows-latest + timeoutInMinutes: 240 + strategy: + parallel: 10 + steps: + - powershell: | + if ("$GITFILESHAREPWD" -ne "" -and "$GITFILESHAREPWD" -ne "`$`(gitfileshare.pwd)") { + net use s: \\gitfileshare.file.core.windows.net\test-cache "$GITFILESHAREPWD" /user:AZURE\gitfileshare /persistent:no + cmd /c mklink /d "$(Build.SourcesDirectory)\test-cache" S:\ + } + displayName: 'Mount test-cache' + env: + GITFILESHAREPWD: $(gitfileshare.pwd) + - task: DownloadPipelineArtifact@0 + displayName: 'Download Pipeline Artifact: test artifacts' + inputs: + artifactName: 'windows-artifacts' + targetPath: '$(Build.SourcesDirectory)' + - task: DownloadPipelineArtifact@0 + displayName: 'Download Pipeline Artifact: git-sdk-64-minimal' + inputs: + artifactName: 'git-sdk-64-minimal' + targetPath: '$(Build.SourcesDirectory)\git-sdk-64-minimal' + - powershell: | + & git-sdk-64-minimal\usr\bin\bash.exe -lc @" + test -f artifacts.tar.gz || { + echo No test artifacts found\; skipping >&2 + exit 0 + } + tar xf artifacts.tar.gz || exit 1 + + # Let Git ignore the SDK and the test-cache + printf '%s\n' /git-sdk-64-minimal/ /test-cache/ >>.git/info/exclude + + ci/run-test-slice.sh `$SYSTEM_JOBPOSITIONINPHASE `$SYSTEM_TOTALJOBSINPHASE || { + ci/print-test-failures.sh + exit 1 + } + "@ + if (!$?) { exit(1) } + displayName: 'Test (parallel)' + env: + HOME: $(Build.SourcesDirectory) + MSYSTEM: MINGW64 + NO_SVN_TESTS: 1 + GIT_TEST_SKIP_REBASE_P: 1 + - powershell: | + if ("$GITFILESHAREPWD" -ne "" -and "$GITFILESHAREPWD" -ne "`$`(gitfileshare.pwd)") { + cmd /c rmdir "$(Build.SourcesDirectory)\test-cache" + } + displayName: 'Unmount test-cache' + condition: true + env: + GITFILESHAREPWD: $(gitfileshare.pwd) + - task: PublishTestResults@2 + displayName: 'Publish Test Results **/TEST-*.xml' + inputs: + mergeTestResults: true + testRunTitle: 'windows' + platform: Windows + publishRunAttachments: false + condition: succeededOrFailed() + - task: PublishBuildArtifacts@1 + displayName: 'Publish trash directories of failed tests' + condition: failed() + inputs: + PathtoPublish: t/failed-test-artifacts + ArtifactName: failed-test-artifacts + +- job: vs_build + displayName: Visual Studio Build + condition: succeeded() + pool: + vmImage: windows-latest + timeoutInMinutes: 240 + steps: + - powershell: | + if ("$GITFILESHAREPWD" -ne "" -and "$GITFILESHAREPWD" -ne "`$`(gitfileshare.pwd)") { + net use s: \\gitfileshare.file.core.windows.net\test-cache "$GITFILESHAREPWD" /user:AZURE\gitfileshare /persistent:no + cmd /c mklink /d "$(Build.SourcesDirectory)\test-cache" S:\ + } + displayName: 'Mount test-cache' + env: + GITFILESHAREPWD: $(gitfileshare.pwd) + - powershell: | + $urlbase = "https://dev.azure.com/git-for-windows/git/_apis/build/builds" + $id = ((Invoke-WebRequest -UseBasicParsing "${urlbase}?definitions=22&statusFilter=completed&resultFilter=succeeded&`$top=1").content | ConvertFrom-JSON).value[0].id + $downloadUrl = ((Invoke-WebRequest -UseBasicParsing "${urlbase}/$id/artifacts").content | ConvertFrom-JSON).value[1].resource.downloadUrl + (New-Object Net.WebClient).DownloadFile($downloadUrl,"git-sdk-64-minimal.zip") + Expand-Archive git-sdk-64-minimal.zip -DestinationPath . -Force + Remove-Item git-sdk-64-minimal.zip + + # Let Git ignore the SDK and the test-cache + "/git-sdk-64-minimal/`n/test-cache/`n" | Out-File -NoNewLine -Encoding ascii -Append "$(Build.SourcesDirectory)\.git\info\exclude" + displayName: 'Download git-sdk-64-minimal' + - powershell: | + & git-sdk-64-minimal\usr\bin\bash.exe -lc @" + make NDEBUG=1 DEVELOPER=1 vcxproj + "@ + if (!$?) { exit(1) } + displayName: Generate Visual Studio Solution + env: + HOME: $(Build.SourcesDirectory) + MSYSTEM: MINGW64 + DEVELOPER: 1 + NO_PERL: 1 + GIT_CONFIG_PARAMETERS: "'user.name=CI' 'user.email=ci@git'" + - powershell: | + $urlbase = "https://dev.azure.com/git/git/_apis/build/builds" + $id = ((Invoke-WebRequest -UseBasicParsing "${urlbase}?definitions=9&statusFilter=completed&resultFilter=succeeded&`$top=1").content | ConvertFrom-JSON).value[0].id + $downloadUrl = ((Invoke-WebRequest -UseBasicParsing "${urlbase}/$id/artifacts").content | ConvertFrom-JSON).value[0].resource.downloadUrl + (New-Object Net.WebClient).DownloadFile($downloadUrl, "compat.zip") + Expand-Archive compat.zip -DestinationPath . -Force + Remove-Item compat.zip + displayName: 'Download vcpkg artifacts' + - task: MSBuild@1 + inputs: + solution: git.sln + platform: x64 + configuration: Release + maximumCpuCount: 4 + msbuildArguments: /p:PlatformToolset=v142 + - powershell: | + & compat\vcbuild\vcpkg_copy_dlls.bat release + if (!$?) { exit(1) } + & git-sdk-64-minimal\usr\bin\bash.exe -lc @" + mkdir -p artifacts && + eval \"`$(make -n artifacts-tar INCLUDE_DLLS_IN_ARTIFACTS=YesPlease ARTIFACTS_DIRECTORY=artifacts | grep ^tar)\" + "@ + if (!$?) { exit(1) } + displayName: Bundle artifact tar + env: + HOME: $(Build.SourcesDirectory) + MSYSTEM: MINGW64 + DEVELOPER: 1 + NO_PERL: 1 + MSVC: 1 + VCPKG_ROOT: $(Build.SourcesDirectory)\compat\vcbuild\vcpkg + - powershell: | + $tag = (Invoke-WebRequest -UseBasicParsing "https://gitforwindows.org/latest-tag.txt").content + $version = (Invoke-WebRequest -UseBasicParsing "https://gitforwindows.org/latest-version.txt").content + $url = "https://github.com/git-for-windows/git/releases/download/${tag}/PortableGit-${version}-64-bit.7z.exe" + (New-Object Net.WebClient).DownloadFile($url,"PortableGit.exe") + & .\PortableGit.exe -y -oartifacts\PortableGit + # Wait until it is unpacked + while (-not @(Remove-Item -ErrorAction SilentlyContinue PortableGit.exe; $?)) { sleep 1 } + displayName: Download & extract portable Git + - task: PublishPipelineArtifact@0 + displayName: 'Publish Pipeline Artifact: MSVC test artifacts' + inputs: + artifactName: 'vs-artifacts' + targetPath: '$(Build.SourcesDirectory)\artifacts' + - powershell: | + if ("$GITFILESHAREPWD" -ne "" -and "$GITFILESHAREPWD" -ne "`$`(gitfileshare.pwd)") { + cmd /c rmdir "$(Build.SourcesDirectory)\test-cache" + } + displayName: 'Unmount test-cache' + condition: true + env: + GITFILESHAREPWD: $(gitfileshare.pwd) + +- job: vs_test + displayName: Visual Studio Test + dependsOn: vs_build + condition: succeeded() + pool: + vmImage: windows-latest + timeoutInMinutes: 240 + strategy: + parallel: 10 + steps: + - powershell: | + if ("$GITFILESHAREPWD" -ne "" -and "$GITFILESHAREPWD" -ne "`$`(gitfileshare.pwd)") { + net use s: \\gitfileshare.file.core.windows.net\test-cache "$GITFILESHAREPWD" /user:AZURE\gitfileshare /persistent:no + cmd /c mklink /d "$(Build.SourcesDirectory)\test-cache" S:\ + } + displayName: 'Mount test-cache' + env: + GITFILESHAREPWD: $(gitfileshare.pwd) + - task: DownloadPipelineArtifact@0 + displayName: 'Download Pipeline Artifact: VS test artifacts' + inputs: + artifactName: 'vs-artifacts' + targetPath: '$(Build.SourcesDirectory)' + - powershell: | + & PortableGit\git-cmd.exe --command=usr\bin\bash.exe -lc @" + test -f artifacts.tar.gz || { + echo No test artifacts found\; skipping >&2 + exit 0 + } + tar xf artifacts.tar.gz || exit 1 + + # Let Git ignore the SDK and the test-cache + printf '%s\n' /PortableGit/ /test-cache/ >>.git/info/exclude + + cd t && + PATH=\"`$PWD/helper:`$PATH\" && + test-tool.exe run-command testsuite --jobs=10 -V -x --write-junit-xml \ + `$(test-tool.exe path-utils slice-tests \ + `$SYSTEM_JOBPOSITIONINPHASE `$SYSTEM_TOTALJOBSINPHASE t[0-9]*.sh) + "@ + if (!$?) { exit(1) } + displayName: 'Test (parallel)' + env: + HOME: $(Build.SourcesDirectory) + MSYSTEM: MINGW64 + NO_SVN_TESTS: 1 + GIT_TEST_SKIP_REBASE_P: 1 + - powershell: | + if ("$GITFILESHAREPWD" -ne "" -and "$GITFILESHAREPWD" -ne "`$`(gitfileshare.pwd)") { + cmd /c rmdir "$(Build.SourcesDirectory)\test-cache" + } + displayName: 'Unmount test-cache' + condition: true + env: + GITFILESHAREPWD: $(gitfileshare.pwd) + - task: PublishTestResults@2 + displayName: 'Publish Test Results **/TEST-*.xml' + inputs: + mergeTestResults: true + testRunTitle: 'vs' + platform: Windows + publishRunAttachments: false + condition: succeededOrFailed() + - task: PublishBuildArtifacts@1 + displayName: 'Publish trash directories of failed tests' + condition: failed() + inputs: + PathtoPublish: t/failed-test-artifacts + ArtifactName: failed-vs-test-artifacts + +- job: linux_clang + displayName: linux-clang + condition: succeeded() + pool: + vmImage: ubuntu-latest + steps: + - bash: | + test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1 + + sudo apt-get update && + sudo apt-get -y install git gcc make libssl-dev libcurl4-openssl-dev libexpat-dev tcl tk gettext git-email zlib1g-dev apache2-bin && + + export CC=clang || exit 1 + + ci/install-dependencies.sh || exit 1 + ci/run-build-and-tests.sh || { + ci/print-test-failures.sh + exit 1 + } + + test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || sudo umount "$HOME/test-cache" || exit 1 + displayName: 'ci/run-build-and-tests.sh' + env: + GITFILESHAREPWD: $(gitfileshare.pwd) + - task: PublishTestResults@2 + displayName: 'Publish Test Results **/TEST-*.xml' + inputs: + mergeTestResults: true + testRunTitle: 'linux-clang' + platform: Linux + publishRunAttachments: false + condition: succeededOrFailed() + - task: PublishBuildArtifacts@1 + displayName: 'Publish trash directories of failed tests' + condition: failed() + inputs: + PathtoPublish: t/failed-test-artifacts + ArtifactName: failed-test-artifacts + +- job: linux_gcc + displayName: linux-gcc + condition: succeeded() + pool: + vmImage: ubuntu-latest + steps: + - bash: | + test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1 + + sudo add-apt-repository ppa:ubuntu-toolchain-r/test && + sudo apt-get update && + sudo apt-get -y install git gcc make libssl-dev libcurl4-openssl-dev libexpat-dev tcl tk gettext git-email zlib1g-dev apache2 language-pack-is git-svn gcc-8 || exit 1 + + ci/install-dependencies.sh || exit 1 + ci/run-build-and-tests.sh || { + ci/print-test-failures.sh + exit 1 + } + + test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || sudo umount "$HOME/test-cache" || exit 1 + displayName: 'ci/run-build-and-tests.sh' + env: + GITFILESHAREPWD: $(gitfileshare.pwd) + - task: PublishTestResults@2 + displayName: 'Publish Test Results **/TEST-*.xml' + inputs: + mergeTestResults: true + testRunTitle: 'linux-gcc' + platform: Linux + publishRunAttachments: false + condition: succeededOrFailed() + - task: PublishBuildArtifacts@1 + displayName: 'Publish trash directories of failed tests' + condition: failed() + inputs: + PathtoPublish: t/failed-test-artifacts + ArtifactName: failed-test-artifacts + +- job: osx_clang + displayName: osx-clang + condition: succeeded() + pool: + vmImage: macOS-latest + steps: + - bash: | + test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1 + + export CC=clang + + ci/install-dependencies.sh || exit 1 + ci/run-build-and-tests.sh || { + ci/print-test-failures.sh + exit 1 + } + + test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || umount "$HOME/test-cache" || exit 1 + displayName: 'ci/run-build-and-tests.sh' + env: + GITFILESHAREPWD: $(gitfileshare.pwd) + - task: PublishTestResults@2 + displayName: 'Publish Test Results **/TEST-*.xml' + inputs: + mergeTestResults: true + testRunTitle: 'osx-clang' + platform: macOS + publishRunAttachments: false + condition: succeededOrFailed() + - task: PublishBuildArtifacts@1 + displayName: 'Publish trash directories of failed tests' + condition: failed() + inputs: + PathtoPublish: t/failed-test-artifacts + ArtifactName: failed-test-artifacts + +- job: osx_gcc + displayName: osx-gcc + condition: succeeded() + pool: + vmImage: macOS-latest + steps: + - bash: | + test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1 + + ci/install-dependencies.sh || exit 1 + ci/run-build-and-tests.sh || { + ci/print-test-failures.sh + exit 1 + } + + test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || umount "$HOME/test-cache" || exit 1 + displayName: 'ci/run-build-and-tests.sh' + env: + GITFILESHAREPWD: $(gitfileshare.pwd) + - task: PublishTestResults@2 + displayName: 'Publish Test Results **/TEST-*.xml' + inputs: + mergeTestResults: true + testRunTitle: 'osx-gcc' + platform: macOS + publishRunAttachments: false + condition: succeededOrFailed() + - task: PublishBuildArtifacts@1 + displayName: 'Publish trash directories of failed tests' + condition: failed() + inputs: + PathtoPublish: t/failed-test-artifacts + ArtifactName: failed-test-artifacts + +- job: gettext_poison + displayName: GETTEXT_POISON + condition: succeeded() + pool: + vmImage: ubuntu-latest + steps: + - bash: | + test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1 + + sudo apt-get update && + sudo apt-get -y install git gcc make libssl-dev libcurl4-openssl-dev libexpat-dev tcl tk gettext git-email zlib1g-dev && + + export jobname=GETTEXT_POISON || exit 1 + + ci/run-build-and-tests.sh || { + ci/print-test-failures.sh + exit 1 + } + + test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || sudo umount "$HOME/test-cache" || exit 1 + displayName: 'ci/run-build-and-tests.sh' + env: + GITFILESHAREPWD: $(gitfileshare.pwd) + - task: PublishTestResults@2 + displayName: 'Publish Test Results **/TEST-*.xml' + inputs: + mergeTestResults: true + testRunTitle: 'gettext-poison' + platform: Linux + publishRunAttachments: false + condition: succeededOrFailed() + - task: PublishBuildArtifacts@1 + displayName: 'Publish trash directories of failed tests' + condition: failed() + inputs: + PathtoPublish: t/failed-test-artifacts + ArtifactName: failed-test-artifacts + +- job: linux32 + displayName: Linux32 + condition: succeeded() + pool: + vmImage: ubuntu-latest + steps: + - bash: | + test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1 + + res=0 + sudo AGENT_OS="$AGENT_OS" BUILD_BUILDNUMBER="$BUILD_BUILDNUMBER" BUILD_REPOSITORY_URI="$BUILD_REPOSITORY_URI" BUILD_SOURCEBRANCH="$BUILD_SOURCEBRANCH" BUILD_SOURCEVERSION="$BUILD_SOURCEVERSION" SYSTEM_PHASENAME="$SYSTEM_PHASENAME" SYSTEM_TASKDEFINITIONSURI="$SYSTEM_TASKDEFINITIONSURI" SYSTEM_TEAMPROJECT="$SYSTEM_TEAMPROJECT" CC=$CC MAKEFLAGS="$MAKEFLAGS" jobname=Linux32 bash -lxc ci/run-docker.sh || res=1 + + sudo chmod a+r t/out/TEST-*.xml + test ! -d t/failed-test-artifacts || sudo chmod a+r t/failed-test-artifacts + + test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || sudo umount "$HOME/test-cache" || res=1 + exit $res + displayName: 'jobname=Linux32 ci/run-docker.sh' + env: + GITFILESHAREPWD: $(gitfileshare.pwd) + - task: PublishTestResults@2 + displayName: 'Publish Test Results **/TEST-*.xml' + inputs: + mergeTestResults: true + testRunTitle: 'linux32' + platform: Linux + publishRunAttachments: false + condition: succeededOrFailed() + - task: PublishBuildArtifacts@1 + displayName: 'Publish trash directories of failed tests' + condition: failed() + inputs: + PathtoPublish: t/failed-test-artifacts + ArtifactName: failed-test-artifacts + +- job: static_analysis + displayName: StaticAnalysis + condition: succeeded() + pool: + vmImage: ubuntu-latest + steps: + - bash: | + test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1 + + sudo apt-get update && + sudo apt-get install -y coccinelle libcurl4-openssl-dev libssl-dev libexpat-dev gettext && + + export jobname=StaticAnalysis && + + ci/run-static-analysis.sh || exit 1 + + test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || sudo umount "$HOME/test-cache" || exit 1 + displayName: 'ci/run-static-analysis.sh' + env: + GITFILESHAREPWD: $(gitfileshare.pwd) + +- job: documentation + displayName: Documentation + condition: succeeded() + pool: + vmImage: ubuntu-latest + steps: + - bash: | + test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1 + + sudo apt-get update && + sudo apt-get install -y asciidoc xmlto asciidoctor docbook-xsl-ns && + + export ALREADY_HAVE_ASCIIDOCTOR=yes. && + export jobname=Documentation && + + ci/test-documentation.sh || exit 1 + + test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || sudo umount "$HOME/test-cache" || exit 1 + displayName: 'ci/test-documentation.sh' + env: + GITFILESHAREPWD: $(gitfileshare.pwd) From 074f5917c873d58951a4aa1b3d9b17abd7f55b16 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 29 Mar 2022 13:42:19 +0200 Subject: [PATCH 151/297] azure-pipeline: drop the `GETTEXT_POISON` job This is a follow-up to 6c280b4142 (ci: remove GETTEXT_POISON jobs, 2021-01-20) after reinstating the Azure Pipeline. Signed-off-by: Johannes Schindelin --- azure-pipelines.yml | 38 -------------------------------------- 1 file changed, 38 deletions(-) diff --git a/azure-pipelines.yml b/azure-pipelines.yml index 11413f66f89662..7b20ad2667fdc9 100644 --- a/azure-pipelines.yml +++ b/azure-pipelines.yml @@ -441,44 +441,6 @@ jobs: PathtoPublish: t/failed-test-artifacts ArtifactName: failed-test-artifacts -- job: gettext_poison - displayName: GETTEXT_POISON - condition: succeeded() - pool: - vmImage: ubuntu-latest - steps: - - bash: | - test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1 - - sudo apt-get update && - sudo apt-get -y install git gcc make libssl-dev libcurl4-openssl-dev libexpat-dev tcl tk gettext git-email zlib1g-dev && - - export jobname=GETTEXT_POISON || exit 1 - - ci/run-build-and-tests.sh || { - ci/print-test-failures.sh - exit 1 - } - - test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || sudo umount "$HOME/test-cache" || exit 1 - displayName: 'ci/run-build-and-tests.sh' - env: - GITFILESHAREPWD: $(gitfileshare.pwd) - - task: PublishTestResults@2 - displayName: 'Publish Test Results **/TEST-*.xml' - inputs: - mergeTestResults: true - testRunTitle: 'gettext-poison' - platform: Linux - publishRunAttachments: false - condition: succeededOrFailed() - - task: PublishBuildArtifacts@1 - displayName: 'Publish trash directories of failed tests' - condition: failed() - inputs: - PathtoPublish: t/failed-test-artifacts - ArtifactName: failed-test-artifacts - - job: linux32 displayName: Linux32 condition: succeeded() From 06e23810713a9ea841133e2cbf6644c1bdbd8fad Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 29 Mar 2022 12:28:12 +0200 Subject: [PATCH 152/297] azure-pipeline: stop hard-coding `apt-get` calls We have `ci/install-dependencies.sh` for that. Incidentally, this avoids the following error in the linux-* jobs: The following packages have unmet dependencies: git-email : Depends: git (< 1:2.25.1-.) but 1:2.35.1-0ppa1~ubuntu20.04.1 is to be installed Recommends: libemail-valid-perl but it is not going to be installed Signed-off-by: Johannes Schindelin --- azure-pipelines.yml | 7 ------- 1 file changed, 7 deletions(-) diff --git a/azure-pipelines.yml b/azure-pipelines.yml index 7b20ad2667fdc9..e311d3055e5eca 100644 --- a/azure-pipelines.yml +++ b/azure-pipelines.yml @@ -303,9 +303,6 @@ jobs: - bash: | test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1 - sudo apt-get update && - sudo apt-get -y install git gcc make libssl-dev libcurl4-openssl-dev libexpat-dev tcl tk gettext git-email zlib1g-dev apache2-bin && - export CC=clang || exit 1 ci/install-dependencies.sh || exit 1 @@ -342,10 +339,6 @@ jobs: - bash: | test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1 - sudo add-apt-repository ppa:ubuntu-toolchain-r/test && - sudo apt-get update && - sudo apt-get -y install git gcc make libssl-dev libcurl4-openssl-dev libexpat-dev tcl tk gettext git-email zlib1g-dev apache2 language-pack-is git-svn gcc-8 || exit 1 - ci/install-dependencies.sh || exit 1 ci/run-build-and-tests.sh || { ci/print-test-failures.sh From fae12b0d261fb21196a433db84d7f03fbc8d639a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E5=AD=99=E5=8D=93=E8=AF=86?= Date: Sun, 16 Jan 2022 03:38:33 +0800 Subject: [PATCH 153/297] Add config option `windows.appendAtomically` MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Atomic append on windows is only supported on local disk files, and it may cause errors in other situations, e.g. network file system. If that is the case, this config option should be used to turn atomic append off. Co-Authored-By: Johannes Schindelin Signed-off-by: 孙卓识 Signed-off-by: Johannes Schindelin --- Documentation/config.txt | 2 ++ Documentation/config/windows.txt | 4 ++++ compat/mingw.c | 37 +++++++++++++++++++++++++++++--- 3 files changed, 40 insertions(+), 3 deletions(-) create mode 100644 Documentation/config/windows.txt diff --git a/Documentation/config.txt b/Documentation/config.txt index 8c0b3ed8075214..f1510a85205487 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -554,4 +554,6 @@ include::config/versionsort.txt[] include::config/web.txt[] +include::config/windows.txt[] + include::config/worktree.txt[] diff --git a/Documentation/config/windows.txt b/Documentation/config/windows.txt new file mode 100644 index 00000000000000..fdaaf1c65504f3 --- /dev/null +++ b/Documentation/config/windows.txt @@ -0,0 +1,4 @@ +windows.appendAtomically:: + By default, append atomic API is used on windows. But it works only with + local disk files, if you're working on a network file system, you should + set it false to turn it off. diff --git a/compat/mingw.c b/compat/mingw.c index 29d3f09768c29a..d9477f9e5a4e64 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -1,3 +1,4 @@ +#define USE_THE_REPOSITORY_VARIABLE #include "../git-compat-util.h" #include "win32.h" #include @@ -18,6 +19,7 @@ #include "gettext.h" #define SECURITY_WIN32 #include +#include "../repository.h" #define HCAST(type, handle) ((type)(intptr_t)handle) @@ -546,6 +548,7 @@ static int is_local_named_pipe_path(const char *filename) int mingw_open (const char *filename, int oflags, ...) { + static int append_atomically = -1; typedef int (*open_fn_t)(wchar_t const *wfilename, int oflags, ...); va_list args; unsigned mode; @@ -562,7 +565,16 @@ int mingw_open (const char *filename, int oflags, ...) return -1; } - if ((oflags & O_APPEND) && !is_local_named_pipe_path(filename)) + /* + * Only set append_atomically to default value(1) when repo is initialized + * and fail to get config value + */ + if (append_atomically < 0 && the_repository && the_repository->commondir && + git_config_get_bool("windows.appendatomically", &append_atomically)) + append_atomically = 1; + + if (append_atomically && (oflags & O_APPEND) && + !is_local_named_pipe_path(filename)) open_fn = mingw_open_append; else open_fn = _wopen; @@ -711,9 +723,28 @@ ssize_t mingw_write(int fd, const void *buf, size_t len) /* check if fd is a pipe */ HANDLE h = (HANDLE) _get_osfhandle(fd); - if (GetFileType(h) != FILE_TYPE_PIPE) + if (GetFileType(h) != FILE_TYPE_PIPE) { + if (orig == EINVAL) { + wchar_t path[MAX_PATH]; + DWORD ret = GetFinalPathNameByHandleW(h, path, + ARRAY_SIZE(path), 0); + UINT drive_type = ret > 0 && ret < ARRAY_SIZE(path) ? + GetDriveTypeW(path) : DRIVE_UNKNOWN; + + /* + * The default atomic append causes such an error on + * network file systems, in such a case, it should be + * turned off via config. + * + * `drive_type` of UNC path: DRIVE_NO_ROOT_DIR + */ + if (DRIVE_NO_ROOT_DIR == drive_type || DRIVE_REMOTE == drive_type) + warning("invalid write operation detected; you may try:\n" + "\n\tgit config windows.appendAtomically false"); + } + errno = orig; - else if (orig == EINVAL) + } else if (orig == EINVAL) errno = EPIPE; else { DWORD buf_size; From 30899bf8f53e6d335ed02b08534c4805f596e2c4 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 4 Sep 2017 11:59:45 +0200 Subject: [PATCH 154/297] mingw: change core.fsyncObjectFiles = 1 by default MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit From the documentation of said setting: This boolean will enable fsync() when writing object files. This is a total waste of time and effort on a filesystem that orders data writes properly, but can be useful for filesystems that do not use journalling (traditional UNIX filesystems) or that only journal metadata and not file contents (OS X’s HFS+, or Linux ext3 with "data=writeback"). The most common file system on Windows (NTFS) does not guarantee that order, therefore a sudden loss of power (or any other event causing an unclean shutdown) would cause corrupt files (i.e. files filled with NULs). Therefore we need to change the default. Note that the documentation makes it sound as if this causes really bad performance. In reality, writing loose objects is something that is done only rarely, and only a handful of files at a time. Signed-off-by: Johannes Schindelin --- compat/mingw.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/compat/mingw.c b/compat/mingw.c index d9477f9e5a4e64..47efa7008c227b 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -19,6 +19,7 @@ #include "gettext.h" #define SECURITY_WIN32 #include +#include "../write-or-die.h" #include "../repository.h" #define HCAST(type, handle) ((type)(intptr_t)handle) @@ -3130,6 +3131,7 @@ int wmain(int argc, const wchar_t **wargv) #endif maybe_redirect_std_handles(); + fsync_object_files = 1; /* determine size of argv and environ conversion buffer */ maxlen = wcslen(wargv[0]); From dd7821030cc915925ca2f054d1464c40cfb403d8 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Sun, 10 Jul 2022 00:02:30 +0200 Subject: [PATCH 155/297] azure-pipeline: drop the code to write to/read from a file share We haven't used this feature in ages, we don't actually need to. Signed-off-by: Johannes Schindelin --- azure-pipelines.yml | 105 -------------------------------------------- 1 file changed, 105 deletions(-) diff --git a/azure-pipelines.yml b/azure-pipelines.yml index e311d3055e5eca..94bc7ea1f51b47 100644 --- a/azure-pipelines.yml +++ b/azure-pipelines.yml @@ -9,14 +9,6 @@ jobs: vmImage: windows-latest timeoutInMinutes: 240 steps: - - powershell: | - if ("$GITFILESHAREPWD" -ne "" -and "$GITFILESHAREPWD" -ne "`$`(gitfileshare.pwd)") { - net use s: \\gitfileshare.file.core.windows.net\test-cache "$GITFILESHAREPWD" /user:AZURE\gitfileshare /persistent:no - cmd /c mklink /d "$(Build.SourcesDirectory)\test-cache" S:\ - } - displayName: 'Mount test-cache' - env: - GITFILESHAREPWD: $(gitfileshare.pwd) - powershell: | $urlbase = "https://dev.azure.com/git-for-windows/git/_apis/build/builds" $id = ((Invoke-WebRequest -UseBasicParsing "${urlbase}?definitions=22&statusFilter=completed&resultFilter=succeeded&`$top=1").content | ConvertFrom-JSON).value[0].id @@ -49,14 +41,6 @@ jobs: inputs: artifactName: 'git-sdk-64-minimal' targetPath: '$(Build.SourcesDirectory)\git-sdk-64-minimal' - - powershell: | - if ("$GITFILESHAREPWD" -ne "" -and "$GITFILESHAREPWD" -ne "`$`(gitfileshare.pwd)") { - cmd /c rmdir "$(Build.SourcesDirectory)\test-cache" - } - displayName: 'Unmount test-cache' - condition: true - env: - GITFILESHAREPWD: $(gitfileshare.pwd) - job: windows_test displayName: Windows Test @@ -68,14 +52,6 @@ jobs: strategy: parallel: 10 steps: - - powershell: | - if ("$GITFILESHAREPWD" -ne "" -and "$GITFILESHAREPWD" -ne "`$`(gitfileshare.pwd)") { - net use s: \\gitfileshare.file.core.windows.net\test-cache "$GITFILESHAREPWD" /user:AZURE\gitfileshare /persistent:no - cmd /c mklink /d "$(Build.SourcesDirectory)\test-cache" S:\ - } - displayName: 'Mount test-cache' - env: - GITFILESHAREPWD: $(gitfileshare.pwd) - task: DownloadPipelineArtifact@0 displayName: 'Download Pipeline Artifact: test artifacts' inputs: @@ -109,14 +85,6 @@ jobs: MSYSTEM: MINGW64 NO_SVN_TESTS: 1 GIT_TEST_SKIP_REBASE_P: 1 - - powershell: | - if ("$GITFILESHAREPWD" -ne "" -and "$GITFILESHAREPWD" -ne "`$`(gitfileshare.pwd)") { - cmd /c rmdir "$(Build.SourcesDirectory)\test-cache" - } - displayName: 'Unmount test-cache' - condition: true - env: - GITFILESHAREPWD: $(gitfileshare.pwd) - task: PublishTestResults@2 displayName: 'Publish Test Results **/TEST-*.xml' inputs: @@ -139,14 +107,6 @@ jobs: vmImage: windows-latest timeoutInMinutes: 240 steps: - - powershell: | - if ("$GITFILESHAREPWD" -ne "" -and "$GITFILESHAREPWD" -ne "`$`(gitfileshare.pwd)") { - net use s: \\gitfileshare.file.core.windows.net\test-cache "$GITFILESHAREPWD" /user:AZURE\gitfileshare /persistent:no - cmd /c mklink /d "$(Build.SourcesDirectory)\test-cache" S:\ - } - displayName: 'Mount test-cache' - env: - GITFILESHAREPWD: $(gitfileshare.pwd) - powershell: | $urlbase = "https://dev.azure.com/git-for-windows/git/_apis/build/builds" $id = ((Invoke-WebRequest -UseBasicParsing "${urlbase}?definitions=22&statusFilter=completed&resultFilter=succeeded&`$top=1").content | ConvertFrom-JSON).value[0].id @@ -215,14 +175,6 @@ jobs: inputs: artifactName: 'vs-artifacts' targetPath: '$(Build.SourcesDirectory)\artifacts' - - powershell: | - if ("$GITFILESHAREPWD" -ne "" -and "$GITFILESHAREPWD" -ne "`$`(gitfileshare.pwd)") { - cmd /c rmdir "$(Build.SourcesDirectory)\test-cache" - } - displayName: 'Unmount test-cache' - condition: true - env: - GITFILESHAREPWD: $(gitfileshare.pwd) - job: vs_test displayName: Visual Studio Test @@ -234,14 +186,6 @@ jobs: strategy: parallel: 10 steps: - - powershell: | - if ("$GITFILESHAREPWD" -ne "" -and "$GITFILESHAREPWD" -ne "`$`(gitfileshare.pwd)") { - net use s: \\gitfileshare.file.core.windows.net\test-cache "$GITFILESHAREPWD" /user:AZURE\gitfileshare /persistent:no - cmd /c mklink /d "$(Build.SourcesDirectory)\test-cache" S:\ - } - displayName: 'Mount test-cache' - env: - GITFILESHAREPWD: $(gitfileshare.pwd) - task: DownloadPipelineArtifact@0 displayName: 'Download Pipeline Artifact: VS test artifacts' inputs: @@ -271,14 +215,6 @@ jobs: MSYSTEM: MINGW64 NO_SVN_TESTS: 1 GIT_TEST_SKIP_REBASE_P: 1 - - powershell: | - if ("$GITFILESHAREPWD" -ne "" -and "$GITFILESHAREPWD" -ne "`$`(gitfileshare.pwd)") { - cmd /c rmdir "$(Build.SourcesDirectory)\test-cache" - } - displayName: 'Unmount test-cache' - condition: true - env: - GITFILESHAREPWD: $(gitfileshare.pwd) - task: PublishTestResults@2 displayName: 'Publish Test Results **/TEST-*.xml' inputs: @@ -301,8 +237,6 @@ jobs: vmImage: ubuntu-latest steps: - bash: | - test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1 - export CC=clang || exit 1 ci/install-dependencies.sh || exit 1 @@ -310,11 +244,7 @@ jobs: ci/print-test-failures.sh exit 1 } - - test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || sudo umount "$HOME/test-cache" || exit 1 displayName: 'ci/run-build-and-tests.sh' - env: - GITFILESHAREPWD: $(gitfileshare.pwd) - task: PublishTestResults@2 displayName: 'Publish Test Results **/TEST-*.xml' inputs: @@ -337,18 +267,12 @@ jobs: vmImage: ubuntu-latest steps: - bash: | - test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1 - ci/install-dependencies.sh || exit 1 ci/run-build-and-tests.sh || { ci/print-test-failures.sh exit 1 } - - test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || sudo umount "$HOME/test-cache" || exit 1 displayName: 'ci/run-build-and-tests.sh' - env: - GITFILESHAREPWD: $(gitfileshare.pwd) - task: PublishTestResults@2 displayName: 'Publish Test Results **/TEST-*.xml' inputs: @@ -371,8 +295,6 @@ jobs: vmImage: macOS-latest steps: - bash: | - test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1 - export CC=clang ci/install-dependencies.sh || exit 1 @@ -380,11 +302,7 @@ jobs: ci/print-test-failures.sh exit 1 } - - test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || umount "$HOME/test-cache" || exit 1 displayName: 'ci/run-build-and-tests.sh' - env: - GITFILESHAREPWD: $(gitfileshare.pwd) - task: PublishTestResults@2 displayName: 'Publish Test Results **/TEST-*.xml' inputs: @@ -407,18 +325,12 @@ jobs: vmImage: macOS-latest steps: - bash: | - test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1 - ci/install-dependencies.sh || exit 1 ci/run-build-and-tests.sh || { ci/print-test-failures.sh exit 1 } - - test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || umount "$HOME/test-cache" || exit 1 displayName: 'ci/run-build-and-tests.sh' - env: - GITFILESHAREPWD: $(gitfileshare.pwd) - task: PublishTestResults@2 displayName: 'Publish Test Results **/TEST-*.xml' inputs: @@ -441,19 +353,14 @@ jobs: vmImage: ubuntu-latest steps: - bash: | - test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1 - res=0 sudo AGENT_OS="$AGENT_OS" BUILD_BUILDNUMBER="$BUILD_BUILDNUMBER" BUILD_REPOSITORY_URI="$BUILD_REPOSITORY_URI" BUILD_SOURCEBRANCH="$BUILD_SOURCEBRANCH" BUILD_SOURCEVERSION="$BUILD_SOURCEVERSION" SYSTEM_PHASENAME="$SYSTEM_PHASENAME" SYSTEM_TASKDEFINITIONSURI="$SYSTEM_TASKDEFINITIONSURI" SYSTEM_TEAMPROJECT="$SYSTEM_TEAMPROJECT" CC=$CC MAKEFLAGS="$MAKEFLAGS" jobname=Linux32 bash -lxc ci/run-docker.sh || res=1 sudo chmod a+r t/out/TEST-*.xml test ! -d t/failed-test-artifacts || sudo chmod a+r t/failed-test-artifacts - test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || sudo umount "$HOME/test-cache" || res=1 exit $res displayName: 'jobname=Linux32 ci/run-docker.sh' - env: - GITFILESHAREPWD: $(gitfileshare.pwd) - task: PublishTestResults@2 displayName: 'Publish Test Results **/TEST-*.xml' inputs: @@ -476,19 +383,13 @@ jobs: vmImage: ubuntu-latest steps: - bash: | - test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1 - sudo apt-get update && sudo apt-get install -y coccinelle libcurl4-openssl-dev libssl-dev libexpat-dev gettext && export jobname=StaticAnalysis && ci/run-static-analysis.sh || exit 1 - - test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || sudo umount "$HOME/test-cache" || exit 1 displayName: 'ci/run-static-analysis.sh' - env: - GITFILESHAREPWD: $(gitfileshare.pwd) - job: documentation displayName: Documentation @@ -497,8 +398,6 @@ jobs: vmImage: ubuntu-latest steps: - bash: | - test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || ci/mount-fileshare.sh //gitfileshare.file.core.windows.net/test-cache gitfileshare "$GITFILESHAREPWD" "$HOME/test-cache" || exit 1 - sudo apt-get update && sudo apt-get install -y asciidoc xmlto asciidoctor docbook-xsl-ns && @@ -506,8 +405,4 @@ jobs: export jobname=Documentation && ci/test-documentation.sh || exit 1 - - test "$GITFILESHAREPWD" = '$(gitfileshare.pwd)' || sudo umount "$HOME/test-cache" || exit 1 displayName: 'ci/test-documentation.sh' - env: - GITFILESHAREPWD: $(gitfileshare.pwd) From ae18a1c14aae9ef4eecc2460c1b82a79e7a84e39 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Sun, 10 Jul 2022 00:14:53 +0200 Subject: [PATCH 156/297] azure-pipeline: use partial clone/parallel checkout to initialize minimal-sdk The Azure Pipeline `git-sdk-64-minimal` was retired... Signed-off-by: Johannes Schindelin --- azure-pipelines.yml | 125 +++++++++++++++++++------------------------- 1 file changed, 55 insertions(+), 70 deletions(-) diff --git a/azure-pipelines.yml b/azure-pipelines.yml index 94bc7ea1f51b47..f11f1342cd080b 100644 --- a/azure-pipelines.yml +++ b/azure-pipelines.yml @@ -1,5 +1,6 @@ variables: Agent.Source.Git.ShallowFetchDepth: 1 + GIT_CONFIG_PARAMETERS: "'checkout.workers=56' 'user.name=CI' 'user.email=ci@git'" jobs: - job: windows_build @@ -9,28 +10,24 @@ jobs: vmImage: windows-latest timeoutInMinutes: 240 steps: - - powershell: | - $urlbase = "https://dev.azure.com/git-for-windows/git/_apis/build/builds" - $id = ((Invoke-WebRequest -UseBasicParsing "${urlbase}?definitions=22&statusFilter=completed&resultFilter=succeeded&`$top=1").content | ConvertFrom-JSON).value[0].id - $downloadUrl = ((Invoke-WebRequest -UseBasicParsing "${urlbase}/$id/artifacts").content | ConvertFrom-JSON).value[1].resource.downloadUrl - (New-Object Net.WebClient).DownloadFile($downloadUrl,"git-sdk-64-minimal.zip") - Expand-Archive git-sdk-64-minimal.zip -DestinationPath . -Force - Remove-Item git-sdk-64-minimal.zip - + - bash: git clone --bare --depth=1 --filter=blob:none --single-branch -b main https://github.com/git-for-windows/git-sdk-64 + displayName: 'clone git-sdk-64' + - bash: git clone --depth=1 --single-branch -b main https://github.com/git-for-windows/build-extra + displayName: 'clone build-extra' + - bash: sh -x ./build-extra/please.sh create-sdk-artifact --sdk=git-sdk-64.git --out=git-sdk-64-minimal minimal-sdk + displayName: 'build git-sdk-64-minimal-sdk' + - bash: | # Let Git ignore the SDK and the test-cache - "/git-sdk-64-minimal/`n/test-cache/`n" | Out-File -NoNewLine -Encoding ascii -Append "$(Build.SourcesDirectory)\.git\info\exclude" - displayName: 'Download git-sdk-64-minimal' - - powershell: | - & git-sdk-64-minimal\usr\bin\bash.exe -lc @" - ci/make-test-artifacts.sh artifacts - "@ - if (!$?) { exit(1) } + printf "%s\n" /git-sdk-64.git/ /build-extra/ /git-sdk-64-minimal/ /test-cache/ >>'.git/info/exclude' + displayName: 'Ignore untracked directories' + - bash: ci/make-test-artifacts.sh artifacts displayName: Build env: HOME: $(Build.SourcesDirectory) MSYSTEM: MINGW64 DEVELOPER: 1 NO_PERL: 1 + PATH: "$(Build.SourcesDirectory)\\git-sdk-64-minimal\\mingw64\\bin;$(Build.SourcesDirectory)\\git-sdk-64-minimal\\usr\\bin;C:\\Windows\\system32;C:\\Windows;C:\\Windows\\system32\\wbem" - task: PublishPipelineArtifact@0 displayName: 'Publish Pipeline Artifact: test artifacts' inputs: @@ -62,29 +59,27 @@ jobs: inputs: artifactName: 'git-sdk-64-minimal' targetPath: '$(Build.SourcesDirectory)\git-sdk-64-minimal' - - powershell: | - & git-sdk-64-minimal\usr\bin\bash.exe -lc @" - test -f artifacts.tar.gz || { - echo No test artifacts found\; skipping >&2 - exit 0 - } - tar xf artifacts.tar.gz || exit 1 + - bash: | + test -f artifacts.tar.gz || { + echo No test artifacts found\; skipping >&2 + exit 0 + } + tar xf artifacts.tar.gz || exit 1 - # Let Git ignore the SDK and the test-cache - printf '%s\n' /git-sdk-64-minimal/ /test-cache/ >>.git/info/exclude + # Let Git ignore the SDK and the test-cache + printf '%s\n' /git-sdk-64.git/ /build-extra/ /git-sdk-64-minimal/ /test-cache/ >>.git/info/exclude - ci/run-test-slice.sh `$SYSTEM_JOBPOSITIONINPHASE `$SYSTEM_TOTALJOBSINPHASE || { - ci/print-test-failures.sh - exit 1 - } - "@ - if (!$?) { exit(1) } + ci/run-test-slice.sh $SYSTEM_JOBPOSITIONINPHASE $SYSTEM_TOTALJOBSINPHASE || { + ci/print-test-failures.sh + exit 1 + } displayName: 'Test (parallel)' env: HOME: $(Build.SourcesDirectory) MSYSTEM: MINGW64 NO_SVN_TESTS: 1 GIT_TEST_SKIP_REBASE_P: 1 + PATH: "$(Build.SourcesDirectory)\\git-sdk-64-minimal\\mingw64\\bin;$(Build.SourcesDirectory)\\git-sdk-64-minimal\\usr\\bin\\core_perl;$(Build.SourcesDirectory)\\git-sdk-64-minimal\\usr\\bin;C:\\Windows\\system32;C:\\Windows;C:\\Windows\\system32\\wbem" - task: PublishTestResults@2 displayName: 'Publish Test Results **/TEST-*.xml' inputs: @@ -107,29 +102,24 @@ jobs: vmImage: windows-latest timeoutInMinutes: 240 steps: - - powershell: | - $urlbase = "https://dev.azure.com/git-for-windows/git/_apis/build/builds" - $id = ((Invoke-WebRequest -UseBasicParsing "${urlbase}?definitions=22&statusFilter=completed&resultFilter=succeeded&`$top=1").content | ConvertFrom-JSON).value[0].id - $downloadUrl = ((Invoke-WebRequest -UseBasicParsing "${urlbase}/$id/artifacts").content | ConvertFrom-JSON).value[1].resource.downloadUrl - (New-Object Net.WebClient).DownloadFile($downloadUrl,"git-sdk-64-minimal.zip") - Expand-Archive git-sdk-64-minimal.zip -DestinationPath . -Force - Remove-Item git-sdk-64-minimal.zip - + - bash: git clone --bare --depth=1 --filter=blob:none --single-branch -b main https://github.com/git-for-windows/git-sdk-64 + displayName: 'clone git-sdk-64' + - bash: git clone --depth=1 --single-branch -b main https://github.com/git-for-windows/build-extra + displayName: 'clone build-extra' + - bash: sh -x ./build-extra/please.sh create-sdk-artifact --sdk=git-sdk-64.git --out=git-sdk-64-minimal minimal-sdk + displayName: 'build git-sdk-64-minimal-sdk' + - bash: | # Let Git ignore the SDK and the test-cache - "/git-sdk-64-minimal/`n/test-cache/`n" | Out-File -NoNewLine -Encoding ascii -Append "$(Build.SourcesDirectory)\.git\info\exclude" - displayName: 'Download git-sdk-64-minimal' - - powershell: | - & git-sdk-64-minimal\usr\bin\bash.exe -lc @" - make NDEBUG=1 DEVELOPER=1 vcxproj - "@ - if (!$?) { exit(1) } + printf "%s\n" /git-sdk-64-minimal/ /test-cache/ >>'.git/info/exclude' + displayName: 'Ignore untracked directories' + - bash: make NDEBUG=1 DEVELOPER=1 vcxproj displayName: Generate Visual Studio Solution env: HOME: $(Build.SourcesDirectory) MSYSTEM: MINGW64 DEVELOPER: 1 NO_PERL: 1 - GIT_CONFIG_PARAMETERS: "'user.name=CI' 'user.email=ci@git'" + PATH: "$(Build.SourcesDirectory)\\git-sdk-64-minimal\\mingw64\\bin;$(Build.SourcesDirectory)\\git-sdk-64-minimal\\usr\\bin;C:\\Windows\\system32;C:\\Windows;C:\\Windows\\system32\\wbem" - powershell: | $urlbase = "https://dev.azure.com/git/git/_apis/build/builds" $id = ((Invoke-WebRequest -UseBasicParsing "${urlbase}?definitions=9&statusFilter=completed&resultFilter=succeeded&`$top=1").content | ConvertFrom-JSON).value[0].id @@ -145,14 +135,10 @@ jobs: configuration: Release maximumCpuCount: 4 msbuildArguments: /p:PlatformToolset=v142 - - powershell: | - & compat\vcbuild\vcpkg_copy_dlls.bat release - if (!$?) { exit(1) } - & git-sdk-64-minimal\usr\bin\bash.exe -lc @" - mkdir -p artifacts && - eval \"`$(make -n artifacts-tar INCLUDE_DLLS_IN_ARTIFACTS=YesPlease ARTIFACTS_DIRECTORY=artifacts | grep ^tar)\" - "@ - if (!$?) { exit(1) } + - bash: | + ./compat/vcbuild/vcpkg_copy_dlls.bat release && + mkdir -p artifacts && + eval "$(make -n artifacts-tar INCLUDE_DLLS_IN_ARTIFACTS=YesPlease ARTIFACTS_DIRECTORY=artifacts | grep ^tar)" displayName: Bundle artifact tar env: HOME: $(Build.SourcesDirectory) @@ -161,6 +147,7 @@ jobs: NO_PERL: 1 MSVC: 1 VCPKG_ROOT: $(Build.SourcesDirectory)\compat\vcbuild\vcpkg + PATH: "$(Build.SourcesDirectory)\\git-sdk-64-minimal\\mingw64\\bin;$(Build.SourcesDirectory)\\git-sdk-64-minimal\\usr\\bin;C:\\Windows\\system32;C:\\Windows;C:\\Windows\\system32\\wbem" - powershell: | $tag = (Invoke-WebRequest -UseBasicParsing "https://gitforwindows.org/latest-tag.txt").content $version = (Invoke-WebRequest -UseBasicParsing "https://gitforwindows.org/latest-version.txt").content @@ -191,30 +178,28 @@ jobs: inputs: artifactName: 'vs-artifacts' targetPath: '$(Build.SourcesDirectory)' - - powershell: | - & PortableGit\git-cmd.exe --command=usr\bin\bash.exe -lc @" - test -f artifacts.tar.gz || { - echo No test artifacts found\; skipping >&2 - exit 0 - } - tar xf artifacts.tar.gz || exit 1 + - bash: | + test -f artifacts.tar.gz || { + echo No test artifacts found\; skipping >&2 + exit 0 + } + tar xf artifacts.tar.gz || exit 1 - # Let Git ignore the SDK and the test-cache - printf '%s\n' /PortableGit/ /test-cache/ >>.git/info/exclude + # Let Git ignore the SDK and the test-cache + printf '%s\n' /PortableGit/ /test-cache/ >>.git/info/exclude - cd t && - PATH=\"`$PWD/helper:`$PATH\" && - test-tool.exe run-command testsuite --jobs=10 -V -x --write-junit-xml \ - `$(test-tool.exe path-utils slice-tests \ - `$SYSTEM_JOBPOSITIONINPHASE `$SYSTEM_TOTALJOBSINPHASE t[0-9]*.sh) - "@ - if (!$?) { exit(1) } + cd t && + PATH="$PWD/helper:$PATH" && + test-tool.exe run-command testsuite --jobs=10 -V -x --write-junit-xml \ + $(test-tool.exe path-utils slice-tests \ + $SYSTEM_JOBPOSITIONINPHASE $SYSTEM_TOTALJOBSINPHASE t[0-9]*.sh) displayName: 'Test (parallel)' env: HOME: $(Build.SourcesDirectory) MSYSTEM: MINGW64 NO_SVN_TESTS: 1 GIT_TEST_SKIP_REBASE_P: 1 + PATH: "$(Build.SourcesDirectory)\\PortableGit\\mingw64\\bin;$(Build.SourcesDirectory)\\PortableGit\\usr\\bin;C:\\Windows\\system32;C:\\Windows;C:\\Windows\\system32\\wbem" - task: PublishTestResults@2 displayName: 'Publish Test Results **/TEST-*.xml' inputs: From 57cbc426e1f2c5a51c812540dbca895be6541759 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Sun, 10 Jul 2022 00:52:40 +0200 Subject: [PATCH 157/297] azure-pipeline: downcase the job name of the `Linux32` job These many refactorings in Git sure are gifts that keep on giving. Signed-off-by: Johannes Schindelin --- azure-pipelines.yml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/azure-pipelines.yml b/azure-pipelines.yml index f11f1342cd080b..21ee5a463380d6 100644 --- a/azure-pipelines.yml +++ b/azure-pipelines.yml @@ -339,13 +339,13 @@ jobs: steps: - bash: | res=0 - sudo AGENT_OS="$AGENT_OS" BUILD_BUILDNUMBER="$BUILD_BUILDNUMBER" BUILD_REPOSITORY_URI="$BUILD_REPOSITORY_URI" BUILD_SOURCEBRANCH="$BUILD_SOURCEBRANCH" BUILD_SOURCEVERSION="$BUILD_SOURCEVERSION" SYSTEM_PHASENAME="$SYSTEM_PHASENAME" SYSTEM_TASKDEFINITIONSURI="$SYSTEM_TASKDEFINITIONSURI" SYSTEM_TEAMPROJECT="$SYSTEM_TEAMPROJECT" CC=$CC MAKEFLAGS="$MAKEFLAGS" jobname=Linux32 bash -lxc ci/run-docker.sh || res=1 + sudo AGENT_OS="$AGENT_OS" BUILD_BUILDNUMBER="$BUILD_BUILDNUMBER" BUILD_REPOSITORY_URI="$BUILD_REPOSITORY_URI" BUILD_SOURCEBRANCH="$BUILD_SOURCEBRANCH" BUILD_SOURCEVERSION="$BUILD_SOURCEVERSION" SYSTEM_PHASENAME="$SYSTEM_PHASENAME" SYSTEM_TASKDEFINITIONSURI="$SYSTEM_TASKDEFINITIONSURI" SYSTEM_TEAMPROJECT="$SYSTEM_TEAMPROJECT" CC=$CC MAKEFLAGS="$MAKEFLAGS" jobname=linux32 bash -lxc ci/run-docker.sh || res=1 sudo chmod a+r t/out/TEST-*.xml test ! -d t/failed-test-artifacts || sudo chmod a+r t/failed-test-artifacts exit $res - displayName: 'jobname=Linux32 ci/run-docker.sh' + displayName: 'jobname=linux32 ci/run-docker.sh' - task: PublishTestResults@2 displayName: 'Publish Test Results **/TEST-*.xml' inputs: From 9c686c9c3785b26467adecec9baeec1b9fca3ae4 Mon Sep 17 00:00:00 2001 From: Dennis Ameling Date: Tue, 4 Oct 2022 09:59:32 +0200 Subject: [PATCH 158/297] config.mak.uname: add support for clangarm64 CLANGARM64 is a relatively new MSYSTEM added by the MSYS2 team. In order to have Git build correctly for this platform, let's add some configuration for it to config.mak.uname. Signed-off-by: Dennis Ameling --- config.mak.uname | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/config.mak.uname b/config.mak.uname index 85d63821ec95f6..8f7166bfc8a22a 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -723,6 +723,10 @@ ifeq ($(uname_S),MINGW) prefix = /mingw64 HOST_CPU = x86_64 BASIC_LDFLAGS += -Wl,--pic-executable,-e,mainCRTStartup + else ifeq (CLANGARM64,$(MSYSTEM)) + prefix = /clangarm64 + HOST_CPU = aarch64 + BASIC_LDFLAGS += -Wl,--pic-executable,-e,mainCRTStartup else COMPAT_CFLAGS += -D_USE_32BIT_TIME_T BASIC_LDFLAGS += -Wl,--large-address-aware From a4f614160051cbc3873c8afae63a32b84dd954b8 Mon Sep 17 00:00:00 2001 From: Taylor Blau Date: Mon, 8 Feb 2021 16:22:34 -0500 Subject: [PATCH 159/297] azure-pipeline: run static-analysis on jammy This is inspired by d051ed77ee6 (.github/workflows/main.yml: run static-analysis on bionic, 2021-02-08) and by ef46584831 (ci: update 'static-analysis' to Ubuntu 22.04, 2022-08-23), adapted to the Azure Pipeline. When Azure Pipelines' build agents transitioned 'ubuntu-latest' from 18.04 to 20.04, it broke our `static-analysis` job, since Coccinelle was not madeavailable on Ubuntu focal (it is only available in the universe suite). This is not an issue with Ubuntu 22.04, but we will only know whether it is an issue with 24.04 when _that_ comes out. So let's play it safe and pin the `static_analysis` job to the latest Ubuntu version that we know to offer a working Coccinelle package. Signed-off-by: Taylor Blau Signed-off-by: Junio C Hamano Signed-off-by: Johannes Schindelin --- azure-pipelines.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/azure-pipelines.yml b/azure-pipelines.yml index 21ee5a463380d6..fed26341d49a4a 100644 --- a/azure-pipelines.yml +++ b/azure-pipelines.yml @@ -365,7 +365,7 @@ jobs: displayName: StaticAnalysis condition: succeeded() pool: - vmImage: ubuntu-latest + vmImage: ubuntu-22.04 steps: - bash: | sudo apt-get update && From be6056f6489fa43814b1070960f35d37448d4cea Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Matthias=20A=C3=9Fhauer?= Date: Sun, 10 Jul 2022 11:27:25 +0200 Subject: [PATCH 160/297] MinGW: link as terminal server aware MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Whith Windows 2000, Microsoft introduced a flag to the PE header to mark executables as "terminal server aware". Windows terminal servers provide a redirected Windows directory and redirected registry hives when launching legacy applications without this flag set. Since we do not use any INI files in the Windows directory and don't write to the registry, we don't need this additional preparation. Telling the OS that we don't need this should provide slightly improved startup times in terminal server environments. When building for supported Windows Versions with MSVC the /TSAWARE linker flag is automatically set, but MinGW requires us to set the --tsaware flag manually. This partially addresses https://github.com/git-for-windows/git/issues/3935. Signed-off-by: Matthias Aßhauer --- config.mak.uname | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/config.mak.uname b/config.mak.uname index 85d63821ec95f6..2fa115cb940fed 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -690,7 +690,7 @@ ifeq ($(uname_S),MINGW) DEFAULT_HELP_FORMAT = html HAVE_PLATFORM_PROCINFO = YesPlease CSPRNG_METHOD = rtlgenrandom - BASIC_LDFLAGS += -municode + BASIC_LDFLAGS += -municode -Wl,--tsaware COMPAT_CFLAGS += -DNOGDI -Icompat -Icompat/win32 COMPAT_CFLAGS += -DSTRIP_EXTENSION=\".exe\" COMPAT_OBJS += compat/mingw.o compat/winansi.o \ From 40fe63356ac69690d71ed0472faf931d934f0279 Mon Sep 17 00:00:00 2001 From: Kiel Hurley Date: Wed, 2 Nov 2022 22:56:16 +1300 Subject: [PATCH 161/297] Fix Windows version resources Add FileVersion, which is a required field As not all required fields were present, none were being included Fixes #4090 Signed-off-by: Kiel Hurley --- git.rc | 1 + 1 file changed, 1 insertion(+) diff --git a/git.rc b/git.rc index cc3fdc6cc6cb83..9c08cb214ac4a9 100644 --- a/git.rc +++ b/git.rc @@ -12,6 +12,7 @@ BEGIN VALUE "OriginalFilename", "git.exe\0" VALUE "ProductName", "Git\0" VALUE "ProductVersion", GIT_VERSION "\0" + VALUE "FileVersion", GIT_VERSION "\0" END END From 3c8827b72d978831d142c30098e6f57903829928 Mon Sep 17 00:00:00 2001 From: Dennis Ameling Date: Tue, 4 Oct 2022 09:58:10 +0200 Subject: [PATCH 162/297] bswap.h: add support for built-in bswap functions Newer compiler versions, like GCC 10 and Clang 12, have built-in functions for bswap32 and bswap64. This comes in handy, for example, when targeting CLANGARM64 on Windows, which would not be supported without this logic. Signed-off-by: Dennis Ameling --- compat/bswap.h | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/compat/bswap.h b/compat/bswap.h index 512f6f4b9937c8..a443e99eef2f1c 100644 --- a/compat/bswap.h +++ b/compat/bswap.h @@ -35,7 +35,19 @@ static inline uint64_t default_bswap64(uint64_t val) #undef bswap32 #undef bswap64 -#if defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__)) +/** + * __has_builtin is available since Clang 10 and GCC 10. + * Below is a fallback for older compilers. + */ +#ifndef __has_builtin + #define __has_builtin(x) 0 +#endif + +#if __has_builtin(__builtin_bswap32) && __has_builtin(__builtin_bswap64) +#define bswap32(x) __builtin_bswap32((x)) +#define bswap64(x) __builtin_bswap64((x)) + +#elif defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__)) #define bswap32 git_bswap32 static inline uint32_t git_bswap32(uint32_t x) From f3d1c551b3ca748f45abedf90d866f8d54a8caca Mon Sep 17 00:00:00 2001 From: Dennis Ameling Date: Tue, 4 Oct 2022 10:09:10 +0200 Subject: [PATCH 163/297] ci: create clangarm64-build.yml No GitHub-hosted ARM64 runners are available at the moment of writing, but folks can leverage self-hosted runners of this architecture. This CI pipeline comes in handy for forks of the git-for-windows/git project that have such runners available. The pipeline can be kicked off manually through a workflow_dispatch. Signed-off-by: Dennis Ameling --- .github/workflows/clangarm64-build.yml | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) create mode 100644 .github/workflows/clangarm64-build.yml diff --git a/.github/workflows/clangarm64-build.yml b/.github/workflows/clangarm64-build.yml new file mode 100644 index 00000000000000..79d1c7589d8033 --- /dev/null +++ b/.github/workflows/clangarm64-build.yml @@ -0,0 +1,25 @@ +name: CLANG build ARM64 + +on: + workflow_dispatch: + +defaults: + run: + shell: bash + +jobs: + clang-build: + runs-on: [Windows, ARM64] + env: + NO_PERL: 1 + steps: + - uses: actions/checkout@v4 + - uses: git-for-windows/setup-git-for-windows-sdk@v1 + with: + flavor: makepkg-git + architecture: aarch64 + # This assumes that the job is running on a self-hosted runner, + # in which case we need to cleanup SDK files. + cleanup: true + - name: Build Git CLANGARM64 + run: make -j`nproc` From 7eeb802785abb0fa88f775eba8ea761d37f2e9ff Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Sat, 6 May 2023 22:26:15 +0200 Subject: [PATCH 164/297] http: optionally load libcurl lazily This compile-time option allows to ask Git to load libcurl dynamically at runtime. Together with a follow-up patch that optionally overrides the file name depending on the `http.sslBackend` setting, this kicks open the door for installing multiple libcurl flavors side by side, and load the one corresponding to the (runtime-)configured SSL/TLS backend. Signed-off-by: Johannes Schindelin --- Makefile | 28 +++- compat/lazyload-curl.c | 354 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 375 insertions(+), 7 deletions(-) create mode 100644 compat/lazyload-curl.c diff --git a/Makefile b/Makefile index d6479092a0b8df..61212a3ed25618 100644 --- a/Makefile +++ b/Makefile @@ -467,6 +467,11 @@ include shared.mak # # CURL_LDFLAGS=-lcurl # +# Define LAZYLOAD_LIBCURL to dynamically load the libcurl; This can be useful +# if Multiple libcurl versions exist (with different file names) that link to +# various SSL/TLS backends, to support the `http.sslBackend` runtime switch in +# such a scenario. +# # === Optional library: libpcre2 === # # Define USE_LIBPCRE if you have and want to use libpcre. Various @@ -1616,10 +1621,19 @@ else CURL_LIBCURL = endif - ifndef CURL_LDFLAGS - CURL_LDFLAGS = $(eval CURL_LDFLAGS := $$(shell $$(CURL_CONFIG) --libs))$(CURL_LDFLAGS) + ifdef LAZYLOAD_LIBCURL + LAZYLOAD_LIBCURL_OBJ = compat/lazyload-curl.o + OBJECTS += $(LAZYLOAD_LIBCURL_OBJ) + # The `CURL_STATICLIB` constant must be defined to avoid seeing the functions + # declared as DLL imports + CURL_CFLAGS = -DCURL_STATICLIB + CURL_LIBCURL = -ldl + else + ifndef CURL_LDFLAGS + CURL_LDFLAGS = $(eval CURL_LDFLAGS := $$(shell $$(CURL_CONFIG) --libs))$(CURL_LDFLAGS) + endif + CURL_LIBCURL += $(CURL_LDFLAGS) endif - CURL_LIBCURL += $(CURL_LDFLAGS) ifndef CURL_CFLAGS CURL_CFLAGS = $(eval CURL_CFLAGS := $$(shell $$(CURL_CONFIG) --cflags))$(CURL_CFLAGS) @@ -1640,7 +1654,7 @@ else endif ifdef USE_CURL_FOR_IMAP_SEND BASIC_CFLAGS += -DUSE_CURL_FOR_IMAP_SEND - IMAP_SEND_BUILDDEPS = http.o + IMAP_SEND_BUILDDEPS = http.o $(LAZYLOAD_LIBCURL_OBJ) IMAP_SEND_LDFLAGS += $(CURL_LIBCURL) endif ifndef NO_EXPAT @@ -2831,10 +2845,10 @@ git-imap-send$X: imap-send.o $(IMAP_SEND_BUILDDEPS) GIT-LDFLAGS $(GITLIBS) $(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \ $(IMAP_SEND_LDFLAGS) $(LIBS) -git-http-fetch$X: http.o http-walker.o http-fetch.o GIT-LDFLAGS $(GITLIBS) +git-http-fetch$X: http.o http-walker.o http-fetch.o $(LAZYLOAD_LIBCURL_OBJ) GIT-LDFLAGS $(GITLIBS) $(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \ $(CURL_LIBCURL) $(LIBS) -git-http-push$X: http.o http-push.o GIT-LDFLAGS $(GITLIBS) +git-http-push$X: http.o http-push.o $(LAZYLOAD_LIBCURL_OBJ) GIT-LDFLAGS $(GITLIBS) $(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \ $(CURL_LIBCURL) $(EXPAT_LIBEXPAT) $(LIBS) @@ -2844,7 +2858,7 @@ $(REMOTE_CURL_ALIASES): $(REMOTE_CURL_PRIMARY) ln -s $< $@ 2>/dev/null || \ cp $< $@ -$(REMOTE_CURL_PRIMARY): remote-curl.o http.o http-walker.o GIT-LDFLAGS $(GITLIBS) +$(REMOTE_CURL_PRIMARY): remote-curl.o http.o http-walker.o $(LAZYLOAD_LIBCURL_OBJ) GIT-LDFLAGS $(GITLIBS) $(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \ $(CURL_LIBCURL) $(EXPAT_LIBEXPAT) $(LIBS) diff --git a/compat/lazyload-curl.c b/compat/lazyload-curl.c new file mode 100644 index 00000000000000..19aa2b6d4b6942 --- /dev/null +++ b/compat/lazyload-curl.c @@ -0,0 +1,354 @@ +#include "../git-compat-util.h" +#include "../git-curl-compat.h" +#include + +/* + * The ABI version of libcurl is encoded in its shared libraries' file names. + * This ABI version has not changed since October 2006 and is unlikely to be + * changed in the future. See https://curl.se/libcurl/abi.html for details. + */ +#define LIBCURL_ABI_VERSION "4" + +typedef void (*func_t)(void); + +#ifdef __APPLE__ +#define LIBCURL_FILE_NAME(base) base "." LIBCURL_ABI_VERSION ".dylib" +#else +#define LIBCURL_FILE_NAME(base) base ".so." LIBCURL_ABI_VERSION +#endif + +static void *load_library(const char *name) +{ + return dlopen(name, RTLD_LAZY); +} + +static func_t load_function(void *handle, const char *name) +{ + /* + * Casting the return value of `dlsym()` to a function pointer is + * explicitly allowed in recent POSIX standards, but GCC complains + * about this in pedantic mode nevertheless. For more about this issue, + * see https://stackoverflow.com/q/31526876/1860823 and + * http://stackoverflow.com/a/36385690/1905491. + */ + func_t f; + *(void **)&f = dlsym(handle, name); + return f; +} + +typedef struct curl_version_info_data *(*curl_version_info_type)(CURLversion version); +static curl_version_info_type curl_version_info_func; + +typedef char *(*curl_easy_escape_type)(CURL *handle, const char *string, int length); +static curl_easy_escape_type curl_easy_escape_func; + +typedef void (*curl_free_type)(void *p); +static curl_free_type curl_free_func; + +typedef CURLcode (*curl_global_init_type)(long flags); +static curl_global_init_type curl_global_init_func; + +typedef CURLsslset (*curl_global_sslset_type)(curl_sslbackend id, const char *name, const curl_ssl_backend ***avail); +static curl_global_sslset_type curl_global_sslset_func; + +typedef void (*curl_global_cleanup_type)(void); +static curl_global_cleanup_type curl_global_cleanup_func; + +typedef struct curl_slist *(*curl_slist_append_type)(struct curl_slist *list, const char *data); +static curl_slist_append_type curl_slist_append_func; + +typedef void (*curl_slist_free_all_type)(struct curl_slist *list); +static curl_slist_free_all_type curl_slist_free_all_func; + +typedef const char *(*curl_easy_strerror_type)(CURLcode error); +static curl_easy_strerror_type curl_easy_strerror_func; + +typedef CURLM *(*curl_multi_init_type)(void); +static curl_multi_init_type curl_multi_init_func; + +typedef CURLMcode (*curl_multi_add_handle_type)(CURLM *multi_handle, CURL *curl_handle); +static curl_multi_add_handle_type curl_multi_add_handle_func; + +typedef CURLMcode (*curl_multi_remove_handle_type)(CURLM *multi_handle, CURL *curl_handle); +static curl_multi_remove_handle_type curl_multi_remove_handle_func; + +typedef CURLMcode (*curl_multi_fdset_type)(CURLM *multi_handle, fd_set *read_fd_set, fd_set *write_fd_set, fd_set *exc_fd_set, int *max_fd); +static curl_multi_fdset_type curl_multi_fdset_func; + +typedef CURLMcode (*curl_multi_perform_type)(CURLM *multi_handle, int *running_handles); +static curl_multi_perform_type curl_multi_perform_func; + +typedef CURLMcode (*curl_multi_cleanup_type)(CURLM *multi_handle); +static curl_multi_cleanup_type curl_multi_cleanup_func; + +typedef CURLMsg *(*curl_multi_info_read_type)(CURLM *multi_handle, int *msgs_in_queue); +static curl_multi_info_read_type curl_multi_info_read_func; + +typedef const char *(*curl_multi_strerror_type)(CURLMcode error); +static curl_multi_strerror_type curl_multi_strerror_func; + +typedef CURLMcode (*curl_multi_timeout_type)(CURLM *multi_handle, long *milliseconds); +static curl_multi_timeout_type curl_multi_timeout_func; + +typedef CURL *(*curl_easy_init_type)(void); +static curl_easy_init_type curl_easy_init_func; + +typedef CURLcode (*curl_easy_perform_type)(CURL *curl); +static curl_easy_perform_type curl_easy_perform_func; + +typedef void (*curl_easy_cleanup_type)(CURL *curl); +static curl_easy_cleanup_type curl_easy_cleanup_func; + +typedef CURL *(*curl_easy_duphandle_type)(CURL *curl); +static curl_easy_duphandle_type curl_easy_duphandle_func; + +typedef CURLcode (*curl_easy_getinfo_long_type)(CURL *curl, CURLINFO info, long *value); +static curl_easy_getinfo_long_type curl_easy_getinfo_long_func; + +typedef CURLcode (*curl_easy_getinfo_pointer_type)(CURL *curl, CURLINFO info, void **value); +static curl_easy_getinfo_pointer_type curl_easy_getinfo_pointer_func; + +typedef CURLcode (*curl_easy_getinfo_off_t_type)(CURL *curl, CURLINFO info, curl_off_t *value); +static curl_easy_getinfo_off_t_type curl_easy_getinfo_off_t_func; + +typedef CURLcode (*curl_easy_setopt_long_type)(CURL *curl, CURLoption opt, long value); +static curl_easy_setopt_long_type curl_easy_setopt_long_func; + +typedef CURLcode (*curl_easy_setopt_pointer_type)(CURL *curl, CURLoption opt, void *value); +static curl_easy_setopt_pointer_type curl_easy_setopt_pointer_func; + +typedef CURLcode (*curl_easy_setopt_off_t_type)(CURL *curl, CURLoption opt, curl_off_t value); +static curl_easy_setopt_off_t_type curl_easy_setopt_off_t_func; + +static void lazy_load_curl(void) +{ + static int initialized; + void *libcurl; + func_t curl_easy_getinfo_func, curl_easy_setopt_func; + + if (initialized) + return; + + initialized = 1; + libcurl = load_library(LIBCURL_FILE_NAME("libcurl")); + if (!libcurl) + die("failed to load library '%s'", LIBCURL_FILE_NAME("libcurl")); + + curl_version_info_func = (curl_version_info_type)load_function(libcurl, "curl_version_info"); + curl_easy_escape_func = (curl_easy_escape_type)load_function(libcurl, "curl_easy_escape"); + curl_free_func = (curl_free_type)load_function(libcurl, "curl_free"); + curl_global_init_func = (curl_global_init_type)load_function(libcurl, "curl_global_init"); + curl_global_sslset_func = (curl_global_sslset_type)load_function(libcurl, "curl_global_sslset"); + curl_global_cleanup_func = (curl_global_cleanup_type)load_function(libcurl, "curl_global_cleanup"); + curl_slist_append_func = (curl_slist_append_type)load_function(libcurl, "curl_slist_append"); + curl_slist_free_all_func = (curl_slist_free_all_type)load_function(libcurl, "curl_slist_free_all"); + curl_easy_strerror_func = (curl_easy_strerror_type)load_function(libcurl, "curl_easy_strerror"); + curl_multi_init_func = (curl_multi_init_type)load_function(libcurl, "curl_multi_init"); + curl_multi_add_handle_func = (curl_multi_add_handle_type)load_function(libcurl, "curl_multi_add_handle"); + curl_multi_remove_handle_func = (curl_multi_remove_handle_type)load_function(libcurl, "curl_multi_remove_handle"); + curl_multi_fdset_func = (curl_multi_fdset_type)load_function(libcurl, "curl_multi_fdset"); + curl_multi_perform_func = (curl_multi_perform_type)load_function(libcurl, "curl_multi_perform"); + curl_multi_cleanup_func = (curl_multi_cleanup_type)load_function(libcurl, "curl_multi_cleanup"); + curl_multi_info_read_func = (curl_multi_info_read_type)load_function(libcurl, "curl_multi_info_read"); + curl_multi_strerror_func = (curl_multi_strerror_type)load_function(libcurl, "curl_multi_strerror"); + curl_multi_timeout_func = (curl_multi_timeout_type)load_function(libcurl, "curl_multi_timeout"); + curl_easy_init_func = (curl_easy_init_type)load_function(libcurl, "curl_easy_init"); + curl_easy_perform_func = (curl_easy_perform_type)load_function(libcurl, "curl_easy_perform"); + curl_easy_cleanup_func = (curl_easy_cleanup_type)load_function(libcurl, "curl_easy_cleanup"); + curl_easy_duphandle_func = (curl_easy_duphandle_type)load_function(libcurl, "curl_easy_duphandle"); + + curl_easy_getinfo_func = load_function(libcurl, "curl_easy_getinfo"); + curl_easy_getinfo_long_func = (curl_easy_getinfo_long_type)curl_easy_getinfo_func; + curl_easy_getinfo_pointer_func = (curl_easy_getinfo_pointer_type)curl_easy_getinfo_func; + curl_easy_getinfo_off_t_func = (curl_easy_getinfo_off_t_type)curl_easy_getinfo_func; + + curl_easy_setopt_func = load_function(libcurl, "curl_easy_setopt"); + curl_easy_setopt_long_func = (curl_easy_setopt_long_type)curl_easy_setopt_func; + curl_easy_setopt_pointer_func = (curl_easy_setopt_pointer_type)curl_easy_setopt_func; + curl_easy_setopt_off_t_func = (curl_easy_setopt_off_t_type)curl_easy_setopt_func; +} + +struct curl_version_info_data *curl_version_info(CURLversion version) +{ + lazy_load_curl(); + return curl_version_info_func(version); +} + +char *curl_easy_escape(CURL *handle, const char *string, int length) +{ + lazy_load_curl(); + return curl_easy_escape_func(handle, string, length); +} + +void curl_free(void *p) +{ + lazy_load_curl(); + curl_free_func(p); +} + +CURLcode curl_global_init(long flags) +{ + lazy_load_curl(); + return curl_global_init_func(flags); +} + +CURLsslset curl_global_sslset(curl_sslbackend id, const char *name, const curl_ssl_backend ***avail) +{ + lazy_load_curl(); + return curl_global_sslset_func(id, name, avail); +} + +void curl_global_cleanup(void) +{ + lazy_load_curl(); + curl_global_cleanup_func(); +} + +struct curl_slist *curl_slist_append(struct curl_slist *list, const char *data) +{ + lazy_load_curl(); + return curl_slist_append_func(list, data); +} + +void curl_slist_free_all(struct curl_slist *list) +{ + lazy_load_curl(); + curl_slist_free_all_func(list); +} + +const char *curl_easy_strerror(CURLcode error) +{ + lazy_load_curl(); + return curl_easy_strerror_func(error); +} + +CURLM *curl_multi_init(void) +{ + lazy_load_curl(); + return curl_multi_init_func(); +} + +CURLMcode curl_multi_add_handle(CURLM *multi_handle, CURL *curl_handle) +{ + lazy_load_curl(); + return curl_multi_add_handle_func(multi_handle, curl_handle); +} + +CURLMcode curl_multi_remove_handle(CURLM *multi_handle, CURL *curl_handle) +{ + lazy_load_curl(); + return curl_multi_remove_handle_func(multi_handle, curl_handle); +} + +CURLMcode curl_multi_fdset(CURLM *multi_handle, fd_set *read_fd_set, fd_set *write_fd_set, fd_set *exc_fd_set, int *max_fd) +{ + lazy_load_curl(); + return curl_multi_fdset_func(multi_handle, read_fd_set, write_fd_set, exc_fd_set, max_fd); +} + +CURLMcode curl_multi_perform(CURLM *multi_handle, int *running_handles) +{ + lazy_load_curl(); + return curl_multi_perform_func(multi_handle, running_handles); +} + +CURLMcode curl_multi_cleanup(CURLM *multi_handle) +{ + lazy_load_curl(); + return curl_multi_cleanup_func(multi_handle); +} + +CURLMsg *curl_multi_info_read(CURLM *multi_handle, int *msgs_in_queue) +{ + lazy_load_curl(); + return curl_multi_info_read_func(multi_handle, msgs_in_queue); +} + +const char *curl_multi_strerror(CURLMcode error) +{ + lazy_load_curl(); + return curl_multi_strerror_func(error); +} + +CURLMcode curl_multi_timeout(CURLM *multi_handle, long *milliseconds) +{ + lazy_load_curl(); + return curl_multi_timeout_func(multi_handle, milliseconds); +} + +CURL *curl_easy_init(void) +{ + lazy_load_curl(); + return curl_easy_init_func(); +} + +CURLcode curl_easy_perform(CURL *curl) +{ + lazy_load_curl(); + return curl_easy_perform_func(curl); +} + +void curl_easy_cleanup(CURL *curl) +{ + lazy_load_curl(); + curl_easy_cleanup_func(curl); +} + +CURL *curl_easy_duphandle(CURL *curl) +{ + lazy_load_curl(); + return curl_easy_duphandle_func(curl); +} + +#ifndef CURL_IGNORE_DEPRECATION +#define CURL_IGNORE_DEPRECATION(x) x +#endif + +#ifndef CURLOPTTYPE_BLOB +#define CURLOPTTYPE_BLOB 40000 +#endif + +#undef curl_easy_getinfo +CURLcode curl_easy_getinfo(CURL *curl, CURLINFO info, ...) +{ + va_list ap; + CURLcode res; + + va_start(ap, info); + lazy_load_curl(); + CURL_IGNORE_DEPRECATION( + if (info >= CURLINFO_LONG && info < CURLINFO_DOUBLE) + res = curl_easy_getinfo_long_func(curl, info, va_arg(ap, long *)); + else if ((info >= CURLINFO_STRING && info < CURLINFO_LONG) || + (info >= CURLINFO_SLIST && info < CURLINFO_SOCKET)) + res = curl_easy_getinfo_pointer_func(curl, info, va_arg(ap, void **)); + else if (info >= CURLINFO_OFF_T) + res = curl_easy_getinfo_off_t_func(curl, info, va_arg(ap, curl_off_t *)); + else + die("%s:%d: TODO (info: %d)!", __FILE__, __LINE__, info); + ) + va_end(ap); + return res; +} + +#undef curl_easy_setopt +CURLcode curl_easy_setopt(CURL *curl, CURLoption opt, ...) +{ + va_list ap; + CURLcode res; + + va_start(ap, opt); + lazy_load_curl(); + CURL_IGNORE_DEPRECATION( + if (opt >= CURLOPTTYPE_LONG && opt < CURLOPTTYPE_OBJECTPOINT) + res = curl_easy_setopt_long_func(curl, opt, va_arg(ap, long)); + else if (opt >= CURLOPTTYPE_OBJECTPOINT && opt < CURLOPTTYPE_OFF_T) + res = curl_easy_setopt_pointer_func(curl, opt, va_arg(ap, void *)); + else if (opt >= CURLOPTTYPE_OFF_T && opt < CURLOPTTYPE_BLOB) + res = curl_easy_setopt_off_t_func(curl, opt, va_arg(ap, curl_off_t)); + else + die("%s:%d: TODO (opt: %d)!", __FILE__, __LINE__, opt); + ) + va_end(ap); + return res; +} From 00f10889015f366d7c93f8cf2162a1d884f9e62f Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Sun, 7 May 2023 22:51:52 +0200 Subject: [PATCH 165/297] http: support lazy-loading libcurl also on Windows This implements the Windows-specific support code, because everything is slightly different on Windows, even loading shared libraries. Note: I specifically do _not_ use the code from `compat/win32/lazyload.h` here because that code is optimized for loading individual functions from various system DLLs, while we specifically want to load _many_ functions from _one_ DLL here, and distinctly not a system DLL (we expect libcurl to be located outside `C:\Windows\system32`, something `INIT_PROC_ADDR` refuses to work with). Also, the `curl_easy_getinfo()`/`curl_easy_setopt()` functions are declared as vararg functions, which `lazyload.h` cannot handle. Finally, we are about to optionally override the exact file name that is to be loaded, which is a goal contrary to `lazyload.h`'s design. Signed-off-by: Johannes Schindelin --- Makefile | 4 ++++ compat/lazyload-curl.c | 52 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 56 insertions(+) diff --git a/Makefile b/Makefile index 61212a3ed25618..ee583473708a05 100644 --- a/Makefile +++ b/Makefile @@ -1627,7 +1627,11 @@ else # The `CURL_STATICLIB` constant must be defined to avoid seeing the functions # declared as DLL imports CURL_CFLAGS = -DCURL_STATICLIB +ifneq ($(uname_S),MINGW) +ifneq ($(uname_S),Windows) CURL_LIBCURL = -ldl +endif +endif else ifndef CURL_LDFLAGS CURL_LDFLAGS = $(eval CURL_LDFLAGS := $$(shell $$(CURL_CONFIG) --libs))$(CURL_LDFLAGS) diff --git a/compat/lazyload-curl.c b/compat/lazyload-curl.c index 19aa2b6d4b6942..98d73fb0f2a66f 100644 --- a/compat/lazyload-curl.c +++ b/compat/lazyload-curl.c @@ -1,6 +1,8 @@ #include "../git-compat-util.h" #include "../git-curl-compat.h" +#ifndef WIN32 #include +#endif /* * The ABI version of libcurl is encoded in its shared libraries' file names. @@ -11,6 +13,7 @@ typedef void (*func_t)(void); +#ifndef WIN32 #ifdef __APPLE__ #define LIBCURL_FILE_NAME(base) base "." LIBCURL_ABI_VERSION ".dylib" #else @@ -35,6 +38,55 @@ static func_t load_function(void *handle, const char *name) *(void **)&f = dlsym(handle, name); return f; } +#else +#define LIBCURL_FILE_NAME(base) base "-" LIBCURL_ABI_VERSION ".dll" + +static void *load_library(const char *name) +{ + size_t name_size = strlen(name) + 1; + const char *path = getenv("PATH"); + char dll_path[MAX_PATH]; + + while (path && *path) { + const char *sep = strchrnul(path, ';'); + size_t len = sep - path; + + if (len && len + name_size < sizeof(dll_path)) { + memcpy(dll_path, path, len); + dll_path[len] = '/'; + memcpy(dll_path + len + 1, name, name_size); + + if (!access(dll_path, R_OK)) { + wchar_t wpath[MAX_PATH]; + int wlen = MultiByteToWideChar(CP_UTF8, 0, dll_path, -1, wpath, ARRAY_SIZE(wpath)); + void *res = wlen ? (void *)LoadLibraryExW(wpath, NULL, 0) : NULL; + if (!res) { + DWORD err = GetLastError(); + char buf[1024]; + + if (!FormatMessageA(FORMAT_MESSAGE_FROM_SYSTEM | + FORMAT_MESSAGE_ARGUMENT_ARRAY | + FORMAT_MESSAGE_IGNORE_INSERTS, + NULL, err, LANG_NEUTRAL, + buf, sizeof(buf) - 1, NULL)) + xsnprintf(buf, sizeof(buf), "last error: %ld", err); + error("LoadLibraryExW() failed with: %s", buf); + } + return res; + } + } + + path = *sep ? sep + 1 : NULL; + } + + return NULL; +} + +static func_t load_function(void *handle, const char *name) +{ + return (func_t)GetProcAddress((HANDLE)handle, name); +} +#endif typedef struct curl_version_info_data *(*curl_version_info_type)(CURLversion version); static curl_version_info_type curl_version_info_func; From 6d8e07c02d92302e0568871bb93bc77782e17b5f Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Sun, 7 May 2023 22:05:33 +0200 Subject: [PATCH 166/297] http: when loading libcurl lazily, allow for multiple SSL backends The previous commits introduced a compile-time option to load libcurl lazily, but it uses the hard-coded name "libcurl-4.dll" (or equivalent on platforms other than Windows). To allow for installing multiple libcurl flavors side by side, where each supports one specific SSL/TLS backend, let's first look whether `libcurl--4.dll` exists, and only use `libcurl-4.dll` as a fall back. That will allow us to ship with a libcurl by default that only supports the Secure Channel backend for the `https://` protocol. This libcurl won't suffer from any dependency problem when upgrading OpenSSL to a new major version (which will change the DLL name, and hence break every program and library that depends on it). This is crucial because Git for Windows relies on libcurl to keep working when building and deploying a new OpenSSL package because that library is used by `git fetch` and `git clone`. Note that this feature is by no means specific to Windows. On Ubuntu, for example, a `git` built using `LAZY_LOAD_LIBCURL` will use `libcurl.so.4` for `http.sslbackend=openssl` and `libcurl-gnutls.so.4` for `http.sslbackend=gnutls`. Signed-off-by: Johannes Schindelin --- compat/lazyload-curl.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/compat/lazyload-curl.c b/compat/lazyload-curl.c index 98d73fb0f2a66f..2f2b2ebc27993e 100644 --- a/compat/lazyload-curl.c +++ b/compat/lazyload-curl.c @@ -172,17 +172,26 @@ static curl_easy_setopt_pointer_type curl_easy_setopt_pointer_func; typedef CURLcode (*curl_easy_setopt_off_t_type)(CURL *curl, CURLoption opt, curl_off_t value); static curl_easy_setopt_off_t_type curl_easy_setopt_off_t_func; +static char ssl_backend[64]; + static void lazy_load_curl(void) { static int initialized; - void *libcurl; + void *libcurl = NULL; func_t curl_easy_getinfo_func, curl_easy_setopt_func; if (initialized) return; initialized = 1; - libcurl = load_library(LIBCURL_FILE_NAME("libcurl")); + if (ssl_backend[0]) { + char dll_name[64 + 16]; + snprintf(dll_name, sizeof(dll_name) - 1, + LIBCURL_FILE_NAME("libcurl-%s"), ssl_backend); + libcurl = load_library(dll_name); + } + if (!libcurl) + libcurl = load_library(LIBCURL_FILE_NAME("libcurl")); if (!libcurl) die("failed to load library '%s'", LIBCURL_FILE_NAME("libcurl")); @@ -246,6 +255,9 @@ CURLcode curl_global_init(long flags) CURLsslset curl_global_sslset(curl_sslbackend id, const char *name, const curl_ssl_backend ***avail) { + if (name && strlen(name) < sizeof(ssl_backend)) + strlcpy(ssl_backend, name, sizeof(ssl_backend)); + lazy_load_curl(); return curl_global_sslset_func(id, name, avail); } From 1aa651e4042c53c1a9383424a8c06c1f58a0f71e Mon Sep 17 00:00:00 2001 From: Andrey Zabavnikov Date: Fri, 28 Oct 2022 17:12:06 +0300 Subject: [PATCH 167/297] status: fix for old-style submodules with commondir In f9b7573f6b00 (repository: free fields before overwriting them, 2017-09-05), Git was taught to release memory before overwriting it, but 357a03ebe9e0 (repository.c: move env-related setup code back to environment.c, 2018-03-03) changed the code so that it would not _always_ be overwritten. As a consequence, the `commondir` attribute would point to already-free()d memory. This seems not to cause problems in core Git, but there are add-on patches in Git for Windows where the `commondir` attribute is subsequently used and causing invalid memory accesses e.g. in setups containing old-style submodules (i.e. the ones with a `.git` directory within theirs worktrees) that have `commondir` configured. This fixes https://github.com/git-for-windows/git/pull/4083. Signed-off-by: Andrey Zabavnikov Signed-off-by: Johannes Schindelin --- repository.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/repository.c b/repository.c index 9825a3089932ed..4d7f6d5f25c50b 100644 --- a/repository.c +++ b/repository.c @@ -96,7 +96,7 @@ static void repo_set_commondir(struct repository *repo, { struct strbuf sb = STRBUF_INIT; - free(repo->commondir); + FREE_AND_NULL(repo->commondir); if (commondir) { repo->different_commondir = 1; From f5fc17d4319d912f396a1f6eb591acf634f9ee0d Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 27 Jan 2023 08:55:21 +0100 Subject: [PATCH 168/297] windows: skip linking `git-` for built-ins It is merely a historical wart that, say, `git-commit` exists in the `libexec/git-core/` directory, a tribute to the original idea to let Git be essentially a bunch of Unix shell scripts revolving around very few "plumbing" (AKA low-level) commands. Git has evolved a lot from there. These days, most of Git's functionality is contained within the `git` executable, in the form of "built-in" commands. To accommodate for scripts that use the "dashed" form of Git commands, even today, Git provides hard-links that make the `git` executable available as, say, `git-commit`, just in case that an old script has not been updated to invoke `git commit`. Those hard-links do not come cheap: they take about half a minute for every build of Git on Windows, they are mistaken for taking up huge amounts of space by some Windows Explorer versions that do not understand hard-links, and therefore many a "bug" report had to be addressed. The "dashed form" has been officially deprecated in Git version 1.5.4, which was released on February 2nd, 2008, i.e. a very long time ago. This deprecation was never finalized by skipping these hard-links, but we can start the process now, in Git for Windows. Signed-off-by: Johannes Schindelin --- config.mak.uname | 2 ++ 1 file changed, 2 insertions(+) diff --git a/config.mak.uname b/config.mak.uname index 85d63821ec95f6..4b41fae0a4db20 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -485,6 +485,7 @@ ifeq ($(uname_S),Windows) NO_POSIX_GOODIES = UnfortunatelyYes NATIVE_CRLF = YesPlease DEFAULT_HELP_FORMAT = html + SKIP_DASHED_BUILT_INS = YabbaDabbaDoo ifeq (/mingw64,$(subst 32,64,$(prefix))) # Move system config into top-level /etc/ ETC_GITCONFIG = ../etc/gitconfig @@ -676,6 +677,7 @@ ifeq ($(uname_S),MINGW) FSMONITOR_DAEMON_BACKEND = win32 FSMONITOR_OS_SETTINGS = win32 + SKIP_DASHED_BUILT_INS = YabbaDabbaDoo RUNTIME_PREFIX = YesPlease HAVE_WPGMPTR = YesWeDo NO_ST_BLOCKS_IN_STRUCT_STAT = YesPlease From d65654002a23dc225c33a125652e3899e0626756 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 21 Mar 2023 16:14:44 +0100 Subject: [PATCH 169/297] windows: fix Repository>Explore Working Copy Since Git v2.39.1, we are a bit more stringent in searching the PATH. In particular, we specifically require the `.exe` suffix. However, the `Repository>Explore Working Copy` command asks for `explorer.exe` to be found on the `PATH`, which _already_ has that suffix. Let's unstartle the PATH-finding logic about this scenario. This fixes https://github.com/git-for-windows/git/issues/4356 Signed-off-by: Johannes Schindelin --- git-gui/git-gui.sh | 3 +++ 1 file changed, 3 insertions(+) diff --git a/git-gui/git-gui.sh b/git-gui/git-gui.sh index 8fe7538e72084d..c299b38a64fb1e 100755 --- a/git-gui/git-gui.sh +++ b/git-gui/git-gui.sh @@ -101,6 +101,9 @@ proc _which {what args} { if {[is_Windows] && [lsearch -exact $args -script] >= 0} { set suffix {} + } elseif {[string match *$_search_exe $what]} { + # The search string already has the file extension + set suffix {} } else { set suffix $_search_exe } From 9780d3a7d7a7761ed054dd633af7d7d934351972 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Sun, 7 May 2023 22:43:37 +0200 Subject: [PATCH 170/297] mingw: do load libcurl dynamically by default This will help with Git for Windows' maintenance going forward: It allows Git for Windows to switch its primary libcurl to a variant without the OpenSSL backend, while still loading an alternate when setting `http.sslBackend = openssl`. This is necessary to avoid maintenance headaches with upgrading OpenSSL: its major version name is encoded in the shared library's file name and hence major version updates (temporarily) break libraries that are linked against the OpenSSL library. Signed-off-by: Johannes Schindelin --- config.mak.uname | 1 + 1 file changed, 1 insertion(+) diff --git a/config.mak.uname b/config.mak.uname index 2fa115cb940fed..84af922f944041 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -691,6 +691,7 @@ ifeq ($(uname_S),MINGW) HAVE_PLATFORM_PROCINFO = YesPlease CSPRNG_METHOD = rtlgenrandom BASIC_LDFLAGS += -municode -Wl,--tsaware + LAZYLOAD_LIBCURL = YesDoThatPlease COMPAT_CFLAGS += -DNOGDI -Icompat -Icompat/win32 COMPAT_CFLAGS += -DSTRIP_EXTENSION=\".exe\" COMPAT_OBJS += compat/mingw.o compat/winansi.o \ From 432588511fc0ad09cf359bfc486516a13a80932e Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 2 Nov 2022 16:23:58 +0100 Subject: [PATCH 171/297] Add a GitHub workflow to verify that Git/Scalar work in Nano Server In Git for Windows v2.39.0, we fixed a regression where `git.exe` would no longer work in Windows Nano Server (frequently used in Docker containers). This GitHub workflow can be used to verify manually that the Git/Scalar executables work in Nano Server. Signed-off-by: Johannes Schindelin --- .github/workflows/nano-server.yml | 76 +++++++++++++++++++++++++++++++ 1 file changed, 76 insertions(+) create mode 100644 .github/workflows/nano-server.yml diff --git a/.github/workflows/nano-server.yml b/.github/workflows/nano-server.yml new file mode 100644 index 00000000000000..3d943c23d2616d --- /dev/null +++ b/.github/workflows/nano-server.yml @@ -0,0 +1,76 @@ +name: Windows Nano Server tests + +on: + workflow_dispatch: + +env: + DEVELOPER: 1 + +jobs: + test-nano-server: + runs-on: windows-2022 + env: + WINDBG_DIR: "C:/Program Files (x86)/Windows Kits/10/Debuggers/x64" + IMAGE: mcr.microsoft.com/powershell:nanoserver-ltsc2022 + + steps: + - uses: actions/checkout@v4 + - uses: git-for-windows/setup-git-for-windows-sdk@v1 + - name: build Git + shell: bash + run: make -j15 + - name: pull nanoserver image + shell: bash + run: docker pull $IMAGE + - name: run nano-server test + shell: bash + run: | + docker run \ + --user "ContainerAdministrator" \ + -v "$WINDBG_DIR:C:/dbg" \ + -v "$(cygpath -aw /mingw64/bin):C:/mingw64-bin" \ + -v "$(cygpath -aw .):C:/test" \ + $IMAGE pwsh.exe -Command ' + # Extend the PATH to include the `.dll` files in /mingw64/bin/ + $env:PATH += ";C:\mingw64-bin" + + # For each executable to test pick some no-operation set of + # flags/subcommands or something that should quickly result in an + # error with known exit code that is not a negative 32-bit + # number, and set the expected return code appropriately. + # + # Only test executables that could be expected to run in a UI + # less environment. + # + # ( Executable path, arguments, expected return code ) + # also note space is required before close parenthesis (a + # powershell quirk when defining nested arrays like this) + + $executables_to_test = @( + ("C:\test\git.exe", "", 1 ), + ("C:\test\scalar.exe", "version", 0 ) + ) + + foreach ($executable in $executables_to_test) + { + Write-Output "Now testing $($executable[0])" + &$executable[0] $executable[1] + if ($LASTEXITCODE -ne $executable[2]) { + # if we failed, run the debugger to find out what function + # or DLL could not be found and then exit the script with + # failure The missing DLL or EXE will be referenced near + # the end of the output + + # Set a flag to have the debugger show loader stub + # diagnostics. This requires running as administrator, + # otherwise the flag will be ignored. + C:\dbg\gflags -i $executable[0] +SLS + + C:\dbg\cdb.exe -c "g" -c "q" $executable[0] $executable[1] + + exit 1 + } + } + + exit 0 + ' From 6f70ef4445ddfcbcad1d5420b493f9ee6f9c9b97 Mon Sep 17 00:00:00 2001 From: David Lomas Date: Fri, 28 Jul 2023 15:31:25 +0100 Subject: [PATCH 172/297] mingw: suggest `windows.appendAtomically` in more cases When running Git for Windows on a remote APFS filesystem, it would appear that the `mingw_open_append()`/`write()` combination would fail almost exactly like on some CIFS-mounted shares as had been reported in https://github.com/git-for-windows/git/issues/2753, albeit with a different `errno` value. Let's handle that `errno` value just the same, by suggesting to set `windows.appendAtomically=false`. Signed-off-by: David Lomas Signed-off-by: Johannes Schindelin --- compat/mingw.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index d9477f9e5a4e64..8e64881f60c2d6 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -718,7 +718,7 @@ ssize_t mingw_write(int fd, const void *buf, size_t len) { ssize_t result = write(fd, buf, len); - if (result < 0 && (errno == EINVAL || errno == ENOSPC) && buf) { + if (result < 0 && (errno == EINVAL || errno == EBADF || errno == ENOSPC) && buf) { int orig = errno; /* check if fd is a pipe */ @@ -744,7 +744,7 @@ ssize_t mingw_write(int fd, const void *buf, size_t len) } errno = orig; - } else if (orig == EINVAL) + } else if (orig == EINVAL || errno == EBADF) errno = EPIPE; else { DWORD buf_size; From f1898a0cf0ed6adcbc21da089e26e2959a766ba9 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 22 Nov 2023 22:57:38 +0100 Subject: [PATCH 173/297] win32: use native ANSI sequence processing, if possible Windows 10 version 1511 (also known as Anniversary Update), according to https://learn.microsoft.com/en-us/windows/console/console-virtual-terminal-sequences introduced native support for ANSI sequence processing. This allows using colors from the entire 24-bit color range. All we need to do is test whether the console's "virtual processing support" can be enabled. If it can, we do not even need to start the `console_thread` to handle ANSI sequences. Or, almost all we need to do: When `console_thread()` does its work, it uses the Unicode-aware `write_console()` function to write to the Win32 Console, which supports Git for Windows' implicit convention that all text that is written is encoded in UTF-8. The same is not necessarily true if native ANSI sequence processing is used, as the output is then subject to the current code page. Let's ensure that the code page is set to `CP_UTF8` as long as Git writes to it. Signed-off-by: Johannes Schindelin --- compat/winansi.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+) diff --git a/compat/winansi.c b/compat/winansi.c index 575813bde8e806..95647a9520941e 100644 --- a/compat/winansi.c +++ b/compat/winansi.c @@ -591,6 +591,49 @@ static void detect_msys_tty(int fd) #endif +static HANDLE std_console_handle; +static DWORD std_console_mode = ENABLE_VIRTUAL_TERMINAL_PROCESSING; +static UINT std_console_code_page = CP_UTF8; + +static void reset_std_console(void) +{ + if (std_console_mode != ENABLE_VIRTUAL_TERMINAL_PROCESSING) + SetConsoleMode(std_console_handle, std_console_mode); + if (std_console_code_page != CP_UTF8) + SetConsoleOutputCP(std_console_code_page); +} + +static int enable_virtual_processing(void) +{ + std_console_handle = GetStdHandle(STD_OUTPUT_HANDLE); + if (std_console_handle == INVALID_HANDLE_VALUE || + !GetConsoleMode(std_console_handle, &std_console_mode)) { + std_console_handle = GetStdHandle(STD_ERROR_HANDLE); + if (std_console_handle == INVALID_HANDLE_VALUE || + !GetConsoleMode(std_console_handle, &std_console_mode)) + return 0; + } + + std_console_code_page = GetConsoleOutputCP(); + if (std_console_code_page != CP_UTF8) + SetConsoleOutputCP(CP_UTF8); + if (!std_console_code_page) + std_console_code_page = CP_UTF8; + + atexit(reset_std_console); + + if (std_console_mode & ENABLE_VIRTUAL_TERMINAL_PROCESSING) + return 1; + + if (!SetConsoleMode(std_console_handle, + std_console_mode | + ENABLE_PROCESSED_OUTPUT | + ENABLE_VIRTUAL_TERMINAL_PROCESSING)) + return 0; + + return 1; +} + /* * Wrapper for isatty(). Most calls in the main git code * call isatty(1 or 2) to see if the instance is interactive @@ -629,6 +672,9 @@ void winansi_init(void) return; } + if (enable_virtual_processing()) + return; + /* create a named pipe to communicate with the console thread */ if (swprintf(name, ARRAY_SIZE(name) - 1, L"\\\\.\\pipe\\winansi%lu", GetCurrentProcessId()) < 0) From 9b44ed4c7138b01e2588fa37b2d27a781ab1ce44 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Matthias=20A=C3=9Fhauer?= Date: Sat, 2 Dec 2023 12:10:00 +0100 Subject: [PATCH 174/297] git.rc: include winuser.h MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit winuser.h contains the definition of RT_MANIFEST that our LLVM based toolchain needs to understand that we want to embed compat/win32/git.manifest as an application manifest. It currently just embeds it as additional data that Windows doesn't understand. This also helps our GCC based toolchain understand that we only want one copy embedded. It currently embeds one working assembly manifest and one nearly identical, but useless copy as additional data. This also teaches our Visual Studio based buildsystems to pick up the manifest file from git.rc. This means we don't have to explicitly specify it in contrib/buildsystems/Generators/Vcxproj.pm anymore. Slightly counter-intuitively this also means we have to explicitly tell Cmake not to embed a default manifest. This fixes https://github.com/git-for-windows/git/issues/4707 Signed-off-by: Matthias Aßhauer Signed-off-by: Johannes Schindelin --- contrib/buildsystems/CMakeLists.txt | 1 + contrib/buildsystems/Generators/Vcxproj.pm | 1 - git.rc | 1 + 3 files changed, 2 insertions(+), 1 deletion(-) diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt index 832f46b316b764..26aac0702be3c9 100644 --- a/contrib/buildsystems/CMakeLists.txt +++ b/contrib/buildsystems/CMakeLists.txt @@ -219,6 +219,7 @@ if(CMAKE_C_COMPILER_ID STREQUAL "MSVC") set(CMAKE_RUNTIME_OUTPUT_DIRECTORY_DEBUG ${CMAKE_BINARY_DIR}) set(CMAKE_RUNTIME_OUTPUT_DIRECTORY_RELEASE ${CMAKE_BINARY_DIR}) add_compile_options(/MP /std:c11) + add_link_options(/MANIFEST:NO) endif() #default behaviour diff --git a/contrib/buildsystems/Generators/Vcxproj.pm b/contrib/buildsystems/Generators/Vcxproj.pm index a6d1c6b8d05682..7a62542eed152b 100644 --- a/contrib/buildsystems/Generators/Vcxproj.pm +++ b/contrib/buildsystems/Generators/Vcxproj.pm @@ -187,7 +187,6 @@ sub createProject { \$(VCPKGLibs);\$(AdditionalDependencies) invalidcontinue.obj %(AdditionalOptions) $entrypoint - $cdup\\compat\\win32\\git.manifest $subsystem EOM diff --git a/git.rc b/git.rc index cc3fdc6cc6cb83..a318a53c12af58 100644 --- a/git.rc +++ b/git.rc @@ -1,3 +1,4 @@ +#include 1 VERSIONINFO FILEVERSION MAJOR,MINOR,MICRO,PATCHLEVEL PRODUCTVERSION MAJOR,MINOR,MICRO,PATCHLEVEL From afecd4aa387400fb15e3c62512afd3b1398b16cc Mon Sep 17 00:00:00 2001 From: MinarKotonoha Date: Mon, 8 Apr 2024 16:41:10 +0800 Subject: [PATCH 175/297] common-main.c: fflush stdout buffer upon exit By default, the buffer type of Windows' `stdout` is unbuffered (_IONBF), and there is no need to manually fflush `stdout`. But some programs, such as the Windows Filtering Platform driver provided by the security software, may change the buffer type of `stdout` to full buffering. This nees `fflush(stdout)` to be called manually, otherwise there will be no output to `stdout`. Signed-off-by: MinarKotonoha Signed-off-by: Johannes Schindelin --- common-main.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/common-main.c b/common-main.c index 8e68ac9e42993d..75cf58abf645f0 100644 --- a/common-main.c +++ b/common-main.c @@ -77,6 +77,13 @@ static void check_bug_if_BUG(void) /* We wrap exit() to call common_exit() in git-compat-util.h */ int common_exit(const char *file, int line, int code) { + /* + * Windows Filtering Platform driver provided by the security software + * may change buffer type of stdout from _IONBF to _IOFBF. + * It will no output without fflush manually. + */ + fflush(stdout); + /* * For non-POSIX systems: Take the lowest 8 bits of the "code" * to e.g. turn -1 into 255. On a POSIX system this is From 377cb331f4dbf9e538020124dd86d339ea1045bc Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 9 Apr 2024 16:50:56 +0200 Subject: [PATCH 176/297] t5601/t7406(mingw): do run tests with symlink support A long time ago, we decided to run tests in Git for Windows' SDK with the default `winsymlinks` mode: copying instead of linking. This is still the default mode of MSYS2 to this day. However, this is not how most users run Git for Windows: As the majority of Git for Windows' users seem to be on Windows 10 and newer, likely having enabled Developer Mode (which allows creating symbolic links without administrator privileges), they will run with symlink support enabled. This is the reason why it is crucial to get the fixes for CVE-2024-? to the users, and also why it is crucial to ensure that the test suite exercises the related test cases. This commit ensures the latter. Signed-off-by: Johannes Schindelin --- t/t5601-clone.sh | 10 ++++++++++ t/t7406-submodule-update.sh | 9 +++++++++ 2 files changed, 19 insertions(+) diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh index 5d7ea147f1abc5..128c8f12468f17 100755 --- a/t/t5601-clone.sh +++ b/t/t5601-clone.sh @@ -7,6 +7,16 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME . ./test-lib.sh +# This test script contains test cases that need to create symbolic links. To +# make sure that these test cases are exercised in Git for Windows, where (for +# historical reasons) `ln -s` creates copies by default, let's specifically ask +# for `ln -s` to create symbolic links whenever possible. +if test_have_prereq MINGW +then + MSYS=${MSYS+$MSYS }winsymlinks:nativestrict + export MSYS +fi + X= test_have_prereq !MINGW || X=.exe diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh index 297c6c3b5cc4b8..80647e440f7093 100755 --- a/t/t7406-submodule-update.sh +++ b/t/t7406-submodule-update.sh @@ -14,6 +14,15 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME . ./test-lib.sh +# This test script contains test cases that need to create symbolic links. To +# make sure that these test cases are exercised in Git for Windows, where (for +# historical reasons) `ln -s` creates copies by default, let's specifically ask +# for `ln -s` to create symbolic links whenever possible. +if test_have_prereq MINGW +then + MSYS=${MSYS+$MSYS }winsymlinks:nativestrict + export MSYS +fi compare_head() { From 5042e2d32adf3c7db177432c555b326a0873e9e6 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 21 May 2024 13:55:26 +0200 Subject: [PATCH 177/297] win32: ensure that `localtime_r()` is declared even in i686 builds The `__MINGW64__` constant is defined, surprise, surprise, only when building for a 64-bit CPU architecture. Therefore using it as a guard to define `_POSIX_C_SOURCE` (so that `localtime_r()` is declared, among other functions) is not enough, we also need to check `__MINGW32__`. Technically, the latter constant is defined even for 64-bit builds. But let's make things a bit easier to understand by testing for both constants. Making it so fixes this compile warning (turned error in GCC v14.1): archive-zip.c: In function 'dos_time': archive-zip.c:612:9: error: implicit declaration of function 'localtime_r'; did you mean 'localtime_s'? [-Wimplicit-function-declaration] 612 | localtime_r(&time, &tm); | ^~~~~~~~~~~ | localtime_s Signed-off-by: Johannes Schindelin --- git-compat-util.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/git-compat-util.h b/git-compat-util.h index 71b4d23f0386b2..96dcca54416b75 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -166,7 +166,7 @@ struct strbuf; /* Approximation of the length of the decimal representation of this type. */ #define decimal_length(x) ((int)(sizeof(x) * 2.56 + 0.5) + 1) -#ifdef __MINGW64__ +#if defined(__MINGW32__) || defined(__MINGW64__) #define _POSIX_C_SOURCE 1 #elif defined(__sun__) /* From a56147b96edcda818b5d57fe4d0864e5e2609f61 Mon Sep 17 00:00:00 2001 From: Ariel Lourenco Date: Tue, 2 Jul 2024 18:09:43 -0300 Subject: [PATCH 178/297] Fallback to AppData if XDG_CONFIG_HOME is unset In order to be a better Windows citizenship, Git should save its configuration files on AppData folder. This can enables git configuration files be replicated between machines using the same Microsoft account logon which would reduce the friction of setting up Git on new systems. Therefore, if %APPDATA%\Git\config exists, we use it; otherwise $HOME/.config/git/config is used. Signed-off-by: Ariel Lourenco --- path.c | 23 ++++++++++++++++++++--- 1 file changed, 20 insertions(+), 3 deletions(-) diff --git a/path.c b/path.c index 19f7684f3876bd..0e307b0a9b734b 100644 --- a/path.c +++ b/path.c @@ -1484,6 +1484,7 @@ int looks_like_command_line_option(const char *str) char *xdg_config_home_for(const char *subdir, const char *filename) { const char *home, *config_home; + char *home_config = NULL; assert(subdir); assert(filename); @@ -1492,10 +1493,26 @@ char *xdg_config_home_for(const char *subdir, const char *filename) return mkpathdup("%s/%s/%s", config_home, subdir, filename); home = getenv("HOME"); - if (home) - return mkpathdup("%s/.config/%s/%s", home, subdir, filename); + if (home && *home) + home_config = mkpathdup("%s/.config/%s/%s", home, subdir, filename); + + #ifdef WIN32 + { + const char *appdata = getenv("APPDATA"); + if (appdata && *appdata) { + char *appdata_config = mkpathdup("%s/Git/%s", appdata, filename); + if (file_exists(appdata_config)) { + if (home_config && file_exists(home_config)) + warning("'%s' was ignored because '%s' exists.", home_config, appdata_config); + free(home_config); + return appdata_config; + } + free(appdata_config); + } + } + #endif - return NULL; + return home_config; } char *xdg_config_home(const char *filename) From 112efbabc60fd01b262318c965e0f12dcfabb7c8 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Thu, 4 Jul 2024 22:41:56 +0200 Subject: [PATCH 179/297] run-command: be helpful with Git LFS fails on Windows 7 Git LFS is now built with Go 1.21 which no longer supports Windows 7. However, Git for Windows still wants to support Windows 7. Ideally, Git LFS would re-introduce Windows 7 support until Git for Windows drops support for Windows 7, but that's not going to happen: https://github.com/git-for-windows/git/issues/4996#issuecomment-2176152565 The next best thing we can do is to let the users know what is happening, and how to get out of their fix, at least. This is not quite as easy as it would first seem because programs compiled with Go 1.21 or newer will simply throw an exception and fail with an Access Violation on Windows 7. The only way I found to address this is to replicate the logic from Go's very own `version` command (which can determine the Go version with which a given executable was built) to detect the situation, and in that case offer a helpful error message. This addresses https://github.com/git-for-windows/git/issues/4996. Signed-off-by: Johannes Schindelin --- compat/win32/path-utils.c | 199 ++++++++++++++++++++++++++++++++++++++ compat/win32/path-utils.h | 3 + git-compat-util.h | 6 ++ run-command.c | 1 + 4 files changed, 209 insertions(+) diff --git a/compat/win32/path-utils.c b/compat/win32/path-utils.c index b658ca3f811717..be43d328db7b3a 100644 --- a/compat/win32/path-utils.c +++ b/compat/win32/path-utils.c @@ -1,5 +1,8 @@ #include "../../git-compat-util.h" #include "../../environment.h" +#include "../../wrapper.h" +#include "../../strbuf.h" +#include "../../versioncmp.h" int win32_has_dos_drive_prefix(const char *path) { @@ -87,3 +90,199 @@ int win32_fspathcmp(const char *a, const char *b) { return win32_fspathncmp(a, b, (size_t)-1); } + +static int read_at(int fd, char *buffer, size_t offset, size_t size) +{ + if (lseek(fd, offset, SEEK_SET) < 0) { + fprintf(stderr, "could not seek to 0x%x\n", (unsigned int)offset); + return -1; + } + + return read_in_full(fd, buffer, size); +} + +static size_t le16(const char *buffer) +{ + unsigned char *u = (unsigned char *)buffer; + return u[0] | (u[1] << 8); +} + +static size_t le32(const char *buffer) +{ + return le16(buffer) | (le16(buffer + 2) << 16); +} + +/* + * Determine the Go version of a given executable, if it was built with Go. + * + * This recapitulates the logic from + * https://github.com/golang/go/blob/master/src/cmd/go/internal/version/version.go + * (without requiring the user to install `go.exe` to find out). + */ +static ssize_t get_go_version(const char *path, char *go_version, size_t go_version_size) +{ + int fd = open(path, O_RDONLY); + char buffer[1024]; + off_t offset; + size_t num_sections, opt_header_size, i; + char *p = NULL, *q; + ssize_t res = -1; + + if (fd < 0) + return -1; + + if (read_in_full(fd, buffer, 2) < 0) + goto fail; + + /* + * Parse the PE file format, for more details, see + * https://en.wikipedia.org/wiki/Portable_Executable#Layout and + * https://learn.microsoft.com/en-us/windows/win32/debug/pe-format + */ + if (buffer[0] != 'M' || buffer[1] != 'Z') + goto fail; + + if (read_at(fd, buffer, 0x3c, 4) < 0) + goto fail; + + /* Read the `PE\0\0` signature and the COFF file header */ + offset = le32(buffer); + if (read_at(fd, buffer, offset, 24) < 0) + goto fail; + + if (buffer[0] != 'P' || buffer[1] != 'E' || buffer[2] != '\0' || buffer[3] != '\0') + goto fail; + + num_sections = le16(buffer + 6); + opt_header_size = le16(buffer + 20); + offset += 24; /* skip file header */ + + /* + * Validate magic number 0x10b or 0x20b, for full details see + * https://learn.microsoft.com/en-us/windows/win32/debug/pe-format#optional-header-standard-fields-image-only + */ + if (read_at(fd, buffer, offset, 2) < 0 || + ((i = le16(buffer)) != 0x10b && i != 0x20b)) + goto fail; + + offset += opt_header_size; + + for (i = 0; i < num_sections; i++) { + if (read_at(fd, buffer, offset + i * 40, 40) < 0) + goto fail; + + /* + * For full details about the section headers, see + * https://learn.microsoft.com/en-us/windows/win32/debug/pe-format#section-table-section-headers + */ + if ((le32(buffer + 36) /* characteristics */ & ~0x600000) /* IMAGE_SCN_ALIGN_32BYTES */ == + (/* IMAGE_SCN_CNT_INITIALIZED_DATA */ 0x00000040 | + /* IMAGE_SCN_MEM_READ */ 0x40000000 | + /* IMAGE_SCN_MEM_WRITE */ 0x80000000)) { + size_t size = le32(buffer + 16); /* "SizeOfRawData " */ + size_t pointer = le32(buffer + 20); /* "PointerToRawData " */ + + /* + * Skip the section if either size or pointer is 0, see + * https://github.com/golang/go/blob/go1.21.0/src/debug/buildinfo/buildinfo.go#L333 + * for full details. + * + * Merely seeing a non-zero size will not actually do, + * though: he size must be at least `buildInfoSize`, + * i.e. 32, and we expect a UVarint (at least another + * byte) _and_ the bytes representing the string, + * which we expect to start with the letters "go" and + * continue with the Go version number. + */ + if (size < 32 + 1 + 2 + 1 || !pointer) + continue; + + p = malloc(size); + + if (!p || read_at(fd, p, pointer, size) < 0) + goto fail; + + /* + * Look for the build information embedded by Go, see + * https://github.com/golang/go/blob/go1.21.0/src/debug/buildinfo/buildinfo.go#L165-L175 + * for full details. + * + * Note: Go contains code to enforce alignment along a + * 16-byte boundary. In practice, no `.exe` has been + * observed that required any adjustment, therefore + * this here code skips that logic for simplicity. + */ + q = memmem(p, size - 18, "\xff Go buildinf:", 14); + if (!q) + goto fail; + /* + * Decode the build blob. For full details, see + * https://github.com/golang/go/blob/go1.21.0/src/debug/buildinfo/buildinfo.go#L177-L191 + * + * Note: The `endianness` values observed in practice + * were always 2, therefore the complex logic to handle + * any other value is skipped for simplicty. + */ + if ((q[14] == 8 || q[14] == 4) && q[15] == 2) { + /* + * Only handle a Go version string with fewer + * than 128 characters, so the Go UVarint at + * q[32] that indicates the string's length must + * be only one byte (without the high bit set). + */ + if ((q[32] & 0x80) || + !q[32] || + (q + 33 + q[32] - p) > size || + q[32] + 1 > go_version_size) + goto fail; + res = q[32]; + memcpy(go_version, q + 33, res); + go_version[res] = '\0'; + break; + } + } + } + +fail: + free(p); + close(fd); + return res; +} + +void win32_warn_about_git_lfs_on_windows7(int exit_code, const char *argv0) +{ + char buffer[128], *git_lfs = NULL; + const char *p; + + /* + * Git LFS v3.5.1 fails with an Access Violation on Windows 7; That + * would usually show up as an exit code 0xc0000005. For some reason + * (probably because at this point, we no longer have the _original_ + * HANDLE that was returned by `CreateProcess()`) we observe other + * values like 0xb00 and 0x2 instead. Since the exact exit code + * seems to be inconsistent, we check for a non-zero exit status. + */ + if (exit_code == 0) + return; + if (GetVersion() >> 16 > 7601) + return; /* Warn only on Windows 7 or older */ + if (!istarts_with(argv0, "git-lfs ") && + strcasecmp(argv0, "git-lfs")) + return; + if (!(git_lfs = locate_in_PATH("git-lfs"))) + return; + if (get_go_version(git_lfs, buffer, sizeof(buffer)) > 0 && + skip_prefix(buffer, "go", &p) && + versioncmp("1.21.0", p) <= 0) + warning("This program was built with Go v%s\n" + "i.e. without support for this Windows version:\n" + "\n\t%s\n" + "\n" + "To work around this, you can download and install a " + "working version from\n" + "\n" + "\thttps://github.com/git-lfs/git-lfs/releases/tag/" + "v3.4.1\n", + p, git_lfs); + free(git_lfs); +} diff --git a/compat/win32/path-utils.h b/compat/win32/path-utils.h index a561c700e75713..a69483c332c1a7 100644 --- a/compat/win32/path-utils.h +++ b/compat/win32/path-utils.h @@ -34,4 +34,7 @@ int win32_fspathcmp(const char *a, const char *b); int win32_fspathncmp(const char *a, const char *b, size_t count); #define fspathncmp win32_fspathncmp +void win32_warn_about_git_lfs_on_windows7(int exit_code, const char *argv0); +#define warn_about_git_lfs_on_windows7 win32_warn_about_git_lfs_on_windows7 + #endif diff --git a/git-compat-util.h b/git-compat-util.h index 71b4d23f0386b2..50982de087fb89 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -514,6 +514,12 @@ static inline int git_offset_1st_component(const char *path) #define fspathncmp git_fspathncmp #endif +#ifndef warn_about_git_lfs_on_windows7 +static inline void warn_about_git_lfs_on_windows7(int exit_code, const char *argv0) +{ +} +#endif + #ifndef is_valid_path #define is_valid_path(path) 1 #endif diff --git a/run-command.c b/run-command.c index 45ba5449323064..de6089d7332880 100644 --- a/run-command.c +++ b/run-command.c @@ -575,6 +575,7 @@ static int wait_or_whine(pid_t pid, const char *argv0, int in_signal) */ code += 128; } else if (WIFEXITED(status)) { + warn_about_git_lfs_on_windows7(status, argv0); code = WEXITSTATUS(status); } else { if (!in_signal) From a7dcca5bd9ca9ea7f92940514be3608a2c3f6705 Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Mon, 9 Sep 2024 16:00:20 -0700 Subject: [PATCH 180/297] ci: remove 'Upload failed tests' directories' step from linux32 jobs Linux32 jobs seem to be getting: Error: This request has been automatically failed because it uses a deprecated version of `actions/upload-artifact: v1`. Learn more: https://github.blog/changelog/2024-02-13-deprecation-notice-v1-and-v2-of-the-artifact-actions/ before doing anything useful. For now, disable the step. Ever since actions/upload-artifact@v1 got disabled, mentioning the offending version of it seems to stop anything from happening. At least this should run the same build and test. See https://github.com/git/git/actions/runs/10780030750/job/29894867249 for example. Cherry-picked-from: 90f2c7240ccc (ci: remove 'Upload failed tests' directories' step from linux32 jobs, 2024-09-09) Signed-off-by: Junio C Hamano Signed-off-by: Johannes Schindelin --- .github/workflows/main.yml | 6 ------ 1 file changed, 6 deletions(-) diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index 1ee0433acc14a2..97f9b063109411 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -365,12 +365,6 @@ jobs: with: name: failed-tests-${{matrix.vector.jobname}} path: ${{env.FAILED_TEST_ARTIFACTS}} - - name: Upload failed tests' directories - if: failure() && env.FAILED_TEST_ARTIFACTS != '' && matrix.vector.jobname == 'linux32' - uses: actions/upload-artifact@v1 # cannot be upgraded because Node.js Actions aren't supported in this container - with: - name: failed-tests-${{matrix.vector.jobname}} - path: ${{env.FAILED_TEST_ARTIFACTS}} static-analysis: needs: ci-config if: needs.ci-config.outputs.enabled == 'yes' From cec19bb7b51b36797faed068b10ba60e500d5e28 Mon Sep 17 00:00:00 2001 From: Karsten Blees Date: Sat, 6 Jul 2013 02:09:35 +0200 Subject: [PATCH 181/297] Win32: make FILETIME conversion functions public We will use them in the upcoming "FSCache" patches (to accelerate sequential lstat() calls). Signed-off-by: Karsten Blees Signed-off-by: Johannes Schindelin --- compat/mingw.c | 18 ------------------ compat/mingw.h | 18 ++++++++++++++++++ 2 files changed, 18 insertions(+), 18 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index c63f6b3a9bd727..1345e3ea2c5ff2 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -788,24 +788,6 @@ int mingw_chmod(const char *filename, int mode) return _wchmod(wfilename, mode); } -/* - * The unit of FILETIME is 100-nanoseconds since January 1, 1601, UTC. - * Returns the 100-nanoseconds ("hekto nanoseconds") since the epoch. - */ -static inline long long filetime_to_hnsec(const FILETIME *ft) -{ - long long winTime = ((long long)ft->dwHighDateTime << 32) + ft->dwLowDateTime; - /* Windows to Unix Epoch conversion */ - return winTime - 116444736000000000LL; -} - -static inline void filetime_to_timespec(const FILETIME *ft, struct timespec *ts) -{ - long long hnsec = filetime_to_hnsec(ft); - ts->tv_sec = (time_t)(hnsec / 10000000); - ts->tv_nsec = (hnsec % 10000000) * 100; -} - /** * Verifies that safe_create_leading_directories() would succeed. */ diff --git a/compat/mingw.h b/compat/mingw.h index 61f98ec39cca2b..f59f7524690cf0 100644 --- a/compat/mingw.h +++ b/compat/mingw.h @@ -356,6 +356,17 @@ static inline int getrlimit(int resource, struct rlimit *rlp) return 0; } +/* + * The unit of FILETIME is 100-nanoseconds since January 1, 1601, UTC. + * Returns the 100-nanoseconds ("hekto nanoseconds") since the epoch. + */ +static inline long long filetime_to_hnsec(const FILETIME *ft) +{ + long long winTime = ((long long)ft->dwHighDateTime << 32) + ft->dwLowDateTime; + /* Windows to Unix Epoch conversion */ + return winTime - 116444736000000000LL; +} + /* * Use mingw specific stat()/lstat()/fstat() implementations on Windows, * including our own struct stat with 64 bit st_size and nanosecond-precision @@ -372,6 +383,13 @@ struct timespec { #endif #endif +static inline void filetime_to_timespec(const FILETIME *ft, struct timespec *ts) +{ + long long hnsec = filetime_to_hnsec(ft); + ts->tv_sec = (time_t)(hnsec / 10000000); + ts->tv_nsec = (hnsec % 10000000) * 100; +} + struct mingw_stat { _dev_t st_dev; _ino_t st_ino; From 237f9dd7d9498d2df402e579e0d45d81e27178f5 Mon Sep 17 00:00:00 2001 From: Karsten Blees Date: Sun, 8 Sep 2013 14:17:31 +0200 Subject: [PATCH 182/297] Win32: dirent.c: Move opendir down Move opendir down in preparation for the next patch. Signed-off-by: Karsten Blees --- compat/win32/dirent.c | 68 +++++++++++++++++++++---------------------- 1 file changed, 34 insertions(+), 34 deletions(-) diff --git a/compat/win32/dirent.c b/compat/win32/dirent.c index 52420ec7d4dad7..2603a0fa39f45a 100644 --- a/compat/win32/dirent.c +++ b/compat/win32/dirent.c @@ -18,40 +18,6 @@ static inline void finddata2dirent(struct dirent *ent, WIN32_FIND_DATAW *fdata) ent->d_type = DT_REG; } -DIR *opendir(const char *name) -{ - wchar_t pattern[MAX_PATH + 2]; /* + 2 for '/' '*' */ - WIN32_FIND_DATAW fdata; - HANDLE h; - int len; - DIR *dir; - - /* convert name to UTF-16 and check length < MAX_PATH */ - if ((len = xutftowcs_path(pattern, name)) < 0) - return NULL; - - /* append optional '/' and wildcard '*' */ - if (len && !is_dir_sep(pattern[len - 1])) - pattern[len++] = '/'; - pattern[len++] = '*'; - pattern[len] = 0; - - /* open find handle */ - h = FindFirstFileW(pattern, &fdata); - if (h == INVALID_HANDLE_VALUE) { - DWORD err = GetLastError(); - errno = (err == ERROR_DIRECTORY) ? ENOTDIR : err_win_to_posix(err); - return NULL; - } - - /* initialize DIR structure and copy first dir entry */ - dir = xmalloc(sizeof(DIR)); - dir->dd_handle = h; - dir->dd_stat = 0; - finddata2dirent(&dir->dd_dir, &fdata); - return dir; -} - struct dirent *readdir(DIR *dir) { if (!dir) { @@ -90,3 +56,37 @@ int closedir(DIR *dir) free(dir); return 0; } + +DIR *opendir(const char *name) +{ + wchar_t pattern[MAX_PATH + 2]; /* + 2 for '/' '*' */ + WIN32_FIND_DATAW fdata; + HANDLE h; + int len; + DIR *dir; + + /* convert name to UTF-16 and check length < MAX_PATH */ + if ((len = xutftowcs_path(pattern, name)) < 0) + return NULL; + + /* append optional '/' and wildcard '*' */ + if (len && !is_dir_sep(pattern[len - 1])) + pattern[len++] = '/'; + pattern[len++] = '*'; + pattern[len] = 0; + + /* open find handle */ + h = FindFirstFileW(pattern, &fdata); + if (h == INVALID_HANDLE_VALUE) { + DWORD err = GetLastError(); + errno = (err == ERROR_DIRECTORY) ? ENOTDIR : err_win_to_posix(err); + return NULL; + } + + /* initialize DIR structure and copy first dir entry */ + dir = xmalloc(sizeof(DIR)); + dir->dd_handle = h; + dir->dd_stat = 0; + finddata2dirent(&dir->dd_dir, &fdata); + return dir; +} From dc626be2c6bd162b70ff42e4aaae71082c82efa4 Mon Sep 17 00:00:00 2001 From: Karsten Blees Date: Sun, 8 Sep 2013 14:18:40 +0200 Subject: [PATCH 183/297] mingw: make the dirent implementation pluggable Emulating the POSIX `dirent` API on Windows via `FindFirstFile()`/`FindNextFile()` is pretty staightforward, however, most of the information provided in the `WIN32_FIND_DATA` structure is thrown away in the process. A more sophisticated implementation may cache this data, e.g. for later reuse in calls to `lstat()`. Make the `dirent` implementation pluggable so that it can be switched at runtime, e.g. based on a config option. Define a base DIR structure with pointers to `readdir()`/`closedir()` that match the `opendir()` implementation (similar to vtable pointers in Object-Oriented Programming). Define `readdir()`/`closedir()` so that they call the function pointers in the `DIR` structure. This allows to choose the `opendir()` implementation on a call-by-call basis. Make the fixed-size `dirent.d_name` buffer a flex array, as `d_name` may be implementation specific (e.g. a caching implementation may allocate a `struct dirent` with _just_ the size needed to hold the `d_name` in question). Signed-off-by: Karsten Blees Signed-off-by: Johannes Schindelin --- compat/win32/dirent.c | 30 +++++++++++++++++++----------- compat/win32/dirent.h | 26 +++++++++++++++++++------- 2 files changed, 38 insertions(+), 18 deletions(-) diff --git a/compat/win32/dirent.c b/compat/win32/dirent.c index 2603a0fa39f45a..139d2ba3c4da34 100644 --- a/compat/win32/dirent.c +++ b/compat/win32/dirent.c @@ -1,15 +1,21 @@ #include "../../git-compat-util.h" -struct DIR { - struct dirent dd_dir; /* includes d_type */ +#pragma GCC diagnostic push +#pragma GCC diagnostic ignored "-Wpedantic" +typedef struct dirent_DIR { + struct DIR base_dir; /* extend base struct DIR */ HANDLE dd_handle; /* FindFirstFile handle */ int dd_stat; /* 0-based index */ -}; + struct dirent dd_dir; /* includes d_type */ +} dirent_DIR; +#pragma GCC diagnostic pop + +DIR *(*opendir)(const char *dirname) = dirent_opendir; static inline void finddata2dirent(struct dirent *ent, WIN32_FIND_DATAW *fdata) { - /* convert UTF-16 name to UTF-8 */ - xwcstoutf(ent->d_name, fdata->cFileName, sizeof(ent->d_name)); + /* convert UTF-16 name to UTF-8 (d_name points to dirent_DIR.dd_name) */ + xwcstoutf(ent->d_name, fdata->cFileName, MAX_PATH * 3); /* Set file type, based on WIN32_FIND_DATA */ if (fdata->dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) @@ -18,7 +24,7 @@ static inline void finddata2dirent(struct dirent *ent, WIN32_FIND_DATAW *fdata) ent->d_type = DT_REG; } -struct dirent *readdir(DIR *dir) +static struct dirent *dirent_readdir(dirent_DIR *dir) { if (!dir) { errno = EBADF; /* No set_errno for mingw */ @@ -45,7 +51,7 @@ struct dirent *readdir(DIR *dir) return &dir->dd_dir; } -int closedir(DIR *dir) +static int dirent_closedir(dirent_DIR *dir) { if (!dir) { errno = EBADF; @@ -57,13 +63,13 @@ int closedir(DIR *dir) return 0; } -DIR *opendir(const char *name) +DIR *dirent_opendir(const char *name) { wchar_t pattern[MAX_PATH + 2]; /* + 2 for '/' '*' */ WIN32_FIND_DATAW fdata; HANDLE h; int len; - DIR *dir; + dirent_DIR *dir; /* convert name to UTF-16 and check length < MAX_PATH */ if ((len = xutftowcs_path(pattern, name)) < 0) @@ -84,9 +90,11 @@ DIR *opendir(const char *name) } /* initialize DIR structure and copy first dir entry */ - dir = xmalloc(sizeof(DIR)); + dir = xmalloc(sizeof(dirent_DIR) + MAX_PATH); + dir->base_dir.preaddir = (struct dirent *(*)(DIR *dir)) dirent_readdir; + dir->base_dir.pclosedir = (int (*)(DIR *dir)) dirent_closedir; dir->dd_handle = h; dir->dd_stat = 0; finddata2dirent(&dir->dd_dir, &fdata); - return dir; + return (DIR*) dir; } diff --git a/compat/win32/dirent.h b/compat/win32/dirent.h index 058207e4bfed62..e0e0e1700f64d1 100644 --- a/compat/win32/dirent.h +++ b/compat/win32/dirent.h @@ -1,20 +1,32 @@ #ifndef DIRENT_H #define DIRENT_H -typedef struct DIR DIR; - #define DT_UNKNOWN 0 #define DT_DIR 1 #define DT_REG 2 #define DT_LNK 3 struct dirent { - unsigned char d_type; /* file type to prevent lstat after readdir */ - char d_name[MAX_PATH * 3]; /* file name (* 3 for UTF-8 conversion) */ + unsigned char d_type; /* file type to prevent lstat after readdir */ + char d_name[FLEX_ARRAY]; /* file name */ }; -DIR *opendir(const char *dirname); -struct dirent *readdir(DIR *dir); -int closedir(DIR *dir); +/* + * Base DIR structure, contains pointers to readdir/closedir implementations so + * that opendir may choose a concrete implementation on a call-by-call basis. + */ +typedef struct DIR { + struct dirent *(*preaddir)(struct DIR *dir); + int (*pclosedir)(struct DIR *dir); +} DIR; + +/* default dirent implementation */ +extern DIR *dirent_opendir(const char *dirname); + +/* current dirent implementation */ +extern DIR *(*opendir)(const char *dirname); + +#define readdir(dir) (dir->preaddir(dir)) +#define closedir(dir) (dir->pclosedir(dir)) #endif /* DIRENT_H */ From cc740004641ec893e5e8ee3f59d2941adeef0c3f Mon Sep 17 00:00:00 2001 From: Karsten Blees Date: Sun, 8 Sep 2013 14:21:30 +0200 Subject: [PATCH 184/297] Win32: make the lstat implementation pluggable Emulating the POSIX lstat API on Windows via GetFileAttributes[Ex] is quite slow. Windows operating system APIs seem to be much better at scanning the status of entire directories than checking single files. A caching implementation may improve performance by bulk-reading entire directories or reusing data obtained via opendir / readdir. Make the lstat implementation pluggable so that it can be switched at runtime, e.g. based on a config option. Signed-off-by: Karsten Blees Signed-off-by: Johannes Schindelin --- compat/mingw.c | 2 ++ compat/mingw.h | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/compat/mingw.c b/compat/mingw.c index 1345e3ea2c5ff2..a68141b17cf558 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -927,6 +927,8 @@ static int do_stat_internal(int follow, const char *file_name, struct stat *buf) return do_lstat(follow, alt_name, buf); } +int (*lstat)(const char *file_name, struct stat *buf) = mingw_lstat; + static int get_file_info_by_handle(HANDLE hnd, struct stat *buf) { BY_HANDLE_FILE_INFORMATION fdata; diff --git a/compat/mingw.h b/compat/mingw.h index f59f7524690cf0..f982cb98bf9385 100644 --- a/compat/mingw.h +++ b/compat/mingw.h @@ -422,7 +422,7 @@ int mingw_fstat(int fd, struct stat *buf); #ifdef lstat #undef lstat #endif -#define lstat mingw_lstat +extern int (*lstat)(const char *file_name, struct stat *buf); int mingw_utime(const char *file_name, const struct utimbuf *times); From 711da7fa93f6ccf437e40049cc1d21ce1775fc9e Mon Sep 17 00:00:00 2001 From: Karsten Blees Date: Sun, 8 Sep 2013 14:23:27 +0200 Subject: [PATCH 185/297] mingw: add infrastructure for read-only file system level caches Add a macro to mark code sections that only read from the file system, along with a config option and documentation. This facilitates implementation of relatively simple file system level caches without the need to synchronize with the file system. Enable read-only sections for 'git status' and preload_index. Signed-off-by: Karsten Blees --- Documentation/config/core.txt | 6 ++++++ builtin/commit.c | 1 + compat/mingw.c | 6 ++++++ compat/mingw.h | 2 ++ git-compat-util.h | 15 +++++++++++++++ preload-index.c | 3 +++ 6 files changed, 33 insertions(+) diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt index 60ca9f2b686106..b894c7e9cc8a57 100644 --- a/Documentation/config/core.txt +++ b/Documentation/config/core.txt @@ -685,6 +685,12 @@ relatively high IO latencies. When enabled, Git will do the index comparison to the filesystem data in parallel, allowing overlapping IO's. Defaults to true. +core.fscache:: + Enable additional caching of file system data for some operations. ++ +Git for Windows uses this to bulk-read and cache lstat data of entire +directories (instead of doing lstat file by file). + core.unsetenvvars:: Windows-only: comma-separated list of environment variables' names that need to be unset before spawning any other process. diff --git a/builtin/commit.c b/builtin/commit.c index 66427ba82d5b9b..794ea24d9ab9a0 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -1568,6 +1568,7 @@ int cmd_status(int argc, const char **argv, const char *prefix) PATHSPEC_PREFER_FULL, prefix, argv); + enable_fscache(1); if (status_format != STATUS_FORMAT_PORCELAIN && status_format != STATUS_FORMAT_PORCELAIN_V2) progress_flag = REFRESH_PROGRESS; diff --git a/compat/mingw.c b/compat/mingw.c index a68141b17cf558..ca8b116190b3af 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -244,6 +244,7 @@ enum hide_dotfiles_type { static int core_restrict_inherited_handles = -1; static enum hide_dotfiles_type hide_dotfiles = HIDE_DOTFILES_DOTGITONLY; static char *unset_environment_variables; +int core_fscache; int mingw_core_config(const char *var, const char *value, const struct config_context *ctx, void *cb) @@ -256,6 +257,11 @@ int mingw_core_config(const char *var, const char *value, return 0; } + if (!strcmp(var, "core.fscache")) { + core_fscache = git_config_bool(var, value); + return 0; + } + if (!strcmp(var, "core.unsetenvvars")) { if (!value) return config_error_nonbool(var); diff --git a/compat/mingw.h b/compat/mingw.h index f982cb98bf9385..946ff707327b4e 100644 --- a/compat/mingw.h +++ b/compat/mingw.h @@ -11,6 +11,8 @@ typedef _sigset_t sigset_t; #undef _POSIX_THREAD_SAFE_FUNCTIONS #endif +extern int core_fscache; + struct config_context; int mingw_core_config(const char *var, const char *value, const struct config_context *ctx, void *cb); diff --git a/git-compat-util.h b/git-compat-util.h index 710230d7321395..c8429efc22c7f1 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -1520,6 +1520,21 @@ static inline int is_missing_file_error(int errno_) return (errno_ == ENOENT || errno_ == ENOTDIR); } +/* + * Enable/disable a read-only cache for file system data on platforms that + * support it. + * + * Implementing a live-cache is complicated and requires special platform + * support (inotify, ReadDirectoryChangesW...). enable_fscache shall be used + * to mark sections of git code that extensively read from the file system + * without modifying anything. Implementations can use this to cache e.g. stat + * data or even file content without the need to synchronize with the file + * system. + */ +#ifndef enable_fscache +#define enable_fscache(x) /* noop */ +#endif + int cmd_main(int, const char **); /* diff --git a/preload-index.c b/preload-index.c index 63fd35d64b141f..ec1d7b46d55fc0 100644 --- a/preload-index.c +++ b/preload-index.c @@ -132,6 +132,7 @@ void preload_index(struct index_state *index, pthread_mutex_init(&pd.mutex, NULL); } + enable_fscache(1); for (i = 0; i < threads; i++) { struct thread_data *p = data+i; int err; @@ -167,6 +168,8 @@ void preload_index(struct index_state *index, trace2_data_intmax("index", NULL, "preload/sum_lstat", t2_sum_lstat); trace2_region_leave("index", "preload", NULL); + + enable_fscache(0); } int repo_read_index_preload(struct repository *repo, From 3fb0e0b94ee861ce7c9a9f0e1f3b483c5230e7a4 Mon Sep 17 00:00:00 2001 From: Jeff Hostetler Date: Tue, 24 Jan 2017 15:12:13 -0500 Subject: [PATCH 186/297] fscache: add key for GIT_TRACE_FSCACHE Signed-off-by: Jeff Hostetler Signed-off-by: Johannes Schindelin --- compat/win32/fscache.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index 8c7a37c6c0cfce..9f495ac384c925 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -4,11 +4,13 @@ #include "fscache.h" #include "../../dir.h" #include "../../abspath.h" +#include "../../trace.h" static int initialized; static volatile long enabled; static struct hashmap map; static CRITICAL_SECTION mutex; +static struct trace_key trace_fscache = TRACE_KEY_INIT(FSCACHE); /* * An entry in the file system cache. Used for both entire directory listings @@ -207,6 +209,8 @@ static struct fsentry *fsentry_create_list(const struct fsentry *dir) if (h == INVALID_HANDLE_VALUE) { err = GetLastError(); errno = (err == ERROR_DIRECTORY) ? ENOTDIR : err_win_to_posix(err); + trace_printf_key(&trace_fscache, "fscache: error(%d) '%s'\n", + errno, dir->dirent.d_name); return NULL; } @@ -392,6 +396,7 @@ int fscache_enable(int enable) fscache_clear(); LeaveCriticalSection(&mutex); } + trace_printf_key(&trace_fscache, "fscache: enable(%d)\n", enable); return result; } From 6cc16496a7eeeca844213e41b522910419a9ad88 Mon Sep 17 00:00:00 2001 From: Jeff Hostetler Date: Wed, 1 Nov 2017 15:05:44 -0400 Subject: [PATCH 187/297] dir.c: make add_excludes aware of fscache during status Teach read_directory_recursive() and add_excludes() to be aware of optional fscache and avoid trying to open() and fstat() non-existant ".gitignore" files in every directory in the worktree. The current code in add_excludes() calls open() and then fstat() for a ".gitignore" file in each directory present in the worktree. Change that when fscache is enabled to call lstat() first and if present, call open(). This seems backwards because both lstat needs to do more work than fstat. But when fscache is enabled, fscache will already know if the .gitignore file exists and can completely avoid the IO calls. This works because of the lstat diversion to mingw_lstat when fscache is enabled. This reduced status times on a 350K file enlistment of the Windows repo on a NVMe SSD by 0.25 seconds. Signed-off-by: Jeff Hostetler --- compat/win32/fscache.c | 5 +++++ compat/win32/fscache.h | 3 +++ dir.c | 33 ++++++++++++++++++++++++--------- git-compat-util.h | 4 ++++ 4 files changed, 36 insertions(+), 9 deletions(-) diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index a79096839e5c31..e22660237b955f 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -12,6 +12,11 @@ static struct hashmap map; static CRITICAL_SECTION mutex; static struct trace_key trace_fscache = TRACE_KEY_INIT(FSCACHE); +int fscache_is_enabled(void) +{ + return enabled; +} + /* * An entry in the file system cache. Used for both entire directory listings * and file entries. diff --git a/compat/win32/fscache.h b/compat/win32/fscache.h index ed518b422d705e..9a21fd5709c5bc 100644 --- a/compat/win32/fscache.h +++ b/compat/win32/fscache.h @@ -4,6 +4,9 @@ int fscache_enable(int enable); #define enable_fscache(x) fscache_enable(x) +int fscache_is_enabled(void); +#define is_fscache_enabled() (fscache_is_enabled()) + DIR *fscache_opendir(const char *dir); int fscache_lstat(const char *file_name, struct stat *buf); diff --git a/dir.c b/dir.c index 5a23376bdaec3a..7e8a0dff88965d 100644 --- a/dir.c +++ b/dir.c @@ -1113,16 +1113,31 @@ static int add_patterns(const char *fname, const char *base, int baselen, size_t size = 0; char *buf; - if (flags & PATTERN_NOFOLLOW) - fd = open_nofollow(fname, O_RDONLY); - else - fd = open(fname, O_RDONLY); - - if (fd < 0 || fstat(fd, &st) < 0) { - if (fd < 0) - warn_on_fopen_errors(fname); + if (is_fscache_enabled()) { + if (lstat(fname, &st) < 0) { + fd = -1; + } else { + fd = open(fname, O_RDONLY); + if (fd < 0) + warn_on_fopen_errors(fname); + } + } else { + if (flags & PATTERN_NOFOLLOW) + fd = open_nofollow(fname, O_RDONLY); else - close(fd); + fd = open(fname, O_RDONLY); + + if (fd < 0 || fstat(fd, &st) < 0) { + if (fd < 0) + warn_on_fopen_errors(fname); + else { + close(fd); + fd = -1; + } + } + } + + if (fd < 0) { if (!istate) return -1; r = read_skip_worktree_file_from_index(istate, fname, diff --git a/git-compat-util.h b/git-compat-util.h index 9b00e08d3e4a8d..79d9bba2c2f692 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -1537,6 +1537,10 @@ static inline int is_missing_file_error(int errno_) #define enable_fscache(x) /* noop */ #endif +#ifndef is_fscache_enabled +#define is_fscache_enabled() (0) +#endif + int cmd_main(int, const char **); /* From 91d204000d7b1a9de12c4a7800b053710975a1d9 Mon Sep 17 00:00:00 2001 From: Karsten Blees Date: Tue, 1 Oct 2013 12:51:54 +0200 Subject: [PATCH 188/297] mingw: add a cache below mingw's lstat and dirent implementations Checking the work tree status is quite slow on Windows, due to slow `lstat()` emulation (git calls `lstat()` once for each file in the index). Windows operating system APIs seem to be much better at scanning the status of entire directories than checking single files. Add an `lstat()` implementation that uses a cache for lstat data. Cache misses read the entire parent directory and add it to the cache. Subsequent `lstat()` calls for the same directory are served directly from the cache. Also implement `opendir()`/`readdir()`/`closedir()` so that they create and use directory listings in the cache. The cache doesn't track file system changes and doesn't plug into any modifying file APIs, so it has to be explicitly enabled for git functions that don't modify the working copy. Note: in an earlier version of this patch, the cache was always active and tracked file system changes via ReadDirectoryChangesW. However, this was much more complex and had negative impact on the performance of modifying git commands such as 'git checkout'. Signed-off-by: Karsten Blees Signed-off-by: Johannes Schindelin --- compat/win32/fscache.c | 463 ++++++++++++++++++++++++++++ compat/win32/fscache.h | 10 + config.mak.uname | 4 +- contrib/buildsystems/CMakeLists.txt | 3 +- git-compat-util.h | 2 + 5 files changed, 479 insertions(+), 3 deletions(-) create mode 100644 compat/win32/fscache.c create mode 100644 compat/win32/fscache.h diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c new file mode 100644 index 00000000000000..16e556274614f6 --- /dev/null +++ b/compat/win32/fscache.c @@ -0,0 +1,463 @@ +#include "../../git-compat-util.h" +#include "../../hashmap.h" +#include "../win32.h" +#include "fscache.h" +#include "../../dir.h" +#include "../../abspath.h" + +static int initialized; +static volatile long enabled; +static struct hashmap map; +static CRITICAL_SECTION mutex; + +/* + * An entry in the file system cache. Used for both entire directory listings + * and file entries. + */ +#pragma GCC diagnostic push +#pragma GCC diagnostic ignored "-Wpedantic" +struct fsentry { + struct hashmap_entry ent; + mode_t st_mode; + /* Pointer to the directory listing, or NULL for the listing itself. */ + struct fsentry *list; + /* Pointer to the next file entry of the list. */ + struct fsentry *next; + + union { + /* Reference count of the directory listing. */ + volatile long refcnt; + struct { + /* More stat members (only used for file entries). */ + off64_t st_size; + struct timespec st_atim; + struct timespec st_mtim; + struct timespec st_ctim; + } s; + } u; + + /* Length of name. */ + unsigned short len; + /* + * Name of the entry. For directory listings: relative path of the + * directory, without trailing '/' (empty for cwd()). For file entries: + * name of the file. Typically points to the end of the structure if + * the fsentry is allocated on the heap (see fsentry_alloc), or to a + * local variable if on the stack (see fsentry_init). + */ + struct dirent dirent; +}; +#pragma GCC diagnostic pop + +struct heap_fsentry { + union { + struct fsentry ent; + char dummy[sizeof(struct fsentry) + MAX_PATH]; + } u; +}; + +/* + * Compares the paths of two fsentry structures for equality. + */ +static int fsentry_cmp(void *unused_cmp_data, + const struct fsentry *fse1, const struct fsentry *fse2, + void *unused_keydata) +{ + int res; + if (fse1 == fse2) + return 0; + + /* compare the list parts first */ + if (fse1->list != fse2->list && + (res = fsentry_cmp(NULL, fse1->list ? fse1->list : fse1, + fse2->list ? fse2->list : fse2, NULL))) + return res; + + /* if list parts are equal, compare len and name */ + if (fse1->len != fse2->len) + return fse1->len - fse2->len; + return fspathncmp(fse1->dirent.d_name, fse2->dirent.d_name, fse1->len); +} + +/* + * Calculates the hash code of an fsentry structure's path. + */ +static unsigned int fsentry_hash(const struct fsentry *fse) +{ + unsigned int hash = fse->list ? fse->list->ent.hash : 0; + return hash ^ memihash(fse->dirent.d_name, fse->len); +} + +/* + * Initialize an fsentry structure for use by fsentry_hash and fsentry_cmp. + */ +static void fsentry_init(struct fsentry *fse, struct fsentry *list, + const char *name, size_t len) +{ + fse->list = list; + if (len > MAX_PATH) + BUG("Trying to allocate fsentry for long path '%.*s'", + (int)len, name); + memcpy(fse->dirent.d_name, name, len); + fse->dirent.d_name[len] = 0; + fse->len = len; + hashmap_entry_init(&fse->ent, fsentry_hash(fse)); +} + +/* + * Allocate an fsentry structure on the heap. + */ +static struct fsentry *fsentry_alloc(struct fsentry *list, const char *name, + size_t len) +{ + /* overallocate fsentry and copy the name to the end */ + struct fsentry *fse = xmalloc(sizeof(struct fsentry) + len + 1); + /* init the rest of the structure */ + fsentry_init(fse, list, name, len); + fse->next = NULL; + fse->u.refcnt = 1; + return fse; +} + +/* + * Add a reference to an fsentry. + */ +inline static void fsentry_addref(struct fsentry *fse) +{ + if (fse->list) + fse = fse->list; + + InterlockedIncrement(&(fse->u.refcnt)); +} + +/* + * Release the reference to an fsentry, frees the memory if its the last ref. + */ +static void fsentry_release(struct fsentry *fse) +{ + if (fse->list) + fse = fse->list; + + if (InterlockedDecrement(&(fse->u.refcnt))) + return; + + while (fse) { + struct fsentry *next = fse->next; + free(fse); + fse = next; + } +} + +/* + * Allocate and initialize an fsentry from a WIN32_FIND_DATA structure. + */ +static struct fsentry *fseentry_create_entry(struct fsentry *list, + const WIN32_FIND_DATAW *fdata) +{ + char buf[MAX_PATH * 3]; + int len; + struct fsentry *fse; + len = xwcstoutf(buf, fdata->cFileName, ARRAY_SIZE(buf)); + + fse = fsentry_alloc(list, buf, len); + + fse->st_mode = file_attr_to_st_mode(fdata->dwFileAttributes); + fse->dirent.d_type = S_ISDIR(fse->st_mode) ? DT_DIR : DT_REG; + fse->u.s.st_size = (((off64_t) (fdata->nFileSizeHigh)) << 32) + | fdata->nFileSizeLow; + filetime_to_timespec(&(fdata->ftLastAccessTime), &(fse->u.s.st_atim)); + filetime_to_timespec(&(fdata->ftLastWriteTime), &(fse->u.s.st_mtim)); + filetime_to_timespec(&(fdata->ftCreationTime), &(fse->u.s.st_ctim)); + + return fse; +} + +/* + * Create an fsentry-based directory listing (similar to opendir / readdir). + * Dir should not contain trailing '/'. Use an empty string for the current + * directory (not "."!). + */ +static struct fsentry *fsentry_create_list(const struct fsentry *dir) +{ + wchar_t pattern[MAX_PATH + 2]; /* + 2 for '/' '*' */ + WIN32_FIND_DATAW fdata; + HANDLE h; + int wlen; + struct fsentry *list, **phead; + DWORD err; + + /* convert name to UTF-16 and check length < MAX_PATH */ + if ((wlen = xutftowcsn(pattern, dir->dirent.d_name, MAX_PATH, + dir->len)) < 0) { + if (errno == ERANGE) + errno = ENAMETOOLONG; + return NULL; + } + + /* append optional '/' and wildcard '*' */ + if (wlen) + pattern[wlen++] = '/'; + pattern[wlen++] = '*'; + pattern[wlen] = 0; + + /* open find handle */ + h = FindFirstFileW(pattern, &fdata); + if (h == INVALID_HANDLE_VALUE) { + err = GetLastError(); + errno = (err == ERROR_DIRECTORY) ? ENOTDIR : err_win_to_posix(err); + return NULL; + } + + /* allocate object to hold directory listing */ + list = fsentry_alloc(NULL, dir->dirent.d_name, dir->len); + + /* walk directory and build linked list of fsentry structures */ + phead = &list->next; + do { + *phead = fseentry_create_entry(list, &fdata); + phead = &(*phead)->next; + } while (FindNextFileW(h, &fdata)); + + /* remember result of last FindNextFile, then close find handle */ + err = GetLastError(); + FindClose(h); + + /* return the list if we've got all the files */ + if (err == ERROR_NO_MORE_FILES) + return list; + + /* otherwise free the list and return error */ + fsentry_release(list); + errno = err_win_to_posix(err); + return NULL; +} + +/* + * Adds a directory listing to the cache. + */ +static void fscache_add(struct fsentry *fse) +{ + if (fse->list) + fse = fse->list; + + for (; fse; fse = fse->next) + hashmap_add(&map, &fse->ent); +} + +/* + * Clears the cache. + */ +static void fscache_clear(void) +{ + hashmap_clear_and_free(&map, struct fsentry, ent); + hashmap_init(&map, (hashmap_cmp_fn)fsentry_cmp, NULL, 0); +} + +/* + * Checks if the cache is enabled for the given path. + */ +static inline int fscache_enabled(const char *path) +{ + return enabled > 0 && !is_absolute_path(path); +} + +/* + * Looks up or creates a cache entry for the specified key. + */ +static struct fsentry *fscache_get(struct fsentry *key) +{ + struct fsentry *fse; + + EnterCriticalSection(&mutex); + /* check if entry is in cache */ + fse = hashmap_get_entry(&map, key, ent, NULL); + if (fse) { + fsentry_addref(fse); + LeaveCriticalSection(&mutex); + return fse; + } + /* if looking for a file, check if directory listing is in cache */ + if (!fse && key->list) { + fse = hashmap_get_entry(&map, key->list, ent, NULL); + if (fse) { + LeaveCriticalSection(&mutex); + /* dir entry without file entry -> file doesn't exist */ + errno = ENOENT; + return NULL; + } + } + + /* create the directory listing (outside mutex!) */ + LeaveCriticalSection(&mutex); + fse = fsentry_create_list(key->list ? key->list : key); + if (!fse) + return NULL; + + EnterCriticalSection(&mutex); + /* add directory listing if it hasn't been added by some other thread */ + if (!hashmap_get_entry(&map, key, ent, NULL)) + fscache_add(fse); + + /* lookup file entry if requested (fse already points to directory) */ + if (key->list) + fse = hashmap_get_entry(&map, key, ent, NULL); + + /* return entry or ENOENT */ + if (fse) + fsentry_addref(fse); + else + errno = ENOENT; + + LeaveCriticalSection(&mutex); + return fse; +} + +/* + * Enables or disables the cache. Note that the cache is read-only, changes to + * the working directory are NOT reflected in the cache while enabled. + */ +int fscache_enable(int enable) +{ + int result; + + if (!initialized) { + /* allow the cache to be disabled entirely */ + if (!core_fscache) + return 0; + + InitializeCriticalSection(&mutex); + hashmap_init(&map, (hashmap_cmp_fn) fsentry_cmp, NULL, 0); + initialized = 1; + } + + result = enable ? InterlockedIncrement(&enabled) + : InterlockedDecrement(&enabled); + + if (enable && result == 1) { + /* redirect opendir and lstat to the fscache implementations */ + opendir = fscache_opendir; + lstat = fscache_lstat; + } else if (!enable && !result) { + /* reset opendir and lstat to the original implementations */ + opendir = dirent_opendir; + lstat = mingw_lstat; + EnterCriticalSection(&mutex); + fscache_clear(); + LeaveCriticalSection(&mutex); + } + return result; +} + +/* + * Lstat replacement, uses the cache if enabled, otherwise redirects to + * mingw_lstat. + */ +int fscache_lstat(const char *filename, struct stat *st) +{ + int dirlen, base, len; + struct heap_fsentry key[2]; + struct fsentry *fse; + + if (!fscache_enabled(filename)) + return mingw_lstat(filename, st); + + /* split filename into path + name */ + len = strlen(filename); + if (len && is_dir_sep(filename[len - 1])) + len--; + base = len; + while (base && !is_dir_sep(filename[base - 1])) + base--; + dirlen = base ? base - 1 : 0; + + /* lookup entry for path + name in cache */ + fsentry_init(&key[0].u.ent, NULL, filename, dirlen); + fsentry_init(&key[1].u.ent, &key[0].u.ent, filename + base, len - base); + fse = fscache_get(&key[1].u.ent); + if (!fse) { + errno = ENOENT; + return -1; + } + + /* copy stat data */ + st->st_ino = 0; + st->st_gid = 0; + st->st_uid = 0; + st->st_dev = 0; + st->st_rdev = 0; + st->st_nlink = 1; + st->st_mode = fse->st_mode; + st->st_size = fse->u.s.st_size; + st->st_atim = fse->u.s.st_atim; + st->st_mtim = fse->u.s.st_mtim; + st->st_ctim = fse->u.s.st_ctim; + + /* don't forget to release fsentry */ + fsentry_release(fse); + return 0; +} + +typedef struct fscache_DIR { + struct DIR base_dir; /* extend base struct DIR */ + struct fsentry *pfsentry; + struct dirent *dirent; +} fscache_DIR; + +/* + * Readdir replacement. + */ +static struct dirent *fscache_readdir(DIR *base_dir) +{ + fscache_DIR *dir = (fscache_DIR*) base_dir; + struct fsentry *next = dir->pfsentry->next; + if (!next) + return NULL; + dir->pfsentry = next; + dir->dirent = &next->dirent; + return dir->dirent; +} + +/* + * Closedir replacement. + */ +static int fscache_closedir(DIR *base_dir) +{ + fscache_DIR *dir = (fscache_DIR*) base_dir; + fsentry_release(dir->pfsentry); + free(dir); + return 0; +} + +/* + * Opendir replacement, uses a directory listing from the cache if enabled, + * otherwise calls original dirent implementation. + */ +DIR *fscache_opendir(const char *dirname) +{ + struct heap_fsentry key; + struct fsentry *list; + fscache_DIR *dir; + int len; + + if (!fscache_enabled(dirname)) + return dirent_opendir(dirname); + + /* prepare name (strip trailing '/', replace '.') */ + len = strlen(dirname); + if ((len == 1 && dirname[0] == '.') || + (len && is_dir_sep(dirname[len - 1]))) + len--; + + /* get directory listing from cache */ + fsentry_init(&key.u.ent, NULL, dirname, len); + list = fscache_get(&key.u.ent); + if (!list) + return NULL; + + /* alloc and return DIR structure */ + dir = (fscache_DIR*) xmalloc(sizeof(fscache_DIR)); + dir->base_dir.preaddir = fscache_readdir; + dir->base_dir.pclosedir = fscache_closedir; + dir->pfsentry = list; + return (DIR*) dir; +} diff --git a/compat/win32/fscache.h b/compat/win32/fscache.h new file mode 100644 index 00000000000000..ed518b422d705e --- /dev/null +++ b/compat/win32/fscache.h @@ -0,0 +1,10 @@ +#ifndef FSCACHE_H +#define FSCACHE_H + +int fscache_enable(int enable); +#define enable_fscache(x) fscache_enable(x) + +DIR *fscache_opendir(const char *dir); +int fscache_lstat(const char *file_name, struct stat *buf); + +#endif diff --git a/config.mak.uname b/config.mak.uname index 75adc5139c57fb..7515d87270e4f2 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -501,7 +501,7 @@ endif compat/win32/path-utils.o \ compat/win32/pthread.o compat/win32/syslog.o \ compat/win32/trace2_win32_process_info.o \ - compat/win32/dirent.o + compat/win32/dirent.o compat/win32/fscache.o COMPAT_CFLAGS = -D__USE_MINGW_ACCESS -DDETECT_MSYS_TTY -DENSURE_MSYSTEM_IS_SET -DNOGDI -DHAVE_STRING_H -Icompat -Icompat/regex -Icompat/win32 -DSTRIP_EXTENSION=\".exe\" BASIC_LDFLAGS = -IGNORE:4217 -IGNORE:4049 -NOLOGO # invalidcontinue.obj allows Git's source code to close the same file @@ -703,7 +703,7 @@ ifeq ($(uname_S),MINGW) compat/win32/flush.o \ compat/win32/path-utils.o \ compat/win32/pthread.o compat/win32/syslog.o \ - compat/win32/dirent.o + compat/win32/dirent.o compat/win32/fscache.o BASIC_CFLAGS += -DWIN32 EXTLIBS += -lws2_32 GITLIBS += git.res diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt index 6e7624467f8c91..9540631852a028 100644 --- a/contrib/buildsystems/CMakeLists.txt +++ b/contrib/buildsystems/CMakeLists.txt @@ -308,7 +308,8 @@ if(CMAKE_SYSTEM_NAME STREQUAL "Windows") compat/win32/trace2_win32_process_info.c compat/win32/dirent.c compat/nedmalloc/nedmalloc.c - compat/strdup.c) + compat/strdup.c + compat/win32/fscache.c) set(NO_UNIX_SOCKETS 1) elseif(CMAKE_SYSTEM_NAME STREQUAL "Linux") diff --git a/git-compat-util.h b/git-compat-util.h index c8429efc22c7f1..9b00e08d3e4a8d 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -284,9 +284,11 @@ static inline int is_xplatform_dir_sep(int c) /* pull in Windows compatibility stuff */ #include "compat/win32/path-utils.h" #include "compat/mingw.h" +#include "compat/win32/fscache.h" #elif defined(_MSC_VER) #include "compat/win32/path-utils.h" #include "compat/msvc.h" +#include "compat/win32/fscache.h" #else #include #include From db1763d6e60ce27a9ed431b3936ef16f4e65bf91 Mon Sep 17 00:00:00 2001 From: Jeff Hostetler Date: Tue, 13 Dec 2016 14:05:32 -0500 Subject: [PATCH 189/297] fscache: remember not-found directories Teach FSCACHE to remember "not found" directories. This is a performance optimization. FSCACHE is a performance optimization available for Windows. It intercepts Posix-style lstat() calls into an in-memory directory using FindFirst/FindNext. It improves performance on Windows by catching the first lstat() call in a directory, using FindFirst/ FindNext to read the list of files (and attribute data) for the entire directory into the cache, and short-cut subsequent lstat() calls in the same directory. This gives a major performance boost on Windows. However, it does not remember "not found" directories. When STATUS runs and there are missing directories, the lstat() interception fails to find the parent directory and simply return ENOENT for the file -- it does not remember that the FindFirst on the directory failed. Thus subsequent lstat() calls in the same directory, each re-attempt the FindFirst. This completely defeats any performance gains. This can be seen by doing a sparse-checkout on a large repo and then doing a read-tree to reset the skip-worktree bits and then running status. This change reduced status times for my very large repo by 60%. Signed-off-by: Jeff Hostetler Signed-off-by: Johannes Schindelin --- compat/win32/fscache.c | 36 ++++++++++++++++++++++++++++++++---- 1 file changed, 32 insertions(+), 4 deletions(-) diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index 9f495ac384c925..a79096839e5c31 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -181,7 +181,8 @@ static struct fsentry *fseentry_create_entry(struct fsentry *list, * Dir should not contain trailing '/'. Use an empty string for the current * directory (not "."!). */ -static struct fsentry *fsentry_create_list(const struct fsentry *dir) +static struct fsentry *fsentry_create_list(const struct fsentry *dir, + int *dir_not_found) { wchar_t pattern[MAX_PATH + 2]; /* + 2 for '/' '*' */ WIN32_FIND_DATAW fdata; @@ -190,6 +191,8 @@ static struct fsentry *fsentry_create_list(const struct fsentry *dir) struct fsentry *list, **phead; DWORD err; + *dir_not_found = 0; + /* convert name to UTF-16 and check length < MAX_PATH */ if ((wlen = xutftowcsn(pattern, dir->dirent.d_name, MAX_PATH, dir->len)) < 0) { @@ -208,6 +211,7 @@ static struct fsentry *fsentry_create_list(const struct fsentry *dir) h = FindFirstFileW(pattern, &fdata); if (h == INVALID_HANDLE_VALUE) { err = GetLastError(); + *dir_not_found = 1; /* or empty directory */ errno = (err == ERROR_DIRECTORY) ? ENOTDIR : err_win_to_posix(err); trace_printf_key(&trace_fscache, "fscache: error(%d) '%s'\n", errno, dir->dirent.d_name); @@ -216,6 +220,8 @@ static struct fsentry *fsentry_create_list(const struct fsentry *dir) /* allocate object to hold directory listing */ list = fsentry_alloc(NULL, dir->dirent.d_name, dir->len); + list->st_mode = S_IFDIR; + list->dirent.d_type = DT_DIR; /* walk directory and build linked list of fsentry structures */ phead = &list->next; @@ -300,12 +306,16 @@ static struct fsentry *fscache_get_wait(struct fsentry *key) static struct fsentry *fscache_get(struct fsentry *key) { struct fsentry *fse, *future, *waiter; + int dir_not_found; EnterCriticalSection(&mutex); /* check if entry is in cache */ fse = fscache_get_wait(key); if (fse) { - fsentry_addref(fse); + if (fse->st_mode) + fsentry_addref(fse); + else + fse = NULL; /* non-existing directory */ LeaveCriticalSection(&mutex); return fse; } @@ -314,7 +324,10 @@ static struct fsentry *fscache_get(struct fsentry *key) fse = fscache_get_wait(key->list); if (fse) { LeaveCriticalSection(&mutex); - /* dir entry without file entry -> file doesn't exist */ + /* + * dir entry without file entry, or dir does not + * exist -> file doesn't exist + */ errno = ENOENT; return NULL; } @@ -328,7 +341,7 @@ static struct fsentry *fscache_get(struct fsentry *key) /* create the directory listing (outside mutex!) */ LeaveCriticalSection(&mutex); - fse = fsentry_create_list(future); + fse = fsentry_create_list(future, &dir_not_found); EnterCriticalSection(&mutex); /* remove future entry and signal waiting threads */ @@ -342,6 +355,18 @@ static struct fsentry *fscache_get(struct fsentry *key) /* leave on error (errno set by fsentry_create_list) */ if (!fse) { + if (dir_not_found && key->list) { + /* + * Record that the directory does not exist (or is + * empty, which for all practical matters is the same + * thing as far as fscache is concerned). + */ + fse = fsentry_alloc(key->list->list, + key->list->dirent.d_name, + key->list->len); + fse->st_mode = 0; + hashmap_add(&map, &fse->ent); + } LeaveCriticalSection(&mutex); return NULL; } @@ -353,6 +378,9 @@ static struct fsentry *fscache_get(struct fsentry *key) if (key->list) fse = hashmap_get_entry(&map, key, ent, NULL); + if (fse && !fse->st_mode) + fse = NULL; /* non-existing directory */ + /* return entry or ENOENT */ if (fse) fsentry_addref(fse); From 5a2c256f6ba04cb5242f07f5f8cc65b94a925737 Mon Sep 17 00:00:00 2001 From: Jeff Hostetler Date: Wed, 20 Dec 2017 10:43:41 -0500 Subject: [PATCH 190/297] fscache: make fscache_enabled() public Make fscache_enabled() function public rather than static. Remove unneeded fscache_is_enabled() function. Change is_fscache_enabled() macro to call fscache_enabled(). is_fscache_enabled() now takes a pathname so that the answer is more precise and mean "is fscache enabled for this pathname", since fscache only stores repo-relative paths and not absolute paths, we can avoid attempting lookups for absolute paths. Signed-off-by: Jeff Hostetler --- compat/win32/fscache.c | 7 +------ compat/win32/fscache.h | 4 ++-- dir.c | 2 +- git-compat-util.h | 2 +- 4 files changed, 5 insertions(+), 10 deletions(-) diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index e22660237b955f..88907836890af1 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -12,11 +12,6 @@ static struct hashmap map; static CRITICAL_SECTION mutex; static struct trace_key trace_fscache = TRACE_KEY_INIT(FSCACHE); -int fscache_is_enabled(void) -{ - return enabled; -} - /* * An entry in the file system cache. Used for both entire directory listings * and file entries. @@ -273,7 +268,7 @@ static void fscache_clear(void) /* * Checks if the cache is enabled for the given path. */ -static inline int fscache_enabled(const char *path) +int fscache_enabled(const char *path) { return enabled > 0 && !is_absolute_path(path); } diff --git a/compat/win32/fscache.h b/compat/win32/fscache.h index 9a21fd5709c5bc..660ada053b4309 100644 --- a/compat/win32/fscache.h +++ b/compat/win32/fscache.h @@ -4,8 +4,8 @@ int fscache_enable(int enable); #define enable_fscache(x) fscache_enable(x) -int fscache_is_enabled(void); -#define is_fscache_enabled() (fscache_is_enabled()) +int fscache_enabled(const char *path); +#define is_fscache_enabled(path) fscache_enabled(path) DIR *fscache_opendir(const char *dir); int fscache_lstat(const char *file_name, struct stat *buf); diff --git a/dir.c b/dir.c index 7e8a0dff88965d..40d4186de467fc 100644 --- a/dir.c +++ b/dir.c @@ -1113,7 +1113,7 @@ static int add_patterns(const char *fname, const char *base, int baselen, size_t size = 0; char *buf; - if (is_fscache_enabled()) { + if (is_fscache_enabled(fname)) { if (lstat(fname, &st) < 0) { fd = -1; } else { diff --git a/git-compat-util.h b/git-compat-util.h index 79d9bba2c2f692..3bd5804b26930a 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -1538,7 +1538,7 @@ static inline int is_missing_file_error(int errno_) #endif #ifndef is_fscache_enabled -#define is_fscache_enabled() (0) +#define is_fscache_enabled(path) (0) #endif int cmd_main(int, const char **); From da8579d5a204b30f7334b672aa36c8bc715237b7 Mon Sep 17 00:00:00 2001 From: Karsten Blees Date: Tue, 24 Jun 2014 13:22:35 +0200 Subject: [PATCH 191/297] fscache: load directories only once If multiple threads access a directory that is not yet in the cache, the directory will be loaded by each thread. Only one of the results is added to the cache, all others are leaked. This wastes performance and memory. On cache miss, add a future object to the cache to indicate that the directory is currently being loaded. Subsequent threads register themselves with the future object and wait. When the first thread has loaded the directory, it replaces the future object with the result and notifies waiting threads. Signed-off-by: Karsten Blees --- compat/win32/fscache.c | 65 ++++++++++++++++++++++++++++++++++++------ 1 file changed, 56 insertions(+), 9 deletions(-) diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index 16e556274614f6..8c7a37c6c0cfce 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -27,6 +27,8 @@ struct fsentry { union { /* Reference count of the directory listing. */ volatile long refcnt; + /* Handle to wait on the loading thread. */ + HANDLE hwait; struct { /* More stat members (only used for file entries). */ off64_t st_size; @@ -261,16 +263,43 @@ static inline int fscache_enabled(const char *path) return enabled > 0 && !is_absolute_path(path); } +/* + * Looks up a cache entry, waits if its being loaded by another thread. + * The mutex must be owned by the calling thread. + */ +static struct fsentry *fscache_get_wait(struct fsentry *key) +{ + struct fsentry *fse = hashmap_get_entry(&map, key, ent, NULL); + + /* return if its a 'real' entry (future entries have refcnt == 0) */ + if (!fse || fse->list || fse->u.refcnt) + return fse; + + /* create an event and link our key to the future entry */ + key->u.hwait = CreateEvent(NULL, TRUE, FALSE, NULL); + key->next = fse->next; + fse->next = key; + + /* wait for the loading thread to signal us */ + LeaveCriticalSection(&mutex); + WaitForSingleObject(key->u.hwait, INFINITE); + CloseHandle(key->u.hwait); + EnterCriticalSection(&mutex); + + /* repeat cache lookup */ + return hashmap_get_entry(&map, key, ent, NULL); +} + /* * Looks up or creates a cache entry for the specified key. */ static struct fsentry *fscache_get(struct fsentry *key) { - struct fsentry *fse; + struct fsentry *fse, *future, *waiter; EnterCriticalSection(&mutex); /* check if entry is in cache */ - fse = hashmap_get_entry(&map, key, ent, NULL); + fse = fscache_get_wait(key); if (fse) { fsentry_addref(fse); LeaveCriticalSection(&mutex); @@ -278,7 +307,7 @@ static struct fsentry *fscache_get(struct fsentry *key) } /* if looking for a file, check if directory listing is in cache */ if (!fse && key->list) { - fse = hashmap_get_entry(&map, key->list, ent, NULL); + fse = fscache_get_wait(key->list); if (fse) { LeaveCriticalSection(&mutex); /* dir entry without file entry -> file doesn't exist */ @@ -287,16 +316,34 @@ static struct fsentry *fscache_get(struct fsentry *key) } } + /* add future entry to indicate that we're loading it */ + future = key->list ? key->list : key; + future->next = NULL; + future->u.refcnt = 0; + hashmap_add(&map, &future->ent); + /* create the directory listing (outside mutex!) */ LeaveCriticalSection(&mutex); - fse = fsentry_create_list(key->list ? key->list : key); - if (!fse) + fse = fsentry_create_list(future); + EnterCriticalSection(&mutex); + + /* remove future entry and signal waiting threads */ + hashmap_remove(&map, &future->ent, NULL); + waiter = future->next; + while (waiter) { + HANDLE h = waiter->u.hwait; + waiter = waiter->next; + SetEvent(h); + } + + /* leave on error (errno set by fsentry_create_list) */ + if (!fse) { + LeaveCriticalSection(&mutex); return NULL; + } - EnterCriticalSection(&mutex); - /* add directory listing if it hasn't been added by some other thread */ - if (!hashmap_get_entry(&map, key, ent, NULL)) - fscache_add(fse); + /* add directory listing to the cache */ + fscache_add(fse); /* lookup file entry if requested (fse already points to directory) */ if (key->list) From 3d532938664aa9bdc61458a98516b67ac579d56b Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 25 Jan 2017 18:39:16 +0100 Subject: [PATCH 192/297] fscache: add a test for the dir-not-found optimization Signed-off-by: Johannes Schindelin --- t/t1090-sparse-checkout-scope.sh | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/t/t1090-sparse-checkout-scope.sh b/t/t1090-sparse-checkout-scope.sh index da0e7714d5ac5c..b6298e7e7d8ab1 100755 --- a/t/t1090-sparse-checkout-scope.sh +++ b/t/t1090-sparse-checkout-scope.sh @@ -107,4 +107,24 @@ test_expect_success 'in partial clone, sparse checkout only fetches needed blobs test_cmp expect actual ' +test_expect_success MINGW 'no unnecessary opendir() with fscache' ' + git clone . fscache-test && + ( + cd fscache-test && + git config core.fscache 1 && + echo "/excluded/*" >.git/info/sparse-checkout && + for f in $(test_seq 10) + do + sha1=$(echo $f | git hash-object -w --stdin) && + git update-index --add \ + --cacheinfo 100644,$sha1,excluded/$f || exit 1 + done && + test_tick && + git commit -m excluded && + GIT_TRACE_FSCACHE=1 git status >out 2>err && + grep excluded err >grep.out && + test_line_count = 1 grep.out + ) +' + test_done From a59884654484c18f121cc753a12b49e2d74757b6 Mon Sep 17 00:00:00 2001 From: Jeff Hostetler Date: Tue, 22 Nov 2016 11:26:38 -0500 Subject: [PATCH 193/297] add: use preload-index and fscache for performance Teach "add" to use preload-index and fscache features to improve performance on very large repositories. During an "add", a call is made to run_diff_files() which calls check_remove() for each index-entry. This calls lstat(). On Windows, the fscache code intercepts the lstat() calls and builds a private cache using the FindFirst/FindNext routines, which are much faster. Somewhat independent of this, is the preload-index code which distributes some of the start-up costs across multiple threads. We need to keep the call to read_cache() before parsing the pathspecs (and hence cannot use the pathspecs to limit any preload) because parse_pathspec() is using the index to determine whether a pathspec is, in fact, in a submodule. If we would not read the index first, parse_pathspec() would not error out on a path that is inside a submodule, and t7400-submodule-basic.sh would fail with not ok 47 - do not add files from a submodule We still want the nice preload performance boost, though, so we simply call read_cache_preload(&pathspecs) after parsing the pathspecs. Signed-off-by: Jeff Hostetler Signed-off-by: Johannes Schindelin --- builtin/add.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/builtin/add.c b/builtin/add.c index 40b61ef90d98ca..a17484ba4980aa 100644 --- a/builtin/add.c +++ b/builtin/add.c @@ -460,6 +460,10 @@ int cmd_add(int argc, const char **argv, const char *prefix) die_in_unpopulated_submodule(the_repository->index, prefix); die_path_inside_submodule(the_repository->index, &pathspec); + enable_fscache(1); + /* We do not really re-read the index but update the up-to-date flags */ + preload_index(the_repository->index, &pathspec, 0); + if (add_new_files) { int baselen; @@ -572,5 +576,6 @@ int cmd_add(int argc, const char **argv, const char *prefix) free(ps_matched); dir_clear(&dir); clear_pathspec(&pathspec); + enable_fscache(0); return exit_status; } From 07cec994bad7a4c6c666b940e7f5a6c5cfabfd46 Mon Sep 17 00:00:00 2001 From: Jeff Hostetler Date: Wed, 20 Dec 2017 11:19:27 -0500 Subject: [PATCH 194/297] dir.c: regression fix for add_excludes with fscache Fix regression described in: https://github.com/git-for-windows/git/issues/1392 which was introduced in: https://github.com/git-for-windows/git/commit/b2353379bba414e6c00dde913497cc9c827366f2 Problem Symptoms ================ When the user has a .gitignore file that is a symlink, the fscache optimization introduced above caused the stat-data from the symlink, rather that of the target file, to be returned. Later when the ignore file was read, the buffer length did not match the stat.st_size field and we called die("cannot use as an exclude file") Optimization Rationale ====================== The above optimization calls lstat() before open() primarily to ask fscache if the file exists. It gets the current stat-data as a side effect essentially for free (since we already have it in memory). If the file does not exist, it does not need to call open(). And since very few directories have .gitignore files, we can greatly reduce time spent in the filesystem. Discussion of Fix ================= The above optimization calls lstat() rather than stat() because the fscache only intercepts lstat() calls. Calls to stat() stay directed to the mingw_stat() completly bypassing fscache. Furthermore, calls to mingw_stat() always call {open, fstat, close} so that symlinks are properly dereferenced, which adds *additional* open/close calls on top of what the original code in dir.c is doing. Since the problem only manifests for symlinks, we add code to overwrite the stat-data when the path is a symlink. This preserves the effect of the performance gains provided by the fscache in the normal case. Signed-off-by: Jeff Hostetler --- dir.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/dir.c b/dir.c index 40d4186de467fc..837c812a149e08 100644 --- a/dir.c +++ b/dir.c @@ -1113,6 +1113,29 @@ static int add_patterns(const char *fname, const char *base, int baselen, size_t size = 0; char *buf; + /* + * A performance optimization for status. + * + * During a status scan, git looks in each directory for a .gitignore + * file before scanning the directory. Since .gitignore files are not + * that common, we can waste a lot of time looking for files that are + * not there. Fortunately, the fscache already knows if the directory + * contains a .gitignore file, since it has already read the directory + * and it already has the stat-data. + * + * If the fscache is enabled, use the fscache-lstat() interlude to see + * if the file exists (in the fscache hash maps) before trying to open() + * it. + * + * This causes problem when the .gitignore file is a symlink, because + * we call lstat() rather than stat() on the symlnk and the resulting + * stat-data is for the symlink itself rather than the target file. + * We CANNOT use stat() here because the fscache DOES NOT install an + * interlude for stat() and mingw_stat() always calls "open-fstat-close" + * on the file and defeats the purpose of the optimization here. Since + * symlinks are even more rare than .gitignore files, we force a fstat() + * after our open() to get stat-data for the target file. + */ if (is_fscache_enabled(fname)) { if (lstat(fname, &st) < 0) { fd = -1; @@ -1120,6 +1143,11 @@ static int add_patterns(const char *fname, const char *base, int baselen, fd = open(fname, O_RDONLY); if (fd < 0) warn_on_fopen_errors(fname); + else if (S_ISLNK(st.st_mode) && fstat(fd, &st) < 0) { + warn_on_fopen_errors(fname); + close(fd); + fd = -1; + } } } else { if (flags & PATTERN_NOFOLLOW) From 2b1802f1f3d6c27525d9a3a7f7327d0e9d22b0d1 Mon Sep 17 00:00:00 2001 From: Ben Peart Date: Thu, 4 Oct 2018 18:10:21 -0400 Subject: [PATCH 195/297] mem_pool: add GIT_TRACE_MEMPOOL support Add tracing around initializing and discarding mempools. In discard report on the amount of memory unused in the current block to help tune setting the initial_size. Signed-off-by: Ben Peart --- mem-pool.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/mem-pool.c b/mem-pool.c index a3ba38831d94d6..f1264bcf9ddfa1 100644 --- a/mem-pool.c +++ b/mem-pool.c @@ -5,7 +5,9 @@ #include "git-compat-util.h" #include "mem-pool.h" #include "gettext.h" +#include "trace.h" +static struct trace_key trace_mem_pool = TRACE_KEY_INIT(MEMPOOL); #define BLOCK_GROWTH_SIZE (1024 * 1024 - sizeof(struct mp_block)) /* @@ -63,12 +65,20 @@ void mem_pool_init(struct mem_pool *pool, size_t initial_size) if (initial_size > 0) mem_pool_alloc_block(pool, initial_size, NULL); + + trace_printf_key(&trace_mem_pool, + "mem_pool (%p): init (%"PRIuMAX") initial size\n", + (void *)pool, (uintmax_t)initial_size); } void mem_pool_discard(struct mem_pool *pool, int invalidate_memory) { struct mp_block *block, *block_to_free; + trace_printf_key(&trace_mem_pool, + "mem_pool (%p): discard (%"PRIuMAX") unused\n", + (void *)pool, + (uintmax_t)(pool->mp_block->end - pool->mp_block->next_free)); block = pool->mp_block; while (block) { From 550556326d94e2786b6a75a30f4ef5aeb0aaa994 Mon Sep 17 00:00:00 2001 From: Ben Peart Date: Fri, 2 Nov 2018 11:19:10 -0400 Subject: [PATCH 196/297] fscache: fscache takes an initial size Update enable_fscache() to take an optional initial size parameter which is used to initialize the hashmap so that it can avoid having to rehash as additional entries are added. Add a separate disable_fscache() macro to make the code clearer and easier to read. Signed-off-by: Ben Peart Signed-off-by: Johannes Schindelin --- builtin/add.c | 2 +- builtin/checkout.c | 4 ++-- builtin/commit.c | 4 ++-- compat/win32/fscache.c | 8 ++++++-- compat/win32/fscache.h | 5 +++-- fetch-pack.c | 4 ++-- git-compat-util.h | 4 ++++ preload-index.c | 4 ++-- read-cache.c | 4 ++-- 9 files changed, 24 insertions(+), 15 deletions(-) diff --git a/builtin/add.c b/builtin/add.c index a17484ba4980aa..077f5c171a90b5 100644 --- a/builtin/add.c +++ b/builtin/add.c @@ -460,7 +460,7 @@ int cmd_add(int argc, const char **argv, const char *prefix) die_in_unpopulated_submodule(the_repository->index, prefix); die_path_inside_submodule(the_repository->index, &pathspec); - enable_fscache(1); + enable_fscache(0); /* We do not really re-read the index but update the up-to-date flags */ preload_index(the_repository->index, &pathspec, 0); diff --git a/builtin/checkout.c b/builtin/checkout.c index e19edfc0507f2f..44db2bae72a2f5 100644 --- a/builtin/checkout.c +++ b/builtin/checkout.c @@ -403,7 +403,7 @@ static int checkout_worktree(const struct checkout_opts *opts, if (pc_workers > 1) init_parallel_checkout(); - enable_fscache(1); + enable_fscache(the_repository->index->cache_nr); for (pos = 0; pos < the_repository->index->cache_nr; pos++) { struct cache_entry *ce = the_repository->index->cache[pos]; if (ce->ce_flags & CE_MATCHED) { @@ -429,7 +429,7 @@ static int checkout_worktree(const struct checkout_opts *opts, errs |= run_parallel_checkout(&state, pc_workers, pc_threshold, NULL, NULL); mem_pool_discard(&ce_mem_pool, should_validate_cache_entries()); - enable_fscache(0); + disable_fscache(); remove_marked_cache_entries(the_repository->index, 1); remove_scheduled_dirs(); errs |= finish_delayed_checkout(&state, opts->show_progress); diff --git a/builtin/commit.c b/builtin/commit.c index 1f61c538faf84e..6a5266075a8ede 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -1568,7 +1568,7 @@ int cmd_status(int argc, const char **argv, const char *prefix) PATHSPEC_PREFER_FULL, prefix, argv); - enable_fscache(1); + enable_fscache(0); if (status_format != STATUS_FORMAT_PORCELAIN && status_format != STATUS_FORMAT_PORCELAIN_V2) progress_flag = REFRESH_PROGRESS; @@ -1609,7 +1609,7 @@ int cmd_status(int argc, const char **argv, const char *prefix) wt_status_print(&s); wt_status_collect_free_buffers(&s); - enable_fscache(0); + disable_fscache(); return 0; } diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index 0f0225981ef4a4..bb485fad483ccf 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -405,7 +405,7 @@ static struct fsentry *fscache_get(struct fsentry *key) * Enables or disables the cache. Note that the cache is read-only, changes to * the working directory are NOT reflected in the cache while enabled. */ -int fscache_enable(int enable) +int fscache_enable(int enable, size_t initial_size) { int result; @@ -421,7 +421,11 @@ int fscache_enable(int enable) InitializeCriticalSection(&mutex); lstat_requests = opendir_requests = 0; fscache_misses = fscache_requests = 0; - hashmap_init(&map, (hashmap_cmp_fn) fsentry_cmp, NULL, 0); + /* + * avoid having to rehash by leaving room for the parent dirs. + * '4' was determined empirically by testing several repos + */ + hashmap_init(&map, (hashmap_cmp_fn) fsentry_cmp, NULL, initial_size * 4); initialized = 1; } diff --git a/compat/win32/fscache.h b/compat/win32/fscache.h index 2f06f8df97dcd0..d49c9381114da6 100644 --- a/compat/win32/fscache.h +++ b/compat/win32/fscache.h @@ -1,8 +1,9 @@ #ifndef FSCACHE_H #define FSCACHE_H -int fscache_enable(int enable); -#define enable_fscache(x) fscache_enable(x) +int fscache_enable(int enable, size_t initial_size); +#define enable_fscache(initial_size) fscache_enable(1, initial_size) +#define disable_fscache() fscache_enable(0, 0) int fscache_enabled(const char *path); #define is_fscache_enabled(path) fscache_enabled(path) diff --git a/fetch-pack.c b/fetch-pack.c index 89279c73d6b05b..0af92b55882bfc 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -757,7 +757,7 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator, save_commit_buffer = 0; trace2_region_enter("fetch-pack", "parse_remote_refs_and_find_cutoff", NULL); - enable_fscache(1); + enable_fscache(0); for (ref = *refs; ref; ref = ref->next) { struct commit *commit; @@ -784,7 +784,7 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator, if (!cutoff || cutoff < commit->date) cutoff = commit->date; } - enable_fscache(0); + disable_fscache(); trace2_region_leave("fetch-pack", "parse_remote_refs_and_find_cutoff", NULL); /* diff --git a/git-compat-util.h b/git-compat-util.h index 529ab6a9834f7f..7c5a95bc55403e 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -1537,6 +1537,10 @@ static inline int is_missing_file_error(int errno_) #define enable_fscache(x) /* noop */ #endif +#ifndef disable_fscache +#define disable_fscache() /* noop */ +#endif + #ifndef is_fscache_enabled #define is_fscache_enabled(path) (0) #endif diff --git a/preload-index.c b/preload-index.c index ec1d7b46d55fc0..93f74d44e2a023 100644 --- a/preload-index.c +++ b/preload-index.c @@ -132,7 +132,7 @@ void preload_index(struct index_state *index, pthread_mutex_init(&pd.mutex, NULL); } - enable_fscache(1); + enable_fscache(index->cache_nr); for (i = 0; i < threads; i++) { struct thread_data *p = data+i; int err; @@ -169,7 +169,7 @@ void preload_index(struct index_state *index, trace2_data_intmax("index", NULL, "preload/sum_lstat", t2_sum_lstat); trace2_region_leave("index", "preload", NULL); - enable_fscache(0); + disable_fscache(); } int repo_read_index_preload(struct repository *repo, diff --git a/read-cache.c b/read-cache.c index 009343eda25273..33ae3c30b68728 100644 --- a/read-cache.c +++ b/read-cache.c @@ -1530,7 +1530,7 @@ int refresh_index(struct index_state *istate, unsigned int flags, typechange_fmt = in_porcelain ? "T\t%s\n" : "%s: needs update\n"; added_fmt = in_porcelain ? "A\t%s\n" : "%s: needs update\n"; unmerged_fmt = in_porcelain ? "U\t%s\n" : "%s: needs merge\n"; - enable_fscache(1); + enable_fscache(0); /* * Use the multi-threaded preload_index() to refresh most of the * cache entries quickly then in the single threaded loop below, @@ -1625,7 +1625,7 @@ int refresh_index(struct index_state *istate, unsigned int flags, display_progress(progress, istate->cache_nr); stop_progress(&progress); trace_performance_leave("refresh index"); - enable_fscache(0); + disable_fscache(); return has_errors; } From 3972d4c924084fd5c9366d032b1678040fb42062 Mon Sep 17 00:00:00 2001 From: Ben Peart Date: Thu, 4 Oct 2018 15:38:08 -0400 Subject: [PATCH 197/297] fscache: update fscache to be thread specific instead of global The threading model for fscache has been to have a single, global cache. This puts requirements on it to be thread safe so that callers like preload-index can call it from multiple threads. This was implemented with a single mutex and completion events which introduces contention between the calling threads. Simplify the threading model by making fscache thread specific. This allows us to remove the global mutex and synchronization events entirely and instead associate a fscache with every thread that requests one. This works well with the current multi-threading which divides the cache entries into blocks with a separate thread processing each block. At the end of each worker thread, if there is a fscache on the primary thread, merge the cached results from the worker into the primary thread cache. This enables us to reuse the cache later especially when scanning for untracked files. In testing, this reduced the time spent in preload_index() by about 25% and also reduced the CPU utilization significantly. On a repo with ~200K files, it reduced overall status times by ~12%. Signed-off-by: Ben Peart --- compat/win32/fscache.c | 294 +++++++++++++++++++++++++---------------- compat/win32/fscache.h | 22 ++- git-compat-util.h | 12 ++ preload-index.c | 8 +- 4 files changed, 215 insertions(+), 121 deletions(-) diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index bb485fad483ccf..b2513e6932e1e1 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -7,14 +7,24 @@ #include "../../trace.h" #include "config.h" -static int initialized; -static volatile long enabled; -static struct hashmap map; +static volatile long initialized; +static DWORD dwTlsIndex; static CRITICAL_SECTION mutex; -static unsigned int lstat_requests; -static unsigned int opendir_requests; -static unsigned int fscache_requests; -static unsigned int fscache_misses; + +/* + * Store one fscache per thread to avoid thread contention and locking. + * This is ok because multi-threaded access is 1) uncommon and 2) always + * splitting up the cache entries across multiple threads so there isn't + * any overlap between threads anyway. + */ +struct fscache { + volatile long enabled; + struct hashmap map; + unsigned int lstat_requests; + unsigned int opendir_requests; + unsigned int fscache_requests; + unsigned int fscache_misses; +}; static struct trace_key trace_fscache = TRACE_KEY_INIT(FSCACHE); /* @@ -34,8 +44,6 @@ struct fsentry { union { /* Reference count of the directory listing. */ volatile long refcnt; - /* Handle to wait on the loading thread. */ - HANDLE hwait; struct { /* More stat members (only used for file entries). */ off64_t st_size; @@ -253,86 +261,63 @@ static struct fsentry *fsentry_create_list(const struct fsentry *dir, /* * Adds a directory listing to the cache. */ -static void fscache_add(struct fsentry *fse) +static void fscache_add(struct fscache *cache, struct fsentry *fse) { if (fse->list) fse = fse->list; for (; fse; fse = fse->next) - hashmap_add(&map, &fse->ent); + hashmap_add(&cache->map, &fse->ent); } /* * Clears the cache. */ -static void fscache_clear(void) +static void fscache_clear(struct fscache *cache) { - hashmap_clear_and_free(&map, struct fsentry, ent); - hashmap_init(&map, (hashmap_cmp_fn)fsentry_cmp, NULL, 0); - lstat_requests = opendir_requests = 0; - fscache_misses = fscache_requests = 0; + hashmap_clear_and_free(&cache->map, struct fsentry, ent); + hashmap_init(&cache->map, (hashmap_cmp_fn)fsentry_cmp, NULL, 0); + cache->lstat_requests = cache->opendir_requests = 0; + cache->fscache_misses = cache->fscache_requests = 0; } /* * Checks if the cache is enabled for the given path. */ -int fscache_enabled(const char *path) +static int do_fscache_enabled(struct fscache *cache, const char *path) { - return enabled > 0 && !is_absolute_path(path); + return cache->enabled > 0 && !is_absolute_path(path); } -/* - * Looks up a cache entry, waits if its being loaded by another thread. - * The mutex must be owned by the calling thread. - */ -static struct fsentry *fscache_get_wait(struct fsentry *key) +int fscache_enabled(const char *path) { - struct fsentry *fse = hashmap_get_entry(&map, key, ent, NULL); - - /* return if its a 'real' entry (future entries have refcnt == 0) */ - if (!fse || fse->list || fse->u.refcnt) - return fse; - - /* create an event and link our key to the future entry */ - key->u.hwait = CreateEvent(NULL, TRUE, FALSE, NULL); - key->next = fse->next; - fse->next = key; - - /* wait for the loading thread to signal us */ - LeaveCriticalSection(&mutex); - WaitForSingleObject(key->u.hwait, INFINITE); - CloseHandle(key->u.hwait); - EnterCriticalSection(&mutex); + struct fscache *cache = fscache_getcache(); - /* repeat cache lookup */ - return hashmap_get_entry(&map, key, ent, NULL); + return cache ? do_fscache_enabled(cache, path) : 0; } /* * Looks up or creates a cache entry for the specified key. */ -static struct fsentry *fscache_get(struct fsentry *key) +static struct fsentry *fscache_get(struct fscache *cache, struct fsentry *key) { - struct fsentry *fse, *future, *waiter; + struct fsentry *fse; int dir_not_found; - EnterCriticalSection(&mutex); - fscache_requests++; + cache->fscache_requests++; /* check if entry is in cache */ - fse = fscache_get_wait(key); + fse = hashmap_get_entry(&cache->map, key, ent, NULL); if (fse) { if (fse->st_mode) fsentry_addref(fse); else fse = NULL; /* non-existing directory */ - LeaveCriticalSection(&mutex); return fse; } /* if looking for a file, check if directory listing is in cache */ if (!fse && key->list) { - fse = fscache_get_wait(key->list); + fse = hashmap_get_entry(&cache->map, key->list, ent, NULL); if (fse) { - LeaveCriticalSection(&mutex); /* * dir entry without file entry, or dir does not * exist -> file doesn't exist @@ -342,25 +327,8 @@ static struct fsentry *fscache_get(struct fsentry *key) } } - /* add future entry to indicate that we're loading it */ - future = key->list ? key->list : key; - future->next = NULL; - future->u.refcnt = 0; - hashmap_add(&map, &future->ent); - - /* create the directory listing (outside mutex!) */ - LeaveCriticalSection(&mutex); - fse = fsentry_create_list(future, &dir_not_found); - EnterCriticalSection(&mutex); - - /* remove future entry and signal waiting threads */ - hashmap_remove(&map, &future->ent, NULL); - waiter = future->next; - while (waiter) { - HANDLE h = waiter->u.hwait; - waiter = waiter->next; - SetEvent(h); - } + /* create the directory listing */ + fse = fsentry_create_list(key->list ? key->list : key, &dir_not_found); /* leave on error (errno set by fsentry_create_list) */ if (!fse) { @@ -374,19 +342,18 @@ static struct fsentry *fscache_get(struct fsentry *key) key->list->dirent.d_name, key->list->len); fse->st_mode = 0; - hashmap_add(&map, &fse->ent); + hashmap_add(&cache->map, &fse->ent); } - LeaveCriticalSection(&mutex); return NULL; } /* add directory listing to the cache */ - fscache_misses++; - fscache_add(fse); + cache->fscache_misses++; + fscache_add(cache, fse); /* lookup file entry if requested (fse already points to directory) */ if (key->list) - fse = hashmap_get_entry(&map, key, ent, NULL); + fse = hashmap_get_entry(&cache->map, key, ent, NULL); if (fse && !fse->st_mode) fse = NULL; /* non-existing directory */ @@ -397,59 +364,104 @@ static struct fsentry *fscache_get(struct fsentry *key) else errno = ENOENT; - LeaveCriticalSection(&mutex); return fse; } /* - * Enables or disables the cache. Note that the cache is read-only, changes to + * Enables the cache. Note that the cache is read-only, changes to * the working directory are NOT reflected in the cache while enabled. */ -int fscache_enable(int enable, size_t initial_size) +int fscache_enable(size_t initial_size) { - int result; + int fscache; + struct fscache *cache; + int result = 0; + + /* allow the cache to be disabled entirely */ + fscache = git_env_bool("GIT_TEST_FSCACHE", -1); + if (fscache != -1) + core_fscache = fscache; + if (!core_fscache) + return 0; + /* + * refcount the global fscache initialization so that the + * opendir and lstat function pointers are redirected if + * any threads are using the fscache. + */ if (!initialized) { - int fscache = git_env_bool("GIT_TEST_FSCACHE", -1); - - /* allow the cache to be disabled entirely */ - if (fscache != -1) - core_fscache = fscache; - if (!core_fscache) - return 0; - InitializeCriticalSection(&mutex); - lstat_requests = opendir_requests = 0; - fscache_misses = fscache_requests = 0; + if (!dwTlsIndex) { + dwTlsIndex = TlsAlloc(); + if (dwTlsIndex == TLS_OUT_OF_INDEXES) { + LeaveCriticalSection(&mutex); + return 0; + } + } + + /* redirect opendir and lstat to the fscache implementations */ + opendir = fscache_opendir; + lstat = fscache_lstat; + } + InterlockedIncrement(&initialized); + + /* refcount the thread specific initialization */ + cache = fscache_getcache(); + if (cache) { + InterlockedIncrement(&cache->enabled); + } else { + cache = (struct fscache *)xcalloc(1, sizeof(*cache)); + cache->enabled = 1; /* * avoid having to rehash by leaving room for the parent dirs. * '4' was determined empirically by testing several repos */ - hashmap_init(&map, (hashmap_cmp_fn) fsentry_cmp, NULL, initial_size * 4); - initialized = 1; + hashmap_init(&cache->map, (hashmap_cmp_fn)fsentry_cmp, NULL, initial_size * 4); + if (!TlsSetValue(dwTlsIndex, cache)) + BUG("TlsSetValue error"); } - result = enable ? InterlockedIncrement(&enabled) - : InterlockedDecrement(&enabled); + trace_printf_key(&trace_fscache, "fscache: enable\n"); + return result; +} - if (enable && result == 1) { - /* redirect opendir and lstat to the fscache implementations */ - opendir = fscache_opendir; - lstat = fscache_lstat; - } else if (!enable && !result) { +/* + * Disables the cache. + */ +void fscache_disable(void) +{ + struct fscache *cache; + + if (!core_fscache) + return; + + /* update the thread specific fscache initialization */ + cache = fscache_getcache(); + if (!cache) + BUG("fscache_disable() called on a thread where fscache has not been initialized"); + if (!cache->enabled) + BUG("fscache_disable() called on an fscache that is already disabled"); + InterlockedDecrement(&cache->enabled); + if (!cache->enabled) { + TlsSetValue(dwTlsIndex, NULL); + trace_printf_key(&trace_fscache, "fscache_disable: lstat %u, opendir %u, " + "total requests/misses %u/%u\n", + cache->lstat_requests, cache->opendir_requests, + cache->fscache_requests, cache->fscache_misses); + fscache_clear(cache); + free(cache); + } + + /* update the global fscache initialization */ + InterlockedDecrement(&initialized); + if (!initialized) { /* reset opendir and lstat to the original implementations */ opendir = dirent_opendir; lstat = mingw_lstat; - EnterCriticalSection(&mutex); - trace_printf_key(&trace_fscache, "fscache: lstat %u, opendir %u, " - "total requests/misses %u/%u\n", - lstat_requests, opendir_requests, - fscache_requests, fscache_misses); - fscache_clear(); - LeaveCriticalSection(&mutex); } - trace_printf_key(&trace_fscache, "fscache: enable(%d)\n", enable); - return result; + + trace_printf_key(&trace_fscache, "fscache: disable\n"); + return; } /* @@ -457,10 +469,10 @@ int fscache_enable(int enable, size_t initial_size) */ void fscache_flush(void) { - if (enabled) { - EnterCriticalSection(&mutex); - fscache_clear(); - LeaveCriticalSection(&mutex); + struct fscache *cache = fscache_getcache(); + + if (cache && cache->enabled) { + fscache_clear(cache); } } @@ -473,11 +485,12 @@ int fscache_lstat(const char *filename, struct stat *st) int dirlen, base, len; struct heap_fsentry key[2]; struct fsentry *fse; + struct fscache *cache = fscache_getcache(); - if (!fscache_enabled(filename)) + if (!cache || !do_fscache_enabled(cache, filename)) return mingw_lstat(filename, st); - lstat_requests++; + cache->lstat_requests++; /* split filename into path + name */ len = strlen(filename); if (len && is_dir_sep(filename[len - 1])) @@ -490,7 +503,7 @@ int fscache_lstat(const char *filename, struct stat *st) /* lookup entry for path + name in cache */ fsentry_init(&key[0].u.ent, NULL, filename, dirlen); fsentry_init(&key[1].u.ent, &key[0].u.ent, filename + base, len - base); - fse = fscache_get(&key[1].u.ent); + fse = fscache_get(cache, &key[1].u.ent); if (!fse) { errno = ENOENT; return -1; @@ -555,11 +568,12 @@ DIR *fscache_opendir(const char *dirname) struct fsentry *list; fscache_DIR *dir; int len; + struct fscache *cache = fscache_getcache(); - if (!fscache_enabled(dirname)) + if (!cache || !do_fscache_enabled(cache, dirname)) return dirent_opendir(dirname); - opendir_requests++; + cache->opendir_requests++; /* prepare name (strip trailing '/', replace '.') */ len = strlen(dirname); if ((len == 1 && dirname[0] == '.') || @@ -568,7 +582,7 @@ DIR *fscache_opendir(const char *dirname) /* get directory listing from cache */ fsentry_init(&key.u.ent, NULL, dirname, len); - list = fscache_get(&key.u.ent); + list = fscache_get(cache, &key.u.ent); if (!list) return NULL; @@ -579,3 +593,53 @@ DIR *fscache_opendir(const char *dirname) dir->pfsentry = list; return (DIR*) dir; } + +struct fscache *fscache_getcache(void) +{ + return (struct fscache *)TlsGetValue(dwTlsIndex); +} + +void fscache_merge(struct fscache *dest) +{ + struct hashmap_iter iter; + struct hashmap_entry *e; + struct fscache *cache = fscache_getcache(); + + /* + * Only do the merge if fscache was enabled and we have a dest + * cache to merge into. + */ + if (!dest) { + fscache_enable(0); + return; + } + if (!cache) + BUG("fscache_merge() called on a thread where fscache has not been initialized"); + + TlsSetValue(dwTlsIndex, NULL); + trace_printf_key(&trace_fscache, "fscache_merge: lstat %u, opendir %u, " + "total requests/misses %u/%u\n", + cache->lstat_requests, cache->opendir_requests, + cache->fscache_requests, cache->fscache_misses); + + /* + * This is only safe because the primary thread we're merging into + * isn't being used so the critical section only needs to prevent + * the the child threads from stomping on each other. + */ + EnterCriticalSection(&mutex); + + hashmap_iter_init(&cache->map, &iter); + while ((e = hashmap_iter_next(&iter))) + hashmap_add(&dest->map, e); + + dest->lstat_requests += cache->lstat_requests; + dest->opendir_requests += cache->opendir_requests; + dest->fscache_requests += cache->fscache_requests; + dest->fscache_misses += cache->fscache_misses; + LeaveCriticalSection(&mutex); + + free(cache); + + InterlockedDecrement(&initialized); +} diff --git a/compat/win32/fscache.h b/compat/win32/fscache.h index d49c9381114da6..2eb8bf3f5cfee8 100644 --- a/compat/win32/fscache.h +++ b/compat/win32/fscache.h @@ -1,9 +1,16 @@ #ifndef FSCACHE_H #define FSCACHE_H -int fscache_enable(int enable, size_t initial_size); -#define enable_fscache(initial_size) fscache_enable(1, initial_size) -#define disable_fscache() fscache_enable(0, 0) +/* + * The fscache is thread specific. enable_fscache() must be called + * for each thread where caching is desired. + */ + +int fscache_enable(size_t initial_size); +#define enable_fscache(initial_size) fscache_enable(initial_size) + +void fscache_disable(void); +#define disable_fscache() fscache_disable() int fscache_enabled(const char *path); #define is_fscache_enabled(path) fscache_enabled(path) @@ -14,4 +21,13 @@ void fscache_flush(void); DIR *fscache_opendir(const char *dir); int fscache_lstat(const char *file_name, struct stat *buf); +/* opaque fscache structure */ +struct fscache; + +struct fscache *fscache_getcache(void); +#define getcache_fscache() fscache_getcache() + +void fscache_merge(struct fscache *dest); +#define merge_fscache(dest) fscache_merge(dest) + #endif diff --git a/git-compat-util.h b/git-compat-util.h index 7c5a95bc55403e..c9b4513c124b5a 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -1533,6 +1533,10 @@ static inline int is_missing_file_error(int errno_) * data or even file content without the need to synchronize with the file * system. */ + + /* opaque fscache structure */ +struct fscache; + #ifndef enable_fscache #define enable_fscache(x) /* noop */ #endif @@ -1549,6 +1553,14 @@ static inline int is_missing_file_error(int errno_) #define flush_fscache() /* noop */ #endif +#ifndef getcache_fscache +#define getcache_fscache() (NULL) /* noop */ +#endif + +#ifndef merge_fscache +#define merge_fscache(dest) /* noop */ +#endif + int cmd_main(int, const char **); /* diff --git a/preload-index.c b/preload-index.c index 93f74d44e2a023..24bb9d0e1ded61 100644 --- a/preload-index.c +++ b/preload-index.c @@ -16,6 +16,8 @@ #include "symlinks.h" #include "trace2.h" +static struct fscache *fscache; + /* * Mostly randomly chosen maximum thread counts: we * cap the parallelism to 20 threads, and we want @@ -53,6 +55,7 @@ static void *preload_thread(void *_data) nr = index->cache_nr - p->offset; last_nr = nr; + enable_fscache(nr); do { struct cache_entry *ce = *cep++; struct stat st; @@ -96,6 +99,7 @@ static void *preload_thread(void *_data) pthread_mutex_unlock(&pd->mutex); } cache_def_clear(&cache); + merge_fscache(fscache); return NULL; } @@ -111,6 +115,7 @@ void preload_index(struct index_state *index, if (!HAVE_THREADS || !core_preload_index) return; + fscache = getcache_fscache(); threads = index->cache_nr / THREAD_COST; if ((index->cache_nr > 1) && (threads < 2) && git_env_bool("GIT_TEST_PRELOAD_INDEX", 0)) threads = 2; @@ -132,7 +137,6 @@ void preload_index(struct index_state *index, pthread_mutex_init(&pd.mutex, NULL); } - enable_fscache(index->cache_nr); for (i = 0; i < threads; i++) { struct thread_data *p = data+i; int err; @@ -168,8 +172,6 @@ void preload_index(struct index_state *index, trace2_data_intmax("index", NULL, "preload/sum_lstat", t2_sum_lstat); trace2_region_leave("index", "preload", NULL); - - disable_fscache(); } int repo_read_index_preload(struct repository *repo, From f16ef3592653d4d7bce34f8b9764d1f3ee965424 Mon Sep 17 00:00:00 2001 From: Takuto Ikuta Date: Wed, 22 Nov 2017 20:39:38 +0900 Subject: [PATCH 198/297] fetch-pack.c: enable fscache for stats under .git/objects When I do git fetch, git call file stats under .git/objects for each refs. This takes time when there are many refs. By enabling fscache, git takes file stats by directory traversing and that improved the speed of fetch-pack for repository having large number of refs. In my windows workstation, this improves the time of `git fetch` for chromium repository like below. I took stats 3 times. * With this patch TotalSeconds: 9.9825165 TotalSeconds: 9.1862075 TotalSeconds: 10.1956256 Avg: 9.78811653333333 * Without this patch TotalSeconds: 15.8406702 TotalSeconds: 15.6248053 TotalSeconds: 15.2085938 Avg: 15.5580231 Signed-off-by: Takuto Ikuta Signed-off-by: Johannes Schindelin --- fetch-pack.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fetch-pack.c b/fetch-pack.c index 732511604b1a6c..89279c73d6b05b 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -757,6 +757,7 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator, save_commit_buffer = 0; trace2_region_enter("fetch-pack", "parse_remote_refs_and_find_cutoff", NULL); + enable_fscache(1); for (ref = *refs; ref; ref = ref->next) { struct commit *commit; @@ -783,6 +784,7 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator, if (!cutoff || cutoff < commit->date) cutoff = commit->date; } + enable_fscache(0); trace2_region_leave("fetch-pack", "parse_remote_refs_and_find_cutoff", NULL); /* From 0b3cb8680a722919eac615eb0581ee9ea1f36d0b Mon Sep 17 00:00:00 2001 From: Takuto Ikuta Date: Tue, 30 Jan 2018 22:42:58 +0900 Subject: [PATCH 199/297] checkout.c: enable fscache for checkout again This is retry of #1419. I added flush_fscache macro to flush cached stats after disk writing with tests for regression reported in #1438 and #1442. git checkout checks each file path in sorted order, so cache flushing does not make performance worse unless we have large number of modified files in a directory containing many files. Using chromium repository, I tested `git checkout .` performance when I delete 10 files in different directories. With this patch: TotalSeconds: 4.307272 TotalSeconds: 4.4863595 TotalSeconds: 4.2975562 Avg: 4.36372923333333 Without this patch: TotalSeconds: 20.9705431 TotalSeconds: 22.4867685 TotalSeconds: 18.8968292 Avg: 20.7847136 I confirmed this patch passed all tests in t/ with core_fscache=1. Signed-off-by: Takuto Ikuta --- builtin/checkout.c | 2 ++ compat/win32/fscache.c | 12 ++++++++++++ compat/win32/fscache.h | 3 +++ entry.c | 3 +++ git-compat-util.h | 4 ++++ parallel-checkout.c | 1 + t/t7201-co.sh | 36 ++++++++++++++++++++++++++++++++++++ 7 files changed, 61 insertions(+) diff --git a/builtin/checkout.c b/builtin/checkout.c index 1748d68c961f6b..e19edfc0507f2f 100644 --- a/builtin/checkout.c +++ b/builtin/checkout.c @@ -403,6 +403,7 @@ static int checkout_worktree(const struct checkout_opts *opts, if (pc_workers > 1) init_parallel_checkout(); + enable_fscache(1); for (pos = 0; pos < the_repository->index->cache_nr; pos++) { struct cache_entry *ce = the_repository->index->cache[pos]; if (ce->ce_flags & CE_MATCHED) { @@ -428,6 +429,7 @@ static int checkout_worktree(const struct checkout_opts *opts, errs |= run_parallel_checkout(&state, pc_workers, pc_threshold, NULL, NULL); mem_pool_discard(&ce_mem_pool, should_validate_cache_entries()); + enable_fscache(0); remove_marked_cache_entries(the_repository->index, 1); remove_scheduled_dirs(); errs |= finish_delayed_checkout(&state, opts->show_progress); diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index 88907836890af1..932e90ad4da4f7 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -428,6 +428,18 @@ int fscache_enable(int enable) return result; } +/* + * Flush cached stats result when fscache is enabled. + */ +void fscache_flush(void) +{ + if (enabled) { + EnterCriticalSection(&mutex); + fscache_clear(); + LeaveCriticalSection(&mutex); + } +} + /* * Lstat replacement, uses the cache if enabled, otherwise redirects to * mingw_lstat. diff --git a/compat/win32/fscache.h b/compat/win32/fscache.h index 660ada053b4309..2f06f8df97dcd0 100644 --- a/compat/win32/fscache.h +++ b/compat/win32/fscache.h @@ -7,6 +7,9 @@ int fscache_enable(int enable); int fscache_enabled(const char *path); #define is_fscache_enabled(path) fscache_enabled(path) +void fscache_flush(void); +#define flush_fscache() fscache_flush() + DIR *fscache_opendir(const char *dir); int fscache_lstat(const char *file_name, struct stat *buf); diff --git a/entry.c b/entry.c index e7ed440ce2277e..75fdd39ff7db40 100644 --- a/entry.c +++ b/entry.c @@ -407,6 +407,9 @@ static int write_entry(struct cache_entry *ce, char *path, struct conv_attrs *ca } finish: + /* Flush cached lstat in fscache after writing to disk. */ + flush_fscache(); + if (state->refresh_cache) { if (!fstat_done && lstat(ce->name, &st) < 0) return error_errno("unable to stat just-written file %s", diff --git a/git-compat-util.h b/git-compat-util.h index 3bd5804b26930a..529ab6a9834f7f 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -1541,6 +1541,10 @@ static inline int is_missing_file_error(int errno_) #define is_fscache_enabled(path) (0) #endif +#ifndef flush_fscache +#define flush_fscache() /* noop */ +#endif + int cmd_main(int, const char **); /* diff --git a/parallel-checkout.c b/parallel-checkout.c index 08b960aac84b0d..1b5f904f7b7d9f 100644 --- a/parallel-checkout.c +++ b/parallel-checkout.c @@ -636,6 +636,7 @@ static void write_items_sequentially(struct checkout *state) { size_t i; + flush_fscache(); for (i = 0; i < parallel_checkout.nr; i++) { struct parallel_checkout_item *pc_item = ¶llel_checkout.items[i]; write_pc_item(pc_item, state); diff --git a/t/t7201-co.sh b/t/t7201-co.sh index 2d984eb4c6a167..3bbc195c2a3169 100755 --- a/t/t7201-co.sh +++ b/t/t7201-co.sh @@ -36,6 +36,42 @@ fill () { } +test_expect_success MINGW 'fscache flush cache' ' + + git init fscache-test && + cd fscache-test && + git config core.fscache 1 && + echo A > test.txt && + git add test.txt && + git commit -m A && + echo B >> test.txt && + git checkout . && + test -z "$(git status -s)" && + echo A > expect.txt && + test_cmp expect.txt test.txt && + cd .. && + rm -rf fscache-test +' + +test_expect_success MINGW 'fscache flush cache dir' ' + + git init fscache-test && + cd fscache-test && + git config core.fscache 1 && + echo A > test.txt && + git add test.txt && + git commit -m A && + rm test.txt && + mkdir test.txt && + touch test.txt/test.txt && + git checkout . && + test -z "$(git status -s)" && + echo A > expect.txt && + test_cmp expect.txt test.txt && + cd .. && + rm -rf fscache-test +' + test_expect_success setup ' fill x y z >same && fill 1 2 3 4 5 6 7 8 >one && From 56cf1702b62e1d6787fdc7f34745ec8451a4d3d1 Mon Sep 17 00:00:00 2001 From: Ben Peart Date: Fri, 7 Sep 2018 11:39:57 -0400 Subject: [PATCH 200/297] Enable the filesystem cache (fscache) in refresh_index(). On file systems that support it, this can dramatically speed up operations like add, commit, describe, rebase, reset, rm that would otherwise have to lstat() every file to "re-match" the stat information in the index to that of the file system. On a synthetic repo with 1M files, "git reset" dropped from 52.02 seconds to 14.42 seconds for a savings of 72%. Signed-off-by: Ben Peart --- read-cache.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/read-cache.c b/read-cache.c index 1f67bb755b1cd1..009343eda25273 100644 --- a/read-cache.c +++ b/read-cache.c @@ -1530,6 +1530,7 @@ int refresh_index(struct index_state *istate, unsigned int flags, typechange_fmt = in_porcelain ? "T\t%s\n" : "%s: needs update\n"; added_fmt = in_porcelain ? "A\t%s\n" : "%s: needs update\n"; unmerged_fmt = in_porcelain ? "U\t%s\n" : "%s: needs merge\n"; + enable_fscache(1); /* * Use the multi-threaded preload_index() to refresh most of the * cache entries quickly then in the single threaded loop below, @@ -1624,6 +1625,7 @@ int refresh_index(struct index_state *istate, unsigned int flags, display_progress(progress, istate->cache_nr); stop_progress(&progress); trace_performance_leave("refresh index"); + enable_fscache(0); return has_errors; } From 747bfe0ea461df921a31b22a9d6ceffa75c831c7 Mon Sep 17 00:00:00 2001 From: Ben Peart Date: Tue, 23 Oct 2018 11:42:06 -0400 Subject: [PATCH 201/297] fscache: use FindFirstFileExW to avoid retrieving the short name Use FindFirstFileExW with FindExInfoBasic to avoid forcing NTFS to look up the short name. Also switch to a larger (64K vs 4K) buffer using FIND_FIRST_EX_LARGE_FETCH to minimize round trips to the kernel. In a repo with ~200K files, this drops warm cache status times from 3.19 seconds to 2.67 seconds for a 16% savings. Signed-off-by: Ben Peart --- compat/win32/fscache.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index 932e90ad4da4f7..7bf44e0f3a76d2 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -208,7 +208,8 @@ static struct fsentry *fsentry_create_list(const struct fsentry *dir, pattern[wlen] = 0; /* open find handle */ - h = FindFirstFileW(pattern, &fdata); + h = FindFirstFileExW(pattern, FindExInfoBasic, &fdata, FindExSearchNameMatch, + NULL, FIND_FIRST_EX_LARGE_FETCH); if (h == INVALID_HANDLE_VALUE) { err = GetLastError(); *dir_not_found = 1; /* or empty directory */ From d45c22b661bb2660d6f30aed0278eceaf6b51671 Mon Sep 17 00:00:00 2001 From: Ben Peart Date: Thu, 1 Nov 2018 11:40:51 -0400 Subject: [PATCH 202/297] status: disable and free fscache at the end of the status command At the end of the status command, disable and free the fscache so that we don't leak the memory and so that we can dump the fscache statistics. Signed-off-by: Ben Peart --- builtin/commit.c | 1 + 1 file changed, 1 insertion(+) diff --git a/builtin/commit.c b/builtin/commit.c index 794ea24d9ab9a0..1f61c538faf84e 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -1609,6 +1609,7 @@ int cmd_status(int argc, const char **argv, const char *prefix) wt_status_print(&s); wt_status_collect_free_buffers(&s); + enable_fscache(0); return 0; } From fc7b9bfc8e17a8132f193a04cfa0f7c1793ac9ae Mon Sep 17 00:00:00 2001 From: Ben Peart Date: Thu, 4 Oct 2018 18:10:21 -0400 Subject: [PATCH 203/297] fscache: add GIT_TEST_FSCACHE support Add support to fscache to enable running the entire test suite with the fscache enabled. Signed-off-by: Ben Peart --- compat/win32/fscache.c | 5 +++++ t/README | 3 +++ 2 files changed, 8 insertions(+) diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index 7bf44e0f3a76d2..029c1fb1693911 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -5,6 +5,7 @@ #include "../../dir.h" #include "../../abspath.h" #include "../../trace.h" +#include "config.h" static int initialized; static volatile long enabled; @@ -401,7 +402,11 @@ int fscache_enable(int enable) int result; if (!initialized) { + int fscache = git_env_bool("GIT_TEST_FSCACHE", -1); + /* allow the cache to be disabled entirely */ + if (fscache != -1) + core_fscache = fscache; if (!core_fscache) return 0; diff --git a/t/README b/t/README index 724ee58195dfae..cd884fdcd4dfe2 100644 --- a/t/README +++ b/t/README @@ -488,6 +488,9 @@ a test and then fails then the whole test run will abort. This can help to make sure the expected tests are executed and not silently skipped when their dependency breaks or is simply not present in a new environment. +GIT_TEST_FSCACHE= exercises the uncommon fscache code path +which adds a cache below mingw's lstat and dirent implementations. + Naming Tests ------------ From 284eb3e9686d813485c19a7e1f524f867ce2cdd1 Mon Sep 17 00:00:00 2001 From: Ben Peart Date: Tue, 25 Sep 2018 16:28:16 -0400 Subject: [PATCH 204/297] fscache: add fscache hit statistics Track fscache hits and misses for lstat and opendir requests. Reporting of statistics is done when the cache is disabled for the last time and freed and is only reported if GIT_TRACE_FSCACHE is set. Sample output is: 11:33:11.836428 compat/win32/fscache.c:433 fscache: lstat 3775, opendir 263, total requests/misses 4052/269 Signed-off-by: Ben Peart --- compat/win32/fscache.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index 029c1fb1693911..0f0225981ef4a4 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -11,6 +11,10 @@ static int initialized; static volatile long enabled; static struct hashmap map; static CRITICAL_SECTION mutex; +static unsigned int lstat_requests; +static unsigned int opendir_requests; +static unsigned int fscache_requests; +static unsigned int fscache_misses; static struct trace_key trace_fscache = TRACE_KEY_INIT(FSCACHE); /* @@ -265,6 +269,8 @@ static void fscache_clear(void) { hashmap_clear_and_free(&map, struct fsentry, ent); hashmap_init(&map, (hashmap_cmp_fn)fsentry_cmp, NULL, 0); + lstat_requests = opendir_requests = 0; + fscache_misses = fscache_requests = 0; } /* @@ -311,6 +317,7 @@ static struct fsentry *fscache_get(struct fsentry *key) int dir_not_found; EnterCriticalSection(&mutex); + fscache_requests++; /* check if entry is in cache */ fse = fscache_get_wait(key); if (fse) { @@ -374,6 +381,7 @@ static struct fsentry *fscache_get(struct fsentry *key) } /* add directory listing to the cache */ + fscache_misses++; fscache_add(fse); /* lookup file entry if requested (fse already points to directory) */ @@ -411,6 +419,8 @@ int fscache_enable(int enable) return 0; InitializeCriticalSection(&mutex); + lstat_requests = opendir_requests = 0; + fscache_misses = fscache_requests = 0; hashmap_init(&map, (hashmap_cmp_fn) fsentry_cmp, NULL, 0); initialized = 1; } @@ -427,6 +437,10 @@ int fscache_enable(int enable) opendir = dirent_opendir; lstat = mingw_lstat; EnterCriticalSection(&mutex); + trace_printf_key(&trace_fscache, "fscache: lstat %u, opendir %u, " + "total requests/misses %u/%u\n", + lstat_requests, opendir_requests, + fscache_requests, fscache_misses); fscache_clear(); LeaveCriticalSection(&mutex); } @@ -459,6 +473,7 @@ int fscache_lstat(const char *filename, struct stat *st) if (!fscache_enabled(filename)) return mingw_lstat(filename, st); + lstat_requests++; /* split filename into path + name */ len = strlen(filename); if (len && is_dir_sep(filename[len - 1])) @@ -540,6 +555,7 @@ DIR *fscache_opendir(const char *dirname) if (!fscache_enabled(dirname)) return dirent_opendir(dirname); + opendir_requests++; /* prepare name (strip trailing '/', replace '.') */ len = strlen(dirname); if ((len == 1 && dirname[0] == '.') || From dd188facdea2ecef88ab483b67f92dc8a707e0e6 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 11 Dec 2018 12:59:29 +0100 Subject: [PATCH 205/297] fscache: remember the reparse tag for each entry We will use this in the next commit to implement an FSCache-aware version of is_mount_point(). Signed-off-by: Johannes Schindelin --- compat/win32/fscache.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index be701880743224..4e550e4811757e 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -46,6 +46,7 @@ static struct trace_key trace_fscache = TRACE_KEY_INIT(FSCACHE); struct fsentry { struct hashmap_entry ent; mode_t st_mode; + ULONG reparse_tag; /* Pointer to the directory listing, or NULL for the listing itself. */ struct fsentry *list; /* Pointer to the next file entry of the list. */ @@ -197,6 +198,10 @@ static struct fsentry *fseentry_create_entry(struct fscache *cache, fse = fsentry_alloc(cache, list, buf, len); + fse->reparse_tag = + fdata->FileAttributes & FILE_ATTRIBUTE_REPARSE_POINT ? + fdata->EaSize : 0; + fse->st_mode = file_attr_to_st_mode(fdata->FileAttributes); fse->dirent.d_type = S_ISDIR(fse->st_mode) ? DT_DIR : DT_REG; fse->u.s.st_size = fdata->EndOfFile.LowPart | From 949503d82384c083638a308f8280b0180049384b Mon Sep 17 00:00:00 2001 From: Heiko Voigt Date: Sun, 21 Feb 2010 21:05:04 +0100 Subject: [PATCH 206/297] git-gui: provide question helper for retry fallback on Windows Make use of the new environment variable GIT_ASK_YESNO to support the recently implemented fallback in case unlink, rename or rmdir fail for files in use on Windows. The added dialog will present a yes/no question to the the user which will currently be used by the windows compat layer to let the user retry a failed file operation. Signed-off-by: Heiko Voigt --- git-gui/Makefile | 2 ++ git-gui/git-gui--askyesno | 51 +++++++++++++++++++++++++++++++++++++++ git-gui/git-gui.sh | 3 +++ 3 files changed, 56 insertions(+) create mode 100755 git-gui/git-gui--askyesno diff --git a/git-gui/Makefile b/git-gui/Makefile index 667c39ed564a55..39a330efb67469 100644 --- a/git-gui/Makefile +++ b/git-gui/Makefile @@ -280,6 +280,7 @@ install: all $(QUIET)$(INSTALL_D0)'$(DESTDIR_SQ)$(gitexecdir_SQ)' $(INSTALL_D1) $(QUIET)$(INSTALL_X0)git-gui $(INSTALL_X1) '$(DESTDIR_SQ)$(gitexecdir_SQ)' $(QUIET)$(INSTALL_X0)git-gui--askpass $(INSTALL_X1) '$(DESTDIR_SQ)$(gitexecdir_SQ)' + $(QUIET)$(INSTALL_X0)git-gui--askyesno $(INSTALL_X1) '$(DESTDIR_SQ)$(gitexecdir_SQ)' $(QUIET)$(foreach p,$(GITGUI_BUILT_INS), $(INSTALL_L0)'$(DESTDIR_SQ)$(gitexecdir_SQ)/$p' $(INSTALL_L1)'$(DESTDIR_SQ)$(gitexecdir_SQ)/git-gui' $(INSTALL_L2)'$(DESTDIR_SQ)$(gitexecdir_SQ)/$p' $(INSTALL_L3) &&) true ifdef GITGUI_WINDOWS_WRAPPER $(QUIET)$(INSTALL_R0)git-gui.tcl $(INSTALL_R1) '$(DESTDIR_SQ)$(gitexecdir_SQ)' @@ -298,6 +299,7 @@ uninstall: $(QUIET)$(CLEAN_DST) '$(DESTDIR_SQ)$(gitexecdir_SQ)' $(QUIET)$(REMOVE_F0)'$(DESTDIR_SQ)$(gitexecdir_SQ)'/git-gui $(REMOVE_F1) $(QUIET)$(REMOVE_F0)'$(DESTDIR_SQ)$(gitexecdir_SQ)'/git-gui--askpass $(REMOVE_F1) + $(QUIET)$(REMOVE_F0)'$(DESTDIR_SQ)$(gitexecdir_SQ)'/git-gui--askyesno $(REMOVE_F1) $(QUIET)$(foreach p,$(GITGUI_BUILT_INS), $(REMOVE_F0)'$(DESTDIR_SQ)$(gitexecdir_SQ)'/$p $(REMOVE_F1) &&) true ifdef GITGUI_WINDOWS_WRAPPER $(QUIET)$(REMOVE_F0)'$(DESTDIR_SQ)$(gitexecdir_SQ)'/git-gui.tcl $(REMOVE_F1) diff --git a/git-gui/git-gui--askyesno b/git-gui/git-gui--askyesno new file mode 100755 index 00000000000000..2a6e6fd11122f5 --- /dev/null +++ b/git-gui/git-gui--askyesno @@ -0,0 +1,51 @@ +#!/bin/sh +# Tcl ignores the next line -*- tcl -*- \ +exec wish "$0" -- "$@" + +# This is an implementation of a simple yes no dialog +# which is injected into the git commandline by git gui +# in case a yesno question needs to be answered. + +set NS {} +set use_ttk [package vsatisfies [package provide Tk] 8.5] +if {$use_ttk} { + set NS ttk +} + +if {$argc < 1} { + puts stderr "Usage: $argv0 " + exit 1 +} else { + set prompt [join $argv " "] +} + +${NS}::frame .t +${NS}::label .t.m -text $prompt -justify center -width 40 +.t.m configure -wraplength 400 +pack .t.m -side top -fill x -padx 20 -pady 20 -expand 1 +pack .t -side top -fill x -ipadx 20 -ipady 20 -expand 1 + +${NS}::frame .b +${NS}::frame .b.left -width 200 +${NS}::button .b.yes -text Yes -command yes +${NS}::button .b.no -text No -command no + + +pack .b.left -side left -expand 1 -fill x +pack .b.yes -side left -expand 1 +pack .b.no -side right -expand 1 -ipadx 5 +pack .b -side bottom -fill x -ipadx 20 -ipady 15 + +bind . {exit 0} +bind . {exit 1} + +proc no {} { + exit 1 +} + +proc yes {} { + exit 0 +} + +wm title . "Question?" +tk::PlaceWindow . diff --git a/git-gui/git-gui.sh b/git-gui/git-gui.sh index 514b7192a0640e..c8d8e1e2caef99 100755 --- a/git-gui/git-gui.sh +++ b/git-gui/git-gui.sh @@ -1248,6 +1248,9 @@ set have_tk85 [expr {[package vcompare $tk_version "8.5"] >= 0}] if {![info exists env(SSH_ASKPASS)]} { set env(SSH_ASKPASS) [gitexec git-gui--askpass] } +if {![info exists env(GIT_ASK_YESNO)]} { + set env(GIT_ASK_YESNO) [gitexec git-gui--askyesno] +} ###################################################################### ## From 87a73ce92e993d597f6207f4fd8fd37b21b12c84 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 11 Dec 2018 12:17:49 +0100 Subject: [PATCH 207/297] fscache: implement an FSCache-aware is_mount_point() When FSCache is active, we can cache the reparse tag and use it directly to determine whether a path refers to an NTFS junction, without any additional, costly I/O. Note: this change only makes a difference with the next commit, which will make use of the FSCache in `git clean` (contingent on `core.fscache` set, of course). Signed-off-by: Johannes Schindelin --- compat/mingw.c | 2 ++ compat/mingw.h | 3 ++- compat/win32/fscache.c | 35 +++++++++++++++++++++++++++++++++++ compat/win32/fscache.h | 1 + 4 files changed, 40 insertions(+), 1 deletion(-) diff --git a/compat/mingw.c b/compat/mingw.c index 75d9e79de68f21..c1ca68aab11a6a 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -2778,6 +2778,8 @@ pid_t waitpid(pid_t pid, int *status, int options) return -1; } +int (*win32_is_mount_point)(struct strbuf *path) = mingw_is_mount_point; + int mingw_is_mount_point(struct strbuf *path) { WIN32_FIND_DATAW findbuf = { 0 }; diff --git a/compat/mingw.h b/compat/mingw.h index 946ff707327b4e..97cdb8d51631ea 100644 --- a/compat/mingw.h +++ b/compat/mingw.h @@ -476,7 +476,8 @@ static inline void convert_slashes(char *path) } struct strbuf; int mingw_is_mount_point(struct strbuf *path); -#define is_mount_point mingw_is_mount_point +extern int (*win32_is_mount_point)(struct strbuf *path); +#define is_mount_point win32_is_mount_point #define CAN_UNLINK_MOUNT_POINTS 1 #define PATH_SEP ';' char *mingw_query_user_email(void); diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index 4e550e4811757e..da19ba0994de58 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -469,6 +469,7 @@ int fscache_enable(size_t initial_size) /* redirect opendir and lstat to the fscache implementations */ opendir = fscache_opendir; lstat = fscache_lstat; + win32_is_mount_point = fscache_is_mount_point; } initialized++; LeaveCriticalSection(&fscache_cs); @@ -529,6 +530,7 @@ void fscache_disable(void) /* reset opendir and lstat to the original implementations */ opendir = dirent_opendir; lstat = mingw_lstat; + win32_is_mount_point = mingw_is_mount_point; } LeaveCriticalSection(&fscache_cs); @@ -599,6 +601,39 @@ int fscache_lstat(const char *filename, struct stat *st) return 0; } +/* + * is_mount_point() replacement, uses cache if enabled, otherwise falls + * back to mingw_is_mount_point(). + */ +int fscache_is_mount_point(struct strbuf *path) +{ + int dirlen, base, len; + struct heap_fsentry key[2]; + struct fsentry *fse; + struct fscache *cache = fscache_getcache(); + + if (!cache || !do_fscache_enabled(cache, path->buf)) + return mingw_is_mount_point(path); + + cache->lstat_requests++; + /* split path into path + name */ + len = path->len; + if (len && is_dir_sep(path->buf[len - 1])) + len--; + base = len; + while (base && !is_dir_sep(path->buf[base - 1])) + base--; + dirlen = base ? base - 1 : 0; + + /* lookup entry for path + name in cache */ + fsentry_init(&key[0].u.ent, NULL, path->buf, dirlen); + fsentry_init(&key[1].u.ent, &key[0].u.ent, path->buf + base, len - base); + fse = fscache_get(cache, &key[1].u.ent); + if (!fse) + return mingw_is_mount_point(path); + return fse->reparse_tag == IO_REPARSE_TAG_MOUNT_POINT; +} + typedef struct fscache_DIR { struct DIR base_dir; /* extend base struct DIR */ struct fsentry *pfsentry; diff --git a/compat/win32/fscache.h b/compat/win32/fscache.h index 042b247a542554..386c770a85d321 100644 --- a/compat/win32/fscache.h +++ b/compat/win32/fscache.h @@ -22,6 +22,7 @@ void fscache_flush(void); DIR *fscache_opendir(const char *dir); int fscache_lstat(const char *file_name, struct stat *buf); +int fscache_is_mount_point(struct strbuf *path); /* opaque fscache structure */ struct fscache; From d74467f9ce87a779a4ff1df2424c453f7f436fec Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 20 Sep 2017 21:52:28 +0200 Subject: [PATCH 208/297] git-gui--askyesno: fix funny text wrapping The text wrapping seems to be aligned to the right side of the Yes button, leaving an awful lot of empty space. Let's try to counter this by using pixel units. Signed-off-by: Johannes Schindelin --- git-gui/git-gui--askyesno | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/git-gui/git-gui--askyesno b/git-gui/git-gui--askyesno index 2a6e6fd11122f5..cf9c990d0919b3 100755 --- a/git-gui/git-gui--askyesno +++ b/git-gui/git-gui--askyesno @@ -20,8 +20,8 @@ if {$argc < 1} { } ${NS}::frame .t -${NS}::label .t.m -text $prompt -justify center -width 40 -.t.m configure -wraplength 400 +${NS}::label .t.m -text $prompt -justify center -width 400px +.t.m configure -wraplength 400px pack .t.m -side top -fill x -padx 20 -pady 20 -expand 1 pack .t -side top -fill x -ipadx 20 -ipady 20 -expand 1 From 26cb4638523a05855411fbdec3dab4ee1d3585cc Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 23 Jul 2010 18:06:05 +0200 Subject: [PATCH 209/297] git gui: set GIT_ASKPASS=git-gui--askpass if not set yet Signed-off-by: Johannes Schindelin --- git-gui/git-gui.sh | 3 +++ 1 file changed, 3 insertions(+) diff --git a/git-gui/git-gui.sh b/git-gui/git-gui.sh index c8d8e1e2caef99..434fd9d844accc 100755 --- a/git-gui/git-gui.sh +++ b/git-gui/git-gui.sh @@ -1248,6 +1248,9 @@ set have_tk85 [expr {[package vcompare $tk_version "8.5"] >= 0}] if {![info exists env(SSH_ASKPASS)]} { set env(SSH_ASKPASS) [gitexec git-gui--askpass] } +if {![info exists env(GIT_ASKPASS)]} { + set env(GIT_ASKPASS) [gitexec git-gui--askpass] +} if {![info exists env(GIT_ASK_YESNO)]} { set env(GIT_ASK_YESNO) [gitexec git-gui--askyesno] } From a6a49065283b253d1aac542f9c7e15e7e52fbd39 Mon Sep 17 00:00:00 2001 From: Ben Peart Date: Fri, 2 Nov 2018 11:19:10 -0400 Subject: [PATCH 210/297] fscache: teach fscache to use mempool Now that the fscache is single threaded, take advantage of the mem_pool as the allocator to significantly reduce the cost of allocations and frees. With the reduced cost of free, in future patches, we can start freeing the fscache at the end of commands instead of just leaking it. Signed-off-by: Ben Peart Signed-off-by: Johannes Schindelin --- compat/win32/fscache.c | 45 ++++++++++++++++++++++-------------------- 1 file changed, 24 insertions(+), 21 deletions(-) diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index b2513e6932e1e1..61181004d8ebf5 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -6,6 +6,7 @@ #include "../../abspath.h" #include "../../trace.h" #include "config.h" +#include "../../mem-pool.h" static volatile long initialized; static DWORD dwTlsIndex; @@ -20,6 +21,7 @@ static CRITICAL_SECTION mutex; struct fscache { volatile long enabled; struct hashmap map; + struct mem_pool mem_pool; unsigned int lstat_requests; unsigned int opendir_requests; unsigned int fscache_requests; @@ -124,11 +126,12 @@ static void fsentry_init(struct fsentry *fse, struct fsentry *list, /* * Allocate an fsentry structure on the heap. */ -static struct fsentry *fsentry_alloc(struct fsentry *list, const char *name, +static struct fsentry *fsentry_alloc(struct fscache *cache, struct fsentry *list, const char *name, size_t len) { /* overallocate fsentry and copy the name to the end */ - struct fsentry *fse = xmalloc(sizeof(struct fsentry) + len + 1); + struct fsentry *fse = + mem_pool_alloc(&cache->mem_pool, sizeof(*fse) + len + 1); /* init the rest of the structure */ fsentry_init(fse, list, name, len); fse->next = NULL; @@ -148,27 +151,21 @@ inline static void fsentry_addref(struct fsentry *fse) } /* - * Release the reference to an fsentry, frees the memory if its the last ref. + * Release the reference to an fsentry. */ static void fsentry_release(struct fsentry *fse) { if (fse->list) fse = fse->list; - if (InterlockedDecrement(&(fse->u.refcnt))) - return; - - while (fse) { - struct fsentry *next = fse->next; - free(fse); - fse = next; - } + InterlockedDecrement(&(fse->u.refcnt)); } /* * Allocate and initialize an fsentry from a WIN32_FIND_DATA structure. */ -static struct fsentry *fseentry_create_entry(struct fsentry *list, +static struct fsentry *fseentry_create_entry(struct fscache *cache, + struct fsentry *list, const WIN32_FIND_DATAW *fdata) { char buf[MAX_PATH * 3]; @@ -176,7 +173,7 @@ static struct fsentry *fseentry_create_entry(struct fsentry *list, struct fsentry *fse; len = xwcstoutf(buf, fdata->cFileName, ARRAY_SIZE(buf)); - fse = fsentry_alloc(list, buf, len); + fse = fsentry_alloc(cache, list, buf, len); fse->st_mode = file_attr_to_st_mode(fdata->dwFileAttributes); fse->dirent.d_type = S_ISDIR(fse->st_mode) ? DT_DIR : DT_REG; @@ -194,7 +191,7 @@ static struct fsentry *fseentry_create_entry(struct fsentry *list, * Dir should not contain trailing '/'. Use an empty string for the current * directory (not "."!). */ -static struct fsentry *fsentry_create_list(const struct fsentry *dir, +static struct fsentry *fsentry_create_list(struct fscache *cache, const struct fsentry *dir, int *dir_not_found) { wchar_t pattern[MAX_PATH + 2]; /* + 2 for '/' '*' */ @@ -233,14 +230,14 @@ static struct fsentry *fsentry_create_list(const struct fsentry *dir, } /* allocate object to hold directory listing */ - list = fsentry_alloc(NULL, dir->dirent.d_name, dir->len); + list = fsentry_alloc(cache, NULL, dir->dirent.d_name, dir->len); list->st_mode = S_IFDIR; list->dirent.d_type = DT_DIR; /* walk directory and build linked list of fsentry structures */ phead = &list->next; do { - *phead = fseentry_create_entry(list, &fdata); + *phead = fseentry_create_entry(cache, list, &fdata); phead = &(*phead)->next; } while (FindNextFileW(h, &fdata)); @@ -252,7 +249,7 @@ static struct fsentry *fsentry_create_list(const struct fsentry *dir, if (err == ERROR_NO_MORE_FILES) return list; - /* otherwise free the list and return error */ + /* otherwise release the list and return error */ fsentry_release(list); errno = err_win_to_posix(err); return NULL; @@ -275,7 +272,9 @@ static void fscache_add(struct fscache *cache, struct fsentry *fse) */ static void fscache_clear(struct fscache *cache) { - hashmap_clear_and_free(&cache->map, struct fsentry, ent); + mem_pool_discard(&cache->mem_pool, 0); + mem_pool_init(&cache->mem_pool, 0); + hashmap_clear(&cache->map); hashmap_init(&cache->map, (hashmap_cmp_fn)fsentry_cmp, NULL, 0); cache->lstat_requests = cache->opendir_requests = 0; cache->fscache_misses = cache->fscache_requests = 0; @@ -328,7 +327,7 @@ static struct fsentry *fscache_get(struct fscache *cache, struct fsentry *key) } /* create the directory listing */ - fse = fsentry_create_list(key->list ? key->list : key, &dir_not_found); + fse = fsentry_create_list(cache, key->list ? key->list : key, &dir_not_found); /* leave on error (errno set by fsentry_create_list) */ if (!fse) { @@ -338,7 +337,7 @@ static struct fsentry *fscache_get(struct fscache *cache, struct fsentry *key) * empty, which for all practical matters is the same * thing as far as fscache is concerned). */ - fse = fsentry_alloc(key->list->list, + fse = fsentry_alloc(cache, key->list->list, key->list->dirent.d_name, key->list->len); fse->st_mode = 0; @@ -417,6 +416,7 @@ int fscache_enable(size_t initial_size) * '4' was determined empirically by testing several repos */ hashmap_init(&cache->map, (hashmap_cmp_fn)fsentry_cmp, NULL, initial_size * 4); + mem_pool_init(&cache->mem_pool, 0); if (!TlsSetValue(dwTlsIndex, cache)) BUG("TlsSetValue error"); } @@ -448,7 +448,8 @@ void fscache_disable(void) "total requests/misses %u/%u\n", cache->lstat_requests, cache->opendir_requests, cache->fscache_requests, cache->fscache_misses); - fscache_clear(cache); + mem_pool_discard(&cache->mem_pool, 0); + hashmap_clear(&cache->map); free(cache); } @@ -633,6 +634,8 @@ void fscache_merge(struct fscache *dest) while ((e = hashmap_iter_next(&iter))) hashmap_add(&dest->map, e); + mem_pool_combine(&dest->mem_pool, &cache->mem_pool); + dest->lstat_requests += cache->lstat_requests; dest->opendir_requests += cache->opendir_requests; dest->fscache_requests += cache->fscache_requests; From 29054d2a19030a9cd36b1ab5396ebcbc08b71043 Mon Sep 17 00:00:00 2001 From: Ben Peart Date: Fri, 16 Nov 2018 10:59:18 -0500 Subject: [PATCH 211/297] fscache: make fscache_enable() thread safe The recent change to make fscache thread specific relied on fscache_enable() being called first from the primary thread before being called in parallel from worker threads. Make that more robust and protect it with a critical section to avoid any issues. Helped-by: Johannes Schindelin Signed-off-by: Ben Peart --- compat/mingw.c | 4 ++++ compat/win32/fscache.c | 23 +++++++++++++---------- compat/win32/fscache.h | 2 ++ 3 files changed, 19 insertions(+), 10 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index ca8b116190b3af..75d9e79de68f21 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -21,6 +21,7 @@ #include #include "../write-or-die.h" #include "../repository.h" +#include "win32/fscache.h" #define HCAST(type, handle) ((type)(intptr_t)handle) @@ -3490,6 +3491,9 @@ int wmain(int argc, const wchar_t **wargv) /* initialize critical section for waitpid pinfo_t list */ InitializeCriticalSection(&pinfo_cs); + /* initialize critical section for fscache */ + InitializeCriticalSection(&fscache_cs); + /* set up default file mode and file modes for stdin/out/err */ _fmode = _O_BINARY; _setmode(_fileno(stdin), _O_BINARY); diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index 61181004d8ebf5..a07decbb435246 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -10,7 +10,7 @@ static volatile long initialized; static DWORD dwTlsIndex; -static CRITICAL_SECTION mutex; +CRITICAL_SECTION fscache_cs; /* * Store one fscache per thread to avoid thread contention and locking. @@ -388,12 +388,12 @@ int fscache_enable(size_t initial_size) * opendir and lstat function pointers are redirected if * any threads are using the fscache. */ + EnterCriticalSection(&fscache_cs); if (!initialized) { - InitializeCriticalSection(&mutex); if (!dwTlsIndex) { dwTlsIndex = TlsAlloc(); if (dwTlsIndex == TLS_OUT_OF_INDEXES) { - LeaveCriticalSection(&mutex); + LeaveCriticalSection(&fscache_cs); return 0; } } @@ -402,12 +402,13 @@ int fscache_enable(size_t initial_size) opendir = fscache_opendir; lstat = fscache_lstat; } - InterlockedIncrement(&initialized); + initialized++; + LeaveCriticalSection(&fscache_cs); /* refcount the thread specific initialization */ cache = fscache_getcache(); if (cache) { - InterlockedIncrement(&cache->enabled); + cache->enabled++; } else { cache = (struct fscache *)xcalloc(1, sizeof(*cache)); cache->enabled = 1; @@ -441,7 +442,7 @@ void fscache_disable(void) BUG("fscache_disable() called on a thread where fscache has not been initialized"); if (!cache->enabled) BUG("fscache_disable() called on an fscache that is already disabled"); - InterlockedDecrement(&cache->enabled); + cache->enabled--; if (!cache->enabled) { TlsSetValue(dwTlsIndex, NULL); trace_printf_key(&trace_fscache, "fscache_disable: lstat %u, opendir %u, " @@ -454,12 +455,14 @@ void fscache_disable(void) } /* update the global fscache initialization */ - InterlockedDecrement(&initialized); + EnterCriticalSection(&fscache_cs); + initialized--; if (!initialized) { /* reset opendir and lstat to the original implementations */ opendir = dirent_opendir; lstat = mingw_lstat; } + LeaveCriticalSection(&fscache_cs); trace_printf_key(&trace_fscache, "fscache: disable\n"); return; @@ -628,7 +631,7 @@ void fscache_merge(struct fscache *dest) * isn't being used so the critical section only needs to prevent * the the child threads from stomping on each other. */ - EnterCriticalSection(&mutex); + EnterCriticalSection(&fscache_cs); hashmap_iter_init(&cache->map, &iter); while ((e = hashmap_iter_next(&iter))) @@ -640,9 +643,9 @@ void fscache_merge(struct fscache *dest) dest->opendir_requests += cache->opendir_requests; dest->fscache_requests += cache->fscache_requests; dest->fscache_misses += cache->fscache_misses; - LeaveCriticalSection(&mutex); + initialized--; + LeaveCriticalSection(&fscache_cs); free(cache); - InterlockedDecrement(&initialized); } diff --git a/compat/win32/fscache.h b/compat/win32/fscache.h index 2eb8bf3f5cfee8..042b247a542554 100644 --- a/compat/win32/fscache.h +++ b/compat/win32/fscache.h @@ -6,6 +6,8 @@ * for each thread where caching is desired. */ +extern CRITICAL_SECTION fscache_cs; + int fscache_enable(size_t initial_size); #define enable_fscache(initial_size) fscache_enable(initial_size) From 8be5c62d49da6a2bfe564d3a0651dfabe0359b8c Mon Sep 17 00:00:00 2001 From: Ben Peart Date: Thu, 15 Nov 2018 14:15:40 -0500 Subject: [PATCH 212/297] fscache: teach fscache to use NtQueryDirectoryFile Using FindFirstFileExW() requires the OS to allocate a 64K buffer for each directory and then free it when we call FindClose(). Update fscache to call the underlying kernel API NtQueryDirectoryFile so that we can do the buffer management ourselves. That allows us to allocate a single buffer for the lifetime of the cache and reuse it for each directory. This change improves performance of 'git status' by 18% in a repo with ~200K files and 30k folders. Documentation for NtQueryDirectoryFile can be found at: https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/content/ntifs/nf-ntifs-ntquerydirectoryfile https://docs.microsoft.com/en-us/windows/desktop/FileIO/file-attribute-constants https://docs.microsoft.com/en-us/windows/desktop/fileio/reparse-point-tags To determine if the specified directory is a symbolic link, inspect the FileAttributes member to see if the FILE_ATTRIBUTE_REPARSE_POINT flag is set. If so, EaSize will contain the reparse tag (this is a so far undocumented feature, but confirmed by the NTFS developers). To determine if the reparse point is a symbolic link (and not some other form of reparse point), test whether the tag value equals the value IO_REPARSE_TAG_SYMLINK. The NtQueryDirectoryFile() call works best (and on Windows 8.1 and earlier, it works *only*) with buffer sizes up to 64kB. Which is 32k wide characters, so let's use that as our buffer size. Signed-off-by: Ben Peart Signed-off-by: Johannes Schindelin --- compat/win32/fscache.c | 123 ++++++++++++++++++++++++++++---------- compat/win32/ntifs.h | 131 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 224 insertions(+), 30 deletions(-) create mode 100644 compat/win32/ntifs.h diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index a07decbb435246..be701880743224 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -7,6 +7,7 @@ #include "../../trace.h" #include "config.h" #include "../../mem-pool.h" +#include "ntifs.h" static volatile long initialized; static DWORD dwTlsIndex; @@ -26,6 +27,13 @@ struct fscache { unsigned int opendir_requests; unsigned int fscache_requests; unsigned int fscache_misses; + /* + * 32k wide characters translates to 64kB, which is the maximum that + * Windows 8.1 and earlier can handle. On network drives, not only + * the client's Windows version matters, but also the server's, + * therefore we need to keep this to 64kB. + */ + WCHAR buffer[32 * 1024]; }; static struct trace_key trace_fscache = TRACE_KEY_INIT(FSCACHE); @@ -161,27 +169,44 @@ static void fsentry_release(struct fsentry *fse) InterlockedDecrement(&(fse->u.refcnt)); } +static int xwcstoutfn(char *utf, int utflen, const wchar_t *wcs, int wcslen) +{ + if (!wcs || !utf || utflen < 1) { + errno = EINVAL; + return -1; + } + utflen = WideCharToMultiByte(CP_UTF8, 0, wcs, wcslen, utf, utflen, NULL, NULL); + if (utflen) + return utflen; + errno = ERANGE; + return -1; +} + /* - * Allocate and initialize an fsentry from a WIN32_FIND_DATA structure. + * Allocate and initialize an fsentry from a FILE_FULL_DIR_INFORMATION structure. */ static struct fsentry *fseentry_create_entry(struct fscache *cache, struct fsentry *list, - const WIN32_FIND_DATAW *fdata) + PFILE_FULL_DIR_INFORMATION fdata) { char buf[MAX_PATH * 3]; int len; struct fsentry *fse; - len = xwcstoutf(buf, fdata->cFileName, ARRAY_SIZE(buf)); + + len = xwcstoutfn(buf, ARRAY_SIZE(buf), fdata->FileName, fdata->FileNameLength / sizeof(wchar_t)); fse = fsentry_alloc(cache, list, buf, len); - fse->st_mode = file_attr_to_st_mode(fdata->dwFileAttributes); + fse->st_mode = file_attr_to_st_mode(fdata->FileAttributes); fse->dirent.d_type = S_ISDIR(fse->st_mode) ? DT_DIR : DT_REG; - fse->u.s.st_size = (((off64_t) (fdata->nFileSizeHigh)) << 32) - | fdata->nFileSizeLow; - filetime_to_timespec(&(fdata->ftLastAccessTime), &(fse->u.s.st_atim)); - filetime_to_timespec(&(fdata->ftLastWriteTime), &(fse->u.s.st_mtim)); - filetime_to_timespec(&(fdata->ftCreationTime), &(fse->u.s.st_ctim)); + fse->u.s.st_size = fdata->EndOfFile.LowPart | + (((off_t)fdata->EndOfFile.HighPart) << 32); + filetime_to_timespec((FILETIME *)&(fdata->LastAccessTime), + &(fse->u.s.st_atim)); + filetime_to_timespec((FILETIME *)&(fdata->LastWriteTime), + &(fse->u.s.st_mtim)); + filetime_to_timespec((FILETIME *)&(fdata->CreationTime), + &(fse->u.s.st_ctim)); return fse; } @@ -194,8 +219,10 @@ static struct fsentry *fseentry_create_entry(struct fscache *cache, static struct fsentry *fsentry_create_list(struct fscache *cache, const struct fsentry *dir, int *dir_not_found) { - wchar_t pattern[MAX_PATH + 2]; /* + 2 for '/' '*' */ - WIN32_FIND_DATAW fdata; + wchar_t pattern[MAX_PATH]; + NTSTATUS status; + IO_STATUS_BLOCK iosb; + PFILE_FULL_DIR_INFORMATION di; HANDLE h; int wlen; struct fsentry *list, **phead; @@ -211,15 +238,18 @@ static struct fsentry *fsentry_create_list(struct fscache *cache, const struct f return NULL; } - /* append optional '/' and wildcard '*' */ - if (wlen) - pattern[wlen++] = '/'; - pattern[wlen++] = '*'; - pattern[wlen] = 0; + /* handle CWD */ + if (!wlen) { + wlen = GetCurrentDirectoryW(ARRAY_SIZE(pattern), pattern); + if (!wlen || wlen >= ARRAY_SIZE(pattern)) { + errno = wlen ? ENAMETOOLONG : err_win_to_posix(GetLastError()); + return NULL; + } + } - /* open find handle */ - h = FindFirstFileExW(pattern, FindExInfoBasic, &fdata, FindExSearchNameMatch, - NULL, FIND_FIRST_EX_LARGE_FETCH); + h = CreateFileW(pattern, FILE_LIST_DIRECTORY, + FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, + NULL, OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, NULL); if (h == INVALID_HANDLE_VALUE) { err = GetLastError(); *dir_not_found = 1; /* or empty directory */ @@ -236,22 +266,55 @@ static struct fsentry *fsentry_create_list(struct fscache *cache, const struct f /* walk directory and build linked list of fsentry structures */ phead = &list->next; - do { - *phead = fseentry_create_entry(cache, list, &fdata); + status = NtQueryDirectoryFile(h, NULL, 0, 0, &iosb, cache->buffer, + sizeof(cache->buffer), FileFullDirectoryInformation, FALSE, NULL, FALSE); + if (!NT_SUCCESS(status)) { + /* + * NtQueryDirectoryFile returns STATUS_INVALID_PARAMETER when + * asked to enumerate an invalid directory (ie it is a file + * instead of a directory). Verify that is the actual cause + * of the error. + */ + if (status == STATUS_INVALID_PARAMETER) { + DWORD attributes = GetFileAttributesW(pattern); + if (!(attributes & FILE_ATTRIBUTE_DIRECTORY)) + status = ERROR_DIRECTORY; + } + goto Error; + } + di = (PFILE_FULL_DIR_INFORMATION)(cache->buffer); + for (;;) { + + *phead = fseentry_create_entry(cache, list, di); phead = &(*phead)->next; - } while (FindNextFileW(h, &fdata)); - /* remember result of last FindNextFile, then close find handle */ - err = GetLastError(); - FindClose(h); + /* If there is no offset in the entry, the buffer has been exhausted. */ + if (di->NextEntryOffset == 0) { + status = NtQueryDirectoryFile(h, NULL, 0, 0, &iosb, cache->buffer, + sizeof(cache->buffer), FileFullDirectoryInformation, FALSE, NULL, FALSE); + if (!NT_SUCCESS(status)) { + if (status == STATUS_NO_MORE_FILES) + break; + goto Error; + } + + di = (PFILE_FULL_DIR_INFORMATION)(cache->buffer); + continue; + } + + /* Advance to the next entry. */ + di = (PFILE_FULL_DIR_INFORMATION)(((PUCHAR)di) + di->NextEntryOffset); + } - /* return the list if we've got all the files */ - if (err == ERROR_NO_MORE_FILES) - return list; + CloseHandle(h); + return list; - /* otherwise release the list and return error */ +Error: + trace_printf_key(&trace_fscache, + "fscache: status(%ld) unable to query directory " + "contents '%s'\n", status, dir->dirent.d_name); + CloseHandle(h); fsentry_release(list); - errno = err_win_to_posix(err); return NULL; } diff --git a/compat/win32/ntifs.h b/compat/win32/ntifs.h new file mode 100644 index 00000000000000..64ed792c52f352 --- /dev/null +++ b/compat/win32/ntifs.h @@ -0,0 +1,131 @@ +#ifndef _NTIFS_ +#define _NTIFS_ + +/* + * Copy necessary structures and definitions out of the Windows DDK + * to enable calling NtQueryDirectoryFile() + */ + +typedef _Return_type_success_(return >= 0) LONG NTSTATUS; +#define NT_SUCCESS(Status) (((NTSTATUS)(Status)) >= 0) + +#if !defined(_NTSECAPI_) && !defined(_WINTERNL_) && \ + !defined(__UNICODE_STRING_DEFINED) +#define __UNICODE_STRING_DEFINED +typedef struct _UNICODE_STRING { + USHORT Length; + USHORT MaximumLength; + PWSTR Buffer; +} UNICODE_STRING; +typedef UNICODE_STRING *PUNICODE_STRING; +typedef const UNICODE_STRING *PCUNICODE_STRING; +#endif /* !_NTSECAPI_ && !_WINTERNL_ && !__UNICODE_STRING_DEFINED */ + +typedef enum _FILE_INFORMATION_CLASS { + FileDirectoryInformation = 1, + FileFullDirectoryInformation, + FileBothDirectoryInformation, + FileBasicInformation, + FileStandardInformation, + FileInternalInformation, + FileEaInformation, + FileAccessInformation, + FileNameInformation, + FileRenameInformation, + FileLinkInformation, + FileNamesInformation, + FileDispositionInformation, + FilePositionInformation, + FileFullEaInformation, + FileModeInformation, + FileAlignmentInformation, + FileAllInformation, + FileAllocationInformation, + FileEndOfFileInformation, + FileAlternateNameInformation, + FileStreamInformation, + FilePipeInformation, + FilePipeLocalInformation, + FilePipeRemoteInformation, + FileMailslotQueryInformation, + FileMailslotSetInformation, + FileCompressionInformation, + FileObjectIdInformation, + FileCompletionInformation, + FileMoveClusterInformation, + FileQuotaInformation, + FileReparsePointInformation, + FileNetworkOpenInformation, + FileAttributeTagInformation, + FileTrackingInformation, + FileIdBothDirectoryInformation, + FileIdFullDirectoryInformation, + FileValidDataLengthInformation, + FileShortNameInformation, + FileIoCompletionNotificationInformation, + FileIoStatusBlockRangeInformation, + FileIoPriorityHintInformation, + FileSfioReserveInformation, + FileSfioVolumeInformation, + FileHardLinkInformation, + FileProcessIdsUsingFileInformation, + FileNormalizedNameInformation, + FileNetworkPhysicalNameInformation, + FileIdGlobalTxDirectoryInformation, + FileIsRemoteDeviceInformation, + FileAttributeCacheInformation, + FileNumaNodeInformation, + FileStandardLinkInformation, + FileRemoteProtocolInformation, + FileMaximumInformation +} FILE_INFORMATION_CLASS, *PFILE_INFORMATION_CLASS; + +typedef struct _FILE_FULL_DIR_INFORMATION { + ULONG NextEntryOffset; + ULONG FileIndex; + LARGE_INTEGER CreationTime; + LARGE_INTEGER LastAccessTime; + LARGE_INTEGER LastWriteTime; + LARGE_INTEGER ChangeTime; + LARGE_INTEGER EndOfFile; + LARGE_INTEGER AllocationSize; + ULONG FileAttributes; + ULONG FileNameLength; + ULONG EaSize; + WCHAR FileName[1]; +} FILE_FULL_DIR_INFORMATION, *PFILE_FULL_DIR_INFORMATION; + +typedef struct _IO_STATUS_BLOCK { + union { + NTSTATUS Status; + PVOID Pointer; + } u; + ULONG_PTR Information; +} IO_STATUS_BLOCK, *PIO_STATUS_BLOCK; + +typedef VOID +(NTAPI *PIO_APC_ROUTINE)( + IN PVOID ApcContext, + IN PIO_STATUS_BLOCK IoStatusBlock, + IN ULONG Reserved); + +NTSYSCALLAPI +NTSTATUS +NTAPI +NtQueryDirectoryFile( + _In_ HANDLE FileHandle, + _In_opt_ HANDLE Event, + _In_opt_ PIO_APC_ROUTINE ApcRoutine, + _In_opt_ PVOID ApcContext, + _Out_ PIO_STATUS_BLOCK IoStatusBlock, + _Out_writes_bytes_(Length) PVOID FileInformation, + _In_ ULONG Length, + _In_ FILE_INFORMATION_CLASS FileInformationClass, + _In_ BOOLEAN ReturnSingleEntry, + _In_opt_ PUNICODE_STRING FileName, + _In_ BOOLEAN RestartScan +); + +#define STATUS_NO_MORE_FILES ((NTSTATUS)0x80000006L) + +#endif From 8b5b156302f8b5b1cc1e50fac108fc9f4a37086e Mon Sep 17 00:00:00 2001 From: Derrick Stolee Date: Wed, 12 Jun 2019 00:58:49 +0000 Subject: [PATCH 213/297] unpack-trees: enable fscache for sparse-checkout When updating the skip-worktree bits in the index to align with new values in a sparse-checkout file, Git scans the entire working directory with lstat() calls. In a sparse-checkout, many of these lstat() calls are for paths that do not exist. Enable the fscache feature during this scan. Since enable_fscache() calls nest, the disable_fscache() method decrements a counter and would only clear the cache if that counter reaches zero. In a local test of a repo with ~2.2 million paths, updating the index with git read-tree -m -u HEAD with a sparse-checkout file containing only /.gitattributes improved from 2-3 minutes to ~6 seconds. Signed-off-by: Derrick Stolee Signed-off-by: Johannes Schindelin --- unpack-trees.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/unpack-trees.c b/unpack-trees.c index 7dc884fafd30a3..1cff7ca762374f 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -1817,7 +1817,9 @@ static void mark_new_skip_worktree(struct pattern_list *pl, * 2. Widen worktree according to sparse-checkout file. * Matched entries will have skip_wt_flag cleared (i.e. "in") */ + enable_fscache(istate->cache_nr); clear_ce_flags(istate, select_flag, skip_wt_flag, pl, show_progress); + disable_fscache(); } static void populate_from_existing_patterns(struct unpack_trees_options *o, From 7d7c8d2d243e264cd317190f9576ecd5001387bf Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 11 Dec 2018 12:17:49 +0100 Subject: [PATCH 214/297] clean: make use of FSCache The `git clean` command needs to enumerate plenty of files and directories, and can therefore benefit from the FSCache. Signed-off-by: Johannes Schindelin --- builtin/clean.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/builtin/clean.c b/builtin/clean.c index 9b416b42a96dfe..e50e1bb39d8fe1 100644 --- a/builtin/clean.c +++ b/builtin/clean.c @@ -1038,6 +1038,7 @@ int cmd_clean(int argc, const char **argv, const char *prefix) if (repo_read_index(the_repository) < 0) die(_("index file corrupt")); + enable_fscache(the_repository->index->cache_nr); pl = add_pattern_list(&dir, EXC_CMDL, "--exclude option"); for (i = 0; i < exclude_list.nr; i++) @@ -1112,6 +1113,7 @@ int cmd_clean(int argc, const char **argv, const char *prefix) } } + disable_fscache(); strbuf_release(&abs_path); strbuf_release(&buf); string_list_clear(&del_list, 0); From 193fea2a7613b93b970f0b81658a443aa9ab1bfc Mon Sep 17 00:00:00 2001 From: Karsten Blees Date: Sat, 4 Feb 2012 21:54:36 +0100 Subject: [PATCH 215/297] gitk: Unicode file name support Assumes file names in git tree objects are UTF-8 encoded. On most unix systems, the system encoding (and thus the TCL system encoding) will be UTF-8, so file names will be displayed correctly. On Windows, it is impossible to set the system encoding to UTF-8. Changing the TCL system encoding (via 'encoding system ...', e.g. in the startup code) is explicitly discouraged by the TCL docs. Change gitk functions dealing with file names to always convert from and to UTF-8. Signed-off-by: Karsten Blees Signed-off-by: Johannes Schindelin --- gitk-git/gitk | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/gitk-git/gitk b/gitk-git/gitk index 076ca3687d7653..a792f48e5a1a47 100755 --- a/gitk-git/gitk +++ b/gitk-git/gitk @@ -7847,7 +7847,7 @@ proc gettreeline {gtf id} { if {[string index $fname 0] eq "\""} { set fname [lindex $fname 0] } - set fname [encoding convertfrom $fname] + set fname [encoding convertfrom utf-8 $fname] lappend treefilelist($id) $fname } if {![eof $gtf]} { @@ -8109,7 +8109,7 @@ proc gettreediffline {gdtf ids} { if {[string index $file 0] eq "\""} { set file [lindex $file 0] } - set file [encoding convertfrom $file] + set file [encoding convertfrom utf-8 $file] if {$file ne [lindex $treediff end]} { lappend treediff $file lappend sublist $file @@ -8254,7 +8254,7 @@ proc makediffhdr {fname ids} { global ctext curdiffstart treediffs diffencoding global ctext_file_names jump_to_here targetline diffline - set fname [encoding convertfrom $fname] + set fname [encoding convertfrom utf-8 $fname] set diffencoding [get_path_encoding $fname] set i [lsearch -exact $treediffs($ids) $fname] if {$i >= 0} { @@ -8316,7 +8316,7 @@ proc parseblobdiffline {ids line} { if {![string compare -length 5 "diff " $line]} { if {![regexp {^diff (--cc|--git) } $line m type]} { - set line [encoding convertfrom $line] + set line [encoding convertfrom utf-8 $line] $ctext insert end "$line\n" hunksep continue } @@ -8365,7 +8365,7 @@ proc parseblobdiffline {ids line} { makediffhdr $fname $ids } elseif {![string compare -length 16 "* Unmerged path " $line]} { - set fname [encoding convertfrom [string range $line 16 end]] + set fname [encoding convertfrom utf-8 [string range $line 16 end]] $ctext insert end "\n" set curdiffstart [$ctext index "end - 1c"] lappend ctext_file_names $fname @@ -8418,7 +8418,7 @@ proc parseblobdiffline {ids line} { if {[string index $fname 0] eq "\""} { set fname [lindex $fname 0] } - set fname [encoding convertfrom $fname] + set fname [encoding convertfrom utf-8 $fname] set i [lsearch -exact $treediffs($ids) $fname] if {$i >= 0} { setinlist difffilestart $i $curdiffstart @@ -8437,6 +8437,7 @@ proc parseblobdiffline {ids line} { set diffinhdr 0 return } + set line [encoding convertfrom utf-8 $line] $ctext insert end "$line\n" filesep } else { @@ -12405,7 +12406,7 @@ proc cache_gitattr {attr pathlist} { foreach row [split $rlist "\n"] { if {[regexp "(.*): $attr: (.*)" $row m path value]} { if {[string index $path 0] eq "\""} { - set path [encoding convertfrom [lindex $path 0]] + set path [encoding convertfrom utf-8 [lindex $path 0]] } set path_attr_cache($attr,$path) $value } From edebad24180ad3df27e2daa66c20cdfdf724cbdb Mon Sep 17 00:00:00 2001 From: Sebastian Schuberth Date: Sun, 22 Jul 2012 23:19:24 +0200 Subject: [PATCH 216/297] gitk: Use an external icon file on Windows Git for Windows now ships with the new Git icon from git-scm.com. Use that icon file if it exists instead of the old procedurally drawn one. This patch was sent upstream but so far no decision on its inclusion was made, so commit it to our fork. Signed-off-by: Sebastian Schuberth --- gitk-git/gitk | 49 ++++++++++++++++++++++++++----------------------- 1 file changed, 26 insertions(+), 23 deletions(-) diff --git a/gitk-git/gitk b/gitk-git/gitk index a792f48e5a1a47..6b84ee13273e4a 100755 --- a/gitk-git/gitk +++ b/gitk-git/gitk @@ -12436,7 +12436,6 @@ if { [info exists ::env(GITK_MSGSDIR)] } { set gitk_prefix [file dirname [file dirname [file normalize $argv0]]] set gitk_libdir [file join $gitk_prefix share gitk lib] set gitk_msgsdir [file join $gitk_libdir msgs] - unset gitk_prefix } ## Internationalization (i18n) through msgcat and gettext. See @@ -12799,28 +12798,32 @@ if {[expr {[exec git rev-parse --is-inside-work-tree] == "true"}]} { set worktree [gitworktree] setcoords makewindow -catch { - image create photo gitlogo -width 16 -height 16 - - image create photo gitlogominus -width 4 -height 2 - gitlogominus put #C00000 -to 0 0 4 2 - gitlogo copy gitlogominus -to 1 5 - gitlogo copy gitlogominus -to 6 5 - gitlogo copy gitlogominus -to 11 5 - image delete gitlogominus - - image create photo gitlogoplus -width 4 -height 4 - gitlogoplus put #008000 -to 1 0 3 4 - gitlogoplus put #008000 -to 0 1 4 3 - gitlogo copy gitlogoplus -to 1 9 - gitlogo copy gitlogoplus -to 6 9 - gitlogo copy gitlogoplus -to 11 9 - image delete gitlogoplus - - image create photo gitlogo32 -width 32 -height 32 - gitlogo32 copy gitlogo -zoom 2 2 - - wm iconphoto . -default gitlogo gitlogo32 +if {$::tcl_platform(platform) eq {windows} && [file exists $gitk_prefix/etc/git.ico]} { + wm iconbitmap . -default $gitk_prefix/etc/git.ico +} else { + catch { + image create photo gitlogo -width 16 -height 16 + + image create photo gitlogominus -width 4 -height 2 + gitlogominus put #C00000 -to 0 0 4 2 + gitlogo copy gitlogominus -to 1 5 + gitlogo copy gitlogominus -to 6 5 + gitlogo copy gitlogominus -to 11 5 + image delete gitlogominus + + image create photo gitlogoplus -width 4 -height 4 + gitlogoplus put #008000 -to 1 0 3 4 + gitlogoplus put #008000 -to 0 1 4 3 + gitlogo copy gitlogoplus -to 1 9 + gitlogo copy gitlogoplus -to 6 9 + gitlogo copy gitlogoplus -to 11 9 + image delete gitlogoplus + + image create photo gitlogo32 -width 32 -height 32 + gitlogo32 copy gitlogo -zoom 2 2 + + wm iconphoto . -default gitlogo gitlogo32 + } } # wait for the window to become visible tkwait visibility . From 079248710081701b47b2c0041fd09d32a4f43c80 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 20 Sep 2017 21:53:45 +0200 Subject: [PATCH 217/297] git-gui--askyesno: allow overriding the window title "Question?" is maybe not the most informative thing to ask. In the absence of better information, it is the best we can do, of course. However, Git for Windows' auto updater just learned the trick to use git-gui--askyesno to ask the user whether to update now or not. And in this scripted scenario, we can easily pass a command-line option to change the window title. So let's support that with the new `--title ` option. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- git-gui/git-gui--askyesno | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/git-gui/git-gui--askyesno b/git-gui/git-gui--askyesno index cf9c990d0919b3..45b0260eff8145 100755 --- a/git-gui/git-gui--askyesno +++ b/git-gui/git-gui--askyesno @@ -12,10 +12,15 @@ if {$use_ttk} { set NS ttk } +set title "Question?" if {$argc < 1} { puts stderr "Usage: $argv0 <question>" exit 1 } else { + if {$argc > 2 && [lindex $argv 0] == "--title"} { + set title [lindex $argv 1] + set argv [lreplace $argv 0 1] + } set prompt [join $argv " "] } @@ -47,5 +52,5 @@ proc yes {} { exit 0 } -wm title . "Question?" +wm title . $title tk::PlaceWindow . From 5040e8b6a31aa80ef012fd808e08ec8b3b56b64e Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Tue, 16 Feb 2016 16:42:06 +0100 Subject: [PATCH 218/297] gitk: fix arrow keys in input fields with Tcl/Tk >= 8.6 Tcl/Tk 8.6 introduced new events for the cursor left/right keys and apparently changed the behavior of the previous event. Let's work around that by using the new events when we are running with Tcl/Tk 8.6 or later. This fixes https://github.com/git-for-windows/git/issues/495 Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- gitk-git/gitk | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/gitk-git/gitk b/gitk-git/gitk index 6b84ee13273e4a..980e3e1b83f8ea 100755 --- a/gitk-git/gitk +++ b/gitk-git/gitk @@ -2234,7 +2234,7 @@ proc makewindow {} { global headctxmenu progresscanv progressitem progresscoords statusw global fprogitem fprogcoord lastprogupdate progupdatepending global rprogitem rprogcoord rownumsel numcommits - global have_tk85 use_ttk NS + global have_tk85 have_tk86 use_ttk NS global git_version global worddiff @@ -2732,8 +2732,13 @@ proc makewindow {} { bind . <Key-Down> "selnextline 1" bind . <Shift-Key-Up> "dofind -1 0" bind . <Shift-Key-Down> "dofind 1 0" - bindkey <Key-Right> "goforw" - bindkey <Key-Left> "goback" + if {$have_tk86} { + bindkey <<NextChar>> "goforw" + bindkey <<PrevChar>> "goback" + } else { + bindkey <Key-Right> "goforw" + bindkey <Key-Left> "goback" + } bind . <Key-Prior> "selnextpage -1" bind . <Key-Next> "selnextpage 1" bind . <$M1B-Home> "allcanvs yview moveto 0.0" @@ -12734,6 +12739,7 @@ set nullid2 "0000000000000000000000000000000000000001" set nullfile "/dev/null" set have_tk85 [expr {[package vcompare $tk_version "8.5"] >= 0}] +set have_tk86 [expr {[package vcompare $tk_version "8.6"] >= 0}] if {![info exists have_ttk]} { set have_ttk [llength [info commands ::ttk::style]] } From f57f1f197d66365ae7d05c644443e3bf9ff220b5 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Wed, 20 Sep 2017 21:55:45 +0200 Subject: [PATCH 219/297] git-gui--askyesno (mingw): use Git for Windows' icon, if available For additional GUI goodness. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- git-gui/git-gui--askyesno | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/git-gui/git-gui--askyesno b/git-gui/git-gui--askyesno index 45b0260eff8145..c0c82e7cbd01d6 100755 --- a/git-gui/git-gui--askyesno +++ b/git-gui/git-gui--askyesno @@ -52,5 +52,17 @@ proc yes {} { exit 0 } +if {$::tcl_platform(platform) eq {windows}} { + set icopath [file dirname [file normalize $argv0]] + if {[file tail $icopath] eq {git-core}} { + set icopath [file dirname $icopath] + } + set icopath [file dirname $icopath] + set icopath [file join $icopath share git git-for-windows.ico] + if {[file exists $icopath]} { + wm iconbitmap . -default $icopath + } +} + wm title . $title tk::PlaceWindow . From 2dfff151750ef1803943260ad369ddc4ae6eebf5 Mon Sep 17 00:00:00 2001 From: "James J. Raden" <james.raden@gmail.com> Date: Thu, 21 Jan 2016 12:07:47 -0500 Subject: [PATCH 220/297] gitk: make the "list references" default window width wider When using remotes (with git-flow especially), the remote reference names are almost always wordwrapped in the "list references" window because it's somewhat narrow by default. It's possible to resize it with a mouse, but it's annoying to have to do this every time, especially on Windows 10, where the window border seems to be only one (1) pixel wide, thus making the grabbing of the window border tricky. Signed-off-by: James J. Raden <james.raden@gmail.com> --- gitk-git/gitk | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gitk-git/gitk b/gitk-git/gitk index 980e3e1b83f8ea..26294152dbf540 100755 --- a/gitk-git/gitk +++ b/gitk-git/gitk @@ -10201,7 +10201,7 @@ proc showrefs {} { text $top.list -background $bgcolor -foreground $fgcolor \ -selectbackground $selectbgcolor -font mainfont \ -xscrollcommand "$top.xsb set" -yscrollcommand "$top.ysb set" \ - -width 30 -height 20 -cursor $maincursor \ + -width 60 -height 20 -cursor $maincursor \ -spacing1 1 -spacing3 1 -state disabled $top.list tag configure highlight -background $selectbgcolor if {![lsearch -exact $bglist $top.list]} { From ed4cdb7392bb00d838b19f52c54b91e5eff5baa3 Mon Sep 17 00:00:00 2001 From: Doug Kelly <dougk.ff7@gmail.com> Date: Wed, 8 Jan 2014 20:28:15 -0600 Subject: [PATCH 221/297] pack-objects (mingw): demonstrate a segmentation fault with large deltas There is a problem in the way 9ac3f0e5b3e4 (pack-objects: fix performance issues on packing large deltas, 2018-07-22) initializes that mutex in the `packing_data` struct. The problem manifests in a segmentation fault on Windows, when a mutex (AKA critical section) is accessed without being initialized. (With pthreads, you apparently do not really have to initialize them?) This was reported in https://github.com/git-for-windows/git/issues/1839. Signed-off-by: Doug Kelly <dougk.ff7@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- t/t7429-submodule-long-path.sh | 106 +++++++++++++++++++++++++++++++++ 1 file changed, 106 insertions(+) create mode 100755 t/t7429-submodule-long-path.sh diff --git a/t/t7429-submodule-long-path.sh b/t/t7429-submodule-long-path.sh new file mode 100755 index 00000000000000..f692cedbff7ff8 --- /dev/null +++ b/t/t7429-submodule-long-path.sh @@ -0,0 +1,106 @@ +#!/bin/sh +# +# Copyright (c) 2013 Doug Kelly +# + +test_description='Test submodules with a path near PATH_MAX + +This test verifies that "git submodule" initialization, update and clones work, including with recursive submodules and paths approaching PATH_MAX (260 characters on Windows) +' + +TEST_NO_CREATE_REPO=1 +. ./test-lib.sh + +longpath="" +for (( i=0; i<4; i++ )); do + longpath="0123456789abcdefghijklmnopqrstuvwxyz$longpath" +done +# Pick a substring maximum of 90 characters +# This should be good, since we'll add on a lot for temp directories +longpath=${longpath:0:90}; export longpath + +test_expect_failure 'submodule with a long path' ' + git config --global protocol.file.allow always && + GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \ + git -c init.defaultBranch=long init --bare remote && + test_create_repo bundle1 && + ( + cd bundle1 && + test_commit "shoot" && + git rev-parse --verify HEAD >../expect + ) && + mkdir home && + ( + cd home && + git clone ../remote test && + cd test && + git checkout -B long && + git submodule add ../bundle1 $longpath && + test_commit "sogood" && + ( + cd $longpath && + git rev-parse --verify HEAD >actual && + test_cmp ../../../expect actual + ) && + git push origin long + ) && + mkdir home2 && + ( + cd home2 && + git clone ../remote test && + cd test && + git checkout long && + git submodule update --init && + ( + cd $longpath && + git rev-parse --verify HEAD >actual && + test_cmp ../../../expect actual + ) + ) +' + +test_expect_failure 'recursive submodule with a long path' ' + GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \ + git -c init.defaultBranch=long init --bare super && + test_create_repo child && + ( + cd child && + test_commit "shoot" && + git rev-parse --verify HEAD >../expect + ) && + test_create_repo parent && + ( + cd parent && + git submodule add ../child $longpath && + test_commit "aim" + ) && + mkdir home3 && + ( + cd home3 && + git clone ../super test && + cd test && + git checkout -B long && + git submodule add ../parent foo && + git submodule update --init --recursive && + test_commit "sogood" && + ( + cd foo/$longpath && + git rev-parse --verify HEAD >actual && + test_cmp ../../../../expect actual + ) && + git push origin long + ) && + mkdir home4 && + ( + cd home4 && + git clone ../super test --recursive && + ( + cd test/foo/$longpath && + git rev-parse --verify HEAD >actual && + test_cmp ../../../../expect actual + ) + ) +' +unset longpath + +test_done From b1eb679572a417de3b9ff63397885c7c5ce367b3 Mon Sep 17 00:00:00 2001 From: Karsten Blees <blees@dcon.de> Date: Thu, 19 Mar 2015 16:33:44 +0100 Subject: [PATCH 222/297] mingw: Support `git_terminal_prompt` with more terminals The `git_terminal_prompt()` function expects the terminal window to be attached to a Win32 Console. However, this is not the case with terminal windows other than `cmd.exe`'s, e.g. with MSys2's own `mintty`. Non-cmd terminals such as `mintty` still have to have a Win32 Console to be proper console programs, but have to hide the Win32 Console to be able to provide more flexibility (such as being resizeable not only vertically but also horizontally). By writing to that Win32 Console, `git_terminal_prompt()` manages only to send the prompt to nowhere and to wait for input from a Console to which the user has no access. This commit introduces a function specifically to support `mintty` -- or other terminals that are compatible with MSys2's `/dev/tty` emulation. We use the `TERM` environment variable as an indicator for that: if the value starts with "xterm" (such as `mintty`'s "xterm_256color"), we prefer to let `xterm_prompt()` handle the user interaction. The most prominent user of `git_terminal_prompt()` is certainly `git-remote-https.exe`. It is an interesting use case because both `stdin` and `stdout` are redirected when Git calls said executable, yet it still wants to access the terminal. When running inside a `mintty`, the terminal is not accessible to the `git-remote-https.exe` program, though, because it is a MinGW program and the `mintty` terminal is not backed by a Win32 console. To solve that problem, we simply call out to the shell -- which is an *MSys2* program and can therefore access `/dev/tty`. Helped-by: nalla <nalla@hamal.uberspace.de> Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- compat/terminal.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 54 insertions(+) diff --git a/compat/terminal.c b/compat/terminal.c index d54efa1c5d6ebf..2a2611855b877f 100644 --- a/compat/terminal.c +++ b/compat/terminal.c @@ -419,6 +419,54 @@ static int getchar_with_timeout(int timeout) return getchar(); } +static char *shell_prompt(const char *prompt, int echo) +{ + const char *read_input[] = { + /* Note: call 'bash' explicitly, as 'read -s' is bash-specific */ + "bash", "-c", echo ? + "cat >/dev/tty && read -r line </dev/tty && echo \"$line\"" : + "cat >/dev/tty && read -r -s line </dev/tty && echo \"$line\" && echo >/dev/tty", + NULL + }; + struct child_process child = CHILD_PROCESS_INIT; + static struct strbuf buffer = STRBUF_INIT; + int prompt_len = strlen(prompt), len = -1, code; + + strvec_pushv(&child.args, read_input); + child.in = -1; + child.out = -1; + + if (start_command(&child)) + return NULL; + + if (write_in_full(child.in, prompt, prompt_len) != prompt_len) { + error("could not write to prompt script"); + close(child.in); + goto ret; + } + close(child.in); + + strbuf_reset(&buffer); + len = strbuf_read(&buffer, child.out, 1024); + if (len < 0) { + error("could not read from prompt script"); + goto ret; + } + + strbuf_strip_suffix(&buffer, "\n"); + strbuf_strip_suffix(&buffer, "\r"); + +ret: + close(child.out); + code = finish_command(&child); + if (code) { + error("failed to execute prompt script (exit code %d)", code); + return NULL; + } + + return len < 0 ? NULL : buffer.buf; +} + #endif #ifndef FORCE_TEXT @@ -430,6 +478,12 @@ char *git_terminal_prompt(const char *prompt, int echo) static struct strbuf buf = STRBUF_INIT; int r; FILE *input_fh, *output_fh; +#ifdef GIT_WINDOWS_NATIVE + const char *term = getenv("TERM"); + + if (term && starts_with(term, "xterm")) + return shell_prompt(prompt, echo); +#endif input_fh = fopen(INPUT_PATH, "r" FORCE_TEXT); if (!input_fh) From 90a0bd5642699b75a0680b694ebff53203293fe7 Mon Sep 17 00:00:00 2001 From: Karsten Blees <blees@dcon.de> Date: Tue, 28 Jul 2015 21:07:41 +0200 Subject: [PATCH 223/297] mingw: support long paths Windows paths are typically limited to MAX_PATH = 260 characters, even though the underlying NTFS file system supports paths up to 32,767 chars. This limitation is also evident in Windows Explorer, cmd.exe and many other applications (including IDEs). Particularly annoying is that most Windows APIs return bogus error codes if a relative path only barely exceeds MAX_PATH in conjunction with the current directory, e.g. ERROR_PATH_NOT_FOUND / ENOENT instead of the infinitely more helpful ERROR_FILENAME_EXCED_RANGE / ENAMETOOLONG. Many Windows wide char APIs support longer than MAX_PATH paths through the file namespace prefix ('\\?\' or '\\?\UNC\') followed by an absolute path. Notable exceptions include functions dealing with executables and the current directory (CreateProcess, LoadLibrary, Get/SetCurrentDirectory) as well as the entire shell API (ShellExecute, SHGetSpecialFolderPath...). Introduce a handle_long_path function to check the length of a specified path properly (and fail with ENAMETOOLONG), and to optionally expand long paths using the '\\?\' file namespace prefix. Short paths will not be modified, so we don't need to worry about device names (NUL, CON, AUX). Contrary to MSDN docs, the GetFullPathNameW function doesn't seem to be limited to MAX_PATH (at least not on Win7), so we can use it to do the heavy lifting of the conversion (translate '/' to '\', eliminate '.' and '..', and make an absolute path). Add long path error checking to xutftowcs_path for APIs with hard MAX_PATH limit. Add a new MAX_LONG_PATH constant and xutftowcs_long_path function for APIs that support long paths. While improved error checking is always active, long paths support must be explicitly enabled via 'core.longpaths' option. This is to prevent end users to shoot themselves in the foot by checking out files that Windows Explorer, cmd/bash or their favorite IDE cannot handle. Test suite: Test the case is when the full pathname length of a dir is close to 260 (MAX_PATH). Bug report and an original reproducer by Andrey Rogozhnikov: https://github.com/msysgit/git/pull/122#issuecomment-43604199 [jes: adjusted test number to avoid conflicts, added support for chdir(), etc] Thanks-to: Martin W. Kirst <maki@bitkings.de> Thanks-to: Doug Kelly <dougk.ff7@gmail.com> Original-test-by: Andrey Rogozhnikov <rogozhnikov.andrey@gmail.com> Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Stepan Kasal <kasal@ucw.cz> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- Documentation/config/core.txt | 7 ++ compat/mingw.c | 169 ++++++++++++++++++++++++++------- compat/mingw.h | 75 ++++++++++++++- compat/win32/dirent.c | 17 ++-- compat/win32/fscache.c | 16 ++-- t/t2031-checkout-long-paths.sh | 102 ++++++++++++++++++++ t/t7429-submodule-long-path.sh | 24 +++-- 7 files changed, 346 insertions(+), 64 deletions(-) create mode 100755 t/t2031-checkout-long-paths.sh diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt index b894c7e9cc8a57..55c8939452a7a4 100644 --- a/Documentation/config/core.txt +++ b/Documentation/config/core.txt @@ -691,6 +691,13 @@ core.fscache:: Git for Windows uses this to bulk-read and cache lstat data of entire directories (instead of doing lstat file by file). +core.longpaths:: + Enable long path (> 260) support for builtin commands in Git for + Windows. This is disabled by default, as long paths are not supported + by Windows Explorer, cmd.exe and the Git for Windows tool chain + (msys, bash, tcl, perl...). Only enable this if you know what you're + doing and are prepared to live with a few quirks. + core.unsetenvvars:: Windows-only: comma-separated list of environment variables' names that need to be unset before spawning any other process. diff --git a/compat/mingw.c b/compat/mingw.c index c1ca68aab11a6a..6b85015f73ab6c 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -247,6 +247,27 @@ static enum hide_dotfiles_type hide_dotfiles = HIDE_DOTFILES_DOTGITONLY; static char *unset_environment_variables; int core_fscache; +int are_long_paths_enabled(void) +{ + /* default to `false` during initialization */ + static const int fallback = 0; + + static int enabled = -1; + + if (enabled < 0) { + /* avoid infinite recursion */ + if (!the_repository) + return fallback; + + if (the_repository->config && + the_repository->config->hash_initialized && + git_config_get_bool("core.longpaths", &enabled) < 0) + enabled = 0; + } + + return enabled < 0 ? fallback : enabled; +} + int mingw_core_config(const char *var, const char *value, const struct config_context *ctx, void *cb) { @@ -311,8 +332,8 @@ static wchar_t *normalize_ntpath(wchar_t *wbuf) int mingw_unlink(const char *pathname) { int ret, tries = 0; - wchar_t wpathname[MAX_PATH]; - if (xutftowcs_path(wpathname, pathname) < 0) + wchar_t wpathname[MAX_LONG_PATH]; + if (xutftowcs_long_path(wpathname, pathname) < 0) return -1; if (DeleteFileW(wpathname)) @@ -344,7 +365,7 @@ static int is_dir_empty(const wchar_t *wpath) { WIN32_FIND_DATAW findbuf; HANDLE handle; - wchar_t wbuf[MAX_PATH + 2]; + wchar_t wbuf[MAX_LONG_PATH + 2]; wcscpy(wbuf, wpath); wcscat(wbuf, L"\\*"); handle = FindFirstFileW(wbuf, &findbuf); @@ -365,7 +386,7 @@ static int is_dir_empty(const wchar_t *wpath) int mingw_rmdir(const char *pathname) { int ret, tries = 0; - wchar_t wpathname[MAX_PATH]; + wchar_t wpathname[MAX_LONG_PATH]; struct stat st; /* @@ -387,7 +408,7 @@ int mingw_rmdir(const char *pathname) return -1; } - if (xutftowcs_path(wpathname, pathname) < 0) + if (xutftowcs_long_path(wpathname, pathname) < 0) return -1; while ((ret = _wrmdir(wpathname)) == -1 && tries < ARRAY_SIZE(delay)) { @@ -466,15 +487,18 @@ static int set_hidden_flag(const wchar_t *path, int set) int mingw_mkdir(const char *path, int mode) { int ret; - wchar_t wpath[MAX_PATH]; + wchar_t wpath[MAX_LONG_PATH]; if (!is_valid_win32_path(path, 0)) { errno = EINVAL; return -1; } - if (xutftowcs_path(wpath, path) < 0) + /* CreateDirectoryW path limit is 248 (MAX_PATH - 8.3 file name) */ + if (xutftowcs_path_ex(wpath, path, MAX_LONG_PATH, -1, 248, + are_long_paths_enabled()) < 0) return -1; + ret = _wmkdir(wpath); if (!ret && needs_hiding(path)) return set_hidden_flag(wpath, 1); @@ -561,7 +585,7 @@ int mingw_open (const char *filename, int oflags, ...) va_list args; unsigned mode; int fd, create = (oflags & (O_CREAT | O_EXCL)) == (O_CREAT | O_EXCL); - wchar_t wfilename[MAX_PATH]; + wchar_t wfilename[MAX_LONG_PATH]; open_fn_t open_fn; va_start(args, oflags); @@ -589,7 +613,7 @@ int mingw_open (const char *filename, int oflags, ...) if (filename && !strcmp(filename, "/dev/null")) wcscpy(wfilename, L"nul"); - else if (xutftowcs_path(wfilename, filename) < 0) + else if (xutftowcs_long_path(wfilename, filename) < 0) return -1; fd = open_fn(wfilename, oflags, mode); @@ -647,14 +671,14 @@ FILE *mingw_fopen (const char *filename, const char *otype) { int hide = needs_hiding(filename); FILE *file; - wchar_t wfilename[MAX_PATH], wotype[4]; + wchar_t wfilename[MAX_LONG_PATH], wotype[4]; if (filename && !strcmp(filename, "/dev/null")) wcscpy(wfilename, L"nul"); else if (!is_valid_win32_path(filename, 1)) { int create = otype && strchr(otype, 'w'); errno = create ? EINVAL : ENOENT; return NULL; - } else if (xutftowcs_path(wfilename, filename) < 0) + } else if (xutftowcs_long_path(wfilename, filename) < 0) return NULL; if (xutftowcs(wotype, otype, ARRAY_SIZE(wotype)) < 0) @@ -676,14 +700,14 @@ FILE *mingw_freopen (const char *filename, const char *otype, FILE *stream) { int hide = needs_hiding(filename); FILE *file; - wchar_t wfilename[MAX_PATH], wotype[4]; + wchar_t wfilename[MAX_LONG_PATH], wotype[4]; if (filename && !strcmp(filename, "/dev/null")) wcscpy(wfilename, L"nul"); else if (!is_valid_win32_path(filename, 1)) { int create = otype && strchr(otype, 'w'); errno = create ? EINVAL : ENOENT; return NULL; - } else if (xutftowcs_path(wfilename, filename) < 0) + } else if (xutftowcs_long_path(wfilename, filename) < 0) return NULL; if (xutftowcs(wotype, otype, ARRAY_SIZE(wotype)) < 0) @@ -733,7 +757,7 @@ ssize_t mingw_write(int fd, const void *buf, size_t len) HANDLE h = (HANDLE) _get_osfhandle(fd); if (GetFileType(h) != FILE_TYPE_PIPE) { if (orig == EINVAL) { - wchar_t path[MAX_PATH]; + wchar_t path[MAX_LONG_PATH]; DWORD ret = GetFinalPathNameByHandleW(h, path, ARRAY_SIZE(path), 0); UINT drive_type = ret > 0 && ret < ARRAY_SIZE(path) ? @@ -770,27 +794,33 @@ ssize_t mingw_write(int fd, const void *buf, size_t len) int mingw_access(const char *filename, int mode) { - wchar_t wfilename[MAX_PATH]; + wchar_t wfilename[MAX_LONG_PATH]; if (!strcmp("nul", filename) || !strcmp("/dev/null", filename)) return 0; - if (xutftowcs_path(wfilename, filename) < 0) + if (xutftowcs_long_path(wfilename, filename) < 0) return -1; /* X_OK is not supported by the MSVCRT version */ return _waccess(wfilename, mode & ~X_OK); } +/* cached length of current directory for handle_long_path */ +static int current_directory_len = 0; + int mingw_chdir(const char *dirname) { - wchar_t wdirname[MAX_PATH]; - if (xutftowcs_path(wdirname, dirname) < 0) + int result; + wchar_t wdirname[MAX_LONG_PATH]; + if (xutftowcs_long_path(wdirname, dirname) < 0) return -1; - return _wchdir(wdirname); + result = _wchdir(wdirname); + current_directory_len = GetCurrentDirectoryW(0, NULL); + return result; } int mingw_chmod(const char *filename, int mode) { - wchar_t wfilename[MAX_PATH]; - if (xutftowcs_path(wfilename, filename) < 0) + wchar_t wfilename[MAX_LONG_PATH]; + if (xutftowcs_long_path(wfilename, filename) < 0) return -1; return _wchmod(wfilename, mode); } @@ -838,8 +868,8 @@ static int has_valid_directory_prefix(wchar_t *wfilename) static int do_lstat(int follow, const char *file_name, struct stat *buf) { WIN32_FILE_ATTRIBUTE_DATA fdata; - wchar_t wfilename[MAX_PATH]; - if (xutftowcs_path(wfilename, file_name) < 0) + wchar_t wfilename[MAX_LONG_PATH]; + if (xutftowcs_long_path(wfilename, file_name) < 0) return -1; if (GetFileAttributesExW(wfilename, GetFileExInfoStandard, &fdata)) { @@ -1010,10 +1040,10 @@ int mingw_utime (const char *file_name, const struct utimbuf *times) FILETIME mft, aft; int rc; DWORD attrs; - wchar_t wfilename[MAX_PATH]; + wchar_t wfilename[MAX_LONG_PATH]; HANDLE osfilehandle; - if (xutftowcs_path(wfilename, file_name) < 0) + if (xutftowcs_long_path(wfilename, file_name) < 0) return -1; /* must have write permission */ @@ -1096,6 +1126,7 @@ char *mingw_mktemp(char *template) wchar_t wtemplate[MAX_PATH]; int offset = 0; + /* we need to return the path, thus no long paths here! */ if (xutftowcs_path(wtemplate, template) < 0) return NULL; @@ -1747,6 +1778,10 @@ static pid_t mingw_spawnve_fd(const char *cmd, const char **argv, char **deltaen if (*argv && !strcmp(cmd, *argv)) wcmd[0] = L'\0'; + /* + * Paths to executables and to the current directory do not support + * long paths, therefore we cannot use xutftowcs_long_path() here. + */ else if (xutftowcs_path(wcmd, cmd) < 0) return -1; if (dir && xutftowcs_path(wdir, dir) < 0) @@ -2395,8 +2430,9 @@ int mingw_rename(const char *pold, const char *pnew) { DWORD attrs, gle; int tries = 0; - wchar_t wpold[MAX_PATH], wpnew[MAX_PATH]; - if (xutftowcs_path(wpold, pold) < 0 || xutftowcs_path(wpnew, pnew) < 0) + wchar_t wpold[MAX_LONG_PATH], wpnew[MAX_LONG_PATH]; + if (xutftowcs_long_path(wpold, pold) < 0 || + xutftowcs_long_path(wpnew, pnew) < 0) return -1; /* @@ -2714,9 +2750,9 @@ int mingw_raise(int sig) int link(const char *oldpath, const char *newpath) { - wchar_t woldpath[MAX_PATH], wnewpath[MAX_PATH]; - if (xutftowcs_path(woldpath, oldpath) < 0 || - xutftowcs_path(wnewpath, newpath) < 0) + wchar_t woldpath[MAX_LONG_PATH], wnewpath[MAX_LONG_PATH]; + if (xutftowcs_long_path(woldpath, oldpath) < 0 || + xutftowcs_long_path(wnewpath, newpath) < 0) return -1; if (!CreateHardLinkW(wnewpath, woldpath, NULL)) { @@ -2784,8 +2820,8 @@ int mingw_is_mount_point(struct strbuf *path) { WIN32_FIND_DATAW findbuf = { 0 }; HANDLE handle; - wchar_t wfilename[MAX_PATH]; - int wlen = xutftowcs_path(wfilename, path->buf); + wchar_t wfilename[MAX_LONG_PATH]; + int wlen = xutftowcs_long_path(wfilename, path->buf); if (wlen < 0) die(_("could not get long path for '%s'"), path->buf); @@ -2930,9 +2966,9 @@ static size_t append_system_bin_dirs(char *path, size_t size) static int is_system32_path(const char *path) { - WCHAR system32[MAX_PATH], wpath[MAX_PATH]; + WCHAR system32[MAX_LONG_PATH], wpath[MAX_LONG_PATH]; - if (xutftowcs_path(wpath, path) < 0 || + if (xutftowcs_long_path(wpath, path) < 0 || !GetSystemDirectoryW(system32, ARRAY_SIZE(system32)) || _wcsicmp(system32, wpath)) return 0; @@ -3344,6 +3380,68 @@ int is_valid_win32_path(const char *path, int allow_literal_nul) } } +int handle_long_path(wchar_t *path, int len, int max_path, int expand) +{ + int result; + wchar_t buf[MAX_LONG_PATH]; + + /* + * we don't need special handling if path is relative to the current + * directory, and current directory + path don't exceed the desired + * max_path limit. This should cover > 99 % of cases with minimal + * performance impact (git almost always uses relative paths). + */ + if ((len < 2 || (!is_dir_sep(path[0]) && path[1] != ':')) && + (current_directory_len + len < max_path)) + return len; + + /* + * handle everything else: + * - absolute paths: "C:\dir\file" + * - absolute UNC paths: "\\server\share\dir\file" + * - absolute paths on current drive: "\dir\file" + * - relative paths on other drive: "X:file" + * - prefixed paths: "\\?\...", "\\.\..." + */ + + /* convert to absolute path using GetFullPathNameW */ + result = GetFullPathNameW(path, MAX_LONG_PATH, buf, NULL); + if (!result) { + errno = err_win_to_posix(GetLastError()); + return -1; + } + + /* + * return absolute path if it fits within max_path (even if + * "cwd + path" doesn't due to '..' components) + */ + if (result < max_path) { + wcscpy(path, buf); + return result; + } + + /* error out if we shouldn't expand the path or buf is too small */ + if (!expand || result >= MAX_LONG_PATH - 6) { + errno = ENAMETOOLONG; + return -1; + } + + /* prefix full path with "\\?\" or "\\?\UNC\" */ + if (buf[0] == '\\') { + /* ...unless already prefixed */ + if (buf[1] == '\\' && (buf[2] == '?' || buf[2] == '.')) + return len; + + wcscpy(path, L"\\\\?\\UNC\\"); + wcscpy(path + 8, buf + 2); + return result + 6; + } else { + wcscpy(path, L"\\\\?\\"); + wcscpy(path + 4, buf); + return result + 4; + } +} + #if !defined(_MSC_VER) /* * Disable MSVCRT command line wildcard expansion (__getmainargs called from @@ -3505,6 +3603,9 @@ int wmain(int argc, const wchar_t **wargv) /* initialize Unicode console */ winansi_init(); + /* init length of current directory for handle_long_path */ + current_directory_len = GetCurrentDirectoryW(0, NULL); + /* invoke the real main() using our utf8 version of argv. */ exit_status = main(argc, argv); diff --git a/compat/mingw.h b/compat/mingw.h index 97cdb8d51631ea..bc1d36b49d5a17 100644 --- a/compat/mingw.h +++ b/compat/mingw.h @@ -12,6 +12,7 @@ typedef _sigset_t sigset_t; #endif extern int core_fscache; +int are_long_paths_enabled(void); struct config_context; int mingw_core_config(const char *var, const char *value, @@ -520,6 +521,42 @@ int is_path_owned_by_current_sid(const char *path, struct strbuf *report); int is_valid_win32_path(const char *path, int allow_literal_nul); #define is_valid_path(path) is_valid_win32_path(path, 0) +/** + * Max length of long paths (exceeding MAX_PATH). The actual maximum supported + * by NTFS is 32,767 (* sizeof(wchar_t)), but we choose an arbitrary smaller + * value to limit required stack memory. + */ +#define MAX_LONG_PATH 4096 + +/** + * Handles paths that would exceed the MAX_PATH limit of Windows Unicode APIs. + * + * With expand == false, the function checks for over-long paths and fails + * with ENAMETOOLONG. The path parameter is not modified, except if cwd + path + * exceeds max_path, but the resulting absolute path doesn't (e.g. due to + * eliminating '..' components). The path parameter must point to a buffer + * of max_path wide characters. + * + * With expand == true, an over-long path is automatically converted in place + * to an absolute path prefixed with '\\?\', and the new length is returned. + * The path parameter must point to a buffer of MAX_LONG_PATH wide characters. + * + * Parameters: + * path: path to check and / or convert + * len: size of path on input (number of wide chars without \0) + * max_path: max short path length to check (usually MAX_PATH = 260, but just + * 248 for CreateDirectoryW) + * expand: false to only check the length, true to expand the path to a + * '\\?\'-prefixed absolute path + * + * Return: + * length of the resulting path, or -1 on failure + * + * Errors: + * ENAMETOOLONG if path is too long + */ +int handle_long_path(wchar_t *path, int len, int max_path, int expand); + /** * Converts UTF-8 encoded string to UTF-16LE. * @@ -578,18 +615,46 @@ static inline int xutftowcs(wchar_t *wcs, const char *utf, size_t wcslen) } /** - * Simplified file system specific variant of xutftowcsn, assumes output - * buffer size is MAX_PATH wide chars and input string is \0-terminated, - * fails with ENAMETOOLONG if input string is too long. + * Simplified file system specific wrapper of xutftowcsn and handle_long_path. + * Converts ERANGE to ENAMETOOLONG. If expand is true, wcs must be at least + * MAX_LONG_PATH wide chars (see handle_long_path). */ -static inline int xutftowcs_path(wchar_t *wcs, const char *utf) +static inline int xutftowcs_path_ex(wchar_t *wcs, const char *utf, + size_t wcslen, int utflen, int max_path, int expand) { - int result = xutftowcsn(wcs, utf, MAX_PATH, -1); + int result = xutftowcsn(wcs, utf, wcslen, utflen); if (result < 0 && errno == ERANGE) errno = ENAMETOOLONG; + if (result >= 0) + result = handle_long_path(wcs, result, max_path, expand); return result; } +/** + * Simplified file system specific variant of xutftowcsn, assumes output + * buffer size is MAX_PATH wide chars and input string is \0-terminated, + * fails with ENAMETOOLONG if input string is too long. Typically used for + * Windows APIs that don't support long paths, e.g. SetCurrentDirectory, + * LoadLibrary, CreateProcess... + */ +static inline int xutftowcs_path(wchar_t *wcs, const char *utf) +{ + return xutftowcs_path_ex(wcs, utf, MAX_PATH, -1, MAX_PATH, 0); +} + +/** + * Simplified file system specific variant of xutftowcsn for Windows APIs + * that support long paths via '\\?\'-prefix, assumes output buffer size is + * MAX_LONG_PATH wide chars, fails with ENAMETOOLONG if input string is too + * long. The 'core.longpaths' git-config option controls whether the path + * is only checked or expanded to a long path. + */ +static inline int xutftowcs_long_path(wchar_t *wcs, const char *utf) +{ + return xutftowcs_path_ex(wcs, utf, MAX_LONG_PATH, -1, MAX_PATH, + are_long_paths_enabled()); +} + /** * Converts UTF-16LE encoded string to UTF-8. * diff --git a/compat/win32/dirent.c b/compat/win32/dirent.c index 139d2ba3c4da34..c9fe2454efc01c 100644 --- a/compat/win32/dirent.c +++ b/compat/win32/dirent.c @@ -65,19 +65,24 @@ static int dirent_closedir(dirent_DIR *dir) DIR *dirent_opendir(const char *name) { - wchar_t pattern[MAX_PATH + 2]; /* + 2 for '/' '*' */ + wchar_t pattern[MAX_LONG_PATH + 2]; /* + 2 for "\*" */ WIN32_FIND_DATAW fdata; HANDLE h; int len; dirent_DIR *dir; - /* convert name to UTF-16 and check length < MAX_PATH */ - if ((len = xutftowcs_path(pattern, name)) < 0) + /* convert name to UTF-16 and check length */ + if ((len = xutftowcs_path_ex(pattern, name, MAX_LONG_PATH, -1, + MAX_PATH - 2, + are_long_paths_enabled())) < 0) return NULL; - /* append optional '/' and wildcard '*' */ + /* + * append optional '\' and wildcard '*'. Note: we need to use '\' as + * Windows doesn't translate '/' to '\' for "\\?\"-prefixed paths. + */ if (len && !is_dir_sep(pattern[len - 1])) - pattern[len++] = '/'; + pattern[len++] = '\\'; pattern[len++] = '*'; pattern[len] = 0; @@ -90,7 +95,7 @@ DIR *dirent_opendir(const char *name) } /* initialize DIR structure and copy first dir entry */ - dir = xmalloc(sizeof(dirent_DIR) + MAX_PATH); + dir = xmalloc(sizeof(dirent_DIR) + MAX_LONG_PATH); dir->base_dir.preaddir = (struct dirent *(*)(DIR *dir)) dirent_readdir; dir->base_dir.pclosedir = (int (*)(DIR *dir)) dirent_closedir; dir->dd_handle = h; diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index da19ba0994de58..581808a64b6cd6 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -80,7 +80,7 @@ struct fsentry { struct heap_fsentry { union { struct fsentry ent; - char dummy[sizeof(struct fsentry) + MAX_PATH]; + char dummy[sizeof(struct fsentry) + MAX_LONG_PATH]; } u; }; @@ -123,7 +123,7 @@ static void fsentry_init(struct fsentry *fse, struct fsentry *list, const char *name, size_t len) { fse->list = list; - if (len > MAX_PATH) + if (len > MAX_LONG_PATH) BUG("Trying to allocate fsentry for long path '%.*s'", (int)len, name); memcpy(fse->dirent.d_name, name, len); @@ -224,7 +224,7 @@ static struct fsentry *fseentry_create_entry(struct fscache *cache, static struct fsentry *fsentry_create_list(struct fscache *cache, const struct fsentry *dir, int *dir_not_found) { - wchar_t pattern[MAX_PATH]; + wchar_t pattern[MAX_LONG_PATH]; NTSTATUS status; IO_STATUS_BLOCK iosb; PFILE_FULL_DIR_INFORMATION di; @@ -235,13 +235,11 @@ static struct fsentry *fsentry_create_list(struct fscache *cache, const struct f *dir_not_found = 0; - /* convert name to UTF-16 and check length < MAX_PATH */ - if ((wlen = xutftowcsn(pattern, dir->dirent.d_name, MAX_PATH, - dir->len)) < 0) { - if (errno == ERANGE) - errno = ENAMETOOLONG; + /* convert name to UTF-16 and check length */ + if ((wlen = xutftowcs_path_ex(pattern, dir->dirent.d_name, + MAX_LONG_PATH, dir->len, MAX_PATH - 2, + are_long_paths_enabled())) < 0) return NULL; - } /* handle CWD */ if (!wlen) { diff --git a/t/t2031-checkout-long-paths.sh b/t/t2031-checkout-long-paths.sh new file mode 100755 index 00000000000000..f30f8920ca689c --- /dev/null +++ b/t/t2031-checkout-long-paths.sh @@ -0,0 +1,102 @@ +#!/bin/sh + +test_description='checkout long paths on Windows + +Ensures that Git for Windows can deal with long paths (>260) enabled via core.longpaths' + +. ./test-lib.sh + +if test_have_prereq !MINGW +then + skip_all='skipping MINGW specific long paths test' + test_done +fi + +test_expect_success setup ' + p=longpathxx && # -> 10 + p=$p$p$p$p$p && # -> 50 + p=$p$p$p$p$p && # -> 250 + + path=${p}/longtestfile && # -> 263 (MAX_PATH = 260) + + blob=$(echo foobar | git hash-object -w --stdin) && + + printf "100644 %s 0\t%s\n" "$blob" "$path" | + git update-index --add --index-info && + git commit -m initial -q +' + +test_expect_success 'checkout of long paths without core.longpaths fails' ' + git config core.longpaths false && + test_must_fail git checkout -f 2>error && + grep -q "Filename too long" error && + test ! -d longpa* +' + +test_expect_success 'checkout of long paths with core.longpaths works' ' + git config core.longpaths true && + git checkout -f && + test_path_is_file longpa*/longtestfile +' + +test_expect_success 'update of long paths' ' + echo frotz >>$(ls longpa*/longtestfile) && + echo $path > expect && + git ls-files -m > actual && + test_cmp expect actual && + git add $path && + git commit -m second && + git grep "frotz" HEAD -- $path +' + +test_expect_success cleanup ' + # bash cannot delete the trash dir if it contains a long path + # lets help cleaning up (unless in debug mode) + if test -z "$debug" + then + rm -rf longpa~1 + fi +' + +# check that the template used in the test won't be too long: +abspath="$(pwd)"/testdir +test ${#abspath} -gt 230 || +test_set_prereq SHORTABSPATH + +test_expect_success SHORTABSPATH 'clean up path close to MAX_PATH' ' + p=/123456789abcdef/123456789abcdef/123456789abcdef/123456789abc/ef && + p=y$p$p$p$p && + subdir="x$(echo "$p" | tail -c $((253 - ${#abspath})) - )" && + # Now, $abspath/$subdir has exactly 254 characters, and is inside CWD + p2="$abspath/$subdir" && + test 254 = ${#p2} && + + # Be careful to overcome path limitations of the MSys tools and split + # the $subdir into two parts. ($subdir2 has to contain 16 chars and a + # slash somewhere following; that is why we asked for abspath <= 230 and + # why we placed a slash near the end of the $subdir template.) + subdir2=${subdir#????????????????*/} && + subdir1=testdir/${subdir%/$subdir2} && + mkdir -p "$subdir1" && + i=0 && + # The most important case is when absolute path is 258 characters long, + # and that will be when i == 4. + while test $i -le 7 + do + mkdir -p $subdir2 && + touch $subdir2/one-file && + mv ${subdir2%%/*} "$subdir1/" && + subdir2=z${subdir2} && + i=$(($i+1)) || + exit 1 + done && + + # now check that git is able to clear the tree: + (cd testdir && + git init && + git config core.longpaths yes && + git clean -fdx) && + test ! -d "$subdir1" +' + +test_done diff --git a/t/t7429-submodule-long-path.sh b/t/t7429-submodule-long-path.sh index f692cedbff7ff8..458519eafd6f03 100755 --- a/t/t7429-submodule-long-path.sh +++ b/t/t7429-submodule-long-path.sh @@ -11,15 +11,20 @@ This test verifies that "git submodule" initialization, update and clones work, TEST_NO_CREATE_REPO=1 . ./test-lib.sh -longpath="" -for (( i=0; i<4; i++ )); do - longpath="0123456789abcdefghijklmnopqrstuvwxyz$longpath" -done -# Pick a substring maximum of 90 characters -# This should be good, since we'll add on a lot for temp directories -longpath=${longpath:0:90}; export longpath +# cloning a submodule calls is_git_directory("$path/../.git/modules/$path"), +# which effectively limits the maximum length to PATH_MAX / 2 minus some +# overhead; start with 3 * 36 = 108 chars (test 2 fails if >= 110) +longpath36=0123456789abcdefghijklmnopqrstuvwxyz +longpath180=$longpath36$longpath36$longpath36$longpath36$longpath36 -test_expect_failure 'submodule with a long path' ' +# the git database must fit within PATH_MAX, which limits the submodule name +# to PATH_MAX - len(pwd) - ~90 (= len("/objects//") + 40-byte sha1 + some +# overhead from the test case) +pwd=$(pwd) +pwdlen=$(echo "$pwd" | wc -c) +longpath=$(echo $longpath180 | cut -c 1-$((170-$pwdlen))) + +test_expect_success 'submodule with a long path' ' git config --global protocol.file.allow always && GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \ git -c init.defaultBranch=long init --bare remote && @@ -59,7 +64,7 @@ test_expect_failure 'submodule with a long path' ' ) ' -test_expect_failure 'recursive submodule with a long path' ' +test_expect_success 'recursive submodule with a long path' ' GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME= \ git -c init.defaultBranch=long init --bare super && test_create_repo child && @@ -101,6 +106,5 @@ test_expect_failure 'recursive submodule with a long path' ' ) ) ' -unset longpath test_done From 6e351ecd8ea938ce72a153744c5bc5785e72b336 Mon Sep 17 00:00:00 2001 From: Karsten Blees <blees@dcon.de> Date: Sat, 9 May 2015 02:11:48 +0200 Subject: [PATCH 224/297] compat/terminal.c: only use the Windows console if bash 'read -r' fails MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Accessing the Windows console through the special CONIN$ / CONOUT$ devices doesn't work properly for non-ASCII usernames an passwords. It also doesn't work for terminal emulators that hide the native console window (such as mintty), and 'TERM=xterm*' is not necessarily a reliable indicator for such terminals. The new shell_prompt() function, on the other hand, works fine for both MSys1 and MSys2, in native console windows as well as mintty, and properly supports Unicode. It just needs bash on the path (for 'read -s', which is bash-specific). On Windows, try to use the shell to read from the terminal. If that fails with ENOENT (i.e. bash was not found), use CONIN/OUT as fallback. Note: To test this, create a UTF-8 credential file with non-ASCII chars, e.g. in git-bash: 'echo url=http://täst.com > cred.txt'. Then in git-cmd, 'git credential fill <cred.txt' works (shell version), while calling git without the git-wrapper (i.e. 'mingw64\bin\git credential fill <cred.txt') mangles non-ASCII chars in both console output and input. Signed-off-by: Karsten Blees <blees@dcon.de> --- compat/terminal.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/compat/terminal.c b/compat/terminal.c index 2a2611855b877f..e8994aaf0ed9bc 100644 --- a/compat/terminal.c +++ b/compat/terminal.c @@ -435,6 +435,7 @@ static char *shell_prompt(const char *prompt, int echo) strvec_pushv(&child.args, read_input); child.in = -1; child.out = -1; + child.silent_exec_failure = 1; if (start_command(&child)) return NULL; @@ -478,11 +479,14 @@ char *git_terminal_prompt(const char *prompt, int echo) static struct strbuf buf = STRBUF_INIT; int r; FILE *input_fh, *output_fh; + #ifdef GIT_WINDOWS_NATIVE - const char *term = getenv("TERM"); - if (term && starts_with(term, "xterm")) - return shell_prompt(prompt, echo); + /* try shell_prompt first, fall back to CONIN/OUT if bash is missing */ + char *result = shell_prompt(prompt, echo); + if (result || errno != ENOENT) + return result; + #endif input_fh = fopen(INPUT_PATH, "r" FORCE_TEXT); From f0c7c00a1e2abca70a8dea8f78ceeaad343c8d6b Mon Sep 17 00:00:00 2001 From: Karsten Blees <blees@dcon.de> Date: Sat, 5 Jul 2014 00:00:36 +0200 Subject: [PATCH 225/297] Win32: fix 'lstat("dir/")' with long paths Use a suffciently large buffer to strip the trailing slash. Signed-off-by: Karsten Blees <blees@dcon.de> --- compat/mingw.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index 6b85015f73ab6c..d4af216d7b5064 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -940,7 +940,7 @@ static int do_lstat(int follow, const char *file_name, struct stat *buf) static int do_stat_internal(int follow, const char *file_name, struct stat *buf) { int namelen; - char alt_name[PATH_MAX]; + char alt_name[MAX_LONG_PATH]; if (!do_lstat(follow, file_name, buf)) return 0; @@ -956,7 +956,7 @@ static int do_stat_internal(int follow, const char *file_name, struct stat *buf) return -1; while (namelen && file_name[namelen-1] == '/') --namelen; - if (!namelen || namelen >= PATH_MAX) + if (!namelen || namelen >= MAX_LONG_PATH) return -1; memcpy(alt_name, file_name, namelen); From 3dab29b4ae801f853be2f1761ca268f3be403888 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Fri, 23 Feb 2018 02:50:03 +0100 Subject: [PATCH 226/297] mingw (git_terminal_prompt): do fall back to CONIN$/CONOUT$ method To support Git Bash running in a MinTTY, we use a dirty trick to access the MSYS2 pseudo terminal: we execute a Bash snippet that accesses /dev/tty. The idea was to fall back to writing to/reading from CONOUT$/CONIN$ if that Bash call failed because Bash was not found. However, we should fall back even in other error conditions, because we have not successfully read the user input. Let's make it so. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- compat/terminal.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/compat/terminal.c b/compat/terminal.c index e8994aaf0ed9bc..66bf4b55d0b49f 100644 --- a/compat/terminal.c +++ b/compat/terminal.c @@ -484,7 +484,7 @@ char *git_terminal_prompt(const char *prompt, int echo) /* try shell_prompt first, fall back to CONIN/OUT if bash is missing */ char *result = shell_prompt(prompt, echo); - if (result || errno != ENOENT) + if (result) return result; #endif From 2a1b191e70645cc51a9fb438a75b9fbb8935bd0c Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Wed, 6 Sep 2023 09:14:47 +0200 Subject: [PATCH 227/297] win32(long path support): leave drive-less absolute paths intact When trying to ensure that long paths are handled correctly, we first normalize absolute paths as we encounter them. However, if the path is a so-called "drive-less" absolute path, i.e. if it is relative to the current drive but _does_ start with a directory separator, we would want the normalized path to be such a drive-less absolute path, too. Let's do that, being careful to still include the drive prefix when we need to go through the `\\?\` dance (because there, the drive prefix is absolutely required). This fixes https://github.com/git-for-windows/git/issues/4586. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- compat/mingw.c | 12 +++++++++++- t/t2031-checkout-long-paths.sh | 9 +++++++++ 2 files changed, 20 insertions(+), 1 deletion(-) diff --git a/compat/mingw.c b/compat/mingw.c index d4af216d7b5064..92c8140841d6f2 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -304,6 +304,11 @@ int mingw_core_config(const char *var, const char *value, return 0; } +static inline int is_wdir_sep(wchar_t wchar) +{ + return wchar == L'/' || wchar == L'\\'; +} + /* Normalizes NT paths as returned by some low-level APIs. */ static wchar_t *normalize_ntpath(wchar_t *wbuf) { @@ -3416,7 +3421,12 @@ int handle_long_path(wchar_t *path, int len, int max_path, int expand) * "cwd + path" doesn't due to '..' components) */ if (result < max_path) { - wcscpy(path, buf); + /* Be careful not to add a drive prefix if there was none */ + if (is_wdir_sep(path[0]) && + !is_wdir_sep(buf[0]) && buf[1] == L':' && is_wdir_sep(buf[2])) + wcscpy(path, buf + 2); + else + wcscpy(path, buf); return result; } diff --git a/t/t2031-checkout-long-paths.sh b/t/t2031-checkout-long-paths.sh index f30f8920ca689c..15416a1d6ee8c7 100755 --- a/t/t2031-checkout-long-paths.sh +++ b/t/t2031-checkout-long-paths.sh @@ -99,4 +99,13 @@ test_expect_success SHORTABSPATH 'clean up path close to MAX_PATH' ' test ! -d "$subdir1" ' +test_expect_success SYMLINKS_WINDOWS 'leave drive-less, short paths intact' ' + printf "/Program Files" >symlink-target && + symlink_target_oid="$(git hash-object -w --stdin <symlink-target)" && + git update-index --add --cacheinfo 120000,$symlink_target_oid,PF && + git -c core.symlinks=true checkout -- PF && + cmd //c dir >actual && + grep "<SYMLINKD\\?> *PF *\\[\\\\Program Files\\]" actual +' + test_done From 12a0a6fd0d26258955244b853d3b40d24f4bbc2b Mon Sep 17 00:00:00 2001 From: Jeff Hostetler <jeffhost@microsoft.com> Date: Fri, 25 Mar 2022 16:56:04 -0400 Subject: [PATCH 228/297] compat/fsmonitor/fsm-*-win32: support long paths Update wchar_t buffers to use MAX_LONG_PATH instead of MAX_PATH and call xutftowcs_long_path() in the Win32 backend source files. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> --- compat/fsmonitor/fsm-health-win32.c | 6 +++--- compat/fsmonitor/fsm-listen-win32.c | 18 +++++++++--------- compat/fsmonitor/fsm-path-utils-win32.c | 8 ++++---- 3 files changed, 16 insertions(+), 16 deletions(-) diff --git a/compat/fsmonitor/fsm-health-win32.c b/compat/fsmonitor/fsm-health-win32.c index 2aa8c219acee4d..4b53360d194105 100644 --- a/compat/fsmonitor/fsm-health-win32.c +++ b/compat/fsmonitor/fsm-health-win32.c @@ -34,7 +34,7 @@ struct fsm_health_data struct wt_moved { - wchar_t wpath[MAX_PATH + 1]; + wchar_t wpath[MAX_LONG_PATH + 1]; BY_HANDLE_FILE_INFORMATION bhfi; } wt_moved; }; @@ -143,8 +143,8 @@ static int has_worktree_moved(struct fsmonitor_daemon_state *state, return 0; case CTX_INIT: - if (xutftowcs_path(data->wt_moved.wpath, - state->path_worktree_watch.buf) < 0) { + if (xutftowcs_long_path(data->wt_moved.wpath, + state->path_worktree_watch.buf) < 0) { error(_("could not convert to wide characters: '%s'"), state->path_worktree_watch.buf); return -1; diff --git a/compat/fsmonitor/fsm-listen-win32.c b/compat/fsmonitor/fsm-listen-win32.c index 5a21dade7b8659..98b73f63170b16 100644 --- a/compat/fsmonitor/fsm-listen-win32.c +++ b/compat/fsmonitor/fsm-listen-win32.c @@ -28,7 +28,7 @@ struct one_watch DWORD count; struct strbuf path; - wchar_t wpath_longname[MAX_PATH + 1]; + wchar_t wpath_longname[MAX_LONG_PATH + 1]; DWORD wpath_longname_len; HANDLE hDir; @@ -131,8 +131,8 @@ static int normalize_path_in_utf8(wchar_t *wpath, DWORD wpath_len, */ static void check_for_shortnames(struct one_watch *watch) { - wchar_t buf_in[MAX_PATH + 1]; - wchar_t buf_out[MAX_PATH + 1]; + wchar_t buf_in[MAX_LONG_PATH + 1]; + wchar_t buf_out[MAX_LONG_PATH + 1]; wchar_t *last; wchar_t *p; @@ -197,8 +197,8 @@ static enum get_relative_result get_relative_longname( const wchar_t *wpath, DWORD wpath_len, wchar_t *wpath_longname, size_t bufsize_wpath_longname) { - wchar_t buf_in[2 * MAX_PATH + 1]; - wchar_t buf_out[MAX_PATH + 1]; + wchar_t buf_in[2 * MAX_LONG_PATH + 1]; + wchar_t buf_out[MAX_LONG_PATH + 1]; DWORD root_len; DWORD out_len; @@ -298,10 +298,10 @@ static struct one_watch *create_watch(const char *path) FILE_SHARE_WRITE | FILE_SHARE_READ | FILE_SHARE_DELETE; HANDLE hDir; DWORD len_longname; - wchar_t wpath[MAX_PATH + 1]; - wchar_t wpath_longname[MAX_PATH + 1]; + wchar_t wpath[MAX_LONG_PATH + 1]; + wchar_t wpath_longname[MAX_LONG_PATH + 1]; - if (xutftowcs_path(wpath, path) < 0) { + if (xutftowcs_long_path(wpath, path) < 0) { error(_("could not convert to wide characters: '%s'"), path); return NULL; } @@ -545,7 +545,7 @@ static int process_worktree_events(struct fsmonitor_daemon_state *state) struct string_list cookie_list = STRING_LIST_INIT_DUP; struct fsmonitor_batch *batch = NULL; const char *p = watch->buffer; - wchar_t wpath_longname[MAX_PATH + 1]; + wchar_t wpath_longname[MAX_LONG_PATH + 1]; /* * If the kernel gets more events than will fit in the kernel diff --git a/compat/fsmonitor/fsm-path-utils-win32.c b/compat/fsmonitor/fsm-path-utils-win32.c index f4f9cc1f336720..c6eb065bde48b4 100644 --- a/compat/fsmonitor/fsm-path-utils-win32.c +++ b/compat/fsmonitor/fsm-path-utils-win32.c @@ -69,8 +69,8 @@ static int check_remote_protocol(wchar_t *wpath) */ int fsmonitor__get_fs_info(const char *path, struct fs_info *fs_info) { - wchar_t wpath[MAX_PATH]; - wchar_t wfullpath[MAX_PATH]; + wchar_t wpath[MAX_LONG_PATH]; + wchar_t wfullpath[MAX_LONG_PATH]; size_t wlen; UINT driveType; @@ -78,7 +78,7 @@ int fsmonitor__get_fs_info(const char *path, struct fs_info *fs_info) * Do everything in wide chars because the drive letter might be * a multi-byte sequence. See win32_has_dos_drive_prefix(). */ - if (xutftowcs_path(wpath, path) < 0) { + if (xutftowcs_long_path(wpath, path) < 0) { return -1; } @@ -97,7 +97,7 @@ int fsmonitor__get_fs_info(const char *path, struct fs_info *fs_info) * slashes to backslashes. This is essential to get GetDriveTypeW() * correctly handle some UNC "\\server\share\..." paths. */ - if (!GetFullPathNameW(wpath, MAX_PATH, wfullpath, NULL)) { + if (!GetFullPathNameW(wpath, MAX_LONG_PATH, wfullpath, NULL)) { return -1; } From caae38a27fb46f39f1303931f95d09b6351c6bb8 Mon Sep 17 00:00:00 2001 From: Ben Boeckel <mathstuf@gmail.com> Date: Fri, 22 Apr 2022 09:06:23 -0400 Subject: [PATCH 229/297] clean: suggest using `core.longPaths` if paths are too long to remove On Windows, git repositories may have extra files which need cleaned (e.g., a build directory) that may be arbitrarily deep. Suggest using `core.longPaths` if such situations are encountered. Fixes: #2715 Signed-off-by: Ben Boeckel <mathstuf@gmail.com> --- Documentation/config/advice.txt | 3 +++ advice.c | 1 + advice.h | 1 + builtin/clean.c | 13 +++++++++++++ 4 files changed, 18 insertions(+) diff --git a/Documentation/config/advice.txt b/Documentation/config/advice.txt index 0ba89898207f0c..5d716c6b2871b6 100644 --- a/Documentation/config/advice.txt +++ b/Documentation/config/advice.txt @@ -58,6 +58,9 @@ advice.*:: set their identity configuration. mergeConflict:: Shown when various commands stop because of conflicts. + nameTooLong:: + Advice shown if a filepath operation is attempted where the + path was too long. nestedTag:: Shown when a user attempts to recursively tag a tag object. pushAlreadyExists:: diff --git a/advice.c b/advice.c index 6b879d805c0304..3f4fed422f1e8d 100644 --- a/advice.c +++ b/advice.c @@ -59,6 +59,7 @@ static struct { [ADVICE_IGNORED_HOOK] = { "ignoredHook" }, [ADVICE_IMPLICIT_IDENTITY] = { "implicitIdentity" }, [ADVICE_MERGE_CONFLICT] = { "mergeConflict" }, + [ADVICE_NAME_TOO_LONG] = { "nameTooLong" }, [ADVICE_NESTED_TAG] = { "nestedTag" }, [ADVICE_OBJECT_NAME_WARNING] = { "objectNameWarning" }, [ADVICE_PUSH_ALREADY_EXISTS] = { "pushAlreadyExists" }, diff --git a/advice.h b/advice.h index d7466bc0ef2026..972c46c3c434e6 100644 --- a/advice.h +++ b/advice.h @@ -26,6 +26,7 @@ enum advice_type { ADVICE_IGNORED_HOOK, ADVICE_IMPLICIT_IDENTITY, ADVICE_MERGE_CONFLICT, + ADVICE_NAME_TOO_LONG, ADVICE_NESTED_TAG, ADVICE_OBJECT_NAME_WARNING, ADVICE_PUSH_ALREADY_EXISTS, diff --git a/builtin/clean.c b/builtin/clean.c index e50e1bb39d8fe1..5f6ba1cd7891a9 100644 --- a/builtin/clean.c +++ b/builtin/clean.c @@ -23,6 +23,7 @@ #include "pathspec.h" #include "help.h" #include "prompt.h" +#include "advice.h" static int require_force = -1; /* unset */ static int interactive; @@ -218,6 +219,9 @@ static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag, quote_path(path->buf, prefix, "ed, 0); errno = saved_errno; warning_errno(_(msg_warn_remove_failed), quoted.buf); + if (saved_errno == ENAMETOOLONG) { + advise_if_enabled(ADVICE_NAME_TOO_LONG, _("Setting `core.longPaths` may allow the deletion to succeed.")); + } *dir_gone = 0; } ret = res; @@ -253,6 +257,9 @@ static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag, quote_path(path->buf, prefix, "ed, 0); errno = saved_errno; warning_errno(_(msg_warn_remove_failed), quoted.buf); + if (saved_errno == ENAMETOOLONG) { + advise_if_enabled(ADVICE_NAME_TOO_LONG, _("Setting `core.longPaths` may allow the deletion to succeed.")); + } *dir_gone = 0; ret = 1; } @@ -296,6 +303,9 @@ static int remove_dirs(struct strbuf *path, const char *prefix, int force_flag, quote_path(path->buf, prefix, "ed, 0); errno = saved_errno; warning_errno(_(msg_warn_remove_failed), quoted.buf); + if (saved_errno == ENAMETOOLONG) { + advise_if_enabled(ADVICE_NAME_TOO_LONG, _("Setting `core.longPaths` may allow the deletion to succeed.")); + } *dir_gone = 0; ret = 1; } @@ -1105,6 +1115,9 @@ int cmd_clean(int argc, const char **argv, const char *prefix) qname = quote_path(item->string, NULL, &buf, 0); errno = saved_errno; warning_errno(_(msg_warn_remove_failed), qname); + if (saved_errno == ENAMETOOLONG) { + advise_if_enabled(ADVICE_NAME_TOO_LONG, _("Setting `core.longPaths` may allow the deletion to succeed.")); + } errors++; } else if (!quiet) { qname = quote_path(item->string, NULL, &buf, 0); From 08d520df739106c8007a279d0953d3c42d77309c Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Tue, 6 Sep 2016 09:50:33 +0200 Subject: [PATCH 230/297] Unbreak interactive GPG prompt upon signing With the recent update in efee955 (gpg-interface: check gpg signature creation status, 2016-06-17), we ask GPG to send all status updates to stderr, and then catch the stderr in an strbuf. But GPG might fail, and send error messages to stderr. And we simply do not show them to the user. Even worse: this swallows any interactive prompt for a passphrase. And detaches stderr from the tty so that the passphrase cannot be read. So while the first problem could be fixed (by printing the captured stderr upon error), the second problem cannot be easily fixed, and presents a major regression. So let's just revert commit efee9553a4f97b2ecd8f49be19606dd4cf7d9c28. This fixes https://github.com/git-for-windows/git/issues/871 Cc: Michael J Gruber <git@drmicha.warpmail.net> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- gpg-interface.c | 21 +++------------------ t/t7004-tag.sh | 31 ------------------------------- 2 files changed, 3 insertions(+), 49 deletions(-) diff --git a/gpg-interface.c b/gpg-interface.c index 5c824aeb257cb9..ffa3ce0787a359 100644 --- a/gpg-interface.c +++ b/gpg-interface.c @@ -972,12 +972,9 @@ static int sign_buffer_gpg(struct strbuf *buffer, struct strbuf *signature, struct child_process gpg = CHILD_PROCESS_INIT; int ret; size_t bottom; - const char *cp; - struct strbuf gpg_status = STRBUF_INIT; strvec_pushl(&gpg.args, use_format->program, - "--status-fd=2", "-bsau", signing_key, NULL); @@ -989,23 +986,11 @@ static int sign_buffer_gpg(struct strbuf *buffer, struct strbuf *signature, */ sigchain_push(SIGPIPE, SIG_IGN); ret = pipe_command(&gpg, buffer->buf, buffer->len, - signature, 1024, &gpg_status, 0); + signature, 1024, NULL, 0); sigchain_pop(SIGPIPE); - for (cp = gpg_status.buf; - cp && (cp = strstr(cp, "[GNUPG:] SIG_CREATED ")); - cp++) { - if (cp == gpg_status.buf || cp[-1] == '\n') - break; /* found */ - } - ret |= !cp; - if (ret) { - error(_("gpg failed to sign the data:\n%s"), - gpg_status.len ? gpg_status.buf : "(no gpg output)"); - strbuf_release(&gpg_status); - return -1; - } - strbuf_release(&gpg_status); + if (ret || signature->len == bottom) + return error(_("gpg failed to sign the data")); /* Strip CR from the line endings, in case we are on Windows. */ remove_cr_after(signature, bottom); diff --git a/t/t7004-tag.sh b/t/t7004-tag.sh index fa6336edf98350..e253b520f237a4 100755 --- a/t/t7004-tag.sh +++ b/t/t7004-tag.sh @@ -1517,30 +1517,6 @@ test_expect_success GPG \ 'test_config user.signingkey BobTheMouse && test_must_fail git tag -s -m tail tag-gpg-failure' -# try to produce invalid signature -test_expect_success GPG \ - 'git tag -s fails if gpg is misconfigured (bad signature format)' \ - 'test_config gpg.program echo && - test_must_fail git tag -s -m tail tag-gpg-failure' - -# try to produce invalid signature -test_expect_success GPG 'git verifies tag is valid with double signature' ' - git tag -s -m tail tag-gpg-double-sig && - git cat-file tag tag-gpg-double-sig >tag && - othersigheader=$(test_oid othersigheader) && - sed -ne "/^\$/q;p" tag >new-tag && - cat <<-EOM >>new-tag && - $othersigheader -----BEGIN PGP SIGNATURE----- - someinvaliddata - -----END PGP SIGNATURE----- - EOM - sed -e "1,/^tagger/d" tag >>new-tag && - new_tag=$(git hash-object -t tag -w new-tag) && - git update-ref refs/tags/tag-gpg-double-sig $new_tag && - git verify-tag tag-gpg-double-sig && - git fsck -' - # try to sign with bad user.signingkey test_expect_success GPGSM \ 'git tag -s fails if gpgsm is misconfigured (bad key)' \ @@ -1548,13 +1524,6 @@ test_expect_success GPGSM \ test_config gpg.format x509 && test_must_fail git tag -s -m tail tag-gpg-failure' -# try to produce invalid signature -test_expect_success GPGSM \ - 'git tag -s fails if gpgsm is misconfigured (bad signature format)' \ - 'test_config gpg.x509.program echo && - test_config gpg.format x509 && - test_must_fail git tag -s -m tail tag-gpg-failure' - # try to verify without gpg: rm -rf gpghome From a1f7c570b52801675e118ef84ba4308289685cc0 Mon Sep 17 00:00:00 2001 From: Karsten Blees <blees@dcon.de> Date: Mon, 11 May 2015 19:54:23 +0200 Subject: [PATCH 231/297] strbuf_readlink: don't call readlink twice if hint is the exact link size strbuf_readlink() calls readlink() twice if the hint argument specifies the exact size of the link target (e.g. by passing stat.st_size as returned by lstat()). This is necessary because 'readlink(..., hint) == hint' could mean that the buffer was too small. Use hint + 1 as buffer size to prevent this. Signed-off-by: Karsten Blees <blees@dcon.de> --- strbuf.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/strbuf.c b/strbuf.c index 3d2189a7f648dc..96779071e03343 100644 --- a/strbuf.c +++ b/strbuf.c @@ -574,12 +574,12 @@ int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint) while (hint < STRBUF_MAXLINK) { ssize_t len; - strbuf_grow(sb, hint); - len = readlink(path, sb->buf, hint); + strbuf_grow(sb, hint + 1); + len = readlink(path, sb->buf, hint + 1); if (len < 0) { if (errno != ERANGE) break; - } else if (len < hint) { + } else if (len <= hint) { strbuf_setlen(sb, len); return 0; } From 5bcdd25640baed24b1af707dc95ed792cdb8174b Mon Sep 17 00:00:00 2001 From: Karsten Blees <blees@dcon.de> Date: Mon, 11 May 2015 22:15:40 +0200 Subject: [PATCH 232/297] strbuf_readlink: support link targets that exceed PATH_MAX strbuf_readlink() refuses to read link targets that exceed PATH_MAX (even if a sufficient size was specified by the caller). As some platforms support longer paths, remove this restriction (similar to strbuf_getcwd()). Signed-off-by: Karsten Blees <blees@dcon.de> --- strbuf.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/strbuf.c b/strbuf.c index 96779071e03343..9b427403c46c2b 100644 --- a/strbuf.c +++ b/strbuf.c @@ -562,8 +562,6 @@ ssize_t strbuf_write(struct strbuf *sb, FILE *f) return sb->len ? fwrite(sb->buf, 1, sb->len, f) : 0; } -#define STRBUF_MAXLINK (2*PATH_MAX) - int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint) { size_t oldalloc = sb->alloc; @@ -571,7 +569,7 @@ int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint) if (hint < 32) hint = 32; - while (hint < STRBUF_MAXLINK) { + for (;;) { ssize_t len; strbuf_grow(sb, hint + 1); From 0ccd0dd1d3e25ff2bfdbb8864d258302d08855c1 Mon Sep 17 00:00:00 2001 From: Karsten Blees <blees@dcon.de> Date: Mon, 11 May 2015 19:58:14 +0200 Subject: [PATCH 233/297] lockfile.c: use is_dir_sep() instead of hardcoded '/' checks Signed-off-by: Karsten Blees <blees@dcon.de> --- lockfile.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lockfile.c b/lockfile.c index 1d5ed016828746..67082a9caaeb18 100644 --- a/lockfile.c +++ b/lockfile.c @@ -19,14 +19,14 @@ static void trim_last_path_component(struct strbuf *path) int i = path->len; /* back up past trailing slashes, if any */ - while (i && path->buf[i - 1] == '/') + while (i && is_dir_sep(path->buf[i - 1])) i--; /* * then go backwards until a slash, or the beginning of the * string */ - while (i && path->buf[i - 1] != '/') + while (i && !is_dir_sep(path->buf[i - 1])) i--; strbuf_setlen(path, i); From a1ba0b3a8524440c51c48a3ec020db0e24e7a3ea Mon Sep 17 00:00:00 2001 From: Karsten Blees <blees@dcon.de> Date: Tue, 12 May 2015 11:09:01 +0200 Subject: [PATCH 234/297] Win32: don't call GetFileAttributes twice in mingw_lstat() GetFileAttributes cannot handle paths with trailing dir separator. The current [l]stat implementation calls GetFileAttributes twice if the path has trailing slashes (first with the original path passed to [l]stat, and and a second time with a path copy with trailing '/' removed). With Unicode conversion, we get the length of the path for free and also have a (wide char) buffer that can be modified. Remove trailing directory separators before calling the Win32 API. Signed-off-by: Karsten Blees <blees@dcon.de> --- compat/mingw.c | 48 ++++++++++++------------------------------------ 1 file changed, 12 insertions(+), 36 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index 92c8140841d6f2..bacecc82d1427e 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -874,8 +874,17 @@ static int do_lstat(int follow, const char *file_name, struct stat *buf) { WIN32_FILE_ATTRIBUTE_DATA fdata; wchar_t wfilename[MAX_LONG_PATH]; - if (xutftowcs_long_path(wfilename, file_name) < 0) + int wlen = xutftowcs_long_path(wfilename, file_name); + if (wlen < 0) + return -1; + + /* strip trailing '/', or GetFileAttributes will fail */ + while (wlen && is_dir_sep(wfilename[wlen - 1])) + wfilename[--wlen] = 0; + if (!wlen) { + errno = ENOENT; return -1; + } if (GetFileAttributesExW(wfilename, GetFileExInfoStandard, &fdata)) { buf->st_ino = 0; @@ -936,39 +945,6 @@ static int do_lstat(int follow, const char *file_name, struct stat *buf) return -1; } -/* We provide our own lstat/fstat functions, since the provided - * lstat/fstat functions are so slow. These stat functions are - * tailored for Git's usage (read: fast), and are not meant to be - * complete. Note that Git stat()s are redirected to mingw_lstat() - * too, since Windows doesn't really handle symlinks that well. - */ -static int do_stat_internal(int follow, const char *file_name, struct stat *buf) -{ - int namelen; - char alt_name[MAX_LONG_PATH]; - - if (!do_lstat(follow, file_name, buf)) - return 0; - - /* if file_name ended in a '/', Windows returned ENOENT; - * try again without trailing slashes - */ - if (errno != ENOENT) - return -1; - - namelen = strlen(file_name); - if (namelen && file_name[namelen-1] != '/') - return -1; - while (namelen && file_name[namelen-1] == '/') - --namelen; - if (!namelen || namelen >= MAX_LONG_PATH) - return -1; - - memcpy(alt_name, file_name, namelen); - alt_name[namelen] = 0; - return do_lstat(follow, alt_name, buf); -} - int (*lstat)(const char *file_name, struct stat *buf) = mingw_lstat; static int get_file_info_by_handle(HANDLE hnd, struct stat *buf) @@ -996,11 +972,11 @@ static int get_file_info_by_handle(HANDLE hnd, struct stat *buf) int mingw_lstat(const char *file_name, struct stat *buf) { - return do_stat_internal(0, file_name, buf); + return do_lstat(0, file_name, buf); } int mingw_stat(const char *file_name, struct stat *buf) { - return do_stat_internal(1, file_name, buf); + return do_lstat(1, file_name, buf); } int mingw_fstat(int fd, struct stat *buf) From 6cc554fdd19838a9d39c1e4ed7ea73d050f5f1c2 Mon Sep 17 00:00:00 2001 From: Karsten Blees <blees@dcon.de> Date: Sat, 16 May 2015 01:18:14 +0200 Subject: [PATCH 235/297] Win32: implement stat() with symlink support With respect to symlinks, the current stat() implementation is almost the same as lstat(): except for the file type (st_mode & S_IFMT), it returns information about the link rather than the target. Implement stat by opening the file with as little permissions as possible and calling GetFileInformationByHandle on it. This way, all link resoltion is handled by the Windows file system layer. If symlinks are disabled, use lstat() as before, but fail with ELOOP if a symlink would have to be resolved. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- compat/mingw.c | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/compat/mingw.c b/compat/mingw.c index bacecc82d1427e..52fb729ada4b3d 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -974,9 +974,26 @@ int mingw_lstat(const char *file_name, struct stat *buf) { return do_lstat(0, file_name, buf); } + int mingw_stat(const char *file_name, struct stat *buf) { - return do_lstat(1, file_name, buf); + wchar_t wfile_name[MAX_LONG_PATH]; + HANDLE hnd; + int result; + + /* open the file and let Windows resolve the links */ + if (xutftowcs_long_path(wfile_name, file_name) < 0) + return -1; + hnd = CreateFileW(wfile_name, 0, + FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, NULL, + OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, NULL); + if (hnd == INVALID_HANDLE_VALUE) { + errno = err_win_to_posix(GetLastError()); + return -1; + } + result = get_file_info_by_handle(hnd, buf); + CloseHandle(hnd); + return result; } int mingw_fstat(int fd, struct stat *buf) From bfb6c707c8008edbf75d4bf013c0a38f1510b545 Mon Sep 17 00:00:00 2001 From: Karsten Blees <blees@dcon.de> Date: Tue, 12 May 2015 00:58:39 +0200 Subject: [PATCH 236/297] Win32: remove separate do_lstat() function With the new mingw_stat() implementation, do_lstat() is only called from mingw_lstat() (with follow == 0). Remove the extra function and the old mingw_stat()-specific (follow == 1) logic. Signed-off-by: Karsten Blees <blees@dcon.de> --- compat/mingw.c | 22 ++-------------------- 1 file changed, 2 insertions(+), 20 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index 52fb729ada4b3d..3bd08565626897 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -863,14 +863,7 @@ static int has_valid_directory_prefix(wchar_t *wfilename) return 1; } -/* We keep the do_lstat code in a separate function to avoid recursion. - * When a path ends with a slash, the stat will fail with ENOENT. In - * this case, we strip the trailing slashes and stat again. - * - * If follow is true then act like stat() and report on the link - * target. Otherwise report on the link itself. - */ -static int do_lstat(int follow, const char *file_name, struct stat *buf) +int mingw_lstat(const char *file_name, struct stat *buf) { WIN32_FILE_ATTRIBUTE_DATA fdata; wchar_t wfilename[MAX_LONG_PATH]; @@ -904,13 +897,7 @@ static int do_lstat(int follow, const char *file_name, struct stat *buf) if (handle != INVALID_HANDLE_VALUE) { if ((findbuf.dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT) && (findbuf.dwReserved0 == IO_REPARSE_TAG_SYMLINK)) { - if (follow) { - char buffer[MAXIMUM_REPARSE_DATA_BUFFER_SIZE]; - buf->st_size = readlink(file_name, buffer, MAXIMUM_REPARSE_DATA_BUFFER_SIZE); - } else { - buf->st_mode = S_IFLNK; - } - buf->st_mode |= S_IREAD; + buf->st_mode = S_IFLNK | S_IREAD; if (!(findbuf.dwFileAttributes & FILE_ATTRIBUTE_READONLY)) buf->st_mode |= S_IWRITE; } @@ -970,11 +957,6 @@ static int get_file_info_by_handle(HANDLE hnd, struct stat *buf) return 0; } -int mingw_lstat(const char *file_name, struct stat *buf) -{ - return do_lstat(0, file_name, buf); -} - int mingw_stat(const char *file_name, struct stat *buf) { wchar_t wfile_name[MAX_LONG_PATH]; From f027e0c1d7333469fb5f4a882f11d23870cf23bd Mon Sep 17 00:00:00 2001 From: Karsten Blees <blees@dcon.de> Date: Sun, 24 May 2015 00:17:56 +0200 Subject: [PATCH 237/297] Win32: let mingw_lstat() error early upon problems with reparse points When obtaining lstat information for reparse points, we need to call FindFirstFile() in addition to GetFileInformationEx() to obtain the type of the reparse point (symlink, mount point etc.). However, currently there is no error handling whatsoever if FindFirstFile() fails. Call FindFirstFile() before modifying the stat *buf output parameter and error out if the call fails. Note: The FindFirstFile() return value includes all the data that we get from GetFileAttributesEx(), so we could replace GetFileAttributesEx() with FindFirstFile(). We don't do that because GetFileAttributesEx() is about twice as fast for single files. I.e. we only pay the extra cost of calling FindFirstFile() in the rare case that we encounter a reparse point. Note: The indentation of the remaining reparse point code will be fixed in the next patch. Signed-off-by: Karsten Blees <blees@dcon.de> --- compat/mingw.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index 3bd08565626897..548b2d1b4e93b2 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -866,6 +866,7 @@ static int has_valid_directory_prefix(wchar_t *wfilename) int mingw_lstat(const char *file_name, struct stat *buf) { WIN32_FILE_ATTRIBUTE_DATA fdata; + WIN32_FIND_DATAW findbuf = { 0 }; wchar_t wfilename[MAX_LONG_PATH]; int wlen = xutftowcs_long_path(wfilename, file_name); if (wlen < 0) @@ -880,6 +881,13 @@ int mingw_lstat(const char *file_name, struct stat *buf) } if (GetFileAttributesExW(wfilename, GetFileExInfoStandard, &fdata)) { + /* for reparse points, use FindFirstFile to get the reparse tag */ + if (fdata.dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT) { + HANDLE handle = FindFirstFileW(wfilename, &findbuf); + if (handle == INVALID_HANDLE_VALUE) + goto error; + FindClose(handle); + } buf->st_ino = 0; buf->st_gid = 0; buf->st_uid = 0; @@ -892,20 +900,16 @@ int mingw_lstat(const char *file_name, struct stat *buf) filetime_to_timespec(&(fdata.ftLastWriteTime), &(buf->st_mtim)); filetime_to_timespec(&(fdata.ftCreationTime), &(buf->st_ctim)); if (fdata.dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT) { - WIN32_FIND_DATAW findbuf; - HANDLE handle = FindFirstFileW(wfilename, &findbuf); - if (handle != INVALID_HANDLE_VALUE) { if ((findbuf.dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT) && (findbuf.dwReserved0 == IO_REPARSE_TAG_SYMLINK)) { buf->st_mode = S_IFLNK | S_IREAD; if (!(findbuf.dwFileAttributes & FILE_ATTRIBUTE_READONLY)) buf->st_mode |= S_IWRITE; } - FindClose(handle); - } } return 0; } +error: switch (GetLastError()) { case ERROR_ACCESS_DENIED: case ERROR_SHARING_VIOLATION: From 793e31806f44d8296eaa2523d14267bd9a9a5430 Mon Sep 17 00:00:00 2001 From: Karsten Blees <blees@dcon.de> Date: Tue, 10 Jan 2017 23:21:56 +0100 Subject: [PATCH 238/297] mingw: teach fscache and dirent about symlinks Move S_IFLNK detection to file_attr_to_st_mode() and reuse it in fscache. Implement DT_LNK detection in dirent.c and the fscache readdir version. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- compat/mingw.c | 13 +++---------- compat/win32.h | 6 ++++-- compat/win32/dirent.c | 5 ++++- compat/win32/fscache.c | 11 +++++++---- 4 files changed, 18 insertions(+), 17 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index 548b2d1b4e93b2..68d53367015ccc 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -892,21 +892,14 @@ int mingw_lstat(const char *file_name, struct stat *buf) buf->st_gid = 0; buf->st_uid = 0; buf->st_nlink = 1; - buf->st_mode = file_attr_to_st_mode(fdata.dwFileAttributes); + buf->st_mode = file_attr_to_st_mode(fdata.dwFileAttributes, + findbuf.dwReserved0); buf->st_size = fdata.nFileSizeLow | (((off_t)fdata.nFileSizeHigh)<<32); buf->st_dev = buf->st_rdev = 0; /* not used by Git */ filetime_to_timespec(&(fdata.ftLastAccessTime), &(buf->st_atim)); filetime_to_timespec(&(fdata.ftLastWriteTime), &(buf->st_mtim)); filetime_to_timespec(&(fdata.ftCreationTime), &(buf->st_ctim)); - if (fdata.dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT) { - if ((findbuf.dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT) && - (findbuf.dwReserved0 == IO_REPARSE_TAG_SYMLINK)) { - buf->st_mode = S_IFLNK | S_IREAD; - if (!(findbuf.dwFileAttributes & FILE_ATTRIBUTE_READONLY)) - buf->st_mode |= S_IWRITE; - } - } return 0; } error: @@ -951,7 +944,7 @@ static int get_file_info_by_handle(HANDLE hnd, struct stat *buf) buf->st_gid = 0; buf->st_uid = 0; buf->st_nlink = 1; - buf->st_mode = file_attr_to_st_mode(fdata.dwFileAttributes); + buf->st_mode = file_attr_to_st_mode(fdata.dwFileAttributes, 0); buf->st_size = fdata.nFileSizeLow | (((off_t)fdata.nFileSizeHigh)<<32); buf->st_dev = buf->st_rdev = 0; /* not used by Git */ diff --git a/compat/win32.h b/compat/win32.h index a97e880757b6f1..671bcc81f93351 100644 --- a/compat/win32.h +++ b/compat/win32.h @@ -6,10 +6,12 @@ #include <windows.h> #endif -static inline int file_attr_to_st_mode (DWORD attr) +static inline int file_attr_to_st_mode (DWORD attr, DWORD tag) { int fMode = S_IREAD; - if (attr & FILE_ATTRIBUTE_DIRECTORY) + if ((attr & FILE_ATTRIBUTE_REPARSE_POINT) && tag == IO_REPARSE_TAG_SYMLINK) + fMode |= S_IFLNK; + else if (attr & FILE_ATTRIBUTE_DIRECTORY) fMode |= S_IFDIR; else fMode |= S_IFREG; diff --git a/compat/win32/dirent.c b/compat/win32/dirent.c index c9fe2454efc01c..87063101f57202 100644 --- a/compat/win32/dirent.c +++ b/compat/win32/dirent.c @@ -18,7 +18,10 @@ static inline void finddata2dirent(struct dirent *ent, WIN32_FIND_DATAW *fdata) xwcstoutf(ent->d_name, fdata->cFileName, MAX_PATH * 3); /* Set file type, based on WIN32_FIND_DATA */ - if (fdata->dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) + if ((fdata->dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT) + && fdata->dwReserved0 == IO_REPARSE_TAG_SYMLINK) + ent->d_type = DT_LNK; + else if (fdata->dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) ent->d_type = DT_DIR; else ent->d_type = DT_REG; diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index 581808a64b6cd6..3016ae61dee5e0 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -202,10 +202,13 @@ static struct fsentry *fseentry_create_entry(struct fscache *cache, fdata->FileAttributes & FILE_ATTRIBUTE_REPARSE_POINT ? fdata->EaSize : 0; - fse->st_mode = file_attr_to_st_mode(fdata->FileAttributes); - fse->dirent.d_type = S_ISDIR(fse->st_mode) ? DT_DIR : DT_REG; - fse->u.s.st_size = fdata->EndOfFile.LowPart | - (((off_t)fdata->EndOfFile.HighPart) << 32); + fse->st_mode = file_attr_to_st_mode(fdata->FileAttributes, + fdata->EaSize); + fse->dirent.d_type = S_ISREG(fse->st_mode) ? DT_REG : + S_ISDIR(fse->st_mode) ? DT_DIR : DT_LNK; + fse->u.s.st_size = S_ISLNK(fse->st_mode) ? MAX_LONG_PATH : + fdata->EndOfFile.LowPart | + (((off_t)fdata->EndOfFile.HighPart) << 32); filetime_to_timespec((FILETIME *)&(fdata->LastAccessTime), &(fse->u.s.st_atim)); filetime_to_timespec((FILETIME *)&(fdata->LastWriteTime), From ba6cfb4b7f3b826ffb76b1bd089f57a0deddc7cb Mon Sep 17 00:00:00 2001 From: Karsten Blees <blees@dcon.de> Date: Sat, 16 May 2015 01:11:37 +0200 Subject: [PATCH 239/297] Win32: lstat(): return adequate stat.st_size for symlinks Git typically doesn't trust the stat.st_size member of symlinks (e.g. see strbuf_readlink()). However, some functions take shortcuts if st_size is 0 (e.g. diff_populate_filespec()). In mingw_lstat() and fscache_lstat(), make sure to return an adequate size. The extra overhead of opening and reading the reparse point to calculate the exact size is not necessary, as git doesn't rely on the value anyway. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- compat/mingw.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index 68d53367015ccc..8a8aeb8daca52f 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -894,8 +894,8 @@ int mingw_lstat(const char *file_name, struct stat *buf) buf->st_nlink = 1; buf->st_mode = file_attr_to_st_mode(fdata.dwFileAttributes, findbuf.dwReserved0); - buf->st_size = fdata.nFileSizeLow | - (((off_t)fdata.nFileSizeHigh)<<32); + buf->st_size = S_ISLNK(buf->st_mode) ? MAX_LONG_PATH : + fdata.nFileSizeLow | (((off_t) fdata.nFileSizeHigh) << 32); buf->st_dev = buf->st_rdev = 0; /* not used by Git */ filetime_to_timespec(&(fdata.ftLastAccessTime), &(buf->st_atim)); filetime_to_timespec(&(fdata.ftLastWriteTime), &(buf->st_mtim)); From 4cac428600c7659d23f074c4594df31beb6d14aa Mon Sep 17 00:00:00 2001 From: Karsten Blees <blees@dcon.de> Date: Tue, 19 May 2015 21:48:55 +0200 Subject: [PATCH 240/297] Win32: factor out retry logic The retry pattern is duplicated in three places. It also seems to be too hard to use: mingw_unlink() and mingw_rmdir() duplicate the code to retry, and both of them do so incompletely. They also do not restore errno if the user answers 'no'. Introduce a retry_ask_yes_no() helper function that handles retry with small delay, asking the user, and restoring errno. mingw_unlink: include _wchmod in the retry loop (which may fail if the file is locked exclusively). mingw_rmdir: include special error handling in the retry loop. Signed-off-by: Karsten Blees <blees@dcon.de> --- compat/mingw.c | 102 ++++++++++++++++++++++--------------------------- 1 file changed, 45 insertions(+), 57 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index 8a8aeb8daca52f..5c1f94c3a23ee7 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -25,8 +25,6 @@ #define HCAST(type, handle) ((type)(intptr_t)handle) -static const int delay[] = { 0, 1, 10, 20, 40 }; - void open_in_gdb(void) { static struct child_process cp = CHILD_PROCESS_INIT; @@ -202,15 +200,12 @@ static int read_yes_no_answer(void) return -1; } -static int ask_yes_no_if_possible(const char *format, ...) +static int ask_yes_no_if_possible(const char *format, va_list args) { char question[4096]; const char *retry_hook; - va_list args; - va_start(args, format); vsnprintf(question, sizeof(question), format, args); - va_end(args); retry_hook = mingw_getenv("GIT_ASK_YESNO"); if (retry_hook) { @@ -235,6 +230,31 @@ static int ask_yes_no_if_possible(const char *format, ...) } } +static int retry_ask_yes_no(int *tries, const char *format, ...) +{ + static const int delay[] = { 0, 1, 10, 20, 40 }; + va_list args; + int result, saved_errno = errno; + + if ((*tries) < ARRAY_SIZE(delay)) { + /* + * We assume that some other process had the file open at the wrong + * moment and retry. In order to give the other process a higher + * chance to complete its operation, we give up our time slice now. + * If we have to retry again, we do sleep a bit. + */ + Sleep(delay[*tries]); + (*tries)++; + return 1; + } + + va_start(args, format); + result = ask_yes_no_if_possible(format, args); + va_end(args); + errno = saved_errno; + return result; +} + /* Windows only */ enum hide_dotfiles_type { HIDE_DOTFILES_FALSE = 0, @@ -336,7 +356,7 @@ static wchar_t *normalize_ntpath(wchar_t *wbuf) int mingw_unlink(const char *pathname) { - int ret, tries = 0; + int tries = 0; wchar_t wpathname[MAX_LONG_PATH]; if (xutftowcs_long_path(wpathname, pathname) < 0) return -1; @@ -344,26 +364,16 @@ int mingw_unlink(const char *pathname) if (DeleteFileW(wpathname)) return 0; - /* read-only files cannot be removed */ - _wchmod(wpathname, 0666); - while ((ret = _wunlink(wpathname)) == -1 && tries < ARRAY_SIZE(delay)) { + do { + /* read-only files cannot be removed */ + _wchmod(wpathname, 0666); + if (!_wunlink(wpathname)) + return 0; if (!is_file_in_use_error(GetLastError())) break; - /* - * We assume that some other process had the source or - * destination file open at the wrong moment and retry. - * In order to give the other process a higher chance to - * complete its operation, we give up our time slice now. - * If we have to retry again, we do sleep a bit. - */ - Sleep(delay[tries]); - tries++; - } - while (ret == -1 && is_file_in_use_error(GetLastError()) && - ask_yes_no_if_possible("Unlink of file '%s' failed. " - "Should I try again?", pathname)) - ret = _wunlink(wpathname); - return ret; + } while (retry_ask_yes_no(&tries, "Unlink of file '%s' failed. " + "Should I try again?", pathname)); + return -1; } static int is_dir_empty(const wchar_t *wpath) @@ -390,7 +400,7 @@ static int is_dir_empty(const wchar_t *wpath) int mingw_rmdir(const char *pathname) { - int ret, tries = 0; + int tries = 0; wchar_t wpathname[MAX_LONG_PATH]; struct stat st; @@ -416,7 +426,11 @@ int mingw_rmdir(const char *pathname) if (xutftowcs_long_path(wpathname, pathname) < 0) return -1; - while ((ret = _wrmdir(wpathname)) == -1 && tries < ARRAY_SIZE(delay)) { + do { + if (!_wrmdir(wpathname)) { + invalidate_lstat_cache(); + return 0; + } if (!is_file_in_use_error(GetLastError())) errno = err_win_to_posix(GetLastError()); if (errno != EACCES) @@ -425,23 +439,9 @@ int mingw_rmdir(const char *pathname) errno = ENOTEMPTY; break; } - /* - * We assume that some other process had the source or - * destination file open at the wrong moment and retry. - * In order to give the other process a higher chance to - * complete its operation, we give up our time slice now. - * If we have to retry again, we do sleep a bit. - */ - Sleep(delay[tries]); - tries++; - } - while (ret == -1 && errno == EACCES && is_file_in_use_error(GetLastError()) && - ask_yes_no_if_possible("Deletion of directory '%s' failed. " - "Should I try again?", pathname)) - ret = _wrmdir(wpathname); - if (!ret) - invalidate_lstat_cache(); - return ret; + } while (retry_ask_yes_no(&tries, "Deletion of directory '%s' failed. " + "Should I try again?", pathname)); + return -1; } static inline int needs_hiding(const char *path) @@ -2445,20 +2445,8 @@ int mingw_rename(const char *pold, const char *pnew) SetFileAttributesW(wpnew, attrs); } } - if (tries < ARRAY_SIZE(delay) && gle == ERROR_ACCESS_DENIED) { - /* - * We assume that some other process had the source or - * destination file open at the wrong moment and retry. - * In order to give the other process a higher chance to - * complete its operation, we give up our time slice now. - * If we have to retry again, we do sleep a bit. - */ - Sleep(delay[tries]); - tries++; - goto repeat; - } if (gle == ERROR_ACCESS_DENIED && - ask_yes_no_if_possible("Rename from '%s' to '%s' failed. " + retry_ask_yes_no(&tries, "Rename from '%s' to '%s' failed. " "Should I try again?", pold, pnew)) goto repeat; From a2d3f150561c8e2bf96cabbab7c3752ec5d3a2ba Mon Sep 17 00:00:00 2001 From: Karsten Blees <blees@dcon.de> Date: Sun, 24 May 2015 01:55:05 +0200 Subject: [PATCH 241/297] Win32: change default of 'core.symlinks' to false Symlinks on Windows don't work the same way as on Unix systems. E.g. there are different types of symlinks for directories and files, creating symlinks requires administrative privileges etc. By default, disable symlink support on Windows. I.e. users explicitly have to enable it with 'git config [--system|--global] core.symlinks true'. The test suite ignores system / global config files. Allow testing *with* symlink support by checking if native symlinks are enabled in MSys2 (via 'MSYS=winsymlinks:nativestrict'). Reminder: This would need to be changed if / when we find a way to run the test suite in a non-MSys-based shell (e.g. dash). Signed-off-by: Karsten Blees <blees@dcon.de> --- compat/mingw.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/compat/mingw.c b/compat/mingw.c index 5c1f94c3a23ee7..6e4e1488e14392 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -3035,6 +3035,15 @@ static void setup_windows_environment(void) if (!getenv("LC_ALL") && !getenv("LC_CTYPE") && !getenv("LANG")) setenv("LC_CTYPE", "C.UTF-8", 1); + + /* + * Change 'core.symlinks' default to false, unless native symlinks are + * enabled in MSys2 (via 'MSYS=winsymlinks:nativestrict'). Thus we can + * run the test suite (which doesn't obey config files) with or without + * symlink support. + */ + if (!(tmp = getenv("MSYS")) || !strstr(tmp, "winsymlinks:nativestrict")) + has_symlinks = 0; } static PSID get_current_user_sid(void) From 9bc14e243fbb8a113cc04de9d7dc237bf2ccaef5 Mon Sep 17 00:00:00 2001 From: Karsten Blees <blees@dcon.de> Date: Sat, 16 May 2015 00:32:03 +0200 Subject: [PATCH 242/297] Win32: add symlink-specific error codes Signed-off-by: Karsten Blees <blees@dcon.de> --- compat/mingw.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/compat/mingw.c b/compat/mingw.c index 6e4e1488e14392..1ffc5d537756aa 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -99,6 +99,7 @@ int err_win_to_posix(DWORD winerr) case ERROR_INVALID_PARAMETER: error = EINVAL; break; case ERROR_INVALID_PASSWORD: error = EPERM; break; case ERROR_INVALID_PRIMARY_GROUP: error = EINVAL; break; + case ERROR_INVALID_REPARSE_DATA: error = EINVAL; break; case ERROR_INVALID_SIGNAL_NUMBER: error = EINVAL; break; case ERROR_INVALID_TARGET_HANDLE: error = EIO; break; case ERROR_INVALID_WORKSTATION: error = EACCES; break; @@ -113,6 +114,7 @@ int err_win_to_posix(DWORD winerr) case ERROR_NEGATIVE_SEEK: error = ESPIPE; break; case ERROR_NOACCESS: error = EFAULT; break; case ERROR_NONE_MAPPED: error = EINVAL; break; + case ERROR_NOT_A_REPARSE_POINT: error = EINVAL; break; case ERROR_NOT_ENOUGH_MEMORY: error = ENOMEM; break; case ERROR_NOT_READY: error = EAGAIN; break; case ERROR_NOT_SAME_DEVICE: error = EXDEV; break; @@ -133,6 +135,9 @@ int err_win_to_posix(DWORD winerr) case ERROR_PIPE_NOT_CONNECTED: error = EPIPE; break; case ERROR_PRIVILEGE_NOT_HELD: error = EACCES; break; case ERROR_READ_FAULT: error = EIO; break; + case ERROR_REPARSE_ATTRIBUTE_CONFLICT: error = EINVAL; break; + case ERROR_REPARSE_TAG_INVALID: error = EINVAL; break; + case ERROR_REPARSE_TAG_MISMATCH: error = EINVAL; break; case ERROR_SEEK: error = EIO; break; case ERROR_SEEK_ON_DEVICE: error = ESPIPE; break; case ERROR_SHARING_BUFFER_EXCEEDED: error = ENFILE; break; From d34eeb7e62f00b6c6087b1ff0a0de53519bafb68 Mon Sep 17 00:00:00 2001 From: Karsten Blees <blees@dcon.de> Date: Sun, 24 May 2015 01:06:10 +0200 Subject: [PATCH 243/297] Win32: mingw_unlink: support symlinks to directories _wunlink() / DeleteFileW() refuses to delete symlinks to directories. If _wunlink() fails with ERROR_ACCESS_DENIED, try _wrmdir() as well. Signed-off-by: Karsten Blees <blees@dcon.de> --- compat/mingw.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/compat/mingw.c b/compat/mingw.c index 1ffc5d537756aa..f474a4254d1897 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -376,6 +376,13 @@ int mingw_unlink(const char *pathname) return 0; if (!is_file_in_use_error(GetLastError())) break; + /* + * _wunlink() / DeleteFileW() for directory symlinks fails with + * ERROR_ACCESS_DENIED (EACCES), so try _wrmdir() as well. This is the + * same error we get if a file is in use (already checked above). + */ + if (!_wrmdir(wpathname)) + return 0; } while (retry_ask_yes_no(&tries, "Unlink of file '%s' failed. " "Should I try again?", pathname)); return -1; From ab63096a9bbd15c6baace7dc84f68b7112f7c15c Mon Sep 17 00:00:00 2001 From: Karsten Blees <blees@dcon.de> Date: Tue, 19 May 2015 22:42:48 +0200 Subject: [PATCH 244/297] Win32: mingw_rename: support renaming symlinks MSVCRT's _wrename() cannot rename symlinks over existing files: it returns success without doing anything. Newer MSVCR*.dll versions probably do not have this problem: according to CRT sources, they just call MoveFileEx() with the MOVEFILE_COPY_ALLOWED flag. Get rid of _wrename() and call MoveFileEx() with proper error handling. Signed-off-by: Karsten Blees <blees@dcon.de> --- compat/mingw.c | 38 +++++++++++++++++--------------------- 1 file changed, 17 insertions(+), 21 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index f474a4254d1897..879beb11449726 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -2417,27 +2417,29 @@ int mingw_accept(int sockfd1, struct sockaddr *sa, socklen_t *sz) #undef rename int mingw_rename(const char *pold, const char *pnew) { - DWORD attrs, gle; + DWORD attrs = INVALID_FILE_ATTRIBUTES, gle; int tries = 0; wchar_t wpold[MAX_LONG_PATH], wpnew[MAX_LONG_PATH]; if (xutftowcs_long_path(wpold, pold) < 0 || xutftowcs_long_path(wpnew, pnew) < 0) return -1; - /* - * Try native rename() first to get errno right. - * It is based on MoveFile(), which cannot overwrite existing files. - */ - if (!_wrename(wpold, wpnew)) - return 0; - if (errno != EEXIST) - return -1; repeat: - if (MoveFileExW(wpold, wpnew, MOVEFILE_REPLACE_EXISTING)) + if (MoveFileExW(wpold, wpnew, + MOVEFILE_REPLACE_EXISTING | MOVEFILE_COPY_ALLOWED)) return 0; - /* TODO: translate more errors */ gle = GetLastError(); - if (gle == ERROR_ACCESS_DENIED && + + /* revert file attributes on failure */ + if (attrs != INVALID_FILE_ATTRIBUTES) + SetFileAttributesW(wpnew, attrs); + + if (!is_file_in_use_error(gle)) { + errno = err_win_to_posix(gle); + return -1; + } + + if (attrs == INVALID_FILE_ATTRIBUTES && (attrs = GetFileAttributesW(wpnew)) != INVALID_FILE_ATTRIBUTES) { if (attrs & FILE_ATTRIBUTE_DIRECTORY) { DWORD attrsold = GetFileAttributesW(wpold); @@ -2449,16 +2451,10 @@ int mingw_rename(const char *pold, const char *pnew) return -1; } if ((attrs & FILE_ATTRIBUTE_READONLY) && - SetFileAttributesW(wpnew, attrs & ~FILE_ATTRIBUTE_READONLY)) { - if (MoveFileExW(wpold, wpnew, MOVEFILE_REPLACE_EXISTING)) - return 0; - gle = GetLastError(); - /* revert file attributes on failure */ - SetFileAttributesW(wpnew, attrs); - } + SetFileAttributesW(wpnew, attrs & ~FILE_ATTRIBUTE_READONLY)) + goto repeat; } - if (gle == ERROR_ACCESS_DENIED && - retry_ask_yes_no(&tries, "Rename from '%s' to '%s' failed. " + if (retry_ask_yes_no(&tries, "Rename from '%s' to '%s' failed. " "Should I try again?", pold, pnew)) goto repeat; From d40f2b0e66d1026f502f6a54b9fc391eaaad39bd Mon Sep 17 00:00:00 2001 From: Karsten Blees <blees@dcon.de> Date: Sun, 24 May 2015 01:17:31 +0200 Subject: [PATCH 245/297] Win32: mingw_chdir: change to symlink-resolved directory If symlinks are enabled, resolve all symlinks when changing directories, as required by POSIX. Note: Git's real_path() function bases its link resolution algorithm on this property of chdir(). Unfortunately, the current directory on Windows is limited to only MAX_PATH (260) characters. Therefore using symlinks and long paths in combination may be problematic. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- compat/mingw.c | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/compat/mingw.c b/compat/mingw.c index 879beb11449726..939299212045b1 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -829,7 +829,24 @@ int mingw_chdir(const char *dirname) wchar_t wdirname[MAX_LONG_PATH]; if (xutftowcs_long_path(wdirname, dirname) < 0) return -1; - result = _wchdir(wdirname); + + if (has_symlinks) { + HANDLE hnd = CreateFileW(wdirname, 0, + FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, NULL, + OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, NULL); + if (hnd == INVALID_HANDLE_VALUE) { + errno = err_win_to_posix(GetLastError()); + return -1; + } + if (!GetFinalPathNameByHandleW(hnd, wdirname, ARRAY_SIZE(wdirname), 0)) { + errno = err_win_to_posix(GetLastError()); + CloseHandle(hnd); + return -1; + } + CloseHandle(hnd); + } + + result = _wchdir(normalize_ntpath(wdirname)); current_directory_len = GetCurrentDirectoryW(0, NULL); return result; } From fd2b6ed436b651eeee9ab0550bf5bc3b2b4c67b5 Mon Sep 17 00:00:00 2001 From: Karsten Blees <blees@dcon.de> Date: Sun, 24 May 2015 01:24:41 +0200 Subject: [PATCH 246/297] Win32: implement readlink() Implement readlink() by reading NTFS reparse points. Works for symlinks and directory junctions. If symlinks are disabled, fail with ENOSYS. Signed-off-by: Karsten Blees <blees@dcon.de> --- compat/mingw.c | 98 ++++++++++++++++++++++++++++++++++++++++++++++++++ compat/mingw.h | 3 +- 2 files changed, 99 insertions(+), 2 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index 939299212045b1..530ef747032d23 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -5,6 +5,7 @@ #include <sddl.h> #include <conio.h> #include <wchar.h> +#include <winioctl.h> #include "../strbuf.h" #include "../run-command.h" #include "../abspath.h" @@ -2752,6 +2753,103 @@ int link(const char *oldpath, const char *newpath) return 0; } +#ifndef _WINNT_H +/* + * The REPARSE_DATA_BUFFER structure is defined in the Windows DDK (in + * ntifs.h) and in MSYS1's winnt.h (which defines _WINNT_H). So define + * it ourselves if we are on MSYS2 (whose winnt.h defines _WINNT_). + */ +typedef struct _REPARSE_DATA_BUFFER { + DWORD ReparseTag; + WORD ReparseDataLength; + WORD Reserved; +#ifndef _MSC_VER + _ANONYMOUS_UNION +#endif + union { + struct { + WORD SubstituteNameOffset; + WORD SubstituteNameLength; + WORD PrintNameOffset; + WORD PrintNameLength; + ULONG Flags; + WCHAR PathBuffer[1]; + } SymbolicLinkReparseBuffer; + struct { + WORD SubstituteNameOffset; + WORD SubstituteNameLength; + WORD PrintNameOffset; + WORD PrintNameLength; + WCHAR PathBuffer[1]; + } MountPointReparseBuffer; + struct { + BYTE DataBuffer[1]; + } GenericReparseBuffer; + } DUMMYUNIONNAME; +} REPARSE_DATA_BUFFER, *PREPARSE_DATA_BUFFER; +#endif + +int readlink(const char *path, char *buf, size_t bufsiz) +{ + HANDLE handle; + WCHAR wpath[MAX_LONG_PATH], *wbuf; + REPARSE_DATA_BUFFER *b = alloca(MAXIMUM_REPARSE_DATA_BUFFER_SIZE); + DWORD dummy; + char tmpbuf[MAX_LONG_PATH]; + int len; + + if (xutftowcs_long_path(wpath, path) < 0) + return -1; + + /* read reparse point data */ + handle = CreateFileW(wpath, 0, + FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, NULL, + OPEN_EXISTING, + FILE_FLAG_BACKUP_SEMANTICS | FILE_FLAG_OPEN_REPARSE_POINT, NULL); + if (handle == INVALID_HANDLE_VALUE) { + errno = err_win_to_posix(GetLastError()); + return -1; + } + if (!DeviceIoControl(handle, FSCTL_GET_REPARSE_POINT, NULL, 0, b, + MAXIMUM_REPARSE_DATA_BUFFER_SIZE, &dummy, NULL)) { + errno = err_win_to_posix(GetLastError()); + CloseHandle(handle); + return -1; + } + CloseHandle(handle); + + /* get target path for symlinks or mount points (aka 'junctions') */ + switch (b->ReparseTag) { + case IO_REPARSE_TAG_SYMLINK: + wbuf = (WCHAR*) (((char*) b->SymbolicLinkReparseBuffer.PathBuffer) + + b->SymbolicLinkReparseBuffer.SubstituteNameOffset); + *(WCHAR*) (((char*) wbuf) + + b->SymbolicLinkReparseBuffer.SubstituteNameLength) = 0; + break; + case IO_REPARSE_TAG_MOUNT_POINT: + wbuf = (WCHAR*) (((char*) b->MountPointReparseBuffer.PathBuffer) + + b->MountPointReparseBuffer.SubstituteNameOffset); + *(WCHAR*) (((char*) wbuf) + + b->MountPointReparseBuffer.SubstituteNameLength) = 0; + break; + default: + errno = EINVAL; + return -1; + } + + /* + * Adapt to strange readlink() API: Copy up to bufsiz *bytes*, potentially + * cutting off a UTF-8 sequence. Insufficient bufsize is *not* a failure + * condition. There is no conversion function that produces invalid UTF-8, + * so convert to a (hopefully large enough) temporary buffer, then memcpy + * the requested number of bytes (including '\0' for robustness). + */ + if ((len = xwcstoutf(tmpbuf, normalize_ntpath(wbuf), MAX_LONG_PATH)) < 0) + return -1; + memcpy(buf, tmpbuf, min(bufsiz, len + 1)); + return min(bufsiz, len); +} + pid_t waitpid(pid_t pid, int *status, int options) { HANDLE h = OpenProcess(SYNCHRONIZE | PROCESS_QUERY_INFORMATION, diff --git a/compat/mingw.h b/compat/mingw.h index bc1d36b49d5a17..45d1615a9665d7 100644 --- a/compat/mingw.h +++ b/compat/mingw.h @@ -125,8 +125,6 @@ struct utsname { * trivial stubs */ -static inline int readlink(const char *path, char *buf, size_t bufsiz) -{ errno = ENOSYS; return -1; } static inline int symlink(const char *oldpath, const char *newpath) { errno = ENOSYS; return -1; } static inline int fchmod(int fildes, mode_t mode) @@ -222,6 +220,7 @@ int setitimer(int type, struct itimerval *in, struct itimerval *out); int sigaction(int sig, struct sigaction *in, struct sigaction *out); int link(const char *oldpath, const char *newpath); int uname(struct utsname *buf); +int readlink(const char *path, char *buf, size_t bufsiz); /* * replacements of existing functions From c75bdec2e9dac5315cbdafaf2830057caaeafb51 Mon Sep 17 00:00:00 2001 From: Bill Zissimopoulos <billziss@navimatics.com> Date: Thu, 28 May 2020 16:35:57 -0700 Subject: [PATCH 247/297] mingw: lstat: compute correct size for symlinks This commit fixes mingw_lstat by computing the proper size for symlinks according to POSIX. POSIX specifies that upon successful return from lstat: "the value of the st_size member shall be set to the length of the pathname contained in the symbolic link not including any terminating null byte". Prior to this commit the mingw_lstat function returned a fixed size of 4096. This caused problems in git repositories that were accessed by git for Cygwin or git for WSL. For example, doing `git reset --hard` using git for Windows would update the size of symlinks in the index to be 4096; at a later time git for Cygwin or git for WSL would find that symlinks have changed size during `git status`. Vice versa doing `git reset --hard` in git for Cygwin or git for WSL would update the size of symlinks in the index with the correct value, only for git for Windows to find incorrectly at a later time that the size had changed. Signed-off-by: Bill Zissimopoulos <billziss@navimatics.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- compat/mingw.c | 65 ++++++++++++++++++++++++++++-------------- compat/win32/fscache.c | 12 ++++++++ 2 files changed, 56 insertions(+), 21 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index 530ef747032d23..f9012d9b726038 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -893,10 +893,14 @@ static int has_valid_directory_prefix(wchar_t *wfilename) return 1; } +static int readlink_1(const WCHAR *wpath, BOOL fail_on_unknown_tag, + char *tmpbuf, int *plen, DWORD *ptag); + int mingw_lstat(const char *file_name, struct stat *buf) { WIN32_FILE_ATTRIBUTE_DATA fdata; - WIN32_FIND_DATAW findbuf = { 0 }; + DWORD reparse_tag = 0; + int link_len = 0; wchar_t wfilename[MAX_LONG_PATH]; int wlen = xutftowcs_long_path(wfilename, file_name); if (wlen < 0) @@ -911,20 +915,21 @@ int mingw_lstat(const char *file_name, struct stat *buf) } if (GetFileAttributesExW(wfilename, GetFileExInfoStandard, &fdata)) { - /* for reparse points, use FindFirstFile to get the reparse tag */ + /* for reparse points, get the link tag and length */ if (fdata.dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT) { - HANDLE handle = FindFirstFileW(wfilename, &findbuf); - if (handle == INVALID_HANDLE_VALUE) - goto error; - FindClose(handle); + char tmpbuf[MAX_LONG_PATH]; + + if (readlink_1(wfilename, FALSE, tmpbuf, &link_len, + &reparse_tag) < 0) + return -1; } buf->st_ino = 0; buf->st_gid = 0; buf->st_uid = 0; buf->st_nlink = 1; buf->st_mode = file_attr_to_st_mode(fdata.dwFileAttributes, - findbuf.dwReserved0); - buf->st_size = S_ISLNK(buf->st_mode) ? MAX_LONG_PATH : + reparse_tag); + buf->st_size = S_ISLNK(buf->st_mode) ? link_len : fdata.nFileSizeLow | (((off_t) fdata.nFileSizeHigh) << 32); buf->st_dev = buf->st_rdev = 0; /* not used by Git */ filetime_to_timespec(&(fdata.ftLastAccessTime), &(buf->st_atim)); @@ -932,7 +937,7 @@ int mingw_lstat(const char *file_name, struct stat *buf) filetime_to_timespec(&(fdata.ftCreationTime), &(buf->st_ctim)); return 0; } -error: + switch (GetLastError()) { case ERROR_ACCESS_DENIED: case ERROR_SHARING_VIOLATION: @@ -2789,17 +2794,13 @@ typedef struct _REPARSE_DATA_BUFFER { } REPARSE_DATA_BUFFER, *PREPARSE_DATA_BUFFER; #endif -int readlink(const char *path, char *buf, size_t bufsiz) +static int readlink_1(const WCHAR *wpath, BOOL fail_on_unknown_tag, + char *tmpbuf, int *plen, DWORD *ptag) { HANDLE handle; - WCHAR wpath[MAX_LONG_PATH], *wbuf; + WCHAR *wbuf; REPARSE_DATA_BUFFER *b = alloca(MAXIMUM_REPARSE_DATA_BUFFER_SIZE); DWORD dummy; - char tmpbuf[MAX_LONG_PATH]; - int len; - - if (xutftowcs_long_path(wpath, path) < 0) - return -1; /* read reparse point data */ handle = CreateFileW(wpath, 0, @@ -2819,7 +2820,7 @@ int readlink(const char *path, char *buf, size_t bufsiz) CloseHandle(handle); /* get target path for symlinks or mount points (aka 'junctions') */ - switch (b->ReparseTag) { + switch ((*ptag = b->ReparseTag)) { case IO_REPARSE_TAG_SYMLINK: wbuf = (WCHAR*) (((char*) b->SymbolicLinkReparseBuffer.PathBuffer) + b->SymbolicLinkReparseBuffer.SubstituteNameOffset); @@ -2833,10 +2834,34 @@ int readlink(const char *path, char *buf, size_t bufsiz) + b->MountPointReparseBuffer.SubstituteNameLength) = 0; break; default: - errno = EINVAL; - return -1; + if (fail_on_unknown_tag) { + errno = EINVAL; + return -1; + } else { + *plen = MAX_LONG_PATH; + return 0; + } } + if ((*plen = + xwcstoutf(tmpbuf, normalize_ntpath(wbuf), MAX_LONG_PATH)) < 0) + return -1; + return 0; +} + +int readlink(const char *path, char *buf, size_t bufsiz) +{ + WCHAR wpath[MAX_LONG_PATH]; + char tmpbuf[MAX_LONG_PATH]; + int len; + DWORD tag; + + if (xutftowcs_long_path(wpath, path) < 0) + return -1; + + if (readlink_1(wpath, TRUE, tmpbuf, &len, &tag) < 0) + return -1; + /* * Adapt to strange readlink() API: Copy up to bufsiz *bytes*, potentially * cutting off a UTF-8 sequence. Insufficient bufsize is *not* a failure @@ -2844,8 +2869,6 @@ int readlink(const char *path, char *buf, size_t bufsiz) * so convert to a (hopefully large enough) temporary buffer, then memcpy * the requested number of bytes (including '\0' for robustness). */ - if ((len = xwcstoutf(tmpbuf, normalize_ntpath(wbuf), MAX_LONG_PATH)) < 0) - return -1; memcpy(buf, tmpbuf, min(bufsiz, len + 1)); return min(bufsiz, len); } diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index 3016ae61dee5e0..d3819aa7f86466 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -584,6 +584,18 @@ int fscache_lstat(const char *filename, struct stat *st) return -1; } + /* + * Special case symbolic links: FindFirstFile()/FindNextFile() did not + * provide us with the length of the target path. + */ + if (fse->u.s.st_size == MAX_LONG_PATH && S_ISLNK(fse->st_mode)) { + char buf[MAX_LONG_PATH]; + int len = readlink(filename, buf, sizeof(buf) - 1); + + if (len > 0) + fse->u.s.st_size = len; + } + /* copy stat data */ st->st_ino = 0; st->st_gid = 0; From 5e2eb17a012a996f117420c66bfbbe32ea85ebde Mon Sep 17 00:00:00 2001 From: Karsten Blees <blees@dcon.de> Date: Sun, 24 May 2015 01:32:03 +0200 Subject: [PATCH 248/297] Win32: implement basic symlink() functionality (file symlinks only) Implement symlink() that always creates file symlinks. Fails with ENOSYS if symlinks are disabled or unsupported. Note: CreateSymbolicLinkW() was introduced with symlink support in Windows Vista. For compatibility with Windows XP, we need to load it dynamically and fail gracefully if it isnt's available. Signed-off-by: Karsten Blees <blees@dcon.de> --- compat/mingw.c | 28 ++++++++++++++++++++++++++++ compat/mingw.h | 3 +-- 2 files changed, 29 insertions(+), 2 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index f9012d9b726038..0ac2687beab38f 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -2758,6 +2758,34 @@ int link(const char *oldpath, const char *newpath) return 0; } +int symlink(const char *target, const char *link) +{ + wchar_t wtarget[MAX_LONG_PATH], wlink[MAX_LONG_PATH]; + int len; + + /* fail if symlinks are disabled or API is not supported (WinXP) */ + if (!has_symlinks) { + errno = ENOSYS; + return -1; + } + + if ((len = xutftowcs_long_path(wtarget, target)) < 0 + || xutftowcs_long_path(wlink, link) < 0) + return -1; + + /* convert target dir separators to backslashes */ + while (len--) + if (wtarget[len] == '/') + wtarget[len] = '\\'; + + /* create file symlink */ + if (!CreateSymbolicLinkW(wlink, wtarget, 0)) { + errno = err_win_to_posix(GetLastError()); + return -1; + } + return 0; +} + #ifndef _WINNT_H /* * The REPARSE_DATA_BUFFER structure is defined in the Windows DDK (in diff --git a/compat/mingw.h b/compat/mingw.h index 45d1615a9665d7..a74024f8b060cd 100644 --- a/compat/mingw.h +++ b/compat/mingw.h @@ -125,8 +125,6 @@ struct utsname { * trivial stubs */ -static inline int symlink(const char *oldpath, const char *newpath) -{ errno = ENOSYS; return -1; } static inline int fchmod(int fildes, mode_t mode) { errno = ENOSYS; return -1; } #ifndef __MINGW64_VERSION_MAJOR @@ -220,6 +218,7 @@ int setitimer(int type, struct itimerval *in, struct itimerval *out); int sigaction(int sig, struct sigaction *in, struct sigaction *out); int link(const char *oldpath, const char *newpath); int uname(struct utsname *buf); +int symlink(const char *target, const char *link); int readlink(const char *path, char *buf, size_t bufsiz); /* From 102cf11eccf31575a52de131a0814531aadc9b30 Mon Sep 17 00:00:00 2001 From: Karsten Blees <blees@dcon.de> Date: Sun, 24 May 2015 01:48:35 +0200 Subject: [PATCH 249/297] Win32: symlink: add support for symlinks to directories Symlinks on Windows have a flag that indicates whether the target is a file or a directory. Symlinks of wrong type simply don't work. This even affects core Win32 APIs (e.g. DeleteFile() refuses to delete directory symlinks). However, CreateFile() with FILE_FLAG_BACKUP_SEMANTICS doesn't seem to care. Check the target type by first creating a tentative file symlink, opening it, and checking the type of the resulting handle. If it is a directory, recreate the symlink with the directory flag set. It is possible to create symlinks before the target exists (or in case of symlinks to symlinks: before the target type is known). If this happens, create a tentative file symlink and postpone the directory decision: keep a list of phantom symlinks to be processed whenever a new directory is created in mingw_mkdir(). Limitations: This algorithm may fail if a link target changes from file to directory or vice versa, or if the target directory is created in another process. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- compat/mingw.c | 159 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 159 insertions(+) diff --git a/compat/mingw.c b/compat/mingw.c index 0ac2687beab38f..9f0bf4b3a058d9 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -335,6 +335,126 @@ static inline int is_wdir_sep(wchar_t wchar) return wchar == L'/' || wchar == L'\\'; } +static const wchar_t *make_relative_to(const wchar_t *path, + const wchar_t *relative_to, wchar_t *out, + size_t size) +{ + size_t i = wcslen(relative_to), len; + + /* Is `path` already absolute? */ + if (is_wdir_sep(path[0]) || + (iswalpha(path[0]) && path[1] == L':' && is_wdir_sep(path[2]))) + return path; + + while (i > 0 && !is_wdir_sep(relative_to[i - 1])) + i--; + + /* Is `relative_to` in the current directory? */ + if (!i) + return path; + + len = wcslen(path); + if (i + len + 1 > size) { + error("Could not make '%ls' relative to '%ls' (too large)", + path, relative_to); + return NULL; + } + + memcpy(out, relative_to, i * sizeof(wchar_t)); + wcscpy(out + i, path); + return out; +} + +enum phantom_symlink_result { + PHANTOM_SYMLINK_RETRY, + PHANTOM_SYMLINK_DONE, + PHANTOM_SYMLINK_DIRECTORY +}; + +/* + * Changes a file symlink to a directory symlink if the target exists and is a + * directory. + */ +static enum phantom_symlink_result +process_phantom_symlink(const wchar_t *wtarget, const wchar_t *wlink) +{ + HANDLE hnd; + BY_HANDLE_FILE_INFORMATION fdata; + wchar_t relative[MAX_LONG_PATH]; + const wchar_t *rel; + + /* check that wlink is still a file symlink */ + if ((GetFileAttributesW(wlink) + & (FILE_ATTRIBUTE_REPARSE_POINT | FILE_ATTRIBUTE_DIRECTORY)) + != FILE_ATTRIBUTE_REPARSE_POINT) + return PHANTOM_SYMLINK_DONE; + + /* make it relative, if necessary */ + rel = make_relative_to(wtarget, wlink, relative, ARRAY_SIZE(relative)); + if (!rel) + return PHANTOM_SYMLINK_DONE; + + /* let Windows resolve the link by opening it */ + hnd = CreateFileW(rel, 0, + FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, NULL, + OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, NULL); + if (hnd == INVALID_HANDLE_VALUE) { + errno = err_win_to_posix(GetLastError()); + return PHANTOM_SYMLINK_RETRY; + } + + if (!GetFileInformationByHandle(hnd, &fdata)) { + errno = err_win_to_posix(GetLastError()); + CloseHandle(hnd); + return PHANTOM_SYMLINK_RETRY; + } + CloseHandle(hnd); + + /* if target exists and is a file, we're done */ + if (!(fdata.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY)) + return PHANTOM_SYMLINK_DONE; + + /* otherwise recreate the symlink with directory flag */ + if (DeleteFileW(wlink) && CreateSymbolicLinkW(wlink, wtarget, 1)) + return PHANTOM_SYMLINK_DIRECTORY; + + errno = err_win_to_posix(GetLastError()); + return PHANTOM_SYMLINK_RETRY; +} + +/* keep track of newly created symlinks to non-existing targets */ +struct phantom_symlink_info { + struct phantom_symlink_info *next; + wchar_t *wlink; + wchar_t *wtarget; +}; + +static struct phantom_symlink_info *phantom_symlinks = NULL; +static CRITICAL_SECTION phantom_symlinks_cs; + +static void process_phantom_symlinks(void) +{ + struct phantom_symlink_info *current, **psi; + EnterCriticalSection(&phantom_symlinks_cs); + /* process phantom symlinks list */ + psi = &phantom_symlinks; + while ((current = *psi)) { + enum phantom_symlink_result result = process_phantom_symlink( + current->wtarget, current->wlink); + if (result == PHANTOM_SYMLINK_RETRY) { + psi = ¤t->next; + } else { + /* symlink was processed, remove from list */ + *psi = current->next; + free(current); + /* if symlink was a directory, start over */ + if (result == PHANTOM_SYMLINK_DIRECTORY) + psi = &phantom_symlinks; + } + } + LeaveCriticalSection(&phantom_symlinks_cs); +} + /* Normalizes NT paths as returned by some low-level APIs. */ static wchar_t *normalize_ntpath(wchar_t *wbuf) { @@ -518,6 +638,8 @@ int mingw_mkdir(const char *path, int mode) return -1; ret = _wmkdir(wpath); + if (!ret) + process_phantom_symlinks(); if (!ret && needs_hiding(path)) return set_hidden_flag(wpath, 1); return ret; @@ -2783,6 +2905,42 @@ int symlink(const char *target, const char *link) errno = err_win_to_posix(GetLastError()); return -1; } + + /* convert to directory symlink if target exists */ + switch (process_phantom_symlink(wtarget, wlink)) { + case PHANTOM_SYMLINK_RETRY: { + /* if target doesn't exist, add to phantom symlinks list */ + wchar_t wfullpath[MAX_LONG_PATH]; + struct phantom_symlink_info *psi; + + /* convert to absolute path to be independent of cwd */ + len = GetFullPathNameW(wlink, MAX_LONG_PATH, wfullpath, NULL); + if (!len || len >= MAX_LONG_PATH) { + errno = err_win_to_posix(GetLastError()); + return -1; + } + + /* over-allocate and fill phantom_symlink_info structure */ + psi = xmalloc(sizeof(struct phantom_symlink_info) + + sizeof(wchar_t) * (len + wcslen(wtarget) + 2)); + psi->wlink = (wchar_t *)(psi + 1); + wcscpy(psi->wlink, wfullpath); + psi->wtarget = psi->wlink + len + 1; + wcscpy(psi->wtarget, wtarget); + + EnterCriticalSection(&phantom_symlinks_cs); + psi->next = phantom_symlinks; + phantom_symlinks = psi; + LeaveCriticalSection(&phantom_symlinks_cs); + break; + } + case PHANTOM_SYMLINK_DIRECTORY: + /* if we created a dir symlink, process other phantom symlinks */ + process_phantom_symlinks(); + break; + default: + break; + } return 0; } @@ -3743,6 +3901,7 @@ int wmain(int argc, const wchar_t **wargv) /* initialize critical section for waitpid pinfo_t list */ InitializeCriticalSection(&pinfo_cs); + InitializeCriticalSection(&phantom_symlinks_cs); /* initialize critical section for fscache */ InitializeCriticalSection(&fscache_cs); From f763ec72a058d8c85d41947a39de2b72c64a4ca3 Mon Sep 17 00:00:00 2001 From: JiSeop Moon <zcube@zcube.kr> Date: Mon, 23 Apr 2018 22:30:18 +0900 Subject: [PATCH 250/297] mingw: introduce code to detect whether we're inside a Windows container This will come in handy in the next commit. Signed-off-by: JiSeop Moon <zcube@zcube.kr> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- compat/mingw.c | 32 ++++++++++++++++++++++++++++++++ compat/mingw.h | 5 +++++ 2 files changed, 37 insertions(+) diff --git a/compat/mingw.c b/compat/mingw.c index 4626f98e522058..fbbbe86f8e21c4 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -3996,3 +3996,35 @@ int mingw_have_unix_sockets(void) return ret; } #endif + +/* + * Based on https://stackoverflow.com/questions/43002803 + * + * [HKLM\SYSTEM\CurrentControlSet\Services\cexecsvc] + * "DisplayName"="@%systemroot%\\system32\\cexecsvc.exe,-100" + * "ErrorControl"=dword:00000001 + * "ImagePath"=hex(2):25,00,73,00,79,00,73,00,74,00,65,00,6d,00,72,00,6f,00, + * 6f,00,74,00,25,00,5c,00,73,00,79,00,73,00,74,00,65,00,6d,00,33,00,32,00, + * 5c,00,63,00,65,00,78,00,65,00,63,00,73,00,76,00,63,00,2e,00,65,00,78,00, + * 65,00,00,00 + * "Start"=dword:00000002 + * "Type"=dword:00000010 + * "Description"="@%systemroot%\\system32\\cexecsvc.exe,-101" + * "ObjectName"="LocalSystem" + * "ServiceSidType"=dword:00000001 + */ +int is_inside_windows_container(void) +{ + static int inside_container = -1; /* -1 uninitialized */ + const char *key = "SYSTEM\\CurrentControlSet\\Services\\cexecsvc"; + HKEY handle = NULL; + + if (inside_container != -1) + return inside_container; + + inside_container = ERROR_SUCCESS == + RegOpenKeyExA(HKEY_LOCAL_MACHINE, key, 0, KEY_READ, &handle); + RegCloseKey(handle); + + return inside_container; +} diff --git a/compat/mingw.h b/compat/mingw.h index a74024f8b060cd..f085a0806941ad 100644 --- a/compat/mingw.h +++ b/compat/mingw.h @@ -728,3 +728,8 @@ int mingw_have_unix_sockets(void); #undef have_unix_sockets #define have_unix_sockets mingw_have_unix_sockets #endif + +/* + * Check current process is inside Windows Container. + */ +int is_inside_windows_container(void); From c96e81681304eec92f0fbc34526e8ec948d589d2 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Tue, 30 May 2017 21:50:57 +0200 Subject: [PATCH 251/297] mingw: try to create symlinks without elevated permissions With Windows 10 Build 14972 in Developer Mode, a new flag is supported by CreateSymbolicLink() to create symbolic links even when running outside of an elevated session (which was previously required). This new flag is called SYMBOLIC_LINK_FLAG_ALLOW_UNPRIVILEGED_CREATE and has the numeric value 0x02. Previous Windows 10 versions will not understand that flag and return an ERROR_INVALID_PARAMETER, therefore we have to be careful to try passing that flag only when the build number indicates that it is supported. For more information about the new flag, see this blog post: https://blogs.windows.com/buildingapps/2016/12/02/symlinks-windows-10/ This patch is loosely based on the patch submitted by Samuel D. Leslie as https://github.com/git-for-windows/git/pull/1184. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- compat/mingw.c | 26 ++++++++++++++++++++++++-- 1 file changed, 24 insertions(+), 2 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index 9f0bf4b3a058d9..0c39289654520c 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -365,6 +365,8 @@ static const wchar_t *make_relative_to(const wchar_t *path, return out; } +static DWORD symlink_file_flags = 0, symlink_directory_flags = 1; + enum phantom_symlink_result { PHANTOM_SYMLINK_RETRY, PHANTOM_SYMLINK_DONE, @@ -415,7 +417,8 @@ process_phantom_symlink(const wchar_t *wtarget, const wchar_t *wlink) return PHANTOM_SYMLINK_DONE; /* otherwise recreate the symlink with directory flag */ - if (DeleteFileW(wlink) && CreateSymbolicLinkW(wlink, wtarget, 1)) + if (DeleteFileW(wlink) && + CreateSymbolicLinkW(wlink, wtarget, symlink_directory_flags)) return PHANTOM_SYMLINK_DIRECTORY; errno = err_win_to_posix(GetLastError()); @@ -2901,7 +2904,7 @@ int symlink(const char *target, const char *link) wtarget[len] = '\\'; /* create file symlink */ - if (!CreateSymbolicLinkW(wlink, wtarget, 0)) { + if (!CreateSymbolicLinkW(wlink, wtarget, symlink_file_flags)) { errno = err_win_to_posix(GetLastError()); return -1; } @@ -3837,6 +3840,24 @@ static void maybe_redirect_std_handles(void) GENERIC_WRITE, FILE_FLAG_NO_BUFFERING); } +static void adjust_symlink_flags(void) +{ + /* + * Starting with Windows 10 Build 14972, symbolic links can be created + * using CreateSymbolicLink() without elevation by passing the flag + * SYMBOLIC_LINK_FLAG_ALLOW_UNPRIVILEGED_CREATE (0x02) as last + * parameter, provided the Developer Mode has been enabled. Some + * earlier Windows versions complain about this flag with an + * ERROR_INVALID_PARAMETER, hence we have to test the build number + * specifically. + */ + if (GetVersion() >= 14972 << 16) { + symlink_file_flags |= 2; + symlink_directory_flags |= 2; + } + +} + #ifdef _MSC_VER #ifdef _DEBUG #include <crtdbg.h> @@ -3871,6 +3892,7 @@ int wmain(int argc, const wchar_t **wargv) #endif maybe_redirect_std_handles(); + adjust_symlink_flags(); fsync_object_files = 1; /* determine size of argv and environ conversion buffer */ From b1a2f3cc60bf4c231aa27982c3d779c9785ada8c Mon Sep 17 00:00:00 2001 From: JiSeop Moon <zcube@zcube.kr> Date: Mon, 23 Apr 2018 22:31:42 +0200 Subject: [PATCH 252/297] mingw: when running in a Windows container, try to rename() harder It is a known issue that a rename() can fail with an "Access denied" error at times, when copying followed by deleting the original file works. Let's just fall back to that behavior. Signed-off-by: JiSeop Moon <zcube@zcube.kr> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- compat/mingw.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/compat/mingw.c b/compat/mingw.c index fbbbe86f8e21c4..8df2e9740d7b42 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -2590,6 +2590,13 @@ int mingw_rename(const char *pold, const char *pnew) return 0; gle = GetLastError(); + if (gle == ERROR_ACCESS_DENIED && is_inside_windows_container()) { + /* Fall back to copy to destination & remove source */ + if (CopyFileW(wpold, wpnew, FALSE) && !mingw_unlink(pold)) + return 0; + gle = GetLastError(); + } + /* revert file attributes on failure */ if (attrs != INVALID_FILE_ATTRIBUTES) SetFileAttributesW(wpnew, attrs); From c582bd8e81cd7e280981e1515d2c410bbf28ed3f Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Mon, 2 Mar 2020 21:54:29 +0100 Subject: [PATCH 253/297] mingw: emulate stat() a little more faithfully When creating directories via `safe_create_leading_directories()`, we might encounter an already-existing directory which is not readable by the current user. To handle that situation, Git's code calls `stat()` to determine whether we're looking at a directory. In such a case, `CreateFile()` will fail, though, no matter what, and consequently `mingw_stat()` will fail, too. But POSIX semantics seem to still allow `stat()` to go forward. So let's call `mingw_lstat()` for the rescue if we fail to get a file handle due to denied permission in `mingw_stat()`, and fill the stat info that way. We need to be careful to not allow this to go forward in case that we're looking at a symbolic link: to resolve the link, we would still have to create a file handle, and we just found out that we cannot. Therefore, `stat()` still needs to fail with `EACCES` in that case. This fixes https://github.com/git-for-windows/git/issues/2531. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- compat/mingw.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/compat/mingw.c b/compat/mingw.c index 0c39289654520c..4626f98e522058 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -1127,7 +1127,19 @@ int mingw_stat(const char *file_name, struct stat *buf) FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, NULL, OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, NULL); if (hnd == INVALID_HANDLE_VALUE) { - errno = err_win_to_posix(GetLastError()); + DWORD err = GetLastError(); + + if (err == ERROR_ACCESS_DENIED && + !mingw_lstat(file_name, buf) && + !S_ISLNK(buf->st_mode)) + /* + * POSIX semantics state to still try to fill + * information, even if permission is denied to create + * a file handle. + */ + return 0; + + errno = err_win_to_posix(err); return -1; } result = get_file_info_by_handle(hnd, buf); From 38d49a63803b63d5acee54d4896440cc8feeda16 Mon Sep 17 00:00:00 2001 From: JiSeop Moon <zcube@zcube.kr> Date: Mon, 23 Apr 2018 22:35:26 +0200 Subject: [PATCH 254/297] mingw: move the file_attr_to_st_mode() function definition In preparation for making this function a bit more complicated (to allow for special-casing the `ContainerMappedDirectories` in Windows containers, which look like a symbolic link, but are not), let's move it out of the header. Signed-off-by: JiSeop Moon <zcube@zcube.kr> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- compat/mingw.c | 14 ++++++++++++++ compat/win32.h | 14 +------------- 2 files changed, 15 insertions(+), 13 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index 8df2e9740d7b42..2a1bba67560225 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -4035,3 +4035,17 @@ int is_inside_windows_container(void) return inside_container; } + +int file_attr_to_st_mode (DWORD attr, DWORD tag) +{ + int fMode = S_IREAD; + if ((attr & FILE_ATTRIBUTE_REPARSE_POINT) && tag == IO_REPARSE_TAG_SYMLINK) + fMode |= S_IFLNK; + else if (attr & FILE_ATTRIBUTE_DIRECTORY) + fMode |= S_IFDIR; + else + fMode |= S_IFREG; + if (!(attr & FILE_ATTRIBUTE_READONLY)) + fMode |= S_IWRITE; + return fMode; +} diff --git a/compat/win32.h b/compat/win32.h index 671bcc81f93351..52169ae19f4371 100644 --- a/compat/win32.h +++ b/compat/win32.h @@ -6,19 +6,7 @@ #include <windows.h> #endif -static inline int file_attr_to_st_mode (DWORD attr, DWORD tag) -{ - int fMode = S_IREAD; - if ((attr & FILE_ATTRIBUTE_REPARSE_POINT) && tag == IO_REPARSE_TAG_SYMLINK) - fMode |= S_IFLNK; - else if (attr & FILE_ATTRIBUTE_DIRECTORY) - fMode |= S_IFDIR; - else - fMode |= S_IFREG; - if (!(attr & FILE_ATTRIBUTE_READONLY)) - fMode |= S_IWRITE; - return fMode; -} +extern int file_attr_to_st_mode (DWORD attr, DWORD tag); static inline int get_file_attr(const char *fname, WIN32_FILE_ATTRIBUTE_DATA *fdata) { From 97d618ec5310b3c31cb144c2c73a208c524402aa Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Thu, 4 Jun 2020 23:16:07 +0200 Subject: [PATCH 255/297] mingw: special-case index entries for symlinks with buggy size In https://github.com/git-for-windows/git/pull/2637, we fixed a bug where symbolic links' target path sizes were recorded incorrectly in the index. The downside of this fix was that every user with tracked symbolic links in their checkouts would see them as modified in `git status`, but not in `git diff`, and only a `git add <path>` (or `git add -u`) would "fix" this. Let's do better than that: we can detect that situation and simply pretend that a symbolic link with a known bad size (or a size that just happens to be that bad size, a _very_ unlikely scenario because it would overflow our buffers due to the trailing NUL byte) means that it needs to be re-checked as if we had just checked it out. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- read-cache.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/read-cache.c b/read-cache.c index 33ae3c30b68728..4699180724c0f2 100644 --- a/read-cache.c +++ b/read-cache.c @@ -469,6 +469,17 @@ int ie_modified(struct index_state *istate, * then we know it is. */ if ((changed & DATA_CHANGED) && +#ifdef GIT_WINDOWS_NATIVE + /* + * Work around Git for Windows v2.27.0 fixing a bug where symlinks' + * target path lengths were not read at all, and instead recorded + * as 4096: now, all symlinks would appear as modified. + * + * So let's just special-case symlinks with a target path length + * (i.e. `sd_size`) of 4096 and force them to be re-checked. + */ + (!S_ISLNK(st->st_mode) || ce->ce_stat_data.sd_size != MAX_LONG_PATH) && +#endif (S_ISGITLINK(ce->ce_mode) || ce->ce_stat_data.sd_size != 0)) return changed; From 6b1d24d06182e13bc125ca818634174c7e67a8b4 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Thu, 20 Jul 2017 22:45:01 +0200 Subject: [PATCH 256/297] mingw: explicitly specify with which cmd to prefix the cmdline The main idea of this patch is that even if we have to look up the absolute path of the script, if only the basename was specified as argv[0], then we should use that basename on the command line, too, not the absolute path. This patch will also help with the upcoming patch where we automatically substitute "sh ..." by "busybox sh ..." if "sh" is not in the PATH but "busybox" is: we will do that by substituting the actual executable, but still keep prepending "sh" to the command line. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- compat/mingw.c | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index 6b3b62333ec13e..4c964d38280b2e 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -1896,8 +1896,8 @@ static int is_msys2_sh(const char *cmd) } static pid_t mingw_spawnve_fd(const char *cmd, const char **argv, char **deltaenv, - const char *dir, - int prepend_cmd, int fhin, int fhout, int fherr) + const char *dir, const char *prepend_cmd, + int fhin, int fhout, int fherr) { static int restrict_handle_inheritance = -1; STARTUPINFOEXW si; @@ -1988,9 +1988,9 @@ static pid_t mingw_spawnve_fd(const char *cmd, const char **argv, char **deltaen /* concatenate argv, quoting args as we go */ strbuf_init(&args, 0); if (prepend_cmd) { - char *quoted = (char *)quote_arg(cmd); + char *quoted = (char *)quote_arg(prepend_cmd); strbuf_addstr(&args, quoted); - if (quoted != cmd) + if (quoted != prepend_cmd) free(quoted); } for (; *argv; argv++) { @@ -2149,7 +2149,8 @@ static pid_t mingw_spawnve_fd(const char *cmd, const char **argv, char **deltaen return (pid_t)pi.dwProcessId; } -static pid_t mingw_spawnv(const char *cmd, const char **argv, int prepend_cmd) +static pid_t mingw_spawnv(const char *cmd, const char **argv, + const char *prepend_cmd) { return mingw_spawnve_fd(cmd, argv, NULL, NULL, prepend_cmd, 0, 1, 2); } @@ -2177,14 +2178,14 @@ pid_t mingw_spawnvpe(const char *cmd, const char **argv, char **deltaenv, pid = -1; } else { - pid = mingw_spawnve_fd(iprog, argv, deltaenv, dir, 1, + pid = mingw_spawnve_fd(iprog, argv, deltaenv, dir, interpr, fhin, fhout, fherr); free(iprog); } argv[0] = argv0; } else - pid = mingw_spawnve_fd(prog, argv, deltaenv, dir, 0, + pid = mingw_spawnve_fd(prog, argv, deltaenv, dir, NULL, fhin, fhout, fherr); free(prog); } @@ -2209,7 +2210,7 @@ static int try_shell_exec(const char *cmd, char *const *argv) argv2[0] = (char *)cmd; /* full path to the script file */ COPY_ARRAY(&argv2[1], &argv[1], argc); exec_id = trace2_exec(prog, (const char **)argv2); - pid = mingw_spawnv(prog, (const char **)argv2, 1); + pid = mingw_spawnv(prog, (const char **)argv2, interpr); if (pid >= 0) { int status; if (waitpid(pid, &status, 0) < 0) @@ -2233,7 +2234,7 @@ int mingw_execv(const char *cmd, char *const *argv) int exec_id; exec_id = trace2_exec(cmd, (const char **)argv); - pid = mingw_spawnv(cmd, (const char **)argv, 0); + pid = mingw_spawnv(cmd, (const char **)argv, NULL); if (pid < 0) { trace2_exec_result(exec_id, -1); return -1; From a850248ca37eba729a09b0a91503ef41565ee04c Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Thu, 20 Jul 2017 20:41:29 +0200 Subject: [PATCH 257/297] mingw: when path_lookup() failed, try BusyBox BusyBox comes with a ton of applets ("applet" being the identical concept to Git's "builtins"). And similar to Git's builtins, the applets can be called via `busybox <command>`, or the BusyBox executable can be copied/hard-linked to the command name. The similarities do not end here. Just as with Git's builtins, it is problematic that BusyBox' hard-linked applets cannot easily be put into a .zip file: .zip archives have no concept of hard-links and therefore would store identical copies (and also extract identical copies, "inflating" the archive unnecessarily). To counteract that issue, MinGit already ships without hard-linked copies of the builtins, and the plan is to do the same with BusyBox' applets: simply ship busybox.exe as single executable, without hard-linked applets. To accommodate that, Git is being taught by this commit a very special trick, exploiting the fact that it is possible to call an executable with a command-line whose argv[0] is different from the executable's name: when `sh` is to be spawned, and no `sh` is found in the PATH, but busybox.exe is, use that executable (with unchanged argv). Likewise, if any executable to be spawned is not on the PATH, but busybox.exe is found, parse the output of `busybox.exe --help` to find out what applets are included, and if the command matches an included applet name, use busybox.exe to execute it. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- compat/mingw.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++++ t/t0014-alias.sh | 2 +- 2 files changed, 64 insertions(+), 1 deletion(-) diff --git a/compat/mingw.c b/compat/mingw.c index 4c964d38280b2e..e61981cdaafa10 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -24,6 +24,7 @@ #include "../repository.h" #include "win32/fscache.h" #include "../attr.h" +#include "../string-list.h" #define HCAST(type, handle) ((type)(intptr_t)handle) @@ -1665,6 +1666,65 @@ static char *lookup_prog(const char *dir, int dirlen, const char *cmd, return NULL; } +static char *path_lookup(const char *cmd, int exe_only); + +static char *is_busybox_applet(const char *cmd) +{ + static struct string_list applets = STRING_LIST_INIT_DUP; + static char *busybox_path; + static int busybox_path_initialized; + + /* Avoid infinite loop */ + if (!strncasecmp(cmd, "busybox", 7) && + (!cmd[7] || !strcasecmp(cmd + 7, ".exe"))) + return NULL; + + if (!busybox_path_initialized) { + busybox_path = path_lookup("busybox.exe", 1); + busybox_path_initialized = 1; + } + + /* Assume that sh is compiled in... */ + if (!busybox_path || !strcasecmp(cmd, "sh")) + return xstrdup_or_null(busybox_path); + + if (!applets.nr) { + struct child_process cp = CHILD_PROCESS_INIT; + struct strbuf buf = STRBUF_INIT; + char *p; + + strvec_pushl(&cp.args, busybox_path, "--help", NULL); + + if (capture_command(&cp, &buf, 2048)) { + string_list_append(&applets, ""); + return NULL; + } + + /* parse output */ + p = strstr(buf.buf, "Currently defined functions:\n"); + if (!p) { + warning("Could not parse output of busybox --help"); + string_list_append(&applets, ""); + return NULL; + } + p = strchrnul(p, '\n'); + for (;;) { + size_t len; + + p += strspn(p, "\n\t ,"); + len = strcspn(p, "\n\t ,"); + if (!len) + break; + p[len] = '\0'; + string_list_insert(&applets, p); + p = p + len + 1; + } + } + + return string_list_has_string(&applets, cmd) ? + xstrdup(busybox_path) : NULL; +} + /* * Determines the absolute path of cmd using the split path in path. * If cmd contains a slash or backslash, no lookup is performed. @@ -1693,6 +1753,9 @@ static char *path_lookup(const char *cmd, int exe_only) path = sep + 1; } + if (!prog && !isexe) + prog = is_busybox_applet(cmd); + return prog; } diff --git a/t/t0014-alias.sh b/t/t0014-alias.sh index 30708146887d19..56d70b73e1c165 100755 --- a/t/t0014-alias.sh +++ b/t/t0014-alias.sh @@ -39,7 +39,7 @@ test_expect_success 'looping aliases - internal execution' ' test_expect_success 'run-command formats empty args properly' ' test_must_fail env GIT_TRACE=1 git frotz a "" b " " c 2>actual.raw && - sed -ne "/run_command:/s/.*trace: run_command: //p" actual.raw >actual && + sed -ne "/run_command: git-frotz/s/.*trace: run_command: //p" actual.raw >actual && echo "git-frotz a '\'''\'' b '\'' '\'' c" >expect && test_cmp expect actual ' From 41f035fd56c0500d4125d669654020101c0eaa22 Mon Sep 17 00:00:00 2001 From: Bert Belder <bertbelder@gmail.com> Date: Fri, 26 Oct 2018 11:13:45 +0200 Subject: [PATCH 258/297] Win32: symlink: move phantom symlink creation to a separate function Signed-off-by: Bert Belder <bertbelder@gmail.com> --- compat/mingw.c | 91 +++++++++++++++++++++++++++----------------------- 1 file changed, 49 insertions(+), 42 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index 4626f98e522058..081cf9250c14df 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -458,6 +458,54 @@ static void process_phantom_symlinks(void) LeaveCriticalSection(&phantom_symlinks_cs); } +static int create_phantom_symlink(wchar_t *wtarget, wchar_t *wlink) +{ + int len; + + /* create file symlink */ + if (!CreateSymbolicLinkW(wlink, wtarget, symlink_file_flags)) { + errno = err_win_to_posix(GetLastError()); + return -1; + } + + /* convert to directory symlink if target exists */ + switch (process_phantom_symlink(wtarget, wlink)) { + case PHANTOM_SYMLINK_RETRY: { + /* if target doesn't exist, add to phantom symlinks list */ + wchar_t wfullpath[MAX_LONG_PATH]; + struct phantom_symlink_info *psi; + + /* convert to absolute path to be independent of cwd */ + len = GetFullPathNameW(wlink, MAX_LONG_PATH, wfullpath, NULL); + if (!len || len >= MAX_LONG_PATH) { + errno = err_win_to_posix(GetLastError()); + return -1; + } + + /* over-allocate and fill phantom_symlink_info structure */ + psi = xmalloc(sizeof(struct phantom_symlink_info) + + sizeof(wchar_t) * (len + wcslen(wtarget) + 2)); + psi->wlink = (wchar_t *)(psi + 1); + wcscpy(psi->wlink, wfullpath); + psi->wtarget = psi->wlink + len + 1; + wcscpy(psi->wtarget, wtarget); + + EnterCriticalSection(&phantom_symlinks_cs); + psi->next = phantom_symlinks; + phantom_symlinks = psi; + LeaveCriticalSection(&phantom_symlinks_cs); + break; + } + case PHANTOM_SYMLINK_DIRECTORY: + /* if we created a dir symlink, process other phantom symlinks */ + process_phantom_symlinks(); + break; + default: + break; + } + return 0; +} + /* Normalizes NT paths as returned by some low-level APIs. */ static wchar_t *normalize_ntpath(wchar_t *wbuf) { @@ -2915,48 +2963,7 @@ int symlink(const char *target, const char *link) if (wtarget[len] == '/') wtarget[len] = '\\'; - /* create file symlink */ - if (!CreateSymbolicLinkW(wlink, wtarget, symlink_file_flags)) { - errno = err_win_to_posix(GetLastError()); - return -1; - } - - /* convert to directory symlink if target exists */ - switch (process_phantom_symlink(wtarget, wlink)) { - case PHANTOM_SYMLINK_RETRY: { - /* if target doesn't exist, add to phantom symlinks list */ - wchar_t wfullpath[MAX_LONG_PATH]; - struct phantom_symlink_info *psi; - - /* convert to absolute path to be independent of cwd */ - len = GetFullPathNameW(wlink, MAX_LONG_PATH, wfullpath, NULL); - if (!len || len >= MAX_LONG_PATH) { - errno = err_win_to_posix(GetLastError()); - return -1; - } - - /* over-allocate and fill phantom_symlink_info structure */ - psi = xmalloc(sizeof(struct phantom_symlink_info) - + sizeof(wchar_t) * (len + wcslen(wtarget) + 2)); - psi->wlink = (wchar_t *)(psi + 1); - wcscpy(psi->wlink, wfullpath); - psi->wtarget = psi->wlink + len + 1; - wcscpy(psi->wtarget, wtarget); - - EnterCriticalSection(&phantom_symlinks_cs); - psi->next = phantom_symlinks; - phantom_symlinks = psi; - LeaveCriticalSection(&phantom_symlinks_cs); - break; - } - case PHANTOM_SYMLINK_DIRECTORY: - /* if we created a dir symlink, process other phantom symlinks */ - process_phantom_symlinks(); - break; - default: - break; - } - return 0; + return create_phantom_symlink(wtarget, wlink); } #ifndef _WINNT_H From 78a45cd6420cf149ae98b24911da59b69e133a43 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Sat, 5 Aug 2017 22:23:36 +0200 Subject: [PATCH 259/297] test-lib: avoid unnecessary Perl invocation It is a bit strange, and even undesirable, to require Perl just to run the test suite even when NO_PERL was set. This patch does not fix this problem by any stretch of imagination. However, it fixes *the* Perl invocation that *every single* test script has to run. While at it, it makes the source code also more grep'able, as the code that unsets some, but not all, GIT_* environment variables just became a *lot* more explicit. And all that while still reducing the total number of lines. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- t/test-lib.sh | 29 ++++++++++++----------------- 1 file changed, 12 insertions(+), 17 deletions(-) diff --git a/t/test-lib.sh b/t/test-lib.sh index 54247604cbc76e..36a9e26938a42e 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -500,23 +500,18 @@ EDITOR=: # /usr/xpg4/bin/sh and /bin/ksh to bail out. So keep the unsets # deriving from the command substitution clustered with the other # ones. -unset VISUAL EMAIL LANGUAGE $("$PERL_PATH" -e ' - my @env = keys %ENV; - my $ok = join("|", qw( - TRACE - DEBUG - TEST - .*_TEST - PROVE - VALGRIND - UNZIP - PERF_ - CURL_VERBOSE - TRACE_CURL - )); - my @vars = grep(/^GIT_/ && !/^GIT_($ok)/o, @env); - print join("\n", @vars); -') +unset VISUAL EMAIL LANGUAGE $(env | sed -n \ + -e '/^GIT_TRACE/d' \ + -e '/^GIT_DEBUG/d' \ + -e '/^GIT_TEST/d' \ + -e '/^GIT_.*_TEST/d' \ + -e '/^GIT_PROVE/d' \ + -e '/^GIT_VALGRIND/d' \ + -e '/^GIT_UNZIP/d' \ + -e '/^GIT_PERF_/d' \ + -e '/^GIT_CURL_VERBOSE/d' \ + -e '/^GIT_TRACE_CURL/d' \ + -e 's/^\(GIT_[^=]*\)=.*/\1/p') unset XDG_CACHE_HOME unset XDG_CONFIG_HOME unset GITPERLLIB From ceee7f1c16a38f189b0431cebb3e0f6ef0434f09 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Mon, 11 Feb 2019 14:19:18 +0100 Subject: [PATCH 260/297] Introduce helper to create symlinks that knows about index_state On Windows, symbolic links actually have a type depending on the target: it can be a file or a directory. In certain circumstances, this poses problems, e.g. when a symbolic link is supposed to point into a submodule that is not checked out, so there is no way for Git to auto-detect the type. To help with that, we will add support over the course of the next commits to specify that symlink type via the Git attributes. This requires an index_state, though, something that Git for Windows' `symlink()` replacement cannot know about because the function signature is defined by the POSIX standard and not ours to change. So let's introduce a helper function to create symbolic links that *does* know about the index_state. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- apply.c | 2 +- builtin/difftool.c | 2 +- compat/mingw.c | 2 +- compat/mingw.h | 4 +++- entry.c | 2 +- git-compat-util.h | 9 +++++++++ merge-recursive.c | 2 +- refs/files-backend.c | 2 +- setup.c | 4 ++-- 9 files changed, 20 insertions(+), 9 deletions(-) diff --git a/apply.c b/apply.c index 6e1060a952c317..d2a34abafe480e 100644 --- a/apply.c +++ b/apply.c @@ -4440,7 +4440,7 @@ static int try_create_file(struct apply_state *state, const char *path, /* Although buf:size is counted string, it also is NUL * terminated. */ - return !!symlink(buf, path); + return !!create_symlink(state && state->repo ? state->repo->index : NULL, buf, path); fd = open(path, O_CREAT | O_EXCL | O_WRONLY, (mode & 0100) ? 0777 : 0666); if (fd < 0) diff --git a/builtin/difftool.c b/builtin/difftool.c index dcc68e190c25b4..5cca20276a06c2 100644 --- a/builtin/difftool.c +++ b/builtin/difftool.c @@ -523,7 +523,7 @@ static int run_dir_diff(const char *extcmd, int symlinks, const char *prefix, } add_path(&wtdir, wtdir_len, dst_path); if (symlinks) { - if (symlink(wtdir.buf, rdir.buf)) { + if (create_symlink(lstate.istate, wtdir.buf, rdir.buf)) { ret = error_errno("could not symlink '%s' to '%s'", wtdir.buf, rdir.buf); goto finish; } diff --git a/compat/mingw.c b/compat/mingw.c index 081cf9250c14df..51821d5311f149 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -2943,7 +2943,7 @@ int link(const char *oldpath, const char *newpath) return 0; } -int symlink(const char *target, const char *link) +int mingw_create_symlink(struct index_state *index, const char *target, const char *link) { wchar_t wtarget[MAX_LONG_PATH], wlink[MAX_LONG_PATH]; int len; diff --git a/compat/mingw.h b/compat/mingw.h index a74024f8b060cd..151f26674d185d 100644 --- a/compat/mingw.h +++ b/compat/mingw.h @@ -218,8 +218,10 @@ int setitimer(int type, struct itimerval *in, struct itimerval *out); int sigaction(int sig, struct sigaction *in, struct sigaction *out); int link(const char *oldpath, const char *newpath); int uname(struct utsname *buf); -int symlink(const char *target, const char *link); int readlink(const char *path, char *buf, size_t bufsiz); +struct index_state; +int mingw_create_symlink(struct index_state *index, const char *target, const char *link); +#define create_symlink mingw_create_symlink /* * replacements of existing functions diff --git a/entry.c b/entry.c index 75fdd39ff7db40..2f92fdb2fe069e 100644 --- a/entry.c +++ b/entry.c @@ -320,7 +320,7 @@ static int write_entry(struct cache_entry *ce, char *path, struct conv_attrs *ca if (!has_symlinks || to_tempfile) goto write_file_entry; - ret = symlink(new_blob, path); + ret = create_symlink(state->istate, new_blob, path); free(new_blob); if (ret) return error_errno("unable to create symlink %s", path); diff --git a/git-compat-util.h b/git-compat-util.h index c9b4513c124b5a..53347307ad533a 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -627,6 +627,15 @@ static inline int git_has_dir_sep(const char *path) #define is_mount_point is_mount_point_via_stat #endif +#ifndef create_symlink +struct index_state; +static inline int git_create_symlink(struct index_state *index, const char *target, const char *link) +{ + return symlink(target, link); +} +#define create_symlink git_create_symlink +#endif + #ifndef query_user_email #define query_user_email() NULL #endif diff --git a/merge-recursive.c b/merge-recursive.c index 5cc638066af599..f606324dee3bdc 100644 --- a/merge-recursive.c +++ b/merge-recursive.c @@ -1004,7 +1004,7 @@ static int update_file_flags(struct merge_options *opt, char *lnk = xmemdupz(buf, size); safe_create_leading_directories_const(path); unlink(path); - if (symlink(lnk, path)) + if (create_symlink(&opt->priv->orig_index, lnk, path)) ret = err(opt, _("failed to symlink '%s': %s"), path, strerror(errno)); free(lnk); diff --git a/refs/files-backend.c b/refs/files-backend.c index 11551de8f84554..e2fdf8a1eb8a3e 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -1942,7 +1942,7 @@ static int create_ref_symlink(struct ref_lock *lock, const char *target) #ifndef NO_SYMLINK_HEAD char *ref_path = get_locked_file_path(&lock->lk); unlink(ref_path); - ret = symlink(target, ref_path); + ret = create_symlink(NULL, target, ref_path); free(ref_path); if (ret) diff --git a/setup.c b/setup.c index 2b44421660c1af..bd303e5de6b01f 100644 --- a/setup.c +++ b/setup.c @@ -2021,7 +2021,7 @@ static void copy_templates_1(struct strbuf *path, struct strbuf *template_path, if (strbuf_readlink(&lnk, template_path->buf, st_template.st_size) < 0) die_errno(_("cannot readlink '%s'"), template_path->buf); - if (symlink(lnk.buf, path->buf)) + if (create_symlink(NULL, lnk.buf, path->buf)) die_errno(_("cannot symlink '%s' '%s'"), lnk.buf, path->buf); strbuf_release(&lnk); @@ -2268,7 +2268,7 @@ static int create_default_files(const char *template_path, path = git_path_buf(&buf, "tXXXXXX"); if (!close(xmkstemp(path)) && !unlink(path) && - !symlink("testing", path) && + !create_symlink(NULL, "testing", path) && !lstat(path, &st1) && S_ISLNK(st1.st_mode)) unlink(path); /* good */ From 4bcaf18995f2479a11291c257a473ac479738abd Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Thu, 20 Jul 2017 22:18:56 +0200 Subject: [PATCH 261/297] test-tool: learn to act as a drop-in replacement for `iconv` It is convenient to assume that everybody who wants to build & test Git has access to a working `iconv` executable (after all, we already pretty much require libiconv). However, that limits esoteric test scenarios such as Git for Windows', where an end user installation has to ship with `iconv` for the sole purpose of being testable. That payload serves no other purpose. So let's just have a test helper (to be able to test Git, the test helpers have to be available, after all) to act as `iconv` replacement. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- Makefile | 1 + t/helper/test-iconv.c | 47 +++++++++++++++++++++++++++++++++++++++++++ t/helper/test-tool.c | 1 + t/helper/test-tool.h | 1 + 4 files changed, 50 insertions(+) create mode 100644 t/helper/test-iconv.c diff --git a/Makefile b/Makefile index d97d96c2fed10a..b72d0730199cbc 100644 --- a/Makefile +++ b/Makefile @@ -808,6 +808,7 @@ TEST_BUILTINS_OBJS += test-hash-speed.o TEST_BUILTINS_OBJS += test-hash.o TEST_BUILTINS_OBJS += test-hashmap.o TEST_BUILTINS_OBJS += test-hexdump.o +TEST_BUILTINS_OBJS += test-iconv.o TEST_BUILTINS_OBJS += test-json-writer.o TEST_BUILTINS_OBJS += test-lazy-init-name-hash.o TEST_BUILTINS_OBJS += test-match-trees.o diff --git a/t/helper/test-iconv.c b/t/helper/test-iconv.c new file mode 100644 index 00000000000000..d3c772fddf990b --- /dev/null +++ b/t/helper/test-iconv.c @@ -0,0 +1,47 @@ +#include "test-tool.h" +#include "git-compat-util.h" +#include "strbuf.h" +#include "gettext.h" +#include "parse-options.h" +#include "utf8.h" + +int cmd__iconv(int argc, const char **argv) +{ + struct strbuf buf = STRBUF_INIT; + char *from = NULL, *to = NULL, *p; + size_t len; + int ret = 0; + const char * const iconv_usage[] = { + N_("test-helper --iconv [<options>]"), + NULL + }; + struct option options[] = { + OPT_STRING('f', "from-code", &from, "encoding", "from"), + OPT_STRING('t', "to-code", &to, "encoding", "to"), + OPT_END() + }; + + argc = parse_options(argc, argv, NULL, options, + iconv_usage, 0); + + if (argc > 1 || !from || !to) + usage_with_options(iconv_usage, options); + + if (!argc) { + if (strbuf_read(&buf, 0, 2048) < 0) + die_errno("Could not read from stdin"); + } else if (strbuf_read_file(&buf, argv[0], 2048) < 0) + die_errno("Could not read from '%s'", argv[0]); + + p = reencode_string_len(buf.buf, buf.len, to, from, &len); + if (!p) + die_errno("Could not reencode"); + if (write(1, p, len) < 0) + ret = !!error_errno("Could not write %"PRIuMAX" bytes", + (uintmax_t)len); + + strbuf_release(&buf); + free(p); + + return ret; +} diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c index da3e69128a976a..af909330f0eab6 100644 --- a/t/helper/test-tool.c +++ b/t/helper/test-tool.c @@ -38,6 +38,7 @@ static struct test_cmd cmds[] = { { "hashmap", cmd__hashmap }, { "hash-speed", cmd__hash_speed }, { "hexdump", cmd__hexdump }, + { "iconv", cmd__iconv }, { "json-writer", cmd__json_writer }, { "lazy-init-name-hash", cmd__lazy_init_name_hash }, { "match-trees", cmd__match_trees }, diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h index 642a34578c0047..6b6a514323ff10 100644 --- a/t/helper/test-tool.h +++ b/t/helper/test-tool.h @@ -32,6 +32,7 @@ int cmd__getcwd(int argc, const char **argv); int cmd__hashmap(int argc, const char **argv); int cmd__hash_speed(int argc, const char **argv); int cmd__hexdump(int argc, const char **argv); +int cmd__iconv(int argc, const char **argv); int cmd__json_writer(int argc, const char **argv); int cmd__lazy_init_name_hash(int argc, const char **argv); int cmd__match_trees(int argc, const char **argv); From 9f760168dba89eab8868ff0854bbdae6df66632c Mon Sep 17 00:00:00 2001 From: Bert Belder <bertbelder@gmail.com> Date: Fri, 26 Oct 2018 11:51:51 +0200 Subject: [PATCH 262/297] mingw: allow to specify the symlink type in .gitattributes On Windows, symbolic links have a type: a "file symlink" must point at a file, and a "directory symlink" must point at a directory. If the type of symlink does not match its target, it doesn't work. Git does not record the type of symlink in the index or in a tree. On checkout it'll guess the type, which only works if the target exists at the time the symlink is created. This may often not be the case, for example when the link points at a directory inside a submodule. By specifying `symlink=file` or `symlink=dir` the user can specify what type of symlink Git should create, so Git doesn't have to rely on unreliable heuristics. Signed-off-by: Bert Belder <bertbelder@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- Documentation/gitattributes.txt | 30 +++++++++++++++++ compat/mingw.c | 58 ++++++++++++++++++++++++++++++++- 2 files changed, 87 insertions(+), 1 deletion(-) diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt index e6150595af8210..f2c07fa1b72be7 100644 --- a/Documentation/gitattributes.txt +++ b/Documentation/gitattributes.txt @@ -403,6 +403,36 @@ sign `$` upon checkout. Any byte sequence that begins with with `$Id$` upon check-in. +`symlink` +^^^^^^^^^ + +On Windows, symbolic links have a type: a "file symlink" must point at +a file, and a "directory symlink" must point at a directory. If the +type of symlink does not match its target, it doesn't work. + +Git does not record the type of symlink in the index or in a tree. On +checkout it'll guess the type, which only works if the target exists +at the time the symlink is created. This may often not be the case, +for example when the link points at a directory inside a submodule. + +The `symlink` attribute allows you to explicitly set the type of symlink +to `file` or `dir`, so Git doesn't have to guess. If you have a set of +symlinks that point at other files, you can do: + +------------------------ +*.gif symlink=file +------------------------ + +To tell Git that a symlink points at a directory, use: + +------------------------ +tools_folder symlink=dir +------------------------ + +The `symlink` attribute is ignored on platforms other than Windows, +since they don't distinguish between different types of symlinks. + + `filter` ^^^^^^^^ diff --git a/compat/mingw.c b/compat/mingw.c index 51821d5311f149..6b3b62333ec13e 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -23,6 +23,7 @@ #include "../write-or-die.h" #include "../repository.h" #include "win32/fscache.h" +#include "../attr.h" #define HCAST(type, handle) ((type)(intptr_t)handle) @@ -2943,6 +2944,37 @@ int link(const char *oldpath, const char *newpath) return 0; } +enum symlink_type { + SYMLINK_TYPE_UNSPECIFIED = 0, + SYMLINK_TYPE_FILE, + SYMLINK_TYPE_DIRECTORY, +}; + +static enum symlink_type check_symlink_attr(struct index_state *index, const char *link) +{ + static struct attr_check *check; + const char *value; + + if (!index) + return SYMLINK_TYPE_UNSPECIFIED; + + if (!check) + check = attr_check_initl("symlink", NULL); + + git_check_attr(index, link, check); + + value = check->items[0].value; + if (ATTR_UNSET(value)) + return SYMLINK_TYPE_UNSPECIFIED; + if (!strcmp(value, "file")) + return SYMLINK_TYPE_FILE; + if (!strcmp(value, "dir") || !strcmp(value, "directory")) + return SYMLINK_TYPE_DIRECTORY; + + warning(_("ignoring invalid symlink type '%s' for '%s'"), value, link); + return SYMLINK_TYPE_UNSPECIFIED; +} + int mingw_create_symlink(struct index_state *index, const char *target, const char *link) { wchar_t wtarget[MAX_LONG_PATH], wlink[MAX_LONG_PATH]; @@ -2963,7 +2995,31 @@ int mingw_create_symlink(struct index_state *index, const char *target, const ch if (wtarget[len] == '/') wtarget[len] = '\\'; - return create_phantom_symlink(wtarget, wlink); + switch (check_symlink_attr(index, link)) { + case SYMLINK_TYPE_UNSPECIFIED: + /* Create a phantom symlink: it is initially created as a file + * symlink, but may change to a directory symlink later if/when + * the target exists. */ + return create_phantom_symlink(wtarget, wlink); + case SYMLINK_TYPE_FILE: + if (!CreateSymbolicLinkW(wlink, wtarget, symlink_file_flags)) + break; + return 0; + case SYMLINK_TYPE_DIRECTORY: + if (!CreateSymbolicLinkW(wlink, wtarget, + symlink_directory_flags)) + break; + /* There may be dangling phantom symlinks that point at this + * one, which should now morph into directory symlinks. */ + process_phantom_symlinks(); + return 0; + default: + BUG("unhandled symlink type"); + } + + /* CreateSymbolicLinkW failed. */ + errno = err_win_to_posix(GetLastError()); + return -1; } #ifndef _WINNT_H From b72eed65c216838546124bc354b3f89a878246e7 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Thu, 20 Jul 2017 22:25:21 +0200 Subject: [PATCH 263/297] tests(mingw): if `iconv` is unavailable, use `test-helper --iconv` Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- t/test-lib.sh | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/t/test-lib.sh b/t/test-lib.sh index 36a9e26938a42e..ee91f6c3b23f2e 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -1721,6 +1721,12 @@ case $uname_s in test_set_prereq GREP_STRIPS_CR test_set_prereq WINDOWS GIT_TEST_CMP="GIT_DIR=/dev/null git diff --no-index --ignore-cr-at-eol --" + if ! type iconv >/dev/null 2>&1 + then + iconv () { + test-tool iconv "$@" + } + fi ;; *CYGWIN*) test_set_prereq POSIXPERM From e86fe24de415be6adfe21baeb6564bf224f49ebe Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Mon, 23 Apr 2018 23:20:00 +0200 Subject: [PATCH 264/297] mingw: Windows Docker volumes are *not* symbolic links ... even if they may look like them. As looking up the target of the "symbolic link" (just to see whether it starts with `/ContainerMappedDirectories/`) is pretty expensive, we do it when we can be *really* sure that there is a possibility that this might be the case. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: JiSeop Moon <zcube@zcube.kr> --- compat/mingw.c | 25 +++++++++++++++++++------ compat/win32.h | 2 +- compat/win32/fscache.c | 24 +++++++++++++++++++++++- 3 files changed, 43 insertions(+), 8 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index 2a1bba67560225..765e5824552f5d 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -1053,7 +1053,7 @@ int mingw_lstat(const char *file_name, struct stat *buf) buf->st_uid = 0; buf->st_nlink = 1; buf->st_mode = file_attr_to_st_mode(fdata.dwFileAttributes, - reparse_tag); + reparse_tag, file_name); buf->st_size = S_ISLNK(buf->st_mode) ? link_len : fdata.nFileSizeLow | (((off_t) fdata.nFileSizeHigh) << 32); buf->st_dev = buf->st_rdev = 0; /* not used by Git */ @@ -1104,7 +1104,7 @@ static int get_file_info_by_handle(HANDLE hnd, struct stat *buf) buf->st_gid = 0; buf->st_uid = 0; buf->st_nlink = 1; - buf->st_mode = file_attr_to_st_mode(fdata.dwFileAttributes, 0); + buf->st_mode = file_attr_to_st_mode(fdata.dwFileAttributes, 0, NULL); buf->st_size = fdata.nFileSizeLow | (((off_t)fdata.nFileSizeHigh)<<32); buf->st_dev = buf->st_rdev = 0; /* not used by Git */ @@ -4036,12 +4036,25 @@ int is_inside_windows_container(void) return inside_container; } -int file_attr_to_st_mode (DWORD attr, DWORD tag) +int file_attr_to_st_mode (DWORD attr, DWORD tag, const char *path) { int fMode = S_IREAD; - if ((attr & FILE_ATTRIBUTE_REPARSE_POINT) && tag == IO_REPARSE_TAG_SYMLINK) - fMode |= S_IFLNK; - else if (attr & FILE_ATTRIBUTE_DIRECTORY) + if ((attr & FILE_ATTRIBUTE_REPARSE_POINT) && + tag == IO_REPARSE_TAG_SYMLINK) { + int flag = S_IFLNK; + char buf[MAX_LONG_PATH]; + + /* + * Windows containers' mapped volumes are marked as reparse + * points and look like symbolic links, but they are not. + */ + if (path && is_inside_windows_container() && + readlink(path, buf, sizeof(buf)) > 27 && + starts_with(buf, "/ContainerMappedDirectories/")) + flag = S_IFDIR; + + fMode |= flag; + } else if (attr & FILE_ATTRIBUTE_DIRECTORY) fMode |= S_IFDIR; else fMode |= S_IFREG; diff --git a/compat/win32.h b/compat/win32.h index 52169ae19f4371..299f01bdf0f5a4 100644 --- a/compat/win32.h +++ b/compat/win32.h @@ -6,7 +6,7 @@ #include <windows.h> #endif -extern int file_attr_to_st_mode (DWORD attr, DWORD tag); +extern int file_attr_to_st_mode (DWORD attr, DWORD tag, const char *path); static inline int get_file_attr(const char *fname, WIN32_FILE_ATTRIBUTE_DATA *fdata) { diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index d3819aa7f86466..e2a6fe272f5740 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -202,8 +202,30 @@ static struct fsentry *fseentry_create_entry(struct fscache *cache, fdata->FileAttributes & FILE_ATTRIBUTE_REPARSE_POINT ? fdata->EaSize : 0; + /* + * On certain Windows versions, host directories mapped into + * Windows Containers ("Volumes", see https://docs.docker.com/storage/volumes/) + * look like symbolic links, but their targets are paths that + * are valid only in kernel mode. + * + * Let's work around this by detecting that situation and + * telling Git that these are *not* symbolic links. + */ + if (fse->reparse_tag == IO_REPARSE_TAG_SYMLINK && + sizeof(buf) > (list ? list->len + 1 : 0) + fse->len + 1 && + is_inside_windows_container()) { + size_t off = 0; + if (list) { + memcpy(buf, list->dirent.d_name, list->len); + buf[list->len] = '/'; + off = list->len + 1; + } + memcpy(buf + off, fse->dirent.d_name, fse->len); + buf[off + fse->len] = '\0'; + } + fse->st_mode = file_attr_to_st_mode(fdata->FileAttributes, - fdata->EaSize); + fdata->EaSize, buf); fse->dirent.d_type = S_ISREG(fse->st_mode) ? DT_REG : S_ISDIR(fse->st_mode) ? DT_DIR : DT_LNK; fse->u.s.st_size = S_ISLNK(fse->st_mode) ? MAX_LONG_PATH : From 0becea9b78aca3ae554597a6dee361146db873a3 Mon Sep 17 00:00:00 2001 From: David Lomas <dl3@pale-eds.co.uk> Date: Fri, 28 Jul 2023 15:20:43 +0100 Subject: [PATCH 265/297] mingw: work around rename() failing on a read-only file At least on _some_ APFS network shares, Git fails to rename the object files because they are marked as read-only, because that has the effect of setting the uchg flag on APFS, which then means the file can't be renamed or deleted. To work around that, when a rename failed, and the read-only flag is set, try to turn it off and on again. This fixes https://github.com/git-for-windows/git/issues/4482 Signed-off-by: David Lomas <dl3@pale-eds.co.uk> Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> --- compat/mingw.c | 25 +++++++++++++++++++------ 1 file changed, 19 insertions(+), 6 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index 765e5824552f5d..45e99182014299 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -2577,7 +2577,7 @@ int mingw_accept(int sockfd1, struct sockaddr *sa, socklen_t *sz) #undef rename int mingw_rename(const char *pold, const char *pnew) { - DWORD attrs = INVALID_FILE_ATTRIBUTES, gle; + DWORD attrs = INVALID_FILE_ATTRIBUTES, gle, attrsold; int tries = 0; wchar_t wpold[MAX_LONG_PATH], wpnew[MAX_LONG_PATH]; if (xutftowcs_long_path(wpold, pold) < 0 || @@ -2590,11 +2590,24 @@ int mingw_rename(const char *pold, const char *pnew) return 0; gle = GetLastError(); - if (gle == ERROR_ACCESS_DENIED && is_inside_windows_container()) { - /* Fall back to copy to destination & remove source */ - if (CopyFileW(wpold, wpnew, FALSE) && !mingw_unlink(pold)) - return 0; - gle = GetLastError(); + if (gle == ERROR_ACCESS_DENIED) { + if (is_inside_windows_container()) { + /* Fall back to copy to destination & remove source */ + if (CopyFileW(wpold, wpnew, FALSE) && !mingw_unlink(pold)) + return 0; + gle = GetLastError(); + } else if ((attrsold = GetFileAttributesW(wpold)) & FILE_ATTRIBUTE_READONLY) { + /* if file is read-only, change and retry */ + SetFileAttributesW(wpold, attrsold & ~FILE_ATTRIBUTE_READONLY); + if (MoveFileExW(wpold, wpnew, + MOVEFILE_REPLACE_EXISTING | MOVEFILE_COPY_ALLOWED)) { + SetFileAttributesW(wpnew, attrsold); + return 0; + } + gle = GetLastError(); + /* revert attribute change on failure */ + SetFileAttributesW(wpold, attrsold); + } } /* revert file attributes on failure */ From 1cf49e571a3a92dca0abfd64c5391ecb114b5db4 Mon Sep 17 00:00:00 2001 From: Bert Belder <bertbelder@gmail.com> Date: Fri, 26 Oct 2018 23:42:09 +0200 Subject: [PATCH 266/297] Win32: symlink: add test for `symlink` attribute To verify that the symlink is resolved correctly, we use the fact that `git.exe` is a native Win32 program, and that `git.exe config -f <path>` therefore uses the native symlink resolution. Signed-off-by: Bert Belder <bertbelder@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- t/t2040-checkout-symlink-attr.sh | 46 ++++++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+) create mode 100755 t/t2040-checkout-symlink-attr.sh diff --git a/t/t2040-checkout-symlink-attr.sh b/t/t2040-checkout-symlink-attr.sh new file mode 100755 index 00000000000000..e00c31d096ce88 --- /dev/null +++ b/t/t2040-checkout-symlink-attr.sh @@ -0,0 +1,46 @@ +#!/bin/sh + +test_description='checkout symlinks with `symlink` attribute on Windows + +Ensures that Git for Windows creates symlinks of the right type, +as specified by the `symlink` attribute in `.gitattributes`.' + +# Tell MSYS to create native symlinks. Without this flag test-lib's +# prerequisite detection for SYMLINKS doesn't detect the right thing. +MSYS=winsymlinks:nativestrict && export MSYS + +. ./test-lib.sh + +if ! test_have_prereq MINGW,SYMLINKS +then + skip_all='skipping $0: MinGW-only test, which requires symlink support.' + test_done +fi + +# Adds a symlink to the index without clobbering the work tree. +cache_symlink () { + sha=$(printf '%s' "$1" | git hash-object --stdin -w) && + git update-index --add --cacheinfo 120000,$sha,"$2" +} + +test_expect_success 'checkout symlinks with attr' ' + cache_symlink file1 file-link && + cache_symlink dir dir-link && + + printf "file-link symlink=file\ndir-link symlink=dir\n" >.gitattributes && + git add .gitattributes && + + git checkout . && + + mkdir dir && + echo "[a]b=c" >file1 && + echo "[x]y=z" >dir/file2 && + + # MSYS2 is very forgiving, it will resolve symlinks even if the + # symlink type is incorrect. To make this test meaningful, try + # them with a native, non-MSYS executable, such as `git config`. + test "$(git config -f file-link a.b)" = "c" && + test "$(git config -f dir-link/file2 x.y)" = "z" +' + +test_done From f6a5a9ac0db39f2d462a0995d82116014472db22 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Thu, 11 Oct 2018 23:55:44 +0200 Subject: [PATCH 267/297] gitattributes: mark .png files as binary Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- .gitattributes | 1 + 1 file changed, 1 insertion(+) diff --git a/.gitattributes b/.gitattributes index 158c3d45c4c10c..60d486390b91df 100644 --- a/.gitattributes +++ b/.gitattributes @@ -6,6 +6,7 @@ *.pm text eol=lf diff=perl *.py text eol=lf diff=python *.bat text eol=crlf +*.png binary CODE_OF_CONDUCT.md -whitespace /Documentation/**/*.txt text eol=lf /command-list.txt text eol=lf From ab6ffcfbcb56faed4d0d051397fb383d5bc8b7a3 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Sat, 5 Aug 2017 20:28:37 +0200 Subject: [PATCH 268/297] tests: move test PNGs into t/lib-diff/ We already have a directory where we store files intended for use by multiple test scripts. The same directory is a better home for the test-binary-*.png files than t/. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- t/{ => lib-diff}/test-binary-1.png | Bin t/{ => lib-diff}/test-binary-2.png | Bin t/t3307-notes-man.sh | 2 +- t/t3903-stash.sh | 2 +- t/t4012-diff-binary.sh | 2 +- t/t4049-diff-stat-count.sh | 2 +- t/t4108-apply-threeway.sh | 12 ++++++------ t/t6403-merge-file.sh | 4 ++-- t/t6407-merge-binary.sh | 2 +- t/t9200-git-cvsexportcommit.sh | 14 +++++++------- 10 files changed, 20 insertions(+), 20 deletions(-) rename t/{ => lib-diff}/test-binary-1.png (100%) rename t/{ => lib-diff}/test-binary-2.png (100%) diff --git a/t/test-binary-1.png b/t/lib-diff/test-binary-1.png similarity index 100% rename from t/test-binary-1.png rename to t/lib-diff/test-binary-1.png diff --git a/t/test-binary-2.png b/t/lib-diff/test-binary-2.png similarity index 100% rename from t/test-binary-2.png rename to t/lib-diff/test-binary-2.png diff --git a/t/t3307-notes-man.sh b/t/t3307-notes-man.sh index ae316502c4531b..2b6894df0a247a 100755 --- a/t/t3307-notes-man.sh +++ b/t/t3307-notes-man.sh @@ -27,7 +27,7 @@ test_expect_success 'example 1: notes to add an Acked-by line' ' ' test_expect_success 'example 2: binary notes' ' - cp "$TEST_DIRECTORY"/test-binary-1.png . && + cp "$TEST_DIRECTORY"/lib-diff/test-binary-1.png . && git checkout B && blob=$(git hash-object -w test-binary-1.png) && git notes --ref=logo add -C "$blob" && diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh index 74666ff3e4b2b8..b18113d303edca 100755 --- a/t/t3903-stash.sh +++ b/t/t3903-stash.sh @@ -1345,7 +1345,7 @@ test_expect_success 'stash -- <subdir> works with binary files' ' git reset && >subdir/untracked && >subdir/tracked && - cp "$TEST_DIRECTORY"/test-binary-1.png subdir/tracked-binary && + cp "$TEST_DIRECTORY"/lib-diff/test-binary-1.png subdir/tracked-binary && git add subdir/tracked* && git stash -- subdir/ && test_path_is_missing subdir/tracked && diff --git a/t/t4012-diff-binary.sh b/t/t4012-diff-binary.sh index c64d9d2f405e1e..b3b9adc3c034fd 100755 --- a/t/t4012-diff-binary.sh +++ b/t/t4012-diff-binary.sh @@ -20,7 +20,7 @@ test_expect_success 'prepare repository' ' echo AIT >a && echo BIT >b && echo CIT >c && echo DIT >d && git update-index --add a b c d && echo git >a && - cat "$TEST_DIRECTORY"/test-binary-1.png >b && + cat "$TEST_DIRECTORY"/lib-diff/test-binary-1.png >b && echo git >c && cat b b >d ' diff --git a/t/t4049-diff-stat-count.sh b/t/t4049-diff-stat-count.sh index 0a4fc735d44ad5..6c028646485971 100755 --- a/t/t4049-diff-stat-count.sh +++ b/t/t4049-diff-stat-count.sh @@ -34,7 +34,7 @@ test_expect_success 'binary changes do not count in lines' ' git reset --hard && echo a >a && echo c >c && - cat "$TEST_DIRECTORY"/test-binary-1.png >d && + cat "$TEST_DIRECTORY"/lib-diff/test-binary-1.png >d && cat >expect <<-\EOF && a | 1 + c | 1 + diff --git a/t/t4108-apply-threeway.sh b/t/t4108-apply-threeway.sh index c558282bc09475..1c85e7051d5f72 100755 --- a/t/t4108-apply-threeway.sh +++ b/t/t4108-apply-threeway.sh @@ -232,11 +232,11 @@ test_expect_success 'apply with --3way --cached and conflicts' ' test_expect_success 'apply binary file patch' ' git reset --hard main && - cp "$TEST_DIRECTORY/test-binary-1.png" bin.png && + cp "$TEST_DIRECTORY/lib-diff/test-binary-1.png" bin.png && git add bin.png && git commit -m "add binary file" && - cp "$TEST_DIRECTORY/test-binary-2.png" bin.png && + cp "$TEST_DIRECTORY/lib-diff/test-binary-2.png" bin.png && git diff --binary >bin.diff && git reset --hard && @@ -247,11 +247,11 @@ test_expect_success 'apply binary file patch' ' test_expect_success 'apply binary file patch with 3way' ' git reset --hard main && - cp "$TEST_DIRECTORY/test-binary-1.png" bin.png && + cp "$TEST_DIRECTORY/lib-diff/test-binary-1.png" bin.png && git add bin.png && git commit -m "add binary file" && - cp "$TEST_DIRECTORY/test-binary-2.png" bin.png && + cp "$TEST_DIRECTORY/lib-diff/test-binary-2.png" bin.png && git diff --binary >bin.diff && git reset --hard && @@ -262,11 +262,11 @@ test_expect_success 'apply binary file patch with 3way' ' test_expect_success 'apply full-index patch with 3way' ' git reset --hard main && - cp "$TEST_DIRECTORY/test-binary-1.png" bin.png && + cp "$TEST_DIRECTORY/lib-diff/test-binary-1.png" bin.png && git add bin.png && git commit -m "add binary file" && - cp "$TEST_DIRECTORY/test-binary-2.png" bin.png && + cp "$TEST_DIRECTORY/lib-diff/test-binary-2.png" bin.png && git diff --full-index >bin.diff && git reset --hard && diff --git a/t/t6403-merge-file.sh b/t/t6403-merge-file.sh index fb872c5a1136fc..6308d3b85f0111 100755 --- a/t/t6403-merge-file.sh +++ b/t/t6403-merge-file.sh @@ -356,12 +356,12 @@ test_expect_success "expected conflict markers" ' test_expect_success 'binary files cannot be merged' ' test_must_fail git merge-file -p \ - orig.txt "$TEST_DIRECTORY"/test-binary-1.png new1.txt 2> merge.err && + orig.txt "$TEST_DIRECTORY"/lib-diff/test-binary-1.png new1.txt 2> merge.err && grep "Cannot merge binary files" merge.err ' test_expect_success 'binary files cannot be merged with --object-id' ' - cp "$TEST_DIRECTORY"/test-binary-1.png . && + cp "$TEST_DIRECTORY"/lib-diff/test-binary-1.png . && git add orig.txt new1.txt test-binary-1.png && test_must_fail git merge-file --object-id \ :orig.txt :test-binary-1.png :new1.txt 2> merge.err && diff --git a/t/t6407-merge-binary.sh b/t/t6407-merge-binary.sh index 0753fc95f45efb..914014e1a9c551 100755 --- a/t/t6407-merge-binary.sh +++ b/t/t6407-merge-binary.sh @@ -10,7 +10,7 @@ TEST_PASSES_SANITIZE_LEAK=true test_expect_success setup ' - cat "$TEST_DIRECTORY"/test-binary-1.png >m && + cat "$TEST_DIRECTORY"/lib-diff/test-binary-1.png >m && git add m && git ls-files -s | sed -e "s/ 0 / 1 /" >E1 && test_tick && diff --git a/t/t9200-git-cvsexportcommit.sh b/t/t9200-git-cvsexportcommit.sh index 3d4842164c8915..3dc8d77433f6ea 100755 --- a/t/t9200-git-cvsexportcommit.sh +++ b/t/t9200-git-cvsexportcommit.sh @@ -55,8 +55,8 @@ test_expect_success 'New file' ' mkdir A B C D E F && echo hello1 >A/newfile1.txt && echo hello2 >B/newfile2.txt && - cp "$TEST_DIRECTORY"/test-binary-1.png C/newfile3.png && - cp "$TEST_DIRECTORY"/test-binary-1.png D/newfile4.png && + cp "$TEST_DIRECTORY"/lib-diff/test-binary-1.png C/newfile3.png && + cp "$TEST_DIRECTORY"/lib-diff/test-binary-1.png D/newfile4.png && git add A/newfile1.txt && git add B/newfile2.txt && git add C/newfile3.png && @@ -81,8 +81,8 @@ test_expect_success 'Remove two files, add two and update two' ' rm -f B/newfile2.txt && rm -f C/newfile3.png && echo Hello5 >E/newfile5.txt && - cp "$TEST_DIRECTORY"/test-binary-2.png D/newfile4.png && - cp "$TEST_DIRECTORY"/test-binary-1.png F/newfile6.png && + cp "$TEST_DIRECTORY"/lib-diff/test-binary-2.png D/newfile4.png && + cp "$TEST_DIRECTORY"/lib-diff/test-binary-1.png F/newfile6.png && git add E/newfile5.txt && git add F/newfile6.png && git commit -a -m "Test: Remove, add and update" && @@ -170,7 +170,7 @@ test_expect_success 'New file with spaces in file name' ' mkdir "G g" && echo ok then >"G g/with spaces.txt" && git add "G g/with spaces.txt" && \ - cp "$TEST_DIRECTORY"/test-binary-1.png "G g/with spaces.png" && \ + cp "$TEST_DIRECTORY"/lib-diff/test-binary-1.png "G g/with spaces.png" && \ git add "G g/with spaces.png" && git commit -a -m "With spaces" && id=$(git rev-list --max-count=1 HEAD) && @@ -182,7 +182,7 @@ test_expect_success 'New file with spaces in file name' ' test_expect_success 'Update file with spaces in file name' ' echo Ok then >>"G g/with spaces.txt" && - cat "$TEST_DIRECTORY"/test-binary-1.png >>"G g/with spaces.png" && \ + cat "$TEST_DIRECTORY"/lib-diff/test-binary-1.png >>"G g/with spaces.png" && \ git add "G g/with spaces.png" && git commit -a -m "Update with spaces" && id=$(git rev-list --max-count=1 HEAD) && @@ -207,7 +207,7 @@ test_expect_success !MINGW 'File with non-ascii file name' ' mkdir -p Å/goo/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p/q/r/s/t/u/v/w/x/y/z/å/ä/ö && echo Foo >Å/goo/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p/q/r/s/t/u/v/w/x/y/z/å/ä/ö/gårdetsågårdet.txt && git add Å/goo/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p/q/r/s/t/u/v/w/x/y/z/å/ä/ö/gårdetsågårdet.txt && - cp "$TEST_DIRECTORY"/test-binary-1.png Å/goo/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p/q/r/s/t/u/v/w/x/y/z/å/ä/ö/gårdetsågårdet.png && + cp "$TEST_DIRECTORY"/lib-diff/test-binary-1.png Å/goo/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p/q/r/s/t/u/v/w/x/y/z/å/ä/ö/gårdetsågårdet.png && git add Å/goo/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p/q/r/s/t/u/v/w/x/y/z/å/ä/ö/gårdetsågårdet.png && git commit -a -m "Går det så går det" && \ id=$(git rev-list --max-count=1 HEAD) && From a07c026baa6c900fcd73536f8902de29c4c8071c Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Tue, 18 Jul 2017 01:15:40 +0200 Subject: [PATCH 269/297] tests: only override sort & find if there are usable ones in /usr/bin/ The idea is to allow running the test suite on MinGit with BusyBox installed in /mingw64/bin/sh.exe. In that case, we will want to exclude sort & find (and other Unix utilities) from being bundled. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- git-sh-setup.sh | 21 ++++++++++++++------- t/test-lib.sh | 21 ++++++++++++++------- 2 files changed, 28 insertions(+), 14 deletions(-) diff --git a/git-sh-setup.sh b/git-sh-setup.sh index ce273fe0e48d99..785a4cbfa9c65e 100644 --- a/git-sh-setup.sh +++ b/git-sh-setup.sh @@ -292,13 +292,20 @@ create_virtual_base() { # Platform specific tweaks to work around some commands case $(uname -s) in *MINGW*) - # Windows has its own (incompatible) sort and find - sort () { - /usr/bin/sort "$@" - } - find () { - /usr/bin/find "$@" - } + if test -x /usr/bin/sort + then + # Windows has its own (incompatible) sort; override + sort () { + /usr/bin/sort "$@" + } + fi + if test -x /usr/bin/find + then + # Windows has its own (incompatible) find; override + find () { + /usr/bin/find "$@" + } + fi # git sees Windows-style pwd pwd () { builtin pwd -W diff --git a/t/test-lib.sh b/t/test-lib.sh index ee91f6c3b23f2e..c501acf7666044 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -1701,13 +1701,20 @@ fi uname_s=$(uname -s) case $uname_s in *MINGW*) - # Windows has its own (incompatible) sort and find - sort () { - /usr/bin/sort "$@" - } - find () { - /usr/bin/find "$@" - } + if test -x /usr/bin/sort + then + # Windows has its own (incompatible) sort; override + sort () { + /usr/bin/sort "$@" + } + fi + if test -x /usr/bin/find + then + # Windows has its own (incompatible) find; override + find () { + /usr/bin/find "$@" + } + fi # git sees Windows-style pwd pwd () { builtin pwd -W From b4cfcbc150aca42120832af0e317192f5ab15a6a Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Mon, 19 Nov 2018 20:34:13 +0100 Subject: [PATCH 270/297] tests: use the correct path separator with BusyBox BusyBox-w32 is a true Win32 application, i.e. it does not come with a POSIX emulation layer. That also means that it does *not* use the Unix convention of separating the entries in the PATH variable using colons, but semicolons. However, there are also BusyBox ports to Windows which use a POSIX emulation layer such as Cygwin's or MSYS2's runtime, i.e. using colons as PATH separators. As a tell-tale, let's use the presence of semicolons in the PATH variable: on Unix, it is highly unlikely that it contains semicolons, and on Windows (without POSIX emulation), it is virtually guaranteed, as everybody should have both $SYSTEMROOT and $SYSTEMROOT/system32 in their PATH. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- t/interop/interop-lib.sh | 8 ++++++-- t/lib-proto-disable.sh | 2 +- t/t0021-conversion.sh | 2 +- t/t0060-path-utils.sh | 24 ++++++++++++------------ t/t0061-run-command.sh | 6 +++--- t/t0300-credentials.sh | 2 +- t/t1504-ceiling-dirs.sh | 10 +++++----- t/t2300-cd-to-toplevel.sh | 2 +- t/t3418-rebase-continue.sh | 4 ++-- t/t5615-alternate-env.sh | 4 ++-- t/t5802-connect-helper.sh | 2 +- t/t7006-pager.sh | 4 ++-- t/t7606-merge-custom.sh | 2 +- t/t7811-grep-open.sh | 2 +- t/t9003-help-autocorrect.sh | 2 +- t/t9800-git-p4-basic.sh | 2 +- t/test-lib.sh | 17 +++++++++++++---- 17 files changed, 54 insertions(+), 41 deletions(-) diff --git a/t/interop/interop-lib.sh b/t/interop/interop-lib.sh index 62f4481b6e4097..c4f8c8e2d8ff92 100644 --- a/t/interop/interop-lib.sh +++ b/t/interop/interop-lib.sh @@ -4,6 +4,10 @@ . ../../GIT-BUILD-OPTIONS INTEROP_ROOT=$(pwd) BUILD_ROOT=$INTEROP_ROOT/build +case "$PATH" in +*\;*) PATH_SEP=\; ;; +*) PATH_SEP=: ;; +esac build_version () { if test -z "$1" @@ -57,7 +61,7 @@ wrap_git () { write_script "$1" <<-EOF GIT_EXEC_PATH="$2" export GIT_EXEC_PATH - PATH="$2:\$PATH" + PATH="$2$PATH_SEP\$PATH" export GIT_EXEC_PATH exec git "\$@" EOF @@ -71,7 +75,7 @@ generate_wrappers () { echo >&2 fatal: test tried to run generic git: $* exit 1 EOF - PATH=$(pwd)/.bin:$PATH + PATH=$(pwd)/.bin$PATH_SEP$PATH } VERSION_A=${GIT_TEST_VERSION_A:-$VERSION_A} diff --git a/t/lib-proto-disable.sh b/t/lib-proto-disable.sh index 890622be81642b..9db481e1be15b2 100644 --- a/t/lib-proto-disable.sh +++ b/t/lib-proto-disable.sh @@ -214,7 +214,7 @@ setup_ext_wrapper () { cd "$TRASH_DIRECTORY/remote" && eval "$*" EOF - PATH=$TRASH_DIRECTORY:$PATH && + PATH=$TRASH_DIRECTORY$PATH_SEP$PATH && export TRASH_DIRECTORY ' } diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh index 0b4997022bf88a..8ba840cdc138cd 100755 --- a/t/t0021-conversion.sh +++ b/t/t0021-conversion.sh @@ -8,7 +8,7 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME . ./test-lib.sh . "$TEST_DIRECTORY"/lib-terminal.sh -PATH=$PWD:$PATH +PATH=$PWD$PATH_SEP$PATH TEST_ROOT="$(pwd)" write_script <<\EOF "$TEST_ROOT/rot13.sh" diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh index 54123c5ab63a6a..61927b98a21523 100755 --- a/t/t0060-path-utils.sh +++ b/t/t0060-path-utils.sh @@ -148,25 +148,25 @@ ancestor /foo /fo -1 ancestor /foo /foo -1 ancestor /foo /bar -1 ancestor /foo /foo/bar -1 -ancestor /foo /foo:/bar -1 -ancestor /foo /:/foo:/bar 0 -ancestor /foo /foo:/:/bar 0 -ancestor /foo /:/bar:/foo 0 +ancestor /foo "/foo$PATH_SEP/bar" -1 +ancestor /foo "/$PATH_SEP/foo$PATH_SEP/bar" 0 +ancestor /foo "/foo$PATH_SEP/$PATH_SEP/bar" 0 +ancestor /foo "/$PATH_SEP/bar$PATH_SEP/foo" 0 ancestor /foo/bar / 0 ancestor /foo/bar /fo -1 ancestor /foo/bar /foo 4 ancestor /foo/bar /foo/ba -1 -ancestor /foo/bar /:/fo 0 -ancestor /foo/bar /foo:/foo/ba 4 +ancestor /foo/bar "/$PATH_SEP/fo" 0 +ancestor /foo/bar "/foo$PATH_SEP/foo/ba" 4 ancestor /foo/bar /bar -1 ancestor /foo/bar /fo -1 -ancestor /foo/bar /foo:/bar 4 -ancestor /foo/bar /:/foo:/bar 4 -ancestor /foo/bar /foo:/:/bar 4 -ancestor /foo/bar /:/bar:/fo 0 -ancestor /foo/bar /:/bar 0 +ancestor /foo/bar "/foo$PATH_SEP/bar" 4 +ancestor /foo/bar "/$PATH_SEP/foo$PATH_SEP/bar" 4 +ancestor /foo/bar "/foo$PATH_SEP/$PATH_SEP/bar" 4 +ancestor /foo/bar "/$PATH_SEP/bar$PATH_SEP/fo" 0 +ancestor /foo/bar "/$PATH_SEP/bar" 0 ancestor /foo/bar /foo 4 -ancestor /foo/bar /foo:/bar 4 +ancestor /foo/bar "/foo$PATH_SEP/bar" 4 ancestor /foo/bar /bar -1 # Windows-specific: DOS drives, network shares diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh index 20986b693cfbe8..0a9926c50c421f 100755 --- a/t/t0061-run-command.sh +++ b/t/t0061-run-command.sh @@ -70,7 +70,7 @@ test_expect_success 'run_command does not try to execute a directory' ' cat bin2/greet EOF - PATH=$PWD/bin1:$PWD/bin2:$PATH \ + PATH=$PWD/bin1$PATH_SEP$PWD/bin2$PATH_SEP$PATH \ test-tool run-command run-command greet >actual 2>err && test_cmp bin2/greet actual && test_must_be_empty err @@ -87,7 +87,7 @@ test_expect_success POSIXPERM 'run_command passes over non-executable file' ' cat bin2/greet EOF - PATH=$PWD/bin1:$PWD/bin2:$PATH \ + PATH=$PWD/bin1$PATH_SEP$PWD/bin2$PATH_SEP$PATH \ test-tool run-command run-command greet >actual 2>err && test_cmp bin2/greet actual && test_must_be_empty err @@ -107,7 +107,7 @@ test_expect_success POSIXPERM,SANITY 'unreadable directory in PATH' ' git config alias.nitfol "!echo frotz" && chmod a-rx local-command && ( - PATH=./local-command:$PATH && + PATH=./local-command$PATH_SEP$PATH && git nitfol >actual ) && echo frotz >expect && diff --git a/t/t0300-credentials.sh b/t/t0300-credentials.sh index 6a76b7fdbd4557..14b1d2f4937e81 100755 --- a/t/t0300-credentials.sh +++ b/t/t0300-credentials.sh @@ -77,7 +77,7 @@ test_expect_success 'setup helper scripts' ' test -z "$pexpiry" || echo password_expiry_utc=$pexpiry EOF - PATH="$PWD:$PATH" + PATH="$PWD$PATH_SEP$PATH" ' test_expect_success 'credential_fill invokes helper' ' diff --git a/t/t1504-ceiling-dirs.sh b/t/t1504-ceiling-dirs.sh index c1679e31d8afca..b47a23f8a9d6b1 100755 --- a/t/t1504-ceiling-dirs.sh +++ b/t/t1504-ceiling-dirs.sh @@ -85,9 +85,9 @@ then GIT_CEILING_DIRECTORIES="$TRASH_ROOT/top/" test_fail subdir_ceil_at_top_slash - GIT_CEILING_DIRECTORIES=":$TRASH_ROOT/top" + GIT_CEILING_DIRECTORIES="$PATH_SEP$TRASH_ROOT/top" test_prefix subdir_ceil_at_top_no_resolve "sub/dir/" - GIT_CEILING_DIRECTORIES=":$TRASH_ROOT/top/" + GIT_CEILING_DIRECTORIES="$PATH_SEP$TRASH_ROOT/top/" test_prefix subdir_ceil_at_top_slash_no_resolve "sub/dir/" fi @@ -117,13 +117,13 @@ GIT_CEILING_DIRECTORIES="$TRASH_ROOT/subdi" test_prefix subdir_ceil_at_subdi_slash "sub/dir/" -GIT_CEILING_DIRECTORIES="/foo:$TRASH_ROOT/sub" +GIT_CEILING_DIRECTORIES="/foo$PATH_SEP$TRASH_ROOT/sub" test_fail second_of_two -GIT_CEILING_DIRECTORIES="$TRASH_ROOT/sub:/bar" +GIT_CEILING_DIRECTORIES="$TRASH_ROOT/sub$PATH_SEP/bar" test_fail first_of_two -GIT_CEILING_DIRECTORIES="/foo:$TRASH_ROOT/sub:/bar" +GIT_CEILING_DIRECTORIES="/foo$PATH_SEP$TRASH_ROOT/sub$PATH_SEP/bar" test_fail second_of_three diff --git a/t/t2300-cd-to-toplevel.sh b/t/t2300-cd-to-toplevel.sh index b40eeb263fe896..4328b9172a5e6e 100755 --- a/t/t2300-cd-to-toplevel.sh +++ b/t/t2300-cd-to-toplevel.sh @@ -17,7 +17,7 @@ test_cd_to_toplevel () { test_expect_success $3 "$2" ' ( cd '"'$1'"' && - PATH="$EXEC_PATH:$PATH" && + PATH="$EXEC_PATH$PATH_SEP$PATH" && . git-sh-setup && cd_to_toplevel && [ "$(pwd -P)" = "$TOPLEVEL" ] diff --git a/t/t3418-rebase-continue.sh b/t/t3418-rebase-continue.sh index c0d29c2154f0b4..b4d25f03435254 100755 --- a/t/t3418-rebase-continue.sh +++ b/t/t3418-rebase-continue.sh @@ -83,7 +83,7 @@ test_expect_success 'rebase --continue remembers merge strategy and options' ' rm -f actual && ( - PATH=./test-bin:$PATH && + PATH=./test-bin$PATH_SEP$PATH && test_must_fail git rebase -s funny -X"option=arg with space" \ -Xop\"tion\\ -X"new${LF}line " main topic ) && @@ -92,7 +92,7 @@ test_expect_success 'rebase --continue remembers merge strategy and options' ' echo "Resolved" >F2 && git add F2 && ( - PATH=./test-bin:$PATH && + PATH=./test-bin$PATH_SEP$PATH && git rebase --continue ) && test_cmp expect actual diff --git a/t/t5615-alternate-env.sh b/t/t5615-alternate-env.sh index 83513e46a3556b..fa2dd08084d21c 100755 --- a/t/t5615-alternate-env.sh +++ b/t/t5615-alternate-env.sh @@ -40,7 +40,7 @@ test_expect_success 'access alternate via absolute path' ' ' test_expect_success 'access multiple alternates' ' - check_obj "$PWD/one.git/objects:$PWD/two.git/objects" <<-EOF + check_obj "$PWD/one.git/objects$PATH_SEP$PWD/two.git/objects" <<-EOF $one blob $two blob EOF @@ -76,7 +76,7 @@ test_expect_success 'access alternate via relative path (subdir)' ' quoted='"one.git\057objects"' unquoted='two.git/objects' test_expect_success 'mix of quoted and unquoted alternates' ' - check_obj "$quoted:$unquoted" <<-EOF + check_obj "$quoted$PATH_SEP$unquoted" <<-EOF $one blob $two blob EOF diff --git a/t/t5802-connect-helper.sh b/t/t5802-connect-helper.sh index c6c2661878c0ca..a096eeeeb427cf 100755 --- a/t/t5802-connect-helper.sh +++ b/t/t5802-connect-helper.sh @@ -85,7 +85,7 @@ test_expect_success 'set up fake git-daemon' ' "$TRASH_DIRECTORY/remote" EOF export TRASH_DIRECTORY && - PATH=$TRASH_DIRECTORY:$PATH + PATH=$TRASH_DIRECTORY$PATH_SEP$PATH ' test_expect_success 'ext command can connect to git daemon (no vhost)' ' diff --git a/t/t7006-pager.sh b/t/t7006-pager.sh index a0296d6ca4019e..6d9140fd90c368 100755 --- a/t/t7006-pager.sh +++ b/t/t7006-pager.sh @@ -55,7 +55,7 @@ test_expect_success !MINGW,TTY 'LESS and LV envvars set by git-sh-setup' ' sane_unset LESS LV && PAGER="env >pager-env.out; wc" && export PAGER && - PATH="$(git --exec-path):$PATH" && + PATH="$(git --exec-path)$PATH_SEP$PATH" && export PATH && test_terminal sh -c ". git-sh-setup && git_pager" ) && @@ -389,7 +389,7 @@ test_default_pager() { EOF chmod +x \$less && ( - PATH=.:\$PATH && + PATH=.$PATH_SEP\$PATH && export PATH && $full_command ) && diff --git a/t/t7606-merge-custom.sh b/t/t7606-merge-custom.sh index 135cb230856533..c2c5f8a20c596d 100755 --- a/t/t7606-merge-custom.sh +++ b/t/t7606-merge-custom.sh @@ -24,7 +24,7 @@ test_expect_success 'set up custom strategy' ' EOF chmod +x git-merge-theirs && - PATH=.:$PATH && + PATH=.$PATH_SEP$PATH && export PATH ' diff --git a/t/t7811-grep-open.sh b/t/t7811-grep-open.sh index fe38d88a1a6ab1..e07a9eaae4662b 100755 --- a/t/t7811-grep-open.sh +++ b/t/t7811-grep-open.sh @@ -53,7 +53,7 @@ test_expect_success SIMPLEPAGER 'git grep -O' ' EOF echo grep.h >expect.notless && - PATH=.:$PATH git grep -O GREP_PATTERN >out && + PATH=.$PATH_SEP$PATH git grep -O GREP_PATTERN >out && { test_cmp expect.less pager-args || test_cmp expect.notless pager-args diff --git a/t/t9003-help-autocorrect.sh b/t/t9003-help-autocorrect.sh index 14a704d0a8c311..f32a4d0e5eeda7 100755 --- a/t/t9003-help-autocorrect.sh +++ b/t/t9003-help-autocorrect.sh @@ -14,7 +14,7 @@ test_expect_success 'setup' ' echo distimdistim was called EOF - PATH="$PATH:." && + PATH="$PATH$PATH_SEP." && export PATH && git commit --allow-empty -m "a single log entry" && diff --git a/t/t9800-git-p4-basic.sh b/t/t9800-git-p4-basic.sh index 3e6dfce24833a0..cede9f5566f4cc 100755 --- a/t/t9800-git-p4-basic.sh +++ b/t/t9800-git-p4-basic.sh @@ -287,7 +287,7 @@ test_expect_success 'exit when p4 fails to produce marshaled output' ' EOF chmod 755 badp4dir/p4 && ( - PATH="$TRASH_DIRECTORY/badp4dir:$PATH" && + PATH="$TRASH_DIRECTORY/badp4dir$PATH_SEP$PATH" && export PATH && test_expect_code 1 git p4 clone --dest="$git" //depot >errs 2>&1 ) && diff --git a/t/test-lib.sh b/t/test-lib.sh index c501acf7666044..84eba18020032a 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -15,6 +15,15 @@ # You should have received a copy of the GNU General Public License # along with this program. If not, see https://www.gnu.org/licenses/ . +# On Unix/Linux, the path separator is the colon, on other systems it +# may be different, though. On Windows, for example, it is a semicolon. +# If the PATH variable contains semicolons, it is pretty safe to assume +# that the path separator is a semicolon. +case "$PATH" in +*\;*) PATH_SEP=\; ;; +*) PATH_SEP=: ;; +esac + # Test the binaries we have just built. The tests are kept in # t/ subdirectory and are run in 'trash directory' subdirectory. if test -z "$TEST_DIRECTORY" @@ -1462,7 +1471,7 @@ then done done IFS=$OLDIFS - PATH=$GIT_VALGRIND/bin:$PATH + PATH=$GIT_VALGRIND/bin$PATH_SEP$PATH GIT_EXEC_PATH=$GIT_VALGRIND/bin export GIT_VALGRIND GIT_VALGRIND_MODE="$valgrind" @@ -1474,7 +1483,7 @@ elif test -n "$GIT_TEST_INSTALLED" then GIT_EXEC_PATH=$($GIT_TEST_INSTALLED/git --exec-path) || error "Cannot run git from $GIT_TEST_INSTALLED." - PATH=$GIT_TEST_INSTALLED:$GIT_BUILD_DIR/t/helper:$PATH + PATH=$GIT_TEST_INSTALLED$PATH_SEP$GIT_BUILD_DIR/t/helper$PATH_SEP$PATH GIT_EXEC_PATH=${GIT_TEST_EXEC_PATH:-$GIT_EXEC_PATH} else # normal case, use ../bin-wrappers only unless $with_dashes: if test -n "$no_bin_wrappers" @@ -1490,12 +1499,12 @@ else # normal case, use ../bin-wrappers only unless $with_dashes: fi with_dashes=t fi - PATH="$git_bin_dir:$PATH" + PATH="$git_bin_dir$PATH_SEP$PATH" fi GIT_EXEC_PATH=$GIT_BUILD_DIR if test -n "$with_dashes" then - PATH="$GIT_BUILD_DIR:$GIT_BUILD_DIR/t/helper:$PATH" + PATH="$GIT_BUILD_DIR$PATH_SEP$GIT_BUILD_DIR/t/helper$PATH_SEP$PATH" fi fi GIT_TEMPLATE_DIR="$GIT_BUILD_DIR"/templates/blt From 68ff010350c8eebb5a40ff6bdf0114928eb849f6 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Fri, 30 Jun 2017 00:35:40 +0200 Subject: [PATCH 271/297] mingw: only use Bash-ism `builtin pwd -W` when available Traditionally, Git for Windows' SDK uses Bash as its default shell. However, other Unix shells are available, too. Most notably, the Win32 port of BusyBox comes with `ash` whose `pwd` command already prints Windows paths as Git for Windows wants them, while there is not even a `builtin` command. Therefore, let's be careful not to override `pwd` unless we know that the `builtin` command is available. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- git-sh-setup.sh | 14 ++++++++++---- t/test-lib.sh | 14 ++++++++++---- 2 files changed, 20 insertions(+), 8 deletions(-) diff --git a/git-sh-setup.sh b/git-sh-setup.sh index 785a4cbfa9c65e..b53539cd0e556a 100644 --- a/git-sh-setup.sh +++ b/git-sh-setup.sh @@ -306,10 +306,16 @@ case $(uname -s) in /usr/bin/find "$@" } fi - # git sees Windows-style pwd - pwd () { - builtin pwd -W - } + # On Windows, Git wants Windows paths. But /usr/bin/pwd spits out + # Unix-style paths. At least in Bash, we have a builtin pwd that + # understands the -W option to force "mixed" paths, i.e. with drive + # prefix but still with forward slashes. Let's use that, if available. + if type builtin >/dev/null 2>&1 + then + pwd () { + builtin pwd -W + } + fi is_absolute_path () { case "$1" in [/\\]* | [A-Za-z]:*) diff --git a/t/test-lib.sh b/t/test-lib.sh index 84eba18020032a..68280ae0e4cbfe 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -1724,10 +1724,16 @@ case $uname_s in /usr/bin/find "$@" } fi - # git sees Windows-style pwd - pwd () { - builtin pwd -W - } + # On Windows, Git wants Windows paths. But /usr/bin/pwd spits out + # Unix-style paths. At least in Bash, we have a builtin pwd that + # understands the -W option to force "mixed" paths, i.e. with drive + # prefix but still with forward slashes. Let's use that, if available. + if type builtin >/dev/null 2>&1 + then + pwd () { + builtin pwd -W + } + fi # no POSIX permissions # backslashes in pathspec are converted to '/' # exec does not inherit the PID From cfa1803a7e56443412bbc6628236068f093b031b Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Fri, 30 Jun 2017 22:32:33 +0200 Subject: [PATCH 272/297] tests (mingw): remove Bash-specific pwd option The -W option is only understood by MSYS2 Bash's pwd command. We already make sure to override `pwd` by `builtin pwd -W` for MINGW, so let's not double the effort here. This will also help when switching the shell to another one (such as BusyBox' ash) whose pwd does *not* understand the -W option. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- t/t9902-completion.sh | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh index 932d5ad7591698..f225313ab3963e 100755 --- a/t/t9902-completion.sh +++ b/t/t9902-completion.sh @@ -139,12 +139,7 @@ invalid_variable_name='${foo.bar}' actual="$TRASH_DIRECTORY/actual" -if test_have_prereq MINGW -then - ROOT="$(pwd -W)" -else - ROOT="$(pwd)" -fi +ROOT="$(pwd)" test_expect_success 'setup for __git_find_repo_path/__gitdir tests' ' mkdir -p subdir/subsubdir && From d48d5f2a0088bc76cc2eb6b3ad5d7b3833f76b39 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Wed, 19 Jul 2017 17:07:56 +0200 Subject: [PATCH 273/297] test-lib: add BUSYBOX prerequisite When running with BusyBox, we will want to avoid calling executables on the PATH that are implemented in BusyBox itself. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- t/test-lib.sh | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/t/test-lib.sh b/t/test-lib.sh index 68280ae0e4cbfe..8604e1486a1691 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -1915,6 +1915,10 @@ test_lazy_prereq UNZIP ' test $? -ne 127 ' +test_lazy_prereq BUSYBOX ' + case "$($SHELL --help 2>&1)" in *BusyBox*) true;; *) false;; esac +' + run_with_limited_cmdline () { (ulimit -s 128 && "$@") } From da65a4fc19199884a3171fcbe7ef22c13d0440af Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Sat, 5 Aug 2017 21:36:01 +0200 Subject: [PATCH 274/297] t5003: use binary file from t/lib-diff/ At some stage, t5003-archive-zip wants to add a file that is not ASCII. To that end, it uses /bin/sh. But that file may actually not exist (it is too easy to forget that not all the world is Unix/Linux...)! Besides, we already have perfectly fine binary files intended for use solely by the tests. So let's use one of them instead. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- t/t5003-archive-zip.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh index 961c6aac256135..2c3d5a13ad027f 100755 --- a/t/t5003-archive-zip.sh +++ b/t/t5003-archive-zip.sh @@ -88,7 +88,7 @@ test_expect_success \ 'mkdir a && echo simple textfile >a/a && mkdir a/bin && - cp /bin/sh a/bin && + cp "$TEST_DIRECTORY/lib-diff/test-binary-1.png" a/bin && printf "text\r" >a/text.cr && printf "text\r\n" >a/text.crlf && printf "text\n" >a/text.lf && From b67f7f0c23ab8b4a93b165532e28c277ad5e2ffd Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Fri, 21 Jul 2017 12:48:33 +0200 Subject: [PATCH 275/297] t5532: workaround for BusyBox on Windows While it may seem super convenient to some old Unix hands to simpy require Perl to be available when running the test suite, this is a major hassle on Windows, where we want to verify that Perl is not, actually, required in a NO_PERL build. As a super ugly workaround, we "install" a script into /usr/bin/perl reading like this: #!/bin/sh # We'd much rather avoid requiring Perl altogether when testing # an installed Git. Oh well, that's why we cannot have nice # things. exec c:/git-sdk-64/usr/bin/perl.exe "$@" The problem with that is that BusyBox assumes that the #! line in a script refers to an executable, not to a script. So when it encounters the line #!/usr/bin/perl in t5532's proxy-get-cmd, it barfs. Let's help this situation by simply executing the Perl script with the "interpreter" specified explicitly. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- t/t5532-fetch-proxy.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/t/t5532-fetch-proxy.sh b/t/t5532-fetch-proxy.sh index d664912799b43a..55bc57c138ddc2 100755 --- a/t/t5532-fetch-proxy.sh +++ b/t/t5532-fetch-proxy.sh @@ -27,7 +27,7 @@ test_expect_success 'setup proxy script' ' write_script proxy <<-\EOF echo >&2 "proxying for $*" - cmd=$(./proxy-get-cmd) + cmd=$("$PERL_PATH" ./proxy-get-cmd) echo >&2 "Running $cmd" exec $cmd EOF From 103d7b3a1b7d19057c2e8d984ed49f050067fb5d Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Fri, 21 Jul 2017 13:24:55 +0200 Subject: [PATCH 276/297] t5605: special-case hardlink test for BusyBox-w32 When t5605 tries to verify that files are hardlinked (or that they are not), it uses the `-links` option of the `find` utility. BusyBox' implementation does not support that option, and BusyBox-w32's lstat() does not even report the number of hard links correctly (for performance reasons). So let's just switch to a different method that actually works on Windows. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- t/t5605-clone-local.sh | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/t/t5605-clone-local.sh b/t/t5605-clone-local.sh index d9a320abd21657..ed9bfb5e44a42e 100755 --- a/t/t5605-clone-local.sh +++ b/t/t5605-clone-local.sh @@ -12,6 +12,21 @@ repo_is_hardlinked() { test_line_count = 0 output } +if test_have_prereq MINGW,BUSYBOX +then + # BusyBox' `find` does not support `-links`. Besides, BusyBox-w32's + # lstat() does not report hard links, just like Git's mingw_lstat() + # (from where BusyBox-w32 got its initial implementation). + repo_is_hardlinked() { + for f in $(find "$1/objects" -type f) + do + "$SYSTEMROOT"/system32/fsutil.exe \ + hardlink list $f >links && + test_line_count -gt 1 links || return 1 + done + } +fi + test_expect_success 'preparing origin repository' ' : >file && git add . && git commit -m1 && git clone --bare . a.git && From 14c9d2753b9b109523ecde1892d42cb17d6afbcf Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Wed, 5 Jul 2017 15:14:50 +0200 Subject: [PATCH 277/297] t5813: allow for $PWD to be a Windows path Git for Windows uses MSYS2's Bash to run the test suite, which comes with benefits but also at a heavy price: on the plus side, MSYS2's POSIX emulation layer allows us to continue pretending that we are on a Unix system, e.g. use Unix paths instead of Windows ones, yet this is bought at a rather noticeable performance penalty. There *are* some more native ports of Unix shells out there, though, most notably BusyBox-w32's ash. These native ports do not use any POSIX emulation layer (or at most a *very* thin one, choosing to avoid features such as fork() that are expensive to emulate on Windows), and they use native Windows paths (usually with forward slashes instead of backslashes, which is perfectly legal in almost all use cases). And here comes the problem: with a $PWD looking like, say, C:/git-sdk-64/usr/src/git/t/trash directory.t5813-proto-disable-ssh Git's test scripts get quite a bit confused, as their assumptions have been shattered. Not only does this path contain a colon (oh no!), it also does not start with a slash. This is a problem e.g. when constructing a URL as t5813 does it: ssh://remote$PWD. Not only is it impossible to separate the "host" from the path with a $PWD as above, even prefixing $PWD by a slash won't work, as /C:/git-sdk-64/... is not a valid path. As a workaround, detect when $PWD does not start with a slash on Windows, and simply strip the drive prefix, using an obscure feature of Windows paths: if an absolute Windows path starts with a slash, it is implicitly prefixed by the drive prefix of the current directory. As we are talking about the current directory here, anyway, that strategy works. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- t/t5813-proto-disable-ssh.sh | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/t/t5813-proto-disable-ssh.sh b/t/t5813-proto-disable-ssh.sh index 2e975dc70ec2c8..9e84f83f2dd100 100755 --- a/t/t5813-proto-disable-ssh.sh +++ b/t/t5813-proto-disable-ssh.sh @@ -16,8 +16,23 @@ test_expect_success 'setup repository to clone' ' ' test_proto "host:path" ssh "remote:repo.git" -test_proto "ssh://" ssh "ssh://remote$PWD/remote/repo.git" -test_proto "git+ssh://" ssh "git+ssh://remote$PWD/remote/repo.git" + +hostdir="$PWD" +if test_have_prereq MINGW && test "/${PWD#/}" != "$PWD" +then + case "$PWD" in + [A-Za-z]:/*) + hostdir="${PWD#?:}" + ;; + *) + skip_all="Unhandled PWD '$PWD'; skipping rest" + test_done + ;; + esac +fi + +test_proto "ssh://" ssh "ssh://remote$hostdir/remote/repo.git" +test_proto "git+ssh://" ssh "git+ssh://remote$hostdir/remote/repo.git" # Don't even bother setting up a "-remote" directory, as ssh would generally # complain about the bogus option rather than completing our request. Our From 3c78f25191c451f6e41720db2ed83fc2fcfafa2e Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Fri, 7 Jul 2017 10:15:36 +0200 Subject: [PATCH 278/297] t9200: skip tests when $PWD contains a colon On Windows, the current working directory is pretty much guaranteed to contain a colon. If we feed that path to CVS, it mistakes it for a separator between host and port, though. This has not been a problem so far because Git for Windows uses MSYS2's Bash using a POSIX emulation layer that also pretends that the current directory is a Unix path (at least as long as we're in a shell script). However, that is rather limiting, as Git for Windows also explores other ports of other Unix shells. One of those is BusyBox-w32's ash, which is a native port (i.e. *not* using any POSIX emulation layer, and certainly not emulating Unix paths). So let's just detect if there is a colon in $PWD and punt in that case. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- t/t9200-git-cvsexportcommit.sh | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/t/t9200-git-cvsexportcommit.sh b/t/t9200-git-cvsexportcommit.sh index 3dc8d77433f6ea..65aea65b55517f 100755 --- a/t/t9200-git-cvsexportcommit.sh +++ b/t/t9200-git-cvsexportcommit.sh @@ -12,6 +12,13 @@ if ! test_have_prereq PERL; then test_done fi +case "$PWD" in +*:*) + skip_all='cvs would get confused by the colon in `pwd`; skipping tests' + test_done + ;; +esac + cvs >/dev/null 2>&1 if test $? -ne 1 then From 66f7a2d05dc77dafa3a83e3732edfb57b7d6d756 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Thu, 20 Jul 2017 00:23:26 +0200 Subject: [PATCH 279/297] mingw: add a Makefile target to copy test artifacts The Makefile target `install-mingit-test-artifacts` simply copies stuff and things directly into a MinGit directory, including an init.bat script to set everything up so that the tests can be run in a cmd window. Sadly, Git's test suite still relies on a Perl interpreter even if compiled with NO_PERL=YesPlease. We punt for now, installing a small script into /usr/bin/perl that hands off to an existing Perl of a Git for Windows SDK. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- config.mak.uname | 56 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 56 insertions(+) diff --git a/config.mak.uname b/config.mak.uname index 7515d87270e4f2..0090afc2012030 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -753,6 +753,62 @@ ifeq ($(uname_S),MINGW) ETC_GITCONFIG = ../etc/gitconfig ETC_GITATTRIBUTES = ../etc/gitattributes endif +ifeq (i686,$(uname_M)) + MINGW_PREFIX := mingw32 +endif +ifeq (x86_64,$(uname_M)) + MINGW_PREFIX := mingw64 +endif + + DESTDIR_WINDOWS = $(shell cygpath -aw '$(DESTDIR_SQ)') + DESTDIR_MIXED = $(shell cygpath -am '$(DESTDIR_SQ)') +install-mingit-test-artifacts: + install -m755 -d '$(DESTDIR_SQ)/usr/bin' + printf '%s\n%s\n' >'$(DESTDIR_SQ)/usr/bin/perl' \ + "#!/mingw64/bin/busybox sh" \ + "exec \"$(shell cygpath -am /usr/bin/perl.exe)\" \"\$$@\"" + + install -m755 -d '$(DESTDIR_SQ)' + printf '%s%s\n%s\n%s\n%s\n%s\n' >'$(DESTDIR_SQ)/init.bat' \ + "PATH=$(DESTDIR_WINDOWS)\\$(MINGW_PREFIX)\\bin;" \ + "C:\\WINDOWS;C:\\WINDOWS\\system32" \ + "@set GIT_TEST_INSTALLED=$(DESTDIR_MIXED)/$(MINGW_PREFIX)/bin" \ + "@`echo "$(DESTDIR_WINDOWS)" | sed 's/:.*/:/'`" \ + "@cd `echo "$(DESTDIR_WINDOWS)" | sed 's/^.://'`\\test-git\\t" \ + "@echo Now, run 'helper\\test-run-command testsuite'" + + install -m755 -d '$(DESTDIR_SQ)/test-git' + sed 's/^\(NO_PERL\|NO_PYTHON\)=.*/\1=YesPlease/' \ + <GIT-BUILD-OPTIONS >'$(DESTDIR_SQ)/test-git/GIT-BUILD-OPTIONS' + + install -m755 -d '$(DESTDIR_SQ)/test-git/t/helper' + install -m755 $(TEST_PROGRAMS) '$(DESTDIR_SQ)/test-git/t/helper' + (cd t && $(TAR) cf - t[0-9][0-9][0-9][0-9] lib-diff) | \ + (cd '$(DESTDIR_SQ)/test-git/t' && $(TAR) xf -) + install -m755 t/t556x_common t/*.sh '$(DESTDIR_SQ)/test-git/t' + + install -m755 -d '$(DESTDIR_SQ)/test-git/templates' + (cd templates && $(TAR) cf - blt) | \ + (cd '$(DESTDIR_SQ)/test-git/templates' && $(TAR) xf -) + + # po/build/locale for t0200 + install -m755 -d '$(DESTDIR_SQ)/test-git/po/build/locale' + (cd po/build/locale && $(TAR) cf - .) | \ + (cd '$(DESTDIR_SQ)/test-git/po/build/locale' && $(TAR) xf -) + + # git-daemon.exe for t5802, git-http-backend.exe for t5560 + install -m755 -d '$(DESTDIR_SQ)/$(MINGW_PREFIX)/bin' + install -m755 git-daemon.exe git-http-backend.exe \ + '$(DESTDIR_SQ)/$(MINGW_PREFIX)/bin' + + # git-upload-archive (dashed) for t5000 + install -m755 -d '$(DESTDIR_SQ)/$(MINGW_PREFIX)/bin' + install -m755 git-upload-archive.exe '$(DESTDIR_SQ)/$(MINGW_PREFIX)/bin' + + # git-difftool--helper for t7800 + install -m755 -d '$(DESTDIR_SQ)/$(MINGW_PREFIX)/libexec/git-core' + install -m755 git-difftool--helper \ + '$(DESTDIR_SQ)/$(MINGW_PREFIX)/libexec/git-core' endif ifeq ($(uname_S),QNX) COMPAT_CFLAGS += -DSA_RESTART=0 From 19430b7b33d4f3255cf2392d7e61293a4a2e66f1 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Wed, 17 May 2017 17:05:09 +0200 Subject: [PATCH 280/297] mingw: kill child processes in a gentler way The TerminateProcess() function does not actually leave the child processes any chance to perform any cleanup operations. This is bad insofar as Git itself expects its signal handlers to run. A symptom is e.g. a left-behind .lock file that would not be left behind if the same operation was run, say, on Linux. To remedy this situation, we use an obscure trick: we inject a thread into the process that needs to be killed and to let that thread run the ExitProcess() function with the desired exit status. Thanks J Wyman for describing this trick. The advantage is that the ExitProcess() function lets the atexit handlers run. While this is still different from what Git expects (i.e. running a signal handler), in practice Git sets up signal handlers and atexit handlers that call the same code to clean up after itself. In case that the gentle method to terminate the process failed, we still fall back to calling TerminateProcess(), but in that case we now also make sure that processes spawned by the spawned process are terminated; TerminateProcess() does not give the spawned process a chance to do so itself. Please note that this change only affects how Git for Windows tries to terminate processes spawned by Git's own executables. Third-party software that *calls* Git and wants to terminate it *still* need to make sure to imitate this gentle method, otherwise this patch will not have any effect. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- compat/mingw.c | 29 +++++-- compat/win32/exit-process.h | 165 ++++++++++++++++++++++++++++++++++++ 2 files changed, 186 insertions(+), 8 deletions(-) create mode 100644 compat/win32/exit-process.h diff --git a/compat/mingw.c b/compat/mingw.c index 4626f98e522058..2a57def36b1af6 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -10,6 +10,7 @@ #include "../run-command.h" #include "../abspath.h" #include "../alloc.h" +#include "win32/exit-process.h" #include "win32/lazyload.h" #include "../config.h" #include "../environment.h" @@ -2213,16 +2214,28 @@ int mingw_execvp(const char *cmd, char *const *argv) int mingw_kill(pid_t pid, int sig) { if (pid > 0 && sig == SIGTERM) { - HANDLE h = OpenProcess(PROCESS_TERMINATE, FALSE, pid); - - if (TerminateProcess(h, -1)) { + HANDLE h = OpenProcess(PROCESS_CREATE_THREAD | + PROCESS_QUERY_INFORMATION | + PROCESS_VM_OPERATION | PROCESS_VM_WRITE | + PROCESS_VM_READ | PROCESS_TERMINATE, + FALSE, pid); + int ret; + + if (h) + ret = exit_process(h, 128 + sig); + else { + h = OpenProcess(PROCESS_TERMINATE, FALSE, pid); + if (!h) { + errno = err_win_to_posix(GetLastError()); + return -1; + } + ret = terminate_process_tree(h, 128 + sig); + } + if (ret) { + errno = err_win_to_posix(GetLastError()); CloseHandle(h); - return 0; } - - errno = err_win_to_posix(GetLastError()); - CloseHandle(h); - return -1; + return ret; } else if (pid > 0 && sig == 0) { HANDLE h = OpenProcess(PROCESS_QUERY_INFORMATION, FALSE, pid); if (h) { diff --git a/compat/win32/exit-process.h b/compat/win32/exit-process.h new file mode 100644 index 00000000000000..d53989884cfb0c --- /dev/null +++ b/compat/win32/exit-process.h @@ -0,0 +1,165 @@ +#ifndef EXIT_PROCESS_H +#define EXIT_PROCESS_H + +/* + * This file contains functions to terminate a Win32 process, as gently as + * possible. + * + * At first, we will attempt to inject a thread that calls ExitProcess(). If + * that fails, we will fall back to terminating the entire process tree. + * + * For simplicity, these functions are marked as file-local. + */ + +#include <tlhelp32.h> + +/* + * Terminates the process corresponding to the process ID and all of its + * directly and indirectly spawned subprocesses. + * + * This way of terminating the processes is not gentle: the processes get + * no chance of cleaning up after themselves (closing file handles, removing + * .lock files, terminating spawned processes (if any), etc). + */ +static int terminate_process_tree(HANDLE main_process, int exit_status) +{ + HANDLE snapshot = CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, 0); + PROCESSENTRY32 entry; + DWORD pids[16384]; + int max_len = sizeof(pids) / sizeof(*pids), i, len, ret = 0; + pid_t pid = GetProcessId(main_process); + + pids[0] = (DWORD)pid; + len = 1; + + /* + * Even if Process32First()/Process32Next() seem to traverse the + * processes in topological order (i.e. parent processes before + * child processes), there is nothing in the Win32 API documentation + * suggesting that this is guaranteed. + * + * Therefore, run through them at least twice and stop when no more + * process IDs were added to the list. + */ + for (;;) { + int orig_len = len; + + memset(&entry, 0, sizeof(entry)); + entry.dwSize = sizeof(entry); + + if (!Process32First(snapshot, &entry)) + break; + + do { + for (i = len - 1; i >= 0; i--) { + if (pids[i] == entry.th32ProcessID) + break; + if (pids[i] == entry.th32ParentProcessID) + pids[len++] = entry.th32ProcessID; + } + } while (len < max_len && Process32Next(snapshot, &entry)); + + if (orig_len == len || len >= max_len) + break; + } + + for (i = len - 1; i > 0; i--) { + HANDLE process = OpenProcess(PROCESS_TERMINATE, FALSE, pids[i]); + + if (process) { + if (!TerminateProcess(process, exit_status)) + ret = -1; + CloseHandle(process); + } + } + if (!TerminateProcess(main_process, exit_status)) + ret = -1; + CloseHandle(main_process); + + return ret; +} + +/** + * Determine whether a process runs in the same architecture as the current + * one. That test is required before we assume that GetProcAddress() returns + * a valid address *for the target process*. + */ +static inline int process_architecture_matches_current(HANDLE process) +{ + static BOOL current_is_wow = -1; + BOOL is_wow; + + if (current_is_wow == -1 && + !IsWow64Process (GetCurrentProcess(), ¤t_is_wow)) + current_is_wow = -2; + if (current_is_wow == -2) + return 0; /* could not determine current process' WoW-ness */ + if (!IsWow64Process (process, &is_wow)) + return 0; /* cannot determine */ + return is_wow == current_is_wow; +} + +/** + * Inject a thread into the given process that runs ExitProcess(). + * + * Note: as kernel32.dll is loaded before any process, the other process and + * this process will have ExitProcess() at the same address. + * + * This function expects the process handle to have the access rights for + * CreateRemoteThread(): PROCESS_CREATE_THREAD, PROCESS_QUERY_INFORMATION, + * PROCESS_VM_OPERATION, PROCESS_VM_WRITE, and PROCESS_VM_READ. + * + * The idea comes from the Dr Dobb's article "A Safer Alternative to + * TerminateProcess()" by Andrew Tucker (July 1, 1999), + * http://www.drdobbs.com/a-safer-alternative-to-terminateprocess/184416547 + * + * If this method fails, we fall back to running terminate_process_tree(). + */ +static int exit_process(HANDLE process, int exit_code) +{ + DWORD code; + + if (GetExitCodeProcess(process, &code) && code == STILL_ACTIVE) { + static int initialized; + static LPTHREAD_START_ROUTINE exit_process_address; + PVOID arg = (PVOID)(intptr_t)exit_code; + DWORD thread_id; + HANDLE thread = NULL; + + if (!initialized) { + HINSTANCE kernel32 = GetModuleHandleA("kernel32"); + if (!kernel32) + die("BUG: cannot find kernel32"); + exit_process_address = + (LPTHREAD_START_ROUTINE)(void (*)(void)) + GetProcAddress(kernel32, "ExitProcess"); + initialized = 1; + } + if (!exit_process_address || + !process_architecture_matches_current(process)) + return terminate_process_tree(process, exit_code); + + thread = CreateRemoteThread(process, NULL, 0, + exit_process_address, + arg, 0, &thread_id); + if (thread) { + CloseHandle(thread); + /* + * If the process survives for 10 seconds (a completely + * arbitrary value picked from thin air), fall back to + * killing the process tree via TerminateProcess(). + */ + if (WaitForSingleObject(process, 10000) == + WAIT_OBJECT_0) { + CloseHandle(process); + return 0; + } + } + + return terminate_process_tree(process, exit_code); + } + + return 0; +} + +#endif From e315485520c83dc0cdb97dcdff308ef7d6b40a15 Mon Sep 17 00:00:00 2001 From: xungeng li <xungeng@gmail.com> Date: Wed, 7 Jun 2023 20:26:33 +0800 Subject: [PATCH 281/297] mingw: optionally enable wsl compability file mode bits The Windows Subsystem for Linux (WSL) version 2 allows to use `chmod` on NTFS volumes provided that they are mounted with metadata enabled (see https://devblogs.microsoft.com/commandline/chmod-chown-wsl-improvements/ for details), for example: $ chmod 0755 /mnt/d/test/a.sh In order to facilitate better collaboration between the Windows version of Git and the WSL version of Git, we can make the Windows version of Git also support reading and writing NTFS file modes in a manner compatible with WSL. Since this slightly slows down operations where lots of files are created (such as an initial checkout), this feature is only enabled when `core.WSLCompat` is set to true. Note that you also have to set `core.fileMode=true` in repositories that have been initialized without enabling WSL compatibility. There are several ways to enable metadata loading for NTFS volumes in WSL, one of which is to modify `/etc/wsl.conf` by adding: ``` [automount] enabled = true options = "metadata,umask=027,fmask=117" ``` And reboot WSL. It can also be enabled temporarily by this incantation: $ sudo umount /mnt/c && sudo mount -t drvfs C: /mnt/c -o metadata,uid=1000,gid=1000,umask=22,fmask=111 It's important to note that this modification is compatible with, but does not depend on WSL. The helper functions in this commit can operate independently and functions normally on devices where WSL is not installed or properly configured. Signed-off-by: xungeng li <xungeng@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- Documentation/config/core.txt | 6 ++ compat/mingw.c | 13 +++ compat/win32/fscache.c | 16 ++++ compat/win32/wsl.c | 142 ++++++++++++++++++++++++++++ compat/win32/wsl.h | 12 +++ config.mak.uname | 4 +- contrib/buildsystems/CMakeLists.txt | 1 + 7 files changed, 192 insertions(+), 2 deletions(-) create mode 100644 compat/win32/wsl.c create mode 100644 compat/win32/wsl.h diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt index 55c8939452a7a4..458a3f4f7f5d80 100644 --- a/Documentation/config/core.txt +++ b/Documentation/config/core.txt @@ -771,3 +771,9 @@ core.maxTreeDepth:: to allow Git to abort cleanly, and should not generally need to be adjusted. When Git is compiled with MSVC, the default is 512. Otherwise, the default is 2048. + +core.WSLCompat:: + Tells Git whether to enable wsl compatibility mode. + The default value is false. When set to true, Git will set the mode + bits of the file in the way of wsl, so that the executable flag of + files can be set or read correctly. diff --git a/compat/mingw.c b/compat/mingw.c index e61981cdaafa10..6e29111f73486a 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -25,6 +25,7 @@ #include "win32/fscache.h" #include "../attr.h" #include "../string-list.h" +#include "win32/wsl.h" #define HCAST(type, handle) ((type)(intptr_t)handle) @@ -811,6 +812,11 @@ int mingw_open (const char *filename, int oflags, ...) fd = open_fn(wfilename, oflags, mode); + if ((oflags & O_CREAT) && fd >= 0 && are_wsl_compatible_mode_bits_enabled()) { + _mode_t wsl_mode = S_IFREG | (mode&0777); + set_wsl_mode_bits_by_handle((HANDLE)_get_osfhandle(fd), wsl_mode); + } + if (fd < 0 && (oflags & O_ACCMODE) != O_RDONLY && errno == EACCES) { DWORD attrs = GetFileAttributesW(wfilename); if (attrs != INVALID_FILE_ATTRIBUTES && (attrs & FILE_ATTRIBUTE_DIRECTORY)) @@ -1110,6 +1116,11 @@ int mingw_lstat(const char *file_name, struct stat *buf) filetime_to_timespec(&(fdata.ftLastAccessTime), &(buf->st_atim)); filetime_to_timespec(&(fdata.ftLastWriteTime), &(buf->st_mtim)); filetime_to_timespec(&(fdata.ftCreationTime), &(buf->st_ctim)); + if (S_ISREG(buf->st_mode) && + are_wsl_compatible_mode_bits_enabled()) { + copy_wsl_mode_bits_from_disk(wfilename, -1, + &buf->st_mode); + } return 0; } @@ -1161,6 +1172,8 @@ static int get_file_info_by_handle(HANDLE hnd, struct stat *buf) filetime_to_timespec(&(fdata.ftLastAccessTime), &(buf->st_atim)); filetime_to_timespec(&(fdata.ftLastWriteTime), &(buf->st_mtim)); filetime_to_timespec(&(fdata.ftCreationTime), &(buf->st_ctim)); + if (are_wsl_compatible_mode_bits_enabled()) + get_wsl_mode_bits_by_handle(hnd, &buf->st_mode); return 0; } diff --git a/compat/win32/fscache.c b/compat/win32/fscache.c index d3819aa7f86466..a4f85ac9b77be2 100644 --- a/compat/win32/fscache.c +++ b/compat/win32/fscache.c @@ -8,6 +8,7 @@ #include "config.h" #include "../../mem-pool.h" #include "ntifs.h" +#include "wsl.h" static volatile long initialized; static DWORD dwTlsIndex; @@ -215,6 +216,21 @@ static struct fsentry *fseentry_create_entry(struct fscache *cache, &(fse->u.s.st_mtim)); filetime_to_timespec((FILETIME *)&(fdata->CreationTime), &(fse->u.s.st_ctim)); + if (fdata->EaSize > 0 && + sizeof(buf) >= (list ? list->len+1 : 0) + fse->len+1 && + are_wsl_compatible_mode_bits_enabled()) { + size_t off = 0; + wchar_t wpath[MAX_LONG_PATH]; + if (list && list->len) { + memcpy(buf, list->dirent.d_name, list->len); + buf[list->len] = '/'; + off = list->len + 1; + } + memcpy(buf + off, fse->dirent.d_name, fse->len); + buf[off + fse->len] = '\0'; + if (xutftowcs_long_path(wpath, buf) >= 0) + copy_wsl_mode_bits_from_disk(wpath, -1, &fse->st_mode); + } return fse; } diff --git a/compat/win32/wsl.c b/compat/win32/wsl.c new file mode 100644 index 00000000000000..c6e9f3bfeacb84 --- /dev/null +++ b/compat/win32/wsl.c @@ -0,0 +1,142 @@ +#define USE_THE_REPOSITORY_VARIABLE +#include "../../git-compat-util.h" +#include "../win32.h" +#include "../../repository.h" +#include "config.h" +#include "ntifs.h" +#include "wsl.h" + +int are_wsl_compatible_mode_bits_enabled(void) +{ + /* default to `false` during initialization */ + static const int fallback = 0; + static int enabled = -1; + + if (enabled < 0) { + /* avoid infinite recursion */ + if (!the_repository) + return fallback; + + if (the_repository->config && + the_repository->config->hash_initialized && + git_config_get_bool("core.wslcompat", &enabled) < 0) + enabled = 0; + } + + return enabled < 0 ? fallback : enabled; +} + +int copy_wsl_mode_bits_from_disk(const wchar_t *wpath, ssize_t wpathlen, + _mode_t *mode) +{ + int ret = -1; + HANDLE h; + if (wpathlen >= 0) { + /* + * It's caller's duty to make sure wpathlen is reasonable so + * it does not overflow. + */ + wchar_t *fn2 = (wchar_t*)alloca((wpathlen + 1) * sizeof(wchar_t)); + memcpy(fn2, wpath, wpathlen * sizeof(wchar_t)); + fn2[wpathlen] = 0; + wpath = fn2; + } + h = CreateFileW(wpath, FILE_READ_EA | SYNCHRONIZE, + FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE, + NULL, OPEN_EXISTING, + FILE_FLAG_BACKUP_SEMANTICS | + FILE_FLAG_OPEN_REPARSE_POINT, + NULL); + if (h != INVALID_HANDLE_VALUE) { + ret = get_wsl_mode_bits_by_handle(h, mode); + CloseHandle(h); + } + return ret; +} + +#ifndef LX_FILE_METADATA_HAS_UID +#define LX_FILE_METADATA_HAS_UID 0x1 +#define LX_FILE_METADATA_HAS_GID 0x2 +#define LX_FILE_METADATA_HAS_MODE 0x4 +#define LX_FILE_METADATA_HAS_DEVICE_ID 0x8 +#define LX_FILE_CASE_SENSITIVE_DIR 0x10 +typedef struct _FILE_STAT_LX_INFORMATION { + LARGE_INTEGER FileId; + LARGE_INTEGER CreationTime; + LARGE_INTEGER LastAccessTime; + LARGE_INTEGER LastWriteTime; + LARGE_INTEGER ChangeTime; + LARGE_INTEGER AllocationSize; + LARGE_INTEGER EndOfFile; + uint32_t FileAttributes; + uint32_t ReparseTag; + uint32_t NumberOfLinks; + ACCESS_MASK EffectiveAccess; + uint32_t LxFlags; + uint32_t LxUid; + uint32_t LxGid; + uint32_t LxMode; + uint32_t LxDeviceIdMajor; + uint32_t LxDeviceIdMinor; +} FILE_STAT_LX_INFORMATION, *PFILE_STAT_LX_INFORMATION; +#endif + +/* + * This struct is extended from the original FILE_FULL_EA_INFORMATION of + * Microsoft Windows. + */ +struct wsl_full_ea_info_t { + uint32_t NextEntryOffset; + uint8_t Flags; + uint8_t EaNameLength; + uint16_t EaValueLength; + char EaName[7]; + char EaValue[4]; + char Padding[1]; +}; + +enum { + FileStatLxInformation = 70, +}; +__declspec(dllimport) NTSTATUS WINAPI + NtQueryInformationFile(HANDLE FileHandle, + PIO_STATUS_BLOCK IoStatusBlock, + PVOID FileInformation, ULONG Length, + uint32_t FileInformationClass); +__declspec(dllimport) NTSTATUS WINAPI + NtSetInformationFile(HANDLE FileHandle, PIO_STATUS_BLOCK IoStatusBlock, + PVOID FileInformation, ULONG Length, + uint32_t FileInformationClass); +__declspec(dllimport) NTSTATUS WINAPI + NtSetEaFile(HANDLE FileHandle, PIO_STATUS_BLOCK IoStatusBlock, + PVOID EaBuffer, ULONG EaBufferSize); + +int set_wsl_mode_bits_by_handle(HANDLE h, _mode_t mode) +{ + uint32_t value = mode; + struct wsl_full_ea_info_t ea_info; + IO_STATUS_BLOCK iob; + /* mode should be valid to make WSL happy */ + assert(S_ISREG(mode) || S_ISDIR(mode)); + ea_info.NextEntryOffset = 0; + ea_info.Flags = 0; + ea_info.EaNameLength = 6; + ea_info.EaValueLength = sizeof(value); /* 4 */ + strlcpy(ea_info.EaName, "$LXMOD", sizeof(ea_info.EaName)); + memcpy(ea_info.EaValue, &value, sizeof(value)); + ea_info.Padding[0] = 0; + return NtSetEaFile(h, &iob, &ea_info, sizeof(ea_info)); +} + +int get_wsl_mode_bits_by_handle(HANDLE h, _mode_t *mode) +{ + FILE_STAT_LX_INFORMATION fxi; + IO_STATUS_BLOCK iob; + if (NtQueryInformationFile(h, &iob, &fxi, sizeof(fxi), + FileStatLxInformation) == 0) { + if (fxi.LxFlags & LX_FILE_METADATA_HAS_MODE) + *mode = (_mode_t)fxi.LxMode; + return 0; + } + return -1; +} diff --git a/compat/win32/wsl.h b/compat/win32/wsl.h new file mode 100644 index 00000000000000..1f5ad7e67a4fc2 --- /dev/null +++ b/compat/win32/wsl.h @@ -0,0 +1,12 @@ +#ifndef COMPAT_WIN32_WSL_H +#define COMPAT_WIN32_WSL_H + +int are_wsl_compatible_mode_bits_enabled(void); + +int copy_wsl_mode_bits_from_disk(const wchar_t *wpath, ssize_t wpathlen, + _mode_t *mode); + +int get_wsl_mode_bits_by_handle(HANDLE h, _mode_t *mode); +int set_wsl_mode_bits_by_handle(HANDLE h, _mode_t mode); + +#endif diff --git a/config.mak.uname b/config.mak.uname index 0090afc2012030..2a491021a78a5d 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -501,7 +501,7 @@ endif compat/win32/path-utils.o \ compat/win32/pthread.o compat/win32/syslog.o \ compat/win32/trace2_win32_process_info.o \ - compat/win32/dirent.o compat/win32/fscache.o + compat/win32/dirent.o compat/win32/fscache.o compat/win32/wsl.o COMPAT_CFLAGS = -D__USE_MINGW_ACCESS -DDETECT_MSYS_TTY -DENSURE_MSYSTEM_IS_SET -DNOGDI -DHAVE_STRING_H -Icompat -Icompat/regex -Icompat/win32 -DSTRIP_EXTENSION=\".exe\" BASIC_LDFLAGS = -IGNORE:4217 -IGNORE:4049 -NOLOGO # invalidcontinue.obj allows Git's source code to close the same file @@ -703,7 +703,7 @@ ifeq ($(uname_S),MINGW) compat/win32/flush.o \ compat/win32/path-utils.o \ compat/win32/pthread.o compat/win32/syslog.o \ - compat/win32/dirent.o compat/win32/fscache.o + compat/win32/dirent.o compat/win32/fscache.o compat/win32/wsl.o BASIC_CFLAGS += -DWIN32 EXTLIBS += -lws2_32 GITLIBS += git.res diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt index 9540631852a028..90553639179e04 100644 --- a/contrib/buildsystems/CMakeLists.txt +++ b/contrib/buildsystems/CMakeLists.txt @@ -307,6 +307,7 @@ if(CMAKE_SYSTEM_NAME STREQUAL "Windows") compat/win32/syslog.c compat/win32/trace2_win32_process_info.c compat/win32/dirent.c + compat/win32/wsl.c compat/nedmalloc/nedmalloc.c compat/strdup.c compat/win32/fscache.c) From 034eb03940e1909ec1181872830c579de7f8ba28 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Mon, 23 Apr 2018 00:24:29 +0200 Subject: [PATCH 282/297] mingw: really handle SIGINT Previously, we did not install any handler for Ctrl+C, but now we really want to because the MSYS2 runtime learned the trick to call the ConsoleCtrlHandler when Ctrl+C was pressed. With this, hitting Ctrl+C while `git log` is running will only terminate the Git process, but not the pager. This finally matches the behavior on Linux and on macOS. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- compat/mingw.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/compat/mingw.c b/compat/mingw.c index 2a57def36b1af6..4d473043b0ffa3 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -3880,7 +3880,14 @@ static void adjust_symlink_flags(void) symlink_file_flags |= 2; symlink_directory_flags |= 2; } +} +static BOOL WINAPI handle_ctrl_c(DWORD ctrl_type) +{ + if (ctrl_type != CTRL_C_EVENT) + return FALSE; /* we did not handle this */ + mingw_raise(SIGINT); + return TRUE; /* we did handle this */ } #ifdef _MSC_VER @@ -3916,6 +3923,8 @@ int wmain(int argc, const wchar_t **wargv) #endif #endif + SetConsoleCtrlHandler(handle_ctrl_c, TRUE); + maybe_redirect_std_handles(); adjust_symlink_flags(); fsync_object_files = 1; From cfce05167286932d986e2265544841efb8b74907 Mon Sep 17 00:00:00 2001 From: "Neeraj K. Singh" <neerajsi@microsoft.com> Date: Wed, 27 Oct 2021 14:22:42 -0700 Subject: [PATCH 283/297] mingw: do not call xutftowcs_path in mingw_mktemp The `xutftowcs_path` function canonicalizes absolute paths using GetFullPathNameW. This canonicalization may change the length of the string (e.g. getting rid of \.\), which breaks callers that pass the template string in a strbuf and expect the length of the string to remain the same. In my particular case, the tmp-objdir code is passing a strbuf to mkdtemp and is breaking since the strbuf.len is no longer synchronized with strlen(strbuf.buf). Signed-off-by: Neeraj K. Singh <neerajsi@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- compat/mingw.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/compat/mingw.c b/compat/mingw.c index 4626f98e522058..c077bfb9a48043 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -1276,8 +1276,11 @@ char *mingw_mktemp(char *template) int offset = 0; /* we need to return the path, thus no long paths here! */ - if (xutftowcs_path(wtemplate, template) < 0) + if (xutftowcsn(wtemplate, template, MAX_PATH, -1) < 0) { + if (errno == ERANGE) + errno = ENAMETOOLONG; return NULL; + } if (is_dir_sep(template[0]) && !is_dir_sep(template[1]) && iswalpha(wtemplate[0]) && wtemplate[1] == L':') { From d9da48c54e795aba4c353d0f70ca49d77a05559c Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Thu, 25 Nov 2021 11:26:41 +0100 Subject: [PATCH 284/297] Partially un-revert "editor: save and reset terminal after calling EDITOR" In e3f7e01b50be (Revert "editor: save and reset terminal after calling EDITOR", 2021-11-22), we reverted the commit wholesale where the terminal state would be saved and restored before/after calling an editor. The reverted commit was intended to fix a problem with Windows Terminal where simply calling `vi` would cause problems afterwards. To fix the problem addressed by the revert, but _still_ keep the problem with Windows Terminal fixed, let's revert the revert, with a twist: we restrict the save/restore _specifically_ to the case where `vi` (or `vim`) is called, and do not do the same for any other editor. This should still catch the majority of the cases, and will bridge the time until the original patch is re-done in a way that addresses all concerns. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- editor.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/editor.c b/editor.c index d1ba2d7c34fbf4..e46b5af9ab8236 100644 --- a/editor.c +++ b/editor.c @@ -11,6 +11,7 @@ #include "strvec.h" #include "run-command.h" #include "sigchain.h" +#include "compat/terminal.h" #ifndef DEFAULT_EDITOR #define DEFAULT_EDITOR "vi" @@ -62,6 +63,7 @@ static int launch_specified_editor(const char *editor, const char *path, return error("Terminal is dumb, but EDITOR unset"); if (strcmp(editor, ":")) { + int save_and_restore_term = !strcmp(editor, "vi") || !strcmp(editor, "vim"); struct strbuf realpath = STRBUF_INIT; struct child_process p = CHILD_PROCESS_INIT; int ret, sig; @@ -90,7 +92,11 @@ static int launch_specified_editor(const char *editor, const char *path, strvec_pushv(&p.env, (const char **)env); p.use_shell = 1; p.trace2_child_class = "editor"; + if (save_and_restore_term) + save_and_restore_term = !save_term(1); if (start_command(&p) < 0) { + if (save_and_restore_term) + restore_term(); strbuf_release(&realpath); return error("unable to start editor '%s'", editor); } @@ -98,6 +104,8 @@ static int launch_specified_editor(const char *editor, const char *path, sigchain_push(SIGINT, SIG_IGN); sigchain_push(SIGQUIT, SIG_IGN); ret = finish_command(&p); + if (save_and_restore_term) + restore_term(); strbuf_release(&realpath); sig = ret - 128; sigchain_pop(SIGINT); From 69a6f8ed0c503dbd2d4beba1fd0a2b4c6aeab390 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Tue, 10 Dec 2019 21:41:57 +0100 Subject: [PATCH 285/297] reset: reinstate support for the deprecated --stdin option The `--stdin` option was a well-established paradigm in other commands, therefore we implemented it in `git reset` for use by Visual Studio. Unfortunately, upstream Git decided that it is time to introduce `--pathspec-from-file` instead. To keep backwards-compatibility for some grace period, we therefore reinstate the `--stdin` option on top of the `--pathspec-from-file` option, but mark it firmly as deprecated. Helped-by: Victoria Dye <vdye@github.com> Helped-by: Matthew John Cheetham <mjcheetham@outlook.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- Documentation/git-reset.txt | 11 +++++++++++ builtin/reset.c | 16 ++++++++++++++++ t/t7108-reset-stdin.sh | 32 ++++++++++++++++++++++++++++++++ 3 files changed, 59 insertions(+) create mode 100755 t/t7108-reset-stdin.sh diff --git a/Documentation/git-reset.txt b/Documentation/git-reset.txt index 79ad5643eedb82..94fed757630f51 100644 --- a/Documentation/git-reset.txt +++ b/Documentation/git-reset.txt @@ -12,6 +12,7 @@ SYNOPSIS 'git reset' [-q] [--pathspec-from-file=<file> [--pathspec-file-nul]] [<tree-ish>] 'git reset' (--patch | -p) [<tree-ish>] [--] [<pathspec>...] 'git reset' [--soft | --mixed [-N] | --hard | --merge | --keep] [-q] [<commit>] +DEPRECATED: 'git reset' [-q] [--stdin [-z]] [<tree-ish>] DESCRIPTION ----------- @@ -133,6 +134,16 @@ OPTIONS + For more details, see the 'pathspec' entry in linkgit:gitglossary[7]. +--stdin:: + DEPRECATED (use `--pathspec-from-file=-` instead): Instead of taking + list of paths from the command line, read list of paths from the + standard input. Paths are separated by LF (i.e. one path per line) by + default. + +-z:: + DEPRECATED (use `--pathspec-file-nul` instead): Only meaningful with + `--stdin`; paths are separated with NUL character instead of LF. + EXAMPLES -------- diff --git a/builtin/reset.c b/builtin/reset.c index 5f941fb3a29cce..9cba244fb045d0 100644 --- a/builtin/reset.c +++ b/builtin/reset.c @@ -35,6 +35,8 @@ #include "trace2.h" #include "dir.h" #include "add-interactive.h" +#include "strbuf.h" +#include "quote.h" #define REFRESH_INDEX_DELAY_WARNING_IN_MS (2 * 1000) @@ -43,6 +45,7 @@ static const char * const git_reset_usage[] = { N_("git reset [-q] [<tree-ish>] [--] <pathspec>..."), N_("git reset [-q] [--pathspec-from-file [--pathspec-file-nul]] [<tree-ish>]"), N_("git reset --patch [<tree-ish>] [--] [<pathspec>...]"), + N_("DEPRECATED: git reset [-q] [--stdin [-z]] [<tree-ish>]"), NULL }; @@ -340,6 +343,7 @@ int cmd_reset(int argc, const char **argv, const char *prefix) struct object_id oid; struct pathspec pathspec; int intent_to_add = 0; + int nul_term_line = 0, read_from_stdin = 0; const struct option options[] = { OPT__QUIET(&quiet, N_("be quiet, only report errors")), OPT_BOOL(0, "no-refresh", &no_refresh, @@ -368,6 +372,10 @@ int cmd_reset(int argc, const char **argv, const char *prefix) N_("record only the fact that removed paths will be added later")), OPT_PATHSPEC_FROM_FILE(&pathspec_from_file), OPT_PATHSPEC_FILE_NUL(&pathspec_file_nul), + OPT_BOOL('z', NULL, &nul_term_line, + N_("DEPRECATED (use --pathspec-file-nul instead): paths are separated with NUL character")), + OPT_BOOL(0, "stdin", &read_from_stdin, + N_("DEPRECATED (use --pathspec-from-file=- instead): read paths from <stdin>")), OPT_END() }; @@ -377,6 +385,14 @@ int cmd_reset(int argc, const char **argv, const char *prefix) PARSE_OPT_KEEP_DASHDASH); parse_args(&pathspec, argv, prefix, patch_mode, &rev); + if (read_from_stdin) { + warning(_("--stdin is deprecated, please use --pathspec-from-file=- instead")); + free(pathspec_from_file); + pathspec_from_file = xstrdup("-"); + if (nul_term_line) + pathspec_file_nul = 1; + } + if (pathspec_from_file) { if (patch_mode) die(_("options '%s' and '%s' cannot be used together"), "--pathspec-from-file", "--patch"); diff --git a/t/t7108-reset-stdin.sh b/t/t7108-reset-stdin.sh new file mode 100755 index 00000000000000..b7cbcbf869296c --- /dev/null +++ b/t/t7108-reset-stdin.sh @@ -0,0 +1,32 @@ +#!/bin/sh + +test_description='reset --stdin' + +. ./test-lib.sh + +test_expect_success 'reset --stdin' ' + test_commit hello && + git rm hello.t && + test -z "$(git ls-files hello.t)" && + echo hello.t | git reset --stdin && + test hello.t = "$(git ls-files hello.t)" +' + +test_expect_success 'reset --stdin -z' ' + test_commit world && + git rm hello.t world.t && + test -z "$(git ls-files hello.t world.t)" && + printf world.tQworld.tQhello.tQ | q_to_nul | git reset --stdin -z && + printf "hello.t\nworld.t\n" >expect && + git ls-files >actual && + test_cmp expect actual +' + +test_expect_success '--stdin requires --mixed' ' + echo hello.t >list && + test_must_fail git reset --soft --stdin <list && + test_must_fail git reset --hard --stdin <list && + git reset --mixed --stdin <list +' + +test_done From 2d20861d0bfc9312b502448501fcc3ea6085ee17 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Mon, 13 Feb 2023 13:31:35 +0100 Subject: [PATCH 286/297] Describe Git for Windows' architecture [no ci] The Git for Windows project has grown quite complex over the years, certainly much more complex than during the first years where the `msysgit.git` repository was abusing Git for package management purposes and the `git/git` fork was called `4msysgit.git`. Let's describe the status quo in a thorough way. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- ARCHITECTURE.md | 130 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 130 insertions(+) create mode 100644 ARCHITECTURE.md diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md new file mode 100644 index 00000000000000..9225b99a28be13 --- /dev/null +++ b/ARCHITECTURE.md @@ -0,0 +1,130 @@ +# Architecture of Git for Windows + +Git for Windows is a complex project. + +## What _is_ Git for Windows? + +### A fork of `git/git` + +First and foremost, it is a friendly fork of [`git/git`](https://github.com/git/git), aiming to improve Git's Windows support. The [`git-for-windows/git`](https://github.com/git-for-windows/git) repository contains dozens of topics on top of `git/git`, some awaiting to be "upstreamed" (i.e. to be contributed to `git/git`), some still being stabilized, and a few topics are specific to the Git for Windows project and are not intended to be integrated into `git/git` at all. + +### Enhancing and maintaining Git's support for Windows + +On the source code side, Git's Windows support is made a bit more tricky than strictly necessary by the fact that Git does not have any platform abstraction layer (unlike other version control systems, such as Subversion). It relies on the presence of POSIX features such as the `hstrerror()` function, and on platforms lacking that functionality, Git provides shims. That leads to some challenges e.g. with the `stat()` function which is very slow on Windows because it has to collect much more metadata than what e.g. the very quick `GetFileAttributesExW()` Win32 API function provides, even when Git calls `stat()` merely to test for the presence of a file (for which all that gathered metadata is totally irrelevant). + +### Providing more than just source code + +In contrast to the Git project, Git for Windows not only publishes tagged source code versions, but full builds of Git. In fact, Git for Windows' primary purpose, as far as most users are concerned, is to provide a convenient installer that end-users can run to have Git on their computer, without ever having to check out `git-for-windows/git` let alone build it. In essence, Git for Windows has to maintain a separate project altogether in addition to the fork of `git/git`, just to build these release artifacts: [`git-for-windows/build-extra`](https://github.com/git-for-windows/build-extra). This repository also contains the definition for a couple of other release artifacts published by Git for Windows, e.g. the "portable" edition of Git for Windows which is a self-extracting 7-Zip archive that does not need to be installed. + +### A software distribution, really + +Another aspect that contributes to the complexity of Git for Windows is that it is not just building `git.exe` and distributes that. Due to its heritage within the Linux project, Git takes certain things for granted, such as the presence of a Unix shell, or for that matter, a package management system from which dependencies can be fetched and updated independently of Git itself. Things that are distinctly not present in most Windows setups. To accommodate for that, Git for Windows originally relied on the MSys project, a minimal fork of Cygwin providing a Unix shell ("Bash"), a Perl interpreter and similar Unix-like tools, and on the MINGW project, a project to build libraries and executables using a GNU C Compiler that relies only on Win32 API functions. As of Git for Windows v2.x, the project has switched away from [MSys](https://sourceforge.net/projects/mingw/files/MSYS/)/[MinGW](https://osdn.net/projects/mingw/) (due to less-than-active maintenance) to [the MSYS2 project](https://msys2.org). That switch brought along the benefit of a robust package management system based on [Pacman](https://archlinux.org/pacman/) (hailing from Arch Linux). To support Windows users, who are in general unfamiliar with Linux-like package management and the need to update installed packages frequently, Git for Windows bundles a subset of its own fork of MSYS2. To put things in perspective: Git for Windows bundles files from ~170 packages, one of which contains Git, and another one contains Git's help files. In that respect, Git for Windows acts like a distribution more than like a mere single software application. + +Most of MSYS2's packages that are bundled in Git for Windows are consumed directly from MSYS2. Others need forks that are maintained by Git for Windows project, to support Git for Windows better. These forks live in the [`git-for-windows/MSYS2-packages`](https://github.com/git-for-windows/MSYS2-packages) and [`git-for-windows/MINGW-packages`](https://github.com/git-for-windows/MINGW-packages) repositories. There are several reasons justifying these forks. For example, the Git for Windows' flavor of the MSYS2 runtime behaves like Git's test suite expects it while MSYS2's flavor does not. Another example: The Bash executable bundled in Git for Windows is code-signed with the same certificate as `git.exe` to help anti-malware programs get out of the users' way. That is why Git for Windows maintains its own `bash` Pacman package. And since MSYS2 dropped 32-bit support already, Git for Windows has to update the 32-bit Pacman packages itself, which is done in the git-for-windows/MSYS2-packages repository. (Side note: the 32-bit issue is a bit more complicated, actually: MSYS2 _still_ builds _MINGW_ packages targeting i686 processors, but no longer any _MSYS_ packages for said processor architecture, and Git for Windows does not keep all of the 32-bit MSYS packages up to date but instead judiciously decides which packages are vital enough as far as Git is concerned to justify the maintenance cost.) + +### Supporting third-party applications that use Git's functionality + +Since the infrastructure required by Git is non-trivial the installer (or for that matter, the Portable Git) is not exactly light-weight: As of January 2023, both artifacts are over fifty megabytes. This is a problem for third-party applications wishing to bundle a version of Git for Windows, which is often advisable given that applications may depend on features that have been introduced only in recent Git versions and therefore relying on an installed Git for Windows could break things. To help with that, the Git for Windows project also provides MinGit as a release artifact, a zip file that is much smaller than the full installer and that contains only the parts of Git for Windows relevant for third-party applications. It lacks Git GUI, for example, as well as the terminal program MinTTY, or for that matter, the documentation. + +### Supporting `git/git`'s GitHub workflows + +The Git for Windows project is also responsible for keeping the Windows part of `git/git`'s automated builds up and running. On Windows, there is no canonical and easy way to get a build environment necessary to build Git and run its test suite, therefore this is a non-trivial task that comes with its own maintenance cost. Git for Windows provides two GitHub Actions to help with that: [`git-for-windows/setup-git-for-windows-sdk`](https://github.com/git-for-windows/setup-git-for-windows-sdk) to set up a tiny subset of Git for Windows' full SDK (which would require about 500MB to be cloned, as opposed to the ~75MB of that subset) and [`git-for-windows/get-azure-pipelines-artifact`](https://github.com/git-for-windows/get-azure-pipelines-artifact) e.g. to download some regularly pre-built artifacts (for example, when `git/git`'s automated tests ran on an Ubuntu version that did not provide an up to date [Coccinelle](https://coccinelle.gitlabpages.inria.fr/website/) package, this GitHub Action was used to download a pre-built version of that Debian package). + +## Maintaining Git for Windows' components + +Git for Windows uses a combination of [a GitHub App called GitForWindowsHelper](https://github.com/git-for-windows/gfw-helper-github-app) (to listen for so-called [slash commands](https://github.com/git-for-windows/gfw-helper-github-app#slash-commands)) combined with workflows in [the `git-for-windows-automation` repository](https://github.com/git-for-windows/git-for-windows-automation/) (for computationally heavy tasks) to support Git for Windows' repetitive tasks. + +This heavy automation serves two purposes: + +1. Document the knowledge about "how things are done" in the Git for Windows project. +2. Make Git for Windows' maintenance less tedious by off-loading as many tasks onto machines as possible. + +One neat trick of some `git-for-windows-automation` workflows is that they "mirror back" check runs to the targeted PRs in another repository. This essentially allows versioning the source code independently of the workflow definition. + +Here is a diagram showing how the bits and pieces fit together. + +```mermaid +graph LR + A[`monitor-components`] --> |opens| B + B{issues labeled<br />`component-update`} --> |/open pr| C + C((GitForWindowsHelper)) --> |triggers| D + D[`open-pr`] --> |opens| E + E{PR in</br>MINGW-packages<br />MSYS2-packages<br />build-extra} --> |closes| B + E --> |/deploy| F + F((GitForWindowsHelper)) --> |triggers| G + G[`build-and-deploy`] --> |deploys to| H + H{Pacman repository} + C --> |backed by| I + F --> |backed by| I + I[[Azure Function]] + D --> |running in| J + G --> | running in| J + J[[git-for-windows-automation]] + K[[git-sdk-32<br />git-sdk-64<br />git-sdk-arm64]] --> |syncing from| H + B --> |/add release note| L + L[`add-release-note`] +``` + +For the curious mind, here are [detailed instructions how the Azure Function backing the GitForWindowsHelper GitHub App was set up](https://github.com/git-for-windows/gfw-helper-github-app#how-this-github-app-was-set-up). + +### The `monitor-components` workflow + +When new versions of components that Git for Windows builds become available, new Pacman packages have to be built. To this end, [the `monitor-components` workflow](https://github.com/git-for-windows/git/blob/main/.github/workflows/monitor-components.yml) monitors a couple of RSS feeds and opens new tickets labeled `component-update` for such new versions. + +### Opening Pull Requests to update Git for Windows' components + +After determining that such a ticket indeed indicates the need for a new Pacman package build, a Git for Windows maintainer issues the `/open pr` command via an issue comment ([example](https://github.com/git-for-windows/git/issues/4281#issuecomment-1426859787)), which gets picked up by the GitForWindowsHelper GitHub App, which in turn triggers [the `open-pr` workflow](https://github.com/git-for-windows/git-for-windows-automation/blob/main/.github/workflows/open-pr.yml) in the `git-for-windows-automation` repository. + +### Deploying the Pacman packages + +This will open a Pull Request in one of Git for Windows' repositories, and once the PR build passes, a Git for Windows maintainer issues the `/deploy` command ([example](https://github.com/git-for-windows/MINGW-packages/pull/69#issuecomment-1427591890)), which gets picked up by the GitForWindowsHelper GitHub App, which triggers [the `build-and-deploy` workflow](https://github.com/git-for-windows/git-for-windows-automation/blob/main/.github/workflows/build-and-deploy.yml). + +### Adding release notes + +Finally, once the packages have been built and deployed to the Pacman repository (which is hosted in Azure Blob Storage), a Git for Windows maintainer will merge the PR(s), which in turn will close the ticket, and the maintainer then issues an `/add release note` command ([example](https://github.com/git-for-windows/MINGW-packages/pull/69#issuecomment-1427782230)), which again gets picked up by the GitForWindowsHelper GitHub App that triggers [the `add-release-note` workflow](https://github.com/git-for-windows/build-extra/blob/main/.github/workflows/add-release-note.yml) that creates and pushes a new commit to the `ReleaseNotes.md` file in `build-extra` ([example](https://github.com/git-for-windows/build-extra/commit/b39c148ff8dc0e987afdb677d17c46a8e99fd0ef)). + +## Releasing official Git for Windows versions + +A relatively infrequent part of Git for Windows' maintainers' duties, if the most rewarding part, is the task of releasing new versions of Git for Windows. + +Most commonly, this is done in response to the "upstream" Git project releasing a new version. When that happens, a Git for Windows maintainer runs [the helper script](https://github.com/git-for-windows/build-extra/blob/main/shears.sh) to perform a "merging rebase" (i.e. a rebase that starts with a fake-merge of the previous tip commit, to maintain both a clean set of commits as well as a [fast-forwarding](https://git-scm.com/docs/git-merge#Documentation/git-merge.txt---ff-only) commit history). + +Once that is done, the maintainer will open a Pull Request to benefit from the automated builds and tests ([example](https://github.com/git-for-windows/git/pull/4160)) as well as from reviews of the [`range-diff`](https://git-scm.com/docs/git-range-diff) relative to the current `main` branch. + +Once everything looks good, the maintainer will issue the `/git-artifacts` command ([example](https://github.com/git-for-windows/git/pull/4160#issuecomment-1346801735)). This will trigger an automated workflow that builds all of the release artifacts: installers, Portable Git, MinGit, `.tar.xz` archive and a NuGet package. Apart from the NuGet package, two sets of artifacts are built: targeting 32-bit ("x86") and 64-bit ("amd64"). + +Once these artifacts are built, the maintainer will download the installer and run [the "pre-flight checklist"](https://github.com/git-for-windows/build-extra/blob/main/installer/checklist.txt). + +If everything looks good, a `/release` command will be issued, which triggers yet another workflow that will download the just-built-and-verified release artifacts, publish them as a new GitHub release, publish the NuGet packages, deploy the Pacman packages to the Pacman repository, send out an announcement mail, and update the respective repositories including [Git for Windows' website](https://gitforwindows.org/). + +As mentioned [before](#architecture-of-git-for-windows), the `/git-artifacts` and `/release` commands are picked up by the GitForWindowsHelper GitHub App which subsequently triggers the respective workflows in the `git-for-windows-automation` repository. Here is a diagram: + +```mermaid +graph LR + A{Pull Request<br />updating to<br />new Git version} --> |/git-artifacts| B + B((GitForWindowsHelper)) --> |triggers| C + C[`tag-git`] --> |upon successful build<br />triggers| D + D((GitForWindowsHelper)) --> |triggers| E + E[`git-artifacts`] + E --> |maintainer verifies artifacts| E + A --> |upon verified `git-artifacts`<br />/release| F + F[`release-git`] + C --> |running in| J + E --> | running in| J + F --> | running in| J + J[[git-for-windows-automation]] +``` + +## Managing Windows/ARM64 builds + +The GitForWindowsHelper comes in real handy for Git for Windows' Pacman packages for the `aarch64` architecture, i.e. for Windows/ARM64. These packages cannot be built in regular hosted GitHub Actions runners because there are none of that architecture. To help with that, the respective workflows in `git-for-windows-automation` use the label `runs-on: ["Windows", "ARM64"]` to indicate that they need a self-hosted Windows/ARM64 runner. + +It would not be cost-effective to have a VM running permanently, hosting such a self-hosted runner: Git for Windows does not build such packages often enough (usually once or twice per week is more the norm). + +Therefore, VMs providing self-hosted GitHub Actions runners are spun up and torn down as needed. This job is done by the GitForWindowsHelper: + +- When a job is queued asking for above-mentioned labels, [the `create-self-hosted-runner` workflow](https://github.com/git-for-windows/git-for-windows-automation/blob/09ec165f44a0a3d84d8f0e26a4939667b4522635/.github/workflows/create-azure-self-hosted-runners.yml) is started. This deploys an Azure Resource Management template that creates an ephemeral self-hosted runner (i.e. a runner that will pick up one job and then is immediately unregistered). + +- When a job with above-mentioned labels has finished, the GitForWindowsHelper triggers [the `delete-self-hosted-runner` workflow](https://github.com/git-for-windows/git-for-windows-automation/blob/09ec165f44a0a3d84d8f0e26a4939667b4522635/.github/workflows/delete-self-hosted-runner.yml) that tears down the now no longer used VM. + +The GitForWindowsHelper GitHub App will also detect when a job is queued for a PR from a forked repository. This is considered unauthorized use, and the job will be canceled immediately by the GitHub App instead of spinning up a self-hosted runner for it. \ No newline at end of file From 79d88388d16060a67a835057231975d0d6516e97 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Fri, 11 Oct 2019 13:22:24 +0200 Subject: [PATCH 287/297] Modify the Code of Conduct for Git for Windows The Git project followed Git for Windows' lead and added their Code of Conduct, based on the Contributor Covenant v1.4, later updated to v2.0. We adapt it slightly to Git for Windows. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- CODE_OF_CONDUCT.md | 58 +++++++++++++++++++++------------------------- 1 file changed, 26 insertions(+), 32 deletions(-) diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md index e58917c50a96dc..4daef7e3ce9196 100644 --- a/CODE_OF_CONDUCT.md +++ b/CODE_OF_CONDUCT.md @@ -1,9 +1,9 @@ -# Git Code of Conduct +# Git for Windows Code of Conduct This code of conduct outlines our expectations for participants within -the Git community, as well as steps for reporting unacceptable behavior. -We are committed to providing a welcoming and inspiring community for -all and expect our code of conduct to be honored. Anyone who violates +the **Git for Windows** community, as well as steps for reporting unacceptable +behavior. We are committed to providing a welcoming and inspiring community +for all and expect our code of conduct to be honored. Anyone who violates this code of conduct may be banned from the community. ## Our Pledge @@ -12,8 +12,8 @@ We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, -nationality, personal appearance, race, religion, or sexual identity -and orientation. +nationality, personal appearance, race, caste, color, religion, or sexual +identity and orientation. We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community. @@ -28,17 +28,17 @@ community include: * Giving and gracefully accepting constructive feedback * Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience -* Focusing on what is best not just for us as individuals, but for the - overall community +* Focusing on what is best not just for us as individuals, but for the overall + community Examples of unacceptable behavior include: -* The use of sexualized language or imagery, and sexual attention or - advances of any kind +* The use of sexualized language or imagery, and sexual attention or advances of + any kind * Trolling, insulting or derogatory comments, and personal or political attacks * Public or private harassment -* Publishing others' private information, such as a physical or email - address, without their explicit permission +* Publishing others' private information, such as a physical or email address, + without their explicit permission * Other conduct which could reasonably be considered inappropriate in a professional setting @@ -58,20 +58,14 @@ decisions when appropriate. This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing the community in public spaces. -Examples of representing our community include using an official e-mail address, +Examples of representing our community include using an official email address, posting via an official social media account, or acting as an appointed representative at an online or offline event. ## Enforcement Instances of abusive, harassing, or otherwise unacceptable behavior may be -reported to the community leaders responsible for enforcement at -git@sfconservancy.org, or individually: - - - Ævar Arnfjörð Bjarmason <avarab@gmail.com> - - Christian Couder <christian.couder@gmail.com> - - Junio C Hamano <gitster@pobox.com> - - Taylor Blau <me@ttaylorr.com> +reported by contacting the Git for Windows maintainer. All complaints will be reviewed and investigated promptly and fairly. @@ -94,15 +88,15 @@ behavior was inappropriate. A public apology may be requested. ### 2. Warning -**Community Impact**: A violation through a single incident or series -of actions. +**Community Impact**: A violation through a single incident or series of +actions. **Consequence**: A warning with consequences for continued behavior. No interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, for a specified period of time. This includes avoiding interactions in community spaces as well as external channels -like social media. Violating these terms may lead to a temporary or -permanent ban. +like social media. Violating these terms may lead to a temporary or permanent +ban. ### 3. Temporary Ban @@ -118,27 +112,27 @@ Violating these terms may lead to a permanent ban. ### 4. Permanent Ban **Community Impact**: Demonstrating a pattern of violation of community -standards, including sustained inappropriate behavior, harassment of an +standards, including sustained inappropriate behavior, harassment of an individual, or aggression toward or disparagement of classes of individuals. -**Consequence**: A permanent ban from any sort of public interaction within -the community. +**Consequence**: A permanent ban from any sort of public interaction within the +community. ## Attribution This Code of Conduct is adapted from the [Contributor Covenant][homepage], -version 2.0, available at -[https://www.contributor-covenant.org/version/2/0/code_of_conduct.html][v2.0]. +version 2.1, available at +[https://www.contributor-covenant.org/version/2/1/code_of_conduct.html][v2.1]. Community Impact Guidelines were inspired by [Mozilla's code of conduct enforcement ladder][Mozilla CoC]. For answers to common questions about this code of conduct, see the FAQ at -[https://www.contributor-covenant.org/faq][FAQ]. Translations are available -at [https://www.contributor-covenant.org/translations][translations]. +[https://www.contributor-covenant.org/faq][FAQ]. Translations are available at +[https://www.contributor-covenant.org/translations][translations]. [homepage]: https://www.contributor-covenant.org -[v2.0]: https://www.contributor-covenant.org/version/2/0/code_of_conduct.html +[v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html [Mozilla CoC]: https://github.com/mozilla/diversity [FAQ]: https://www.contributor-covenant.org/faq [translations]: https://www.contributor-covenant.org/translations From 56ce15a2ac9df0b7b6406afbff3b557e48662b78 Mon Sep 17 00:00:00 2001 From: Derrick Stolee <dstolee@microsoft.com> Date: Thu, 1 Mar 2018 12:10:14 -0500 Subject: [PATCH 288/297] CONTRIBUTING.md: add guide for first-time contributors Getting started contributing to Git can be difficult on a Windows machine. CONTRIBUTING.md contains a guide to getting started, including detailed steps for setting up build tools, running tests, and submitting patches to upstream. [includes an example by Pratik Karki how to submit v2, v3, v4, etc.] Signed-off-by: Derrick Stolee <dstolee@microsoft.com> --- CONTRIBUTING.md | 417 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 417 insertions(+) create mode 100644 CONTRIBUTING.md diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 00000000000000..ca9770df8c4808 --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,417 @@ +How to Contribute to Git for Windows +==================================== + +Git was originally designed for Unix systems and still today, all the build tools for the Git +codebase assume you have standard Unix tools available in your path. If you have an open-source +mindset and want to start contributing to Git, but primarily use a Windows machine, then you may +have trouble getting started. This guide is for you. + +Get the Source +-------------- + +Clone the [GitForWindows repository on GitHub](https://github.com/git-for-windows/git). +It is helpful to create your own fork for storing your development branches. + +Windows uses different line endings than Unix systems. See +[this GitHub article on working with line endings](https://help.github.com/articles/dealing-with-line-endings/#refreshing-a-repository-after-changing-line-endings) +if you have trouble with line endings. + +Build the Source +---------------- + +First, download and install the latest [Git for Windows SDK (64-bit)](https://github.com/git-for-windows/build-extra/releases/latest). +When complete, you can run the Git SDK, which creates a new Git Bash terminal window with +the additional development commands, such as `make`. + + As of time of writing, the SDK uses a different credential manager, so you may still want to use normal Git + Bash for interacting with your remotes. Alternatively, use SSH rather than HTTPS and + avoid credential manager problems. + +You should now be ready to type `make` from the root of your `git` source directory. +Here are some helpful variations: + +* `make -j[N] DEVELOPER=1`: Compile new sources using up to N concurrent processes. + The `DEVELOPER` flag turns on all warnings; code failing these warnings will not be + accepted upstream ("upstream" = "the core Git project"). +* `make clean`: Delete all compiled files. + +When running `make`, you can use `-j$(nproc)` to automatically use the number of processors +on your machine as the number of concurrent build processes. + +You can go deeper on the Windows-specific build process by reading the +[technical overview](https://github.com/git-for-windows/git/wiki/Technical-overview) or the +[guide to compiling Git with Visual Studio](https://github.com/git-for-windows/git/wiki/Compiling-Git-with-Visual-Studio). + +## Building `git` on Windows with Visual Studio + +The typical approach to building `git` is to use the standard `Makefile` with GCC, as +above. Developers working in a Windows environment may want to instead build with the +[Microsoft Visual C++ compiler and libraries toolset (MSVC)](https://blogs.msdn.microsoft.com/vcblog/2017/03/07/msvc-the-best-choice-for-windows/). +There are a few benefits to using MSVC over GCC during your development, including creating +symbols for debugging and [performance tracing](https://github.com/Microsoft/perfview#perfview-overview). + +There are two ways to build Git for Windows using MSVC. Each have their own merits. + +### Using SDK Command Line + +Use one of the following commands from the SDK Bash window to build Git for Windows: + +``` + make MSVC=1 -j12 + make MSVC=1 DEBUG=1 -j12 +``` + +The first form produces release-mode binaries; the second produces debug-mode binaries. +Both forms produce PDB files and can be debugged. However, the first is best for perf +tracing and the second is best for single-stepping. + +You can then open Visual Studio and select File -> Open -> Project/Solution and select +the compiled `git.exe` file. This creates a basic solution and you can use the debugging +and performance tracing tools in Visual Studio to monitor a Git process. Use the Debug +Properties page to set the working directory and command line arguments. + +Be sure to clean up before switching back to GCC (or to switch between debug and +release MSVC builds): + +``` + make MSVC=1 -j12 clean + make MSVC=1 DEBUG=1 -j12 clean +``` + +### Using the IDE + +If you prefer working in Visual Studio with a solution full of projects, then you can use +CMake, either by letting Visual Studio configure it automatically (simply open Git's +top-level directory via `File>Open>Folder...`) or by (downloading and) running +[CMake](https://cmake.org) manually. + +What to Change? +--------------- + +Many new contributors ask: What should I start working on? + +One way to win big with the open-source community is to look at the +[issues page](https://github.com/git-for-windows/git/issues) and see if there are any issues that +you can fix quickly, or if anything catches your eye. + +You can also look at [the unofficial Chromium issues page](https://crbug.com/git) for +multi-platform issues. You can look at recent user questions on +[the Git mailing list](https://public-inbox.org/git). + +Or you can "scratch your own itch", i.e. address an issue you have with Git. The team at Microsoft where the Git for Windows maintainer works, for example, is focused almost entirely on [improving performance](https://blogs.msdn.microsoft.com/devops/2018/01/11/microsofts-performance-contributions-to-git-in-2017/). +We approach our work by finding something that is slow and try to speed it up. We start our +investigation by reliably reproducing the slow behavior, then running that example using +the MSVC build and tracing the results in PerfView. + +You could also think of something you wish Git could do, and make it do that thing! The +only concern I would have with this approach is whether or not that feature is something +the community also wants. If this excites you though, go for it! Don't be afraid to +[get involved in the mailing list](http://vger.kernel.org/vger-lists.html#git) early for +feedback on the idea. + +Test Your Changes +----------------- + +After you make your changes, it is important that you test your changes. Manual testing is +important, but checking and extending the existing test suite is even more important. You +want to run the functional tests to see if you broke something else during your change, and +you want to extend the functional tests to be sure no one breaks your feature in the future. + +### Functional Tests + +Navigate to the `t/` directory and type `make` to run all tests or use `prove` as +[described in the Git for Windows wiki](https://github.com/git-for-windows/git/wiki/Building-Git): + +``` +prove -j12 --state=failed,save ./t[0-9]*.sh +``` + +You can also run each test directly by running the corresponding shell script with a name +like `tNNNN-descriptor.sh`. + +If you are adding new functionality, you may need to create unit tests by creating +helper commands that test a very limited action. These commands are stored in `t/helpers`. +When adding a helper, be sure to add a line to `t/Makefile` and to the `.gitignore` for the +binary file you add. The Git community prefers functional tests using the full `git` +executable, so try to exercise your new code using `git` commands before creating a test +helper. + +To find out why a test failed, repeat the test with the `-x -v -d -i` options and then +navigate to the appropriate "trash" directory to see the data shape that was used for the +test failed step. + +Read [`t/README`](t/README) for more details. + +### Performance Tests + +If you are working on improving performance, you will need to be acquainted with the +performance tests in `t/perf`. There are not too many performance tests yet, but adding one +as your first commit in a patch series helps to communicate the boost your change provides. + +To check the change in performance across multiple versions of `git`, you can use the +`t/perf/run` script. For example, to compare the performance of `git rev-list` across the +`core/master` and `core/next` branches compared to a `topic` branch, you can run + +``` +cd t/perf +./run core/master core/next topic -- p0001-rev-list.sh +``` + +You can also set certain environment variables to help test the performance on different +repositories or with more repetitions. The full list is available in +[the `t/perf/README` file](t/perf/README), +but here are a few important ones: + +``` +GIT_PERF_REPO=/path/to/repo +GIT_PERF_LARGE_REPO=/path/to/large/repo +GIT_PERF_REPEAT_COUNT=10 +``` + +When running the performance tests on Linux, you may see a message "Can't locate JSON.pm in +@INC" and that means you need to run `sudo cpanm install JSON` to get the JSON perl package. + +For running performance tests, it can be helpful to set up a few repositories with strange +data shapes, such as: + +**Many objects:** Clone repos such as [Kotlin](https://github.com/jetbrains/kotlin), [Linux](https://github.com/torvalds/linux), or [Android](https://source.android.com/setup/downloading). + +**Many pack-files:** You can split a fresh clone into multiple pack-files of size at most +16MB by running `git repack -adfF --max-pack-size=16m`. See the +[`git repack` documentation](https://git-scm.com/docs/git-repack) for more information. +You can count the number of pack-files using `ls .git/objects/pack/*.pack | wc -l`. + +**Many loose objects:** If you already split your repository into multiple pack-files, then +you can pick one to split into loose objects using `cat .git/objects/pack/[id].pack | git unpack-objects`; +delete the `[id].pack` and `[id].idx` files after this. You can count the number of loose +bjects using `ls .git/objects/??/* | wc -l`. + +**Deep history:** Usually large repositories also have deep histories, but you can use the +[test-many-commits-1m repo](https://github.com/cirosantilli/test-many-commits-1m/) to +target deep histories without the overhead of many objects. One issue with this repository: +there are no merge commits, so you will need to use a different repository to test a "wide" +commit history. + +**Large Index:** You can generate a large index and repo by using the scripts in +`t/perf/repos`. There are two scripts. `many-files.sh` which will generate a repo with +same tree and blobs but different paths. Using `many-files.sh -d 5 -w 10 -f 9` will create +a repo with ~1 million entries in the index. `inflate-repo.sh` will use an existing repo +and copy the current work tree until it is a specified size. + +Test Your Changes on Linux +-------------------------- + +It can be important to work directly on the [core Git codebase](https://github.com/git/git), +such as a recent commit into the `master` or `next` branch that has not been incorporated +into Git for Windows. Also, it can help to run functional and performance tests on your +code in Linux before submitting patches to the mailing list, which focuses on many platforms. +The differences between Windows and Linux are usually enough to catch most cross-platform +issues. + +### Using the Windows Subsystem for Linux + +The [Windows Subsystem for Linux (WSL)](https://docs.microsoft.com/en-us/windows/wsl/install-win10) +allows you to [install Ubuntu Linux as an app](https://www.microsoft.com/en-us/store/p/ubuntu/9nblggh4msv6) +that can run Linux executables on top of the Windows kernel. Internally, +Linux syscalls are interpreted by the WSL, everything else is plain Ubuntu. + +First, open WSL (either type "Bash" in Cortana, or execute "bash.exe" in a CMD window). +Then install the prerequisites, and `git` for the initial clone: + +``` +sudo apt-get update +sudo apt-get install git gcc make libssl-dev libcurl4-openssl-dev \ + libexpat-dev tcl tk gettext git-email zlib1g-dev +``` + +Then, clone and build: + +``` +git clone https://github.com/git-for-windows/git +cd git +git remote add -f upstream https://github.com/git/git +make +``` + +Be sure to clone into `/home/[user]/` and not into any folder under `/mnt/?/` or your build +will fail due to colons in file names. + +### Using a Linux Virtual Machine with Hyper-V + +If you prefer, you can use a virtual machine (VM) to run Linux and test your changes in the +full environment. The test suite runs a lot faster on Linux than on Windows or with the WSL. +You can connect to the VM using an SSH terminal like +[PuTTY](https://www.chiark.greenend.org.uk/~sgtatham/putty/). + +The following instructions are for using Hyper-V, which is available in some versions of Windows. +There are many virtual machine alternatives available, if you do not have such a version installed. + +* [Download an Ubuntu Server ISO](https://www.ubuntu.com/download/server). +* Open [Hyper-V Manager](https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/quick-start/enable-hyper-v). +* [Set up a virtual switch](https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/quick-start/connect-to-network) + so your VM can reach the network. +* Select "Quick Create", name your machine, select the ISO as installation source, and un-check + "This virtual machine will run Windows." +* Go through the Ubuntu install process, being sure to select to install OpenSSH Server. +* When install is complete, log in and check the SSH server status with `sudo service ssh status`. + * If the service is not found, install with `sudo apt-get install openssh-server`. + * If the service is not running, then use `sudo service ssh start`. +* Use `shutdown -h now` to shutdown the VM, go to the Hyper-V settings for the VM, expand Network Adapter + to select "Advanced Features", and set the MAC address to be static (this can save your VM from losing + network if shut down incorrectly). +* Provide as many cores to your VM as you can (for parallel builds). +* Restart your VM, but do not connect. +* Use `ssh` in Git Bash, download [PuTTY](http://www.putty.org/), or use your favorite SSH client to connect to the VM through SSH. + +In order to build and use `git`, you will need the following libraries via `apt-get`: + +``` +sudo apt-get update +sudo apt-get install git gcc make libssl-dev libcurl4-openssl-dev \ + libexpat-dev tcl tk gettext git-email zlib1g-dev +``` + +To get your code from your Windows machine to the Linux VM, it is easiest to push the branch to your fork of Git and clone your fork in the Linux VM. + +Don't forget to set your `git` config with your preferred name, email, and editor. + +Polish Your Commits +------------------- + +Before submitting your patch, be sure to read the [coding guidelines](https://github.com/git/git/blob/master/Documentation/CodingGuidelines) +and check your code to match as best you can. This can be a lot of effort, but it saves +time during review to avoid style issues. + +The other possibly major difference between the mailing list submissions and GitHub PR workflows +is that each commit will be reviewed independently. Even if you are submitting a +patch series with multiple commits, each commit must stand on it's own and be reviewable +by itself. Make sure the commit message clearly explain the why of the commit not the how. +Describe what is wrong with the current code and how your changes have made the code better. + +When preparing your patch, it is important to put yourself in the shoes of the Git community. +Accepting a patch requires more justification than approving a pull request from someone on +your team. The community has a stable product and is responsible for keeping it stable. If +you introduce a bug, then they cannot count on you being around to fix it. When you decided +to start work on a new feature, they were not part of the design discussion and may not +even believe the feature is worth introducing. + +Questions to answer in your patch message (and commit messages) may include: +* Why is this patch necessary? +* How does the current behavior cause pain for users? +* What kinds of repositories are necessary for noticing a difference? +* What design options did you consider before writing this version? Do you have links to + code for those alternate designs? +* Is this a performance fix? Provide clear performance numbers for various well-known repos. + +Here are some other tips that we use when cleaning up our commits: + +* Commit messages should be wrapped at 76 columns per line (or less; 72 is also a + common choice). +* Make sure the commits are signed off using `git commit (-s|--signoff)`. See + [SubmittingPatches](https://github.com/git/git/blob/v2.8.1/Documentation/SubmittingPatches#L234-L286) + for more details about what this sign-off means. +* Check for whitespace errors using `git diff --check [base]...HEAD` or `git log --check`. +* Run `git rebase --whitespace=fix` to correct upstream issues with whitespace. +* Become familiar with interactive rebase (`git rebase -i`) because you will be reordering, + squashing, and editing commits as your patch or series of patches is reviewed. +* Make sure any shell scripts that you add have the executable bit set on them. This is + usually for test files that you add in the `/t` directory. You can use + `git add --chmod=+x [file]` to update it. You can test whether a file is marked as executable + using `git ls-files --stage \*.sh`; the first number is 100755 for executable files. +* Your commit titles should match the "area: change description" format. Rules of thumb: + * Choose "<area>: " prefix appropriately. + * Keep the description short and to the point. + * The word that follows the "<area>: " prefix is not capitalized. + * Do not include a full-stop at the end of the title. + * Read a few commit messages -- using `git log origin/master`, for instance -- to + become acquainted with the preferred commit message style. +* Build source using `make DEVELOPER=1` for extra-strict compiler warnings. + +Submit Your Patch +----------------- + +Git for Windows [accepts pull requests on GitHub](https://github.com/git-for-windows/git/pulls), but +these are reserved for Windows-specific improvements. For core Git, submissions are accepted on +[the Git mailing list](https://public-inbox.org/git). + +### Configure Git to Send Emails + +There are a bunch of options for configuring the `git send-email` command. These options can +be found in the documentation for +[`git config`](https://git-scm.com/docs/git-config) and +[`git send-email`](https://git-scm.com/docs/git-send-email). + +``` +git config --global sendemail.smtpserver <smtp server> +git config --global sendemail.smtpserverport 587 +git config --global sendemail.smtpencryption tls +git config --global sendemail.smtpuser <email address> +``` + +To avoid storing your password in the config file, store it in the Git credential manager: + +``` +$ git credential fill +protocol=smtp +host=<stmp server> +username=<email address> +password=password +``` + +Before submitting a patch, read the [Git documentation on submitting patches](https://github.com/git/git/blob/master/Documentation/SubmittingPatches). + +To construct a patch set, use the `git format-patch` command. There are three important options: + +* `--cover-letter`: If specified, create a `[v#-]0000-cover-letter.patch` file that can be + edited to describe the patch as a whole. If you previously added a branch description using + `git branch --edit-description`, you will end up with a 0/N mail with that description and + a nice overall diffstat. +* `--in-reply-to=[Message-ID]`: This will mark your cover letter as replying to the given + message (which should correspond to your previous iteration). To determine the correct Message-ID, + find the message you are replying to on [public-inbox.org/git](https://public-inbox.org/git) and take + the ID from between the angle brackets. + +* `--subject-prefix=[prefix]`: This defaults to [PATCH]. For subsequent iterations, you will want to + override it like `--subject-prefix="[PATCH v2]"`. You can also use the `-v` option to have it + automatically generate the version number in the patches. + +If you have multiple commits and use the `--cover-letter` option be sure to open the +`0000-cover-letter.patch` file to update the subject and add some details about the overall purpose +of the patch series. + +### Examples + +To generate a single commit patch file: +``` +git format-patch -s -o [dir] -1 +``` +To generate four patch files from the last three commits with a cover letter: +``` +git format-patch --cover-letter -s -o [dir] HEAD~4 +``` +To generate version 3 with four patch files from the last four commits with a cover letter: +``` +git format-patch --cover-letter -s -o [dir] -v 3 HEAD~4 +``` + +### Submit the Patch + +Run [`git send-email`](https://git-scm.com/docs/git-send-email), starting with a test email: + +``` +git send-email --to=yourself@address.com [dir with patches]/*.patch +``` + +After checking the receipt of your test email, you can send to the list and to any +potentially interested reviewers. + +``` +git send-email --to=git@vger.kernel.org --cc=<email1> --cc=<email2> [dir with patches]/*.patch +``` + +To submit a nth version patch (say version 3): + +``` +git send-email --to=git@vger.kernel.org --cc=<email1> --cc=<email2> \ + --in-reply-to=<the message id of cover letter of patch v2> [dir with patches]/*.patch +``` From 46a6c34a9d21b6f52d33b608b885dd415b3311d9 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Fri, 10 Jan 2014 16:16:03 -0600 Subject: [PATCH 289/297] README.md: Add a Windows-specific preamble MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Includes touch-ups by 마누엘, Philip Oakley and 孙卓识. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- README.md | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 76 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 665ce5f5a83647..4eabce53d89e1f 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,77 @@ -[![Build status](https://github.com/git/git/workflows/CI/badge.svg)](https://github.com/git/git/actions?query=branch%3Amaster+event%3Apush) +Git for Windows +=============== + +[![Contributor Covenant](https://img.shields.io/badge/Contributor%20Covenant-2.1-4baaaa.svg)](CODE_OF_CONDUCT.md) +[![Open in Visual Studio Code](https://img.shields.io/static/v1?logo=visualstudiocode&label=&message=Open%20in%20Visual%20Studio%20Code&labelColor=2c2c32&color=007acc&logoColor=007acc)](https://open.vscode.dev/git-for-windows/git) +[![Build status](https://github.com/git-for-windows/git/workflows/CI/badge.svg)](https://github.com/git-for-windows/git/actions?query=branch%3Amain+event%3Apush) +[![Join the chat at https://gitter.im/git-for-windows/git](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/git-for-windows/git?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) + +This is [Git for Windows](http://git-for-windows.github.io/), the Windows port +of [Git](http://git-scm.com/). + +The Git for Windows project is run using a [governance +model](http://git-for-windows.github.io/governance-model.html). If you +encounter problems, you can report them as [GitHub +issues](https://github.com/git-for-windows/git/issues), discuss them on Git +for Windows' [Google Group](http://groups.google.com/group/git-for-windows), +and [contribute bug +fixes](https://github.com/git-for-windows/git/wiki/How-to-participate). + +To build Git for Windows, please either install [Git for Windows' +SDK](https://gitforwindows.org/#download-sdk), start its `git-bash.exe`, `cd` +to your Git worktree and run `make`, or open the Git worktree as a folder in +Visual Studio. + +To verify that your build works, use one of the following methods: + +- If you want to test the built executables within Git for Windows' SDK, + prepend `<worktree>/bin-wrappers` to the `PATH`. +- Alternatively, run `make install` in the Git worktree. +- If you need to test this in a full installer, run `sdk build + git-and-installer`. +- You can also "install" Git into an existing portable Git via `make install + DESTDIR=<dir>` where `<dir>` refers to the top-level directory of the + portable Git. In this instance, you will want to prepend that portable Git's + `/cmd` directory to the `PATH`, or test by running that portable Git's + `git-bash.exe` or `git-cmd.exe`. +- If you built using a recent Visual Studio, you can use the menu item + `Build>Install git` (you will want to click on `Project>CMake Settings for + Git` first, then click on `Edit JSON` and then point `installRoot` to the + `mingw64` directory of an already-unpacked portable Git). + + As in the previous bullet point, you will then prepend `/cmd` to the `PATH` + or run using the portable Git's `git-bash.exe` or `git-cmd.exe`. +- If you want to run the built executables in-place, but in a CMD instead of + inside a Bash, you can run a snippet like this in the `git-bash.exe` window + where Git was built (ensure that the `EOF` line has no leading spaces), and + then paste into the CMD window what was put in the clipboard: + + ```sh + clip.exe <<EOF + set GIT_EXEC_PATH=$(cygpath -aw .) + set PATH=$(cygpath -awp ".:contrib/scalar:/mingw64/bin:/usr/bin:$PATH") + set GIT_TEMPLATE_DIR=$(cygpath -aw templates/blt) + set GITPERLLIB=$(cygpath -aw perl/build/lib) + EOF + ``` +- If you want to run the built executables in-place, but outside of Git for + Windows' SDK, and without an option to set/override any environment + variables (e.g. in Visual Studio's debugger), you can call the Git executable + by its absolute path and use the `--exec-path` option, like so: + + ```cmd + C:\git-sdk-64\usr\src\git\git.exe --exec-path=C:\git-sdk-64\usr\src\git help + ``` + + Note: for this to work, you have to hard-link (or copy) the `.dll` files from + the `/mingw64/bin` directory to the Git worktree, or add the `/mingw64/bin` + directory to the `PATH` somehow or other. + +To make sure that you are testing the correct binary, call `./git.exe version` +in the Git worktree, and then call `git version` in a directory/window where +you want to test Git, and verify that they refer to the same version (you may +even want to pass the command-line option `--build-options` to look at the +exact commit from which the Git version was built). Git - fast, scalable, distributed revision control system ========================================================= @@ -29,7 +102,7 @@ CVS users may also want to read [Documentation/gitcvs-migration.txt][] (`man gitcvs-migration` or `git help cvs-migration` if git is installed). -The user discussion and development of Git take place on the Git +The user discussion and development of core Git take place on the Git mailing list -- everyone is welcome to post bug reports, feature requests, comments and patches to git@vger.kernel.org (read [Documentation/SubmittingPatches][] for instructions on patch submission @@ -43,6 +116,7 @@ To subscribe to the list, send an email to <git+subscribe@vger.kernel.org> (see https://subspace.kernel.org/subscribing.html for details). The mailing list archives are available at <https://lore.kernel.org/git/>, <https://marc.info/?l=git> and other archival sites. +The core git mailing list is plain text (no HTML!). Issues which are security relevant should be disclosed privately to the Git Security mailing list <git-security@googlegroups.com>. From 34bbb92a1d53f3d64c9c56654b6933438c8aaae1 Mon Sep 17 00:00:00 2001 From: Brendan Forster <brendan@github.com> Date: Thu, 18 Feb 2016 21:29:50 +1100 Subject: [PATCH 290/297] Add an issue template With improvements by Clive Chan, Adric Norris, Ben Bodenmiller and Philip Oakley. Helped-by: Clive Chan <cc@clive.io> Helped-by: Adric Norris <landstander668@gmail.com> Helped-by: Ben Bodenmiller <bbodenmiller@hotmail.com> Helped-by: Philip Oakley <philipoakley@iee.org> Signed-off-by: Brendan Forster <brendan@github.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- .github/ISSUE_TEMPLATE.md | 64 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 64 insertions(+) create mode 100644 .github/ISSUE_TEMPLATE.md diff --git a/.github/ISSUE_TEMPLATE.md b/.github/ISSUE_TEMPLATE.md new file mode 100644 index 00000000000000..4017ed82ca4341 --- /dev/null +++ b/.github/ISSUE_TEMPLATE.md @@ -0,0 +1,64 @@ + - [ ] I was not able to find an [open](https://github.com/git-for-windows/git/issues?q=is%3Aopen) or [closed](https://github.com/git-for-windows/git/issues?q=is%3Aclosed) issue matching what I'm seeing + +### Setup + + - Which version of Git for Windows are you using? Is it 32-bit or 64-bit? + +``` +$ git --version --build-options + +** insert your machine's response here ** +``` + + - Which version of Windows are you running? Vista, 7, 8, 10? Is it 32-bit or 64-bit? + +``` +$ cmd.exe /c ver + +** insert your machine's response here ** +``` + + - What options did you set as part of the installation? Or did you choose the + defaults? + +``` +# One of the following: +> type "C:\Program Files\Git\etc\install-options.txt" +> type "C:\Program Files (x86)\Git\etc\install-options.txt" +> type "%USERPROFILE%\AppData\Local\Programs\Git\etc\install-options.txt" +> type "$env:USERPROFILE\AppData\Local\Programs\Git\etc\install-options.txt" +$ cat /etc/install-options.txt + +** insert your machine's response here ** +``` + + - Any other interesting things about your environment that might be related + to the issue you're seeing? + +** insert your response here ** + +### Details + + - Which terminal/shell are you running Git from? e.g Bash/CMD/PowerShell/other + +** insert your response here ** + + - What commands did you run to trigger this issue? If you can provide a + [Minimal, Complete, and Verifiable example](http://stackoverflow.com/help/mcve) + this will help us understand the issue. + +``` +** insert your commands here ** +``` + - What did you expect to occur after running these commands? + +** insert here ** + + - What actually happened instead? + +** insert here ** + + - If the problem was occurring with a specific repository, can you provide the + URL to that repository to help us with testing? + +** insert URL here ** From b0d782a22d504d10e0d79fd6f3ec49576c3ac52a Mon Sep 17 00:00:00 2001 From: Philip Oakley <philipoakley@iee.org> Date: Fri, 22 Dec 2017 17:15:50 +0000 Subject: [PATCH 291/297] Modify the GitHub Pull Request template (to reflect Git for Windows) Git for Windows accepts pull requests; Core Git does not. Therefore we need to adjust the template (because it only matches core Git's project management style, not ours). Also: direct Git for Windows enhancements to their contributions page, space out the text for easy reading, and clarify that the mailing list is plain text, not HTML. Signed-off-by: Philip Oakley <philipoakley@iee.org> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- .github/PULL_REQUEST_TEMPLATE.md | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md index 37654cdfd7abcf..7baf31f2c471ec 100644 --- a/.github/PULL_REQUEST_TEMPLATE.md +++ b/.github/PULL_REQUEST_TEMPLATE.md @@ -1,7 +1,19 @@ -Thanks for taking the time to contribute to Git! Please be advised that the -Git community does not use github.com for their contributions. Instead, we use -a mailing list (git@vger.kernel.org) for code submissions, code reviews, and -bug reports. Nevertheless, you can use GitGitGadget (https://gitgitgadget.github.io/) +Thanks for taking the time to contribute to Git! + +Those seeking to contribute to the Git for Windows fork should see +http://gitforwindows.org/#contribute on how to contribute Windows specific +enhancements. + +If your contribution is for the core Git functions and documentation +please be aware that the Git community does not use the github.com issues +or pull request mechanism for their contributions. + +Instead, we use the Git mailing list (git@vger.kernel.org) for code and +documentation submissions, code reviews, and bug reports. The +mailing list is plain text only (anything with HTML is sent directly +to the spam folder). + +Nevertheless, you can use GitGitGadget (https://gitgitgadget.github.io/) to conveniently send your Pull Requests commits to our mailing list. For a single-commit pull request, please *leave the pull request description From 79df07f8ab8eb5a57f5d6346b380574256583bdb Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Tue, 20 Feb 2018 15:44:57 +0100 Subject: [PATCH 292/297] .github: Add configuration for the Sentiment Bot The sentiment bot will help detect when things get too heated. Hopefully. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- .github/config.yml | 10 ++++++++++ 1 file changed, 10 insertions(+) create mode 100644 .github/config.yml diff --git a/.github/config.yml b/.github/config.yml new file mode 100644 index 00000000000000..45edb7ba37ce02 --- /dev/null +++ b/.github/config.yml @@ -0,0 +1,10 @@ +# Configuration for sentiment-bot - https://github.com/behaviorbot/sentiment-bot + +# *Required* toxicity threshold between 0 and .99 with the higher numbers being +# the most toxic. Anything higher than this threshold will be marked as toxic +# and commented on +sentimentBotToxicityThreshold: .7 + +# *Required* Comment to reply with +sentimentBotReplyComment: > + Please be sure to review the code of conduct and be respectful of other users. cc/ @git-for-windows/trusted-git-for-windows-developers From 4b119814fcc7baed475701c145186fc9cd64ff92 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Tue, 29 Sep 2020 13:50:59 +0200 Subject: [PATCH 293/297] Add a GitHub workflow to monitor component updates MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Rather than using private IFTTT Applets that send mails to this maintainer whenever a new version of a Git for Windows component was released, let's use the power of GitHub workflows to make this process publicly visible. This workflow monitors the Atom/RSS feeds, and opens a ticket whenever a new version was released. Note: Bash sometimes releases multiple patched versions within a few minutes of each other (i.e. 5.1p1 through 5.1p4, 5.0p15 and 5.0p16). The MSYS2 runtime also has a similar system. We can address those patches as a group, so we shouldn't get multiple issues about them. Note further: We're not acting on newlib releases, OpenSSL alphas, Perl release candidates or non-stable Perl releases. There's no need to open issues about them. Co-authored-by: Matthias Aßhauer <mha1993@live.de> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- .github/workflows/monitor-components.yml | 98 ++++++++++++++++++++++++ 1 file changed, 98 insertions(+) create mode 100644 .github/workflows/monitor-components.yml diff --git a/.github/workflows/monitor-components.yml b/.github/workflows/monitor-components.yml new file mode 100644 index 00000000000000..fedc6add69a9e0 --- /dev/null +++ b/.github/workflows/monitor-components.yml @@ -0,0 +1,98 @@ +name: Monitor component updates + +# Git for Windows is a slightly modified subset of MSYS2. Some of its +# components are maintained by Git for Windows, others by MSYS2. To help +# keeping the former up to date, this workflow monitors the Atom/RSS feeds +# and opens new tickets for each new component version. + +on: + schedule: + - cron: "23 8,11,14,17 * * *" + workflow_dispatch: + +env: + CHARACTER_LIMIT: 5000 + MAX_AGE: 7d + +jobs: + job: + # Only run this in Git for Windows' fork + if: github.event.repository.owner.login == 'git-for-windows' + runs-on: ubuntu-latest + permissions: + issues: write + strategy: + matrix: + component: + - label: git + feed: https://github.com/git/git/tags.atom + - label: git-lfs + feed: https://github.com/git-lfs/git-lfs/tags.atom + - label: git-credential-manager + feed: https://github.com/git-ecosystem/git-credential-manager/tags.atom + - label: tig + feed: https://github.com/jonas/tig/tags.atom + - label: cygwin + feed: https://github.com/cygwin/cygwin/releases.atom + title-pattern: ^(?!.*newlib) + - label: msys2-runtime-package + feed: https://github.com/msys2/MSYS2-packages/commits/master/msys2-runtime.atom + - label: msys2-runtime + feed: https://github.com/msys2/msys2-runtime/commits/HEAD.atom + aggregate: true + - label: openssh + feed: https://github.com/openssh/openssh-portable/tags.atom + - label: libfido2 + feed: https://github.com/Yubico/libfido2/tags.atom + - label: libcbor + feed: https://github.com/PJK/libcbor/tags.atom + - label: openssl + feed: https://github.com/openssl/openssl/tags.atom + title-pattern: ^(?!.*alpha) + - label: gnutls + feed: https://gnutls.org/news.atom + - label: heimdal + feed: https://github.com/heimdal/heimdal/tags.atom + - label: git-sizer + feed: https://github.com/github/git-sizer/tags.atom + - label: gitflow + feed: https://github.com/petervanderdoes/gitflow-avh/tags.atom + - label: curl + feed: https://github.com/curl/curl/tags.atom + - label: libgpg-error + feed: https://github.com/gpg/libgpg-error/releases.atom + title-pattern: ^libgpg-error-[0-9\.]*$ + - label: libgcrypt + feed: https://github.com/gpg/libgcrypt/releases.atom + title-pattern: ^libgcrypt-[0-9\.]*$ + - label: gpg + feed: https://github.com/gpg/gnupg/releases.atom + - label: mintty + feed: https://github.com/mintty/mintty/releases.atom + - label: 7-zip + feed: https://sourceforge.net/projects/sevenzip/rss?path=/7-Zip + aggregate: true + - label: bash + feed: https://git.savannah.gnu.org/cgit/bash.git/atom/?h=master + aggregate: true + - label: perl + feed: https://github.com/Perl/perl5/tags.atom + title-pattern: ^(?!.*(5\.[0-9]+[13579]|RC)) + - label: pcre2 + feed: https://github.com/PCRE2Project/pcre2/tags.atom + - label: mingw-w64-llvm + feed: https://github.com/msys2/MINGW-packages/commits/master/mingw-w64-llvm.atom + - label: innosetup + feed: https://github.com/jrsoftware/issrc/tags.atom + fail-fast: false + steps: + - uses: git-for-windows/rss-to-issues@v0 + with: + feed: ${{matrix.component.feed}} + prefix: "[New ${{matrix.component.label}} version]" + labels: component-update + github-token: ${{ secrets.GITHUB_TOKEN }} + character-limit: ${{ env.CHARACTER_LIMIT }} + max-age: ${{ env.MAX_AGE }} + aggregate: ${{matrix.component.aggregate}} + title-pattern: ${{matrix.component.title-pattern}} From 15f95ac290a09318d15605c3c1224b1bebd61217 Mon Sep 17 00:00:00 2001 From: Alejandro Barreto <alejandro.barreto@ni.com> Date: Fri, 9 Mar 2018 14:17:54 -0600 Subject: [PATCH 294/297] Document how $HOME is set on Windows Git documentation refers to $HOME and $XDG_CONFIG_HOME often, but does not specify how or where these values come from on Windows where neither is set by default. The new documentation reflects the behavior of setup_windows_environment() in compat/mingw.c. Signed-off-by: Alejandro Barreto <alejandro.barreto@ni.com> --- Documentation/git.txt | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/Documentation/git.txt b/Documentation/git.txt index 4489e2297a396b..74ee8334cfdfd0 100644 --- a/Documentation/git.txt +++ b/Documentation/git.txt @@ -477,6 +477,14 @@ their values the same way as Boolean valued configuration variables, e.g. Here are the variables: +System +~~~~~~ +`HOME`:: + Specifies the path to the user's home directory. On Windows, if + unset, Git will set a process environment variable equal to: + `$HOMEDRIVE$HOMEPATH` if both `$HOMEDRIVE` and `$HOMEPATH` exist; + otherwise `$USERPROFILE` if `$USERPROFILE` exists. + The Git Repository ~~~~~~~~~~~~~~~~~~ These environment variables apply to 'all' core Git commands. Nb: it From 7337fe16e24a98689ba500fbc26a260ad75a8a32 Mon Sep 17 00:00:00 2001 From: Victoria Dye <vdye@github.com> Date: Mon, 4 Apr 2022 15:38:58 -0700 Subject: [PATCH 295/297] fsmonitor: reintroduce core.useBuiltinFSMonitor Reintroduce the 'core.useBuiltinFSMonitor' config setting (originally added in 0a756b2a25 (fsmonitor: config settings are repository-specific, 2021-03-05)) after its removal from the upstream version of FSMonitor. Upstream, the 'core.useBuiltinFSMonitor' setting was rendered obsolete by "overloading" the 'core.fsmonitor' setting to take a boolean value. However, several applications (e.g., 'scalar') utilize the original config setting, so it should be preserved for a deprecation period before complete removal: * if 'core.fsmonitor' is a boolean, the user is correctly using the new config syntax; do not use 'core.useBuiltinFSMonitor'. * if 'core.fsmonitor' is unspecified, use 'core.useBuiltinFSMonitor'. * if 'core.fsmonitor' is a path, override and use the builtin FSMonitor if 'core.useBuiltinFSMonitor' is 'true'; otherwise, use the FSMonitor hook indicated by the path. Additionally, for this deprecation period, advise users to switch to using 'core.fsmonitor' to specify their use of the builtin FSMonitor. Signed-off-by: Victoria Dye <vdye@github.com> --- Documentation/config/advice.txt | 4 ++++ advice.c | 1 + advice.h | 1 + fsmonitor-settings.c | 34 +++++++++++++++++++++++++++++++-- 4 files changed, 38 insertions(+), 2 deletions(-) diff --git a/Documentation/config/advice.txt b/Documentation/config/advice.txt index 0ba89898207f0c..86bf99d55c6575 100644 --- a/Documentation/config/advice.txt +++ b/Documentation/config/advice.txt @@ -160,4 +160,8 @@ advice.*:: Shown when the user tries to create a worktree from an invalid reference, to tell the user how to create a new unborn branch instead. + + useCoreFSMonitorConfig:: + Advice shown if the deprecated 'core.useBuiltinFSMonitor' config + setting is in use. -- diff --git a/advice.c b/advice.c index 6b879d805c0304..66f196e0ae3cd9 100644 --- a/advice.c +++ b/advice.c @@ -87,6 +87,7 @@ static struct { [ADVICE_SUBMODULE_MERGE_CONFLICT] = { "submoduleMergeConflict" }, [ADVICE_SUGGEST_DETACHING_HEAD] = { "suggestDetachingHead" }, [ADVICE_UPDATE_SPARSE_PATH] = { "updateSparsePath" }, + [ADVICE_USE_CORE_FSMONITOR_CONFIG] = { "useCoreFSMonitorConfig" }, [ADVICE_WAITING_FOR_EDITOR] = { "waitingForEditor" }, [ADVICE_WORKTREE_ADD_ORPHAN] = { "worktreeAddOrphan" }, }; diff --git a/advice.h b/advice.h index d7466bc0ef2026..a78ea909ed8e7f 100644 --- a/advice.h +++ b/advice.h @@ -54,6 +54,7 @@ enum advice_type { ADVICE_SUBMODULE_MERGE_CONFLICT, ADVICE_SUGGEST_DETACHING_HEAD, ADVICE_UPDATE_SPARSE_PATH, + ADVICE_USE_CORE_FSMONITOR_CONFIG, ADVICE_WAITING_FOR_EDITOR, ADVICE_WORKTREE_ADD_ORPHAN, }; diff --git a/fsmonitor-settings.c b/fsmonitor-settings.c index e818583420aa1b..7ce048c4efed5d 100644 --- a/fsmonitor-settings.c +++ b/fsmonitor-settings.c @@ -5,6 +5,7 @@ #include "fsmonitor-ipc.h" #include "fsmonitor-settings.h" #include "fsmonitor-path-utils.h" +#include "advice.h" /* * We keep this structure defintion private and have getters @@ -100,6 +101,31 @@ static struct fsmonitor_settings *alloc_settings(void) return s; } +static int check_deprecated_builtin_config(struct repository *r) +{ + int core_use_builtin_fsmonitor = 0; + + /* + * If 'core.useBuiltinFSMonitor' is set, print a deprecation warning + * suggesting the use of 'core.fsmonitor' instead. If the config is + * set to true, set the appropriate mode and return 1 indicating that + * the check resulted the config being set by this (deprecated) setting. + */ + if(!repo_config_get_bool(r, "core.useBuiltinFSMonitor", &core_use_builtin_fsmonitor) && + core_use_builtin_fsmonitor) { + if (!git_env_bool("GIT_SUPPRESS_USEBUILTINFSMONITOR_ADVICE", 0)) { + advise_if_enabled(ADVICE_USE_CORE_FSMONITOR_CONFIG, + _("core.useBuiltinFSMonitor=true is deprecated;" + "please set core.fsmonitor=true instead")); + setenv("GIT_SUPPRESS_USEBUILTINFSMONITOR_ADVICE", "1", 1); + } + fsm_settings__set_ipc(r); + return 1; + } + + return 0; +} + static void lookup_fsmonitor_settings(struct repository *r) { const char *const_str; @@ -126,12 +152,16 @@ static void lookup_fsmonitor_settings(struct repository *r) return; case 1: /* config value was unset */ + if (check_deprecated_builtin_config(r)) + return; + const_str = getenv("GIT_TEST_FSMONITOR"); break; case -1: /* config value set to an arbitrary string */ - if (repo_config_get_pathname(r, "core.fsmonitor", &to_free)) - return; /* should not happen */ + if (check_deprecated_builtin_config(r) || + repo_config_get_pathname(r, "core.fsmonitor", &to_free)) + return; const_str = to_free; break; From b347f2c9ef7601c23aec38525115831e6c7c7fee Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Tue, 6 Feb 2024 18:45:35 +0100 Subject: [PATCH 296/297] dependabot: help keeping GitHub Actions versions up to date See https://docs.github.com/en/code-security/dependabot/working-with-dependabot/keeping-your-actions-up-to-date-with-dependabot#enabling-dependabot-version-updates-for-actions for details. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- .github/dependabot.yml | 13 +++++++++++++ 1 file changed, 13 insertions(+) create mode 100644 .github/dependabot.yml diff --git a/.github/dependabot.yml b/.github/dependabot.yml new file mode 100644 index 00000000000000..22d5376407abf1 --- /dev/null +++ b/.github/dependabot.yml @@ -0,0 +1,13 @@ +# To get started with Dependabot version updates, you'll need to specify which +# package ecosystems to update and where the package manifests are located. +# Please see the documentation for all configuration options: +# https://docs.github.com/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file +# especially +# https://docs.github.com/en/code-security/dependabot/working-with-dependabot/keeping-your-actions-up-to-date-with-dependabot#enabling-dependabot-version-updates-for-actions + +version: 2 +updates: + - package-ecosystem: "github-actions" # See documentation for possible values + directory: "/" # Location of package manifests + schedule: + interval: "weekly" From 95bf5144fa7ae27bb90f41f26443500d2af3e86f Mon Sep 17 00:00:00 2001 From: Johannes Schindelin <johannes.schindelin@gmx.de> Date: Fri, 23 Aug 2019 14:14:42 +0200 Subject: [PATCH 297/297] SECURITY.md: document Git for Windows' policies This is the recommended way on GitHub to describe policies revolving around security issues and about supported versions. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- SECURITY.md | 56 +++++++++++++++++++++++++++++++++-------------------- 1 file changed, 35 insertions(+), 21 deletions(-) diff --git a/SECURITY.md b/SECURITY.md index c720c2ae7f9580..328351684298a4 100644 --- a/SECURITY.md +++ b/SECURITY.md @@ -28,24 +28,38 @@ Examples for details to include: ## Supported Versions -There are no official "Long Term Support" versions in Git. -Instead, the maintenance track (i.e. the versions based on the -most recently published feature release, also known as ".0" -version) sees occasional updates with bug fixes. - -Fixes to vulnerabilities are made for the maintenance track for -the latest feature release and merged up to the in-development -branches. The Git project makes no formal guarantee for any -older maintenance tracks to receive updates. In practice, -though, critical vulnerability fixes are applied not only to the -most recent track, but to at least a couple more maintenance -tracks. - -This is typically done by making the fix on the oldest and still -relevant maintenance track, and merging it upwards to newer and -newer maintenance tracks. - -For example, v2.24.1 was released to address a couple of -[CVEs](https://cve.mitre.org/), and at the same time v2.14.6, -v2.15.4, v2.16.6, v2.17.3, v2.18.2, v2.19.3, v2.20.2, v2.21.1, -v2.22.2 and v2.23.1 were released. +Git for Windows is a "friendly fork" of [Git](https://git-scm.com/), i.e. changes in Git for Windows are frequently contributed back, and Git for Windows' release cycle closely following Git's. + +While Git maintains several release trains (when v2.19.1 was released, there were updates to v2.14.x-v2.18.x, too, for example), Git for Windows follows only the latest Git release. For example, there is no Git for Windows release corresponding to Git v2.16.5 (which was released after v2.19.0). + +One exception is [MinGit for Windows](https://github.com/git-for-windows/git/wiki/MinGit) (a minimal subset of Git for Windows, intended for bundling with third-party applications that do not need any interactive commands nor support for `git svn`): critical security fixes are backported to the v2.11.x, v2.14.x, v2.19.x, v2.21.x and v2.23.x release trains. + +## Version number scheme + +The Git for Windows versions reflect the Git version on which they are based. For example, Git for Windows v2.21.0 is based on Git v2.21.0. + +As Git for Windows bundles more than just Git (such as Bash, OpenSSL, OpenSSH, GNU Privacy Guard), sometimes there are interim releases without corresponding Git releases. In these cases, Git for Windows appends a number in parentheses, starting with the number 2, then 3, etc. For example, both Git for Windows v2.17.1 and v2.17.1(2) were based on Git v2.17.1, but the latter included updates for Git Credential Manager and Git LFS, fixing critical regressions. + +## Tag naming scheme + +Every Git for Windows version is tagged using a name that starts with the Git version on which it is based, with the suffix `.windows.<patchlevel>` appended. For example, Git for Windows v2.17.1' source code is tagged as [`v2.17.1.windows.1`](https://github.com/git-for-windows/git/releases/tag/v2.17.1.windows.1) (the patch level is always at least 1, given that Git for Windows always has patches on top of Git). Likewise, Git for Windows v2.17.1(2)' source code is tagged as [`v2.17.1.windows.2`](https://github.com/git-for-windows/git/releases/tag/v2.17.1.windows.2). + +## Release Candidate (rc) versions + +As a friendly fork of Git (the "upstream" project), Git for Windows is closely corelated to that project. + +Consequently, Git for Windows publishes versions based on Git's release candidates (for upcoming "`.0`" versions, see [Git's release schedule](https://tinyurl.com/gitCal)). These versions end in `-rc<n>`, starting with `-rc0` for a very early preview of what is to come, and as with regular versions, Git for Windows tries to follow Git's releases as quickly as possible. + +Note: there is currently a bug in the "Check daily for updates" code, where it mistakes the final version as a downgrade from release candidates. Example: if you installed Git for Windows v2.23.0-rc3 and enabled the auto-updater, it would ask you whether you want to "downgrade" to v2.23.0 when that version was available. + +[All releases](https://github.com/git-for-windows/git/releases/), including release candidates, are listed via a link at the footer of the [Git for Windows](https://gitforwindows.org/) home page. + +## Snapshot versions ('nightly builds') + +Git for Windows also provides snapshots (these are not releases) of the the current development as per git-for-Windows/git's `master` branch at the [Snapshots](https://wingit.blob.core.windows.net/files/index.html) page. This link is also listed in the footer of the [Git for Windows](https://gitforwindows.org/) home page. + +Note: even if those builds are not exactly "nightly", they are sometimes referred to as "nightly builds" to keep with other projects' nomenclature. + +## Following upstream's developments + +The [gitforwindows/git repository](https://github.com/git-for-windows/git) also provides the `shears/*` branches. The `shears/*` branches reflect Git for Windows' patches, rebased onto the upstream integration branches, [updated (mostly) via automated CI builds](https://dev.azure.com/git-for-windows/git/_build?definitionId=25).