-
Notifications
You must be signed in to change notification settings - Fork 106
refs/files: fix issues with git-fetch on case-insensitive FS #798
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refs/files: fix issues with git-fetch on case-insensitive FS #798
Conversation
During the 'prepare' phase of a reference transaction in the files backend, we create the lock files for references to be created. When using batched updates on case-insensitive filesystems, the entire batched updates would be aborted if there are conflicting names such as: refs/heads/Foo refs/heads/foo This affects all commands which were migrated to use batched updates in Git 2.51, including 'git-fetch(1)' and 'git-receive-pack(1)'. Before that, reference updates would be applied serially with one transaction used per update. When users fetched multiple references on case-insensitive systems, subsequent references would simply overwrite any earlier references. So when fetching: refs/heads/foo: 5f34ec0bfeac225b1c854340257a65b106f70ea6 refs/heads/Foo: ec3053b0977e83d9b67fc32c4527a117953994f3 refs/heads/sample: 2eefd1150e06d8fca1ddfa684dec016f36bf4e56 The user would simply end up with: refs/heads/foo: ec3053b0977e83d9b67fc32c4527a117953994f3 refs/heads/sample: 2eefd1150e06d8fca1ddfa684dec016f36bf4e56 This is buggy behavior since the user is never informed about the overrides performed and missing references. Nevertheless, the user is left with a working repository with a subset of the references. Since Git 2.51, in such situations fetches would simply fail without updating any references. Which is also buggy behavior and worse off since the user is left without any references. The error is triggered in `lock_raw_ref()` where the files backend attempts to create a lock file. When a lock file already exists the function returns a 'REF_TRANSACTION_ERROR_GENERIC'. When this happens, the entire batched updates, not individual operation, is aborted as if it were in a transaction. Change this to return 'REF_TRANSACTION_ERROR_CASE_CONFLICT' instead to aid the batched update mechanism to simply reject such errors. The change only affects batched updates since batched updates will reject individual updates with non-generic errors. So specifically this would only affect: 1. git fetch 2. git receive-pack 3. git update-ref --batch-updates This bubbles the error type up to `files_transaction_prepare()` which tries to lock each reference update. So if the locking fails, we check if the rejection type can be ignored, which is done by calling `ref_transaction_maybe_set_rejected()`. As the error type is now 'REF_TRANSACTION_ERROR_CASE_CONFLICT', the specific reference update would simply be rejected, while other updates in the transaction would continue to be applied. This allows partial application of references in case-insensitive filesystems when fetching colliding references. While the earlier implementation allowed the last reference to be applied overriding the initial references, this change would allow the first reference to be applied while rejecting consequent collisions. This should be an okay compromise since with the files backend, there is no scenario possible where we would retain all colliding references. Let's also be more proactive and notify users on case-insensitive filesystems about such problems by providing a brief about the issue while also recommending using the reftable backend, which doesn't have the same issue. Reported-by: Joe Drew <[email protected]> Helped-by: Patrick Steinhardt <[email protected]> Signed-off-by: Karthik Nayak <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
When fetching references into a repository, if a lock for a particular reference exists, then `lock_raw_ref()` throws: - REF_TRANSACTION_ERROR_CASE_CONFLICT: when there is a conflict because the transaction contains conflicting references while being on a case-insensitive filesystem. - REF_TRANSACTION_ERROR_GENERIC: for all other errors. The latter causes the entire set of batched updates to fail, even in case sensitive filessystems. Instead, return a 'REF_TRANSACTION_ERROR_CREATE_EXISTS' error. This allows batched updates to reject the individual update which conflicts with the existing file, while updating the rest of the references. Signed-off-by: Karthik Nayak <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
When using the files-backend on case-insensitive filesystems, there is possibility of hitting F/D conflicts when creating references within a single transaction, such as: - 'refs/heads/foo' - 'refs/heads/Foo/bar' Ideally such conflicts are caught in `refs_verify_refnames_available()` which is responsible for checking F/D conflicts within a given transaction. This utility function is shared across the reference backends. As such, it doesn't consider the issues of using a case-insensitive file system, which only affects the files-backend. While one solution would be to make the function aware of such issues, this feels like leaking implementation details of file-backend specific issues into the utility function. So opt for the more simpler option, of lowercasing all references sent to this function when on a case-insensitive filesystem and operating on the files-backend. To do this, simply use a `struct strbuf` to convert the refname to lowercase and append it to the list of refnames to be checked. Since we use a `struct strbuf` and the memory is cleared right after, make sure that the string list duplicates all provided string. Without this change, the user would simply be left with a repository with '.lock' files which were created in the 'prepare' phase of the transaction, as the 'commit' phase would simply abort and not do the necessary cleanup. Reported-by: Junio C Hamano <[email protected]> Signed-off-by: Karthik Nayak <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
The previous commit added the necessary validation and checks for F/D conflicts in the files backend when working on case insensitive systems. There is still a possibility for D/F conflicts. This is a different from the F/D since for F/D conflicts, there would not be a conflict during the lock creation phase: refs/heads/foo.lock refs/heads/foo/bar.lock However there would be a conflict when the locks are committed, since we cannot have 'refs/heads/foo/bar' and 'refs/heads/foo'. These kinds of conflicts are checked and resolved in `refs_verify_refnames_available()`, so the previous commit ensured that for case-insensitive filesystems, we would lowercase the inputs to that function. For D/F conflicts, there is a conflict during the lock creation phase itself: refs/heads/foo/bar.lock refs/heads/foo.lock As in `lock_raw_ref()` after creating the lock, we also check for D/F conflicts. This can occur in case-insensitive filesystems when trying to fetch case-conflicted references like: refs/heads/Foo/new refs/heads/foo D/F conflicts can also occur in case-sensitive filesystems, when the repository already contains a directory with a lock file 'refs/heads/foo/bar.lock' and trying to fetch 'refs/heads/foo'. This doesn't concern directories containing garbage files as those are handled on a higher level. To fix this, simply categorize the error as a name conflict. Also remove this reference from the list of valid refnames for availability checks. By categorizing the error and removing it from the list of valid references, batched updates now knows to reject such reference updates and apply the other reference updates. Fix a small typo in `ref_transaction_maybe_set_rejected()` while here. Helped-by: Junio C Hamano <[email protected]> Signed-off-by: Karthik Nayak <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
As mentioned here, I consider this a blocker to promote any v2.51.* release to become the |
git@948b2ab finally landed in the |
I realized that we'll actually need a new Git for Windows v2.51 version because of this bug (and a couple others), therefore I opened the corresponding PR there: git-for-windows#5839 |
Hrm. "t5799.13 curl-error: no server" keeps failing for me, even if I cannot reproduce locally... Strange. In any case, let's close this in favor of a new Git for Windows release that we'll then merge into Microsoft Git. |
This happens again in #799, but now I was clever enough to try to reproduce with Windows instead of Linux 😁. Will continue over there. |
With Git 2.51 we moved 'git-fetch(1)' and 'git-receive-pack(1)' to use batched updates while doing reference updates. This provided a nice perf boost since both commands will now use a single transaction for reference updates. This removes the overhead of using individual transaction per reference update and also avoids unnecessary auto-compaction between reference updates in the reftable backend.
However, in the files-backend it does introduce a few bugs around conflicts. The reported bug was around case-insensitive filesystems [1], but we also fix some adjacent issues:
When fetching references such as:
Earlier we would simply overwrite the first reference with the second and continue. Since Git 2.51 we simply abort stating a conflict.
This is resolved in the first commit by explicitly categorizing the error as non-GENERIC. This allows batched updates to reject the
particular update, while updating the rest.
When fetching references and a lock for a particular reference already exits. We treat this is a GENERIC error, which fails the entire update. By categorizing this error as non-GENERIC, we can reject this specific update and update the other references.
When fetching references such as with F/D conflict:
Earlier we would apply the first, while the second would fail due to conflict. Since Git 2.51, the lock files for both would be created, but the 'commit' phase would abruptly end leaving the lock files.
The second commit fixes this by ensuring that on case-insensitive filesystems we lowercase the refnames for availability check to ensure F/D are caught and reported to the user.
The creation of the second reference's lock in
lock_raw_ref()
catches the D/F conflict, but we mark this as a GENERIC error. By categorizing this as non-GENERIC, we can allow the updates to continue while rejecting this specific error.This also applies to D/F conflicts in case-sensitive systems where there exists a lock in the repo 'refs/heads/foo/bar.lock' causing a conflict while fetching 'refs/heads/foo'.
This PR corresponds to https://lore.kernel.org/git/20250917-587-git-fetch-1-fails-fetches-on-case-insensitive-repositories-v4-0-da3c74a08ed0@gmail.com/.