Skip to content

Conversation

calebdw
Copy link

@calebdw calebdw commented Oct 6, 2024

Hello,

This patch series modifies Git's handling of worktree linking to use
relative paths instead of absolute paths. The motivation behind this
change is to enhance the robustness of worktree links when the main
repository is moved, or when used in environments like containerized
development where absolute paths differ across systems.

Currently, Git stores absolute paths to both the main repository and
the linked worktrees. This causes issues when the repository is moved
because the hardcoded paths become invalid, leading to broken worktree
links. Developers are then required to manually repair the links by
running git worktree repair, which can be cumbersome.

By switching to relative paths for worktrees, this patch improves the
resilience of worktree links. For self-contained repositories (e.g.,
bare repositories with worktrees), the links will continue to function
properly when the repository is moved or mapped inside a containerized
environment. While relative paths do not completely eliminate the need
for repairs (as links external to the repository can still break), the
likelihood is reduced, and Git continues to provide mechanisms to repair
broken links when needed.

I have included tests to verify that:

  • worktree links are created with relative paths.
  • moving the repository does not break worktree links.

Note that absolute paths are still supported, and the code handles both
types of paths. There should be no breaking changes introduced with this
patch. I considered adding a configuration option
(e.g., worktree.useRelativePaths) to control path type, but decided to
keep it simple. However, if there is interest, I can add this feature.

This series is based on top of 111e864.

Thanks!

Caleb

Copy link

gitgitgadget bot commented Oct 6, 2024

Welcome to GitGitGadget

Hi @calebdw, and welcome to GitGitGadget, the GitHub App to send patch series to the Git mailing list from GitHub Pull Requests.

Please make sure that either:

  • Your Pull Request has a good description, if it consists of multiple commits, as it will be used as cover letter.
  • Your Pull Request description is empty, if it consists of a single commit, as the commit message should be descriptive enough by itself.

You can CC potential reviewers by adding a footer to the PR description with the following syntax:

CC: Revi Ewer <[email protected]>, Ill Takalook <[email protected]>

NOTE: DO NOT copy/paste your CC list from a previous GGG PR's description,
because it will result in a malformed CC list on the mailing list. See
example.

Also, it is a good idea to review the commit messages one last time, as the Git project expects them in a quite specific form:

  • the lines should not exceed 76 columns,
  • the first line should be like a header and typically start with a prefix like "tests:" or "revisions:" to state which subsystem the change is about, and
  • the commit messages' body should be describing the "why?" of the change.
  • Finally, the commit messages should end in a Signed-off-by: line matching the commits' author.

It is in general a good idea to await the automated test ("Checks") in this Pull Request before contributing the patches, e.g. to avoid trivial issues such as unportable code.

Contributing the patches

Before you can contribute the patches, your GitHub username needs to be added to the list of permitted users. Any already-permitted user can do that, by adding a comment to your PR of the form /allow. A good way to find other contributors is to locate recent pull requests where someone has been /allowed:

Both the person who commented /allow and the PR author are able to /allow you.

An alternative is the channel #git-devel on the Libera Chat IRC network:

<newcontributor> I've just created my first PR, could someone please /allow me? https://github.com/gitgitgadget/git/pull/12345
<veteran> newcontributor: it is done
<newcontributor> thanks!

Once on the list of permitted usernames, you can contribute the patches to the Git mailing list by adding a PR comment /submit.

If you want to see what email(s) would be sent for a /submit request, add a PR comment /preview to have the email(s) sent to you. You must have a public GitHub email address for this. Note that any reviewers CC'd via the list in the PR description will not actually be sent emails.

After you submit, GitGitGadget will respond with another comment that contains the link to the cover letter mail in the Git mailing list archive. Please make sure to monitor the discussion in that thread and to address comments and suggestions (while the comments and suggestions will be mirrored into the PR by GitGitGadget, you will still want to reply via mail).

If you do not want to subscribe to the Git mailing list just to be able to respond to a mail, you can download the mbox from the Git mailing list archive (click the (raw) link), then import it into your mail program. If you use GMail, you can do this via:

curl -g --user "<EMailAddress>:<Password>" \
    --url "imaps://imap.gmail.com/INBOX" -T /path/to/raw.txt

To iterate on your change, i.e. send a revised patch or patch series, you will first want to (force-)push to the same branch. You probably also want to modify your Pull Request description (or title). It is a good idea to summarize the revision by adding something like this to the cover letter (read: by editing the first comment on the PR, i.e. the PR description):

Changes since v1:
- Fixed a typo in the commit message (found by ...)
- Added a code comment to ... as suggested by ...
...

To send a new iteration, just add another PR comment with the contents: /submit.

Need help?

New contributors who want advice are encouraged to join [email protected], where volunteers who regularly contribute to Git are willing to answer newbie questions, give advice, or otherwise provide mentoring to interested contributors. You must join in order to post or view messages, but anyone can join.

You may also be able to find help in real time in the developer IRC channel, #git-devel on Libera Chat. Remember that IRC does not support offline messaging, so if you send someone a private message and log out, they cannot respond to you. The scrollback of #git-devel is archived, though.

@gitgitgadget gitgitgadget bot added the new user label Oct 6, 2024
@sunshineco
Copy link

/allow

Copy link

gitgitgadget bot commented Oct 6, 2024

User calebdw is now allowed to use GitGitGadget.

@calebdw calebdw force-pushed the wt_relative_paths branch from 64638e3 to a2be943 Compare October 7, 2024 23:00
For each linked worktree, Git maintains two pointers: (1)
<repo>/worktrees/<id>/gitdir which points at the linked worktree, and
(2) <worktree>/.git which points back at <repo>/worktrees/<id>. Both
pointers are absolute pathnames.

Aside from manually manipulating those raw files, it is possible to
easily "break" one or both pointers by ignoring the "git worktree move"
command and instead manually moving a linked worktree, moving the
repository, or moving both. The "git worktree repair" command was
invented to handle this case by restoring these pointers to sane values.

For the "repair" command, the "git worktree" manual page states:

  Repair worktree administrative files, if possible, if they have
  become corrupted or outdated due to external factors.

The "if possible" clause was chosen deliberately to convey that the
existing implementation may not be able to fix every possible breakage,
and to imply that improvements may be made to handle other types of
breakage.

A recent problem report[*] illustrates a case in which "git worktree
repair" not only fails to fix breakage, but actually causes breakage.
Specifically, if a repository / main-worktree and linked worktrees are
*copied* as a unit (rather than *moved*), then "git worktree repair" run
in the copy leaves the copy untouched but botches the pointers in the
original repository and the original worktrees.

For instance, given this directory structure:

  orig/
    main/ (main-worktree)
    linked/ (linked worktree)

if "orig" is copied (not moved) to "dup", then immediately after the
manual copy operation:

  * orig/main/.git/worktrees/linked/gitdir points at orig/linked/.git
  * orig/linked/.git points at orig/main/.git/worktrees/linked
  * dup/main/.git/worktrees/linked/gitdir points at orig/linked/.git
  * dup/linked/.git points at orig/main/.git/worktrees/linked

So, dup/main thinks its linked worktree is orig/linked, and worktree
dup/linked thinks its repository / main-worktree is orig/main.

"git worktree repair" is reasonably simple-minded; it wants to trust
valid-looking pointers, hence doesn't try to second-guess them. In this
case, when validating dup/linked/.git, it finds a legitimate repository
pointer, orig/main/.git/worktrees/linked, thus trusts that is correct,
but does notice that gitdir in that directory doesn't point at
dup/linked/.git, so it (incorrectly) _fixes_
orig/main/.git/worktrees/linked/gitdir to point at dup/linked/.git.
Similarly, when validating dup/main/.git/worktrees/linked/gitdir, it
finds a legitimate worktree pointer, orig/linked/.git, but notices that
its .git file doesn't point back at dup/main, thus (incorrectly) _fixes_
orig/linked/.git to point at dup/main/.git/worktrees/linked. Hence, it
has modified and broken the linkage between orig/main and orig/linked
rather than fixing dup/main and dup/linked as expected.

Fix this problem by also checking if a plausible .git/worktrees/<id>
exists in the *current* repository -- not just in the repository pointed
at by the worktree's .git file -- and comparing whether they are the
same. If not, then it is likely because the repository / main-worktree
and linked worktrees were copied, so prefer the discovered plausible
pointer rather than the one from the existing .git file.

[*]: https://lore.kernel.org/git/[email protected]/

Reported-by: Russell Stuart <[email protected]>
Signed-off-by: Eric Sunshine <[email protected]>
@calebdw calebdw force-pushed the wt_relative_paths branch from a2be943 to a3ea662 Compare October 8, 2024 02:18
Git currently stores absolute paths to both the main repository and
linked worktrees. However, this causes problems when moving repositories
or working in containerized environments where absolute paths differ
between systems. The worktree links break, and users are required to
manually execute `worktree repair` to repair them, leading to workflow
disruptions. Additionally, mapping repositories inside of containerized
environments renders the repository unusable inside the containers, and
this is not repairable as repairing the worktrees inside the containers
will result in them being broken outside the containers.

To address this, this patch series makes Git always write relative paths
when linking worktrees. Relative paths increase the resilience of the
worktree links across various systems and environments, particularly
when the worktrees are self-contained inside the main repository (such
as when using a bare repository with worktrees). This improves
portability, workflow efficiency, and reduces overall breakages.

Although Git now writes relative paths, existing repositories with
absolute paths are still supported. There are no breaking changes
to workflows based on absolute paths, ensuring backward compatibility.

Signed-off-by: Caleb White <[email protected]>

---
Changes in v3:
- Squashed patch [3/4] into patch [2/4] to streamline changes.
- Dropped patch [4/4] as it was no longer necessary.
- Rebased onto [email protected]
- Updated `infer_backlink()` to return 1 on success for consistency.
- Swapped the order of `infer_backlink()` arguments for clarity.
- Clear `inferred` if error occurs in `infer_backlink()`.
- Renamed `git_contents` to `dotgit_contents` for clearer semantics.
- Cleaned up `dotgit_contents` logic in `repair_worktree_at_path()`.
- Replaced multiple `xstrfmt/free` calls with a single `strbuf`.
- Added a test case covering a failure noted in a separate patch series.
- Improved commit messages.
- Link to v2: https://lore.kernel.org/r/[email protected]
Changes in v2:
- No changes, just a resubmission
- Link to v1: https://lore.kernel.org/git/[email protected]/

Range-diff against v2:
1:  735a903 ! 1:  b5060db worktree: refactor infer_backlink() to use *strbuf
    @@ Metadata
      ## Commit message ##
         worktree: refactor infer_backlink() to use *strbuf
     
    -    This refactors the `infer_backlink` function to return an integer
    -    result and use a pre-allocated `strbuf` for the inferred backlink
    -    path, replacing the previous `char*` return type.
    +    This lays the groundwork for the next patch, which needs the backlink
    +    returned from infer_backlink() as a `strbuf`. It seemed inefficient to
    +    convert from `strbuf` to `char*` and back to `strbuf` again.
     
    -    This lays the groundwork for the next patch, which needs the
    -    resultant backlink as a `strbuf`. There was no need to go from
    -    `strbuf -> char* -> strbuf` again. This change also aligns the
    -    function's signature with other `strbuf`-based functions.
    +    This refactors infer_backlink() to return an integer result and use a
    +    pre-allocated `strbuf` for the inferred backlink path, replacing the
    +    previous `char*` return type and improving efficiency.
     
         Signed-off-by: Caleb White <[email protected]>
     
    @@ worktree.c: static int is_main_worktree_path(const char *path)
       * extracting the <id>, and checking if <repo>/worktrees/<id> exists.
       */
     -static char *infer_backlink(const char *gitfile)
    -+static int infer_backlink(struct strbuf *inferred, const char *gitfile)
    ++static int infer_backlink(const char *gitfile, struct strbuf *inferred)
      {
        struct strbuf actual = STRBUF_INIT;
     -	struct strbuf inferred = STRBUF_INIT;
    @@ worktree.c: static char *infer_backlink(const char *gitfile)
                goto error;
     -	strbuf_git_common_path(&inferred, the_repository, "worktrees/%s", id);
     -	if (!is_directory(inferred.buf))
    ++	strbuf_reset(inferred);
     +	strbuf_git_common_path(inferred, the_repository, "worktrees/%s", id);
     +	if (!is_directory(inferred->buf))
                goto error;
      
        strbuf_release(&actual);
     -	return strbuf_detach(&inferred, NULL);
    -+	return 0;
    ++	return 1;
      
      error:
        strbuf_release(&actual);
     -	strbuf_release(&inferred);
     -	return NULL;
    -+	return 1;
    ++	strbuf_reset(inferred); /* clear invalid path */
    ++	return 0;
      }
      
      /*
    @@ worktree.c: void repair_worktree_at_path(const char *path,
        struct strbuf dotgit = STRBUF_INIT;
        struct strbuf realdotgit = STRBUF_INIT;
     +	struct strbuf backlink = STRBUF_INIT;
    ++	struct strbuf inferred_backlink = STRBUF_INIT;
        struct strbuf gitdir = STRBUF_INIT;
        struct strbuf olddotgit = STRBUF_INIT;
     -	char *backlink = NULL;
    -+	char *git_contents = NULL;
    +-	char *inferred_backlink = NULL;
    ++	char *dotgit_contents = NULL;
        const char *repair = NULL;
        int err;
      
    @@ worktree.c: void repair_worktree_at_path(const char *path,
                goto done;
        }
      
    +-	inferred_backlink = infer_backlink(realdotgit.buf);
     -	backlink = xstrdup_or_null(read_gitfile_gently(realdotgit.buf, &err));
    -+	git_contents = xstrdup_or_null(read_gitfile_gently(realdotgit.buf, &err));
    - 	if (err == READ_GITFILE_ERR_NOT_A_FILE) {
    +-	if (err == READ_GITFILE_ERR_NOT_A_FILE) {
    ++	infer_backlink(realdotgit.buf, &inferred_backlink);
    ++	dotgit_contents = xstrdup_or_null(read_gitfile_gently(realdotgit.buf, &err));
    ++	if (dotgit_contents) {
    ++		strbuf_addstr(&backlink, dotgit_contents);
    ++	} else if (err == READ_GITFILE_ERR_NOT_A_FILE) {
                fn(1, realdotgit.buf, _("unable to locate repository; .git is not a file"), cb_data);
                goto done;
        } else if (err == READ_GITFILE_ERR_NOT_A_REPO) {
    --		if (!(backlink = infer_backlink(realdotgit.buf))) {
    -+		if (infer_backlink(&backlink, realdotgit.buf)) {
    +-		if (inferred_backlink) {
    ++		if (inferred_backlink.len) {
    + 			/*
    + 			 * Worktree's .git file does not point at a repository
    + 			 * but we found a .git/worktrees/<id> in this
    + 			 * repository with the same <id> as recorded in the
    + 			 * worktree's .git file so make the worktree point at
    +-			 * the discovered .git/worktrees/<id>. (Note: backlink
    +-			 * is already NULL, so no need to free it first.)
    ++			 * the discovered .git/worktrees/<id>.
    + 			 */
    +-			backlink = inferred_backlink;
    +-			inferred_backlink = NULL;
    ++			strbuf_swap(&backlink, &inferred_backlink);
    + 		} else {
                        fn(1, realdotgit.buf, _("unable to locate repository; .git file does not reference a repository"), cb_data);
                        goto done;
    - 		}
    - 	} else if (err) {
    - 		fn(1, realdotgit.buf, _("unable to locate repository; .git file broken"), cb_data);
    - 		goto done;
    -+	} else if (git_contents) {
    -+		strbuf_addstr(&backlink, git_contents);
    +@@ worktree.c: void repair_worktree_at_path(const char *path,
    + 	 * in the "copy" repository. In this case, point the "copy" worktree's
    + 	 * .git file at the "copy" repository.
    + 	 */
    +-	if (inferred_backlink && fspathcmp(backlink, inferred_backlink)) {
    +-		free(backlink);
    +-		backlink = inferred_backlink;
    +-		inferred_backlink = NULL;
    ++	if (inferred_backlink.len && fspathcmp(backlink.buf, inferred_backlink.buf)) {
    ++		strbuf_swap(&backlink, &inferred_backlink);
        }
      
     -	strbuf_addf(&gitdir, "%s/gitdir", backlink);
    @@ worktree.c: void repair_worktree_at_path(const char *path,
        }
      done:
     -	free(backlink);
    -+	free(git_contents);
    +-	free(inferred_backlink);
    ++	free(dotgit_contents);
        strbuf_release(&olddotgit);
     +	strbuf_release(&backlink);
    ++	strbuf_release(&inferred_backlink);
        strbuf_release(&gitdir);
        strbuf_release(&realdotgit);
        strbuf_release(&dotgit);
2:  f128dfa ! 2:  017f199 worktree: link worktrees with relative paths
    @@ Metadata
      ## Commit message ##
         worktree: link worktrees with relative paths
     
    -    This modifies Git’s handling of worktree linking to use relative
    -    paths instead of absolute paths. Previously, when creating a worktree,
    -    Git would store the absolute paths to both the main repository and the
    -    linked worktrees. These hardcoded absolute paths cause breakages such
    -    as when the repository is moved to a different directory or during
    -    containerized development where the absolute differs between systems.
    +    Git currently stores absolute paths to both the main repository and
    +    linked worktrees. However, this causes problems when moving repositories
    +    or working in containerized environments where absolute paths differ
    +    between systems. The worktree links break, and users are required to
    +    manually execute `worktree repair` to repair them, leading to workflow
    +    disruptions. Additionally, mapping repositories inside of containerized
    +    environments renders the repository unusable inside the containers, and
    +    this is not repairable as repairing the worktrees inside the containers
    +    will result in them being broken outside the containers.
     
    -    By switching to relative paths, we help ensure that worktree links are
    -    more resilient when the repository is moved. While links external to the
    -    repository may still break, Git still automatically handles many common
    -    scenarios, reducing the need for manual repair. This is particularly
    -    useful in containerized or portable development environments, where the
    -    absolute path to the repository can differ between systems. Developers
    -    no longer need to reinitialize or repair worktrees after relocating the
    -    repository, improving workflow efficiency and reducing breakages.
    +    To address this, this patch makes Git always write relative paths when
    +    linking worktrees. Relative paths increase the resilience of the
    +    worktree links across various systems and environments, particularly
    +    when the worktrees are self-contained inside the main repository (such
    +    as when using a bare repository with worktrees). This improves
    +    portability, workflow efficiency, and reduces overall breakages.
     
    -    For self-contained repositories (such as using a bare repository with
    -    worktrees), where both the repository and its worktrees are located
    -    within the same directory structure, using relative paths guarantees all
    -    links remain functional regardless of where the directory is located.
    +    Although Git now writes relative paths, existing repositories with
    +    absolute paths are still supported. There are no breaking changes
    +    to workflows based on absolute paths, ensuring backward compatibility.
    +
    +    At a low level, the changes involve modifying functions in `worktree.c`
    +    and `builtin/worktree.c` to use `relative_path()` when writing the
    +    worktree’s `.git` file and the main repository’s `gitdir` reference.
    +    Instead of hardcoding absolute paths, Git now computes the relative path
    +    between the worktree and the repository, ensuring that these links are
    +    portable. Locations where these respective file are read have also been
    +    updated to properly handle both absolute and relative paths. Generally,
    +    relative paths are always resolved into absolute paths before any
    +    operations or comparisons are performed.
    +
    +    Additionally, `repair_worktrees_after_gitdir_move()` has been introduced
    +    to address the case where both the `<worktree>/.git` and
    +    `<repo>/worktrees/<id>/gitdir` links are broken after the gitdir is
    +    moved (such as during a re-initialization). This function repairs both
    +    sides of the worktree link using the old gitdir path to reestablish the
    +    correct paths after a move.
    +
    +    The `worktree.path` struct member has also been updated to always store
    +    the absolute path of a worktree. This ensures that worktree consumers
    +    never have to worry about trying to resolve the absolute path themselves.
     
         Signed-off-by: Caleb White <[email protected]>
     
    @@ builtin/worktree.c: static int add_worktree(const char *path, const char *refnam
     +	strbuf_realpath(&sb_path_realpath, path, 1);
     +	strbuf_realpath(&sb_repo_realpath, sb_repo.buf, 1);
     +	write_file(sb.buf, "%s/.git", relative_path(sb_path_realpath.buf, sb_repo_realpath.buf, &sb_tmp));
    -+	strbuf_reset(&sb_tmp);
     +	write_file(sb_git.buf, "gitdir: %s", relative_path(sb_repo_realpath.buf, sb_path_realpath.buf, &sb_tmp));
        strbuf_reset(&sb);
        strbuf_addf(&sb, "%s/commondir", sb_repo.buf);
    @@ builtin/worktree.c: static int add_worktree(const char *path, const char *refnam
        return ret;
      }
     
    + ## setup.c ##
    +@@ setup.c: static void separate_git_dir(const char *git_dir, const char *git_link)
    + 
    + 		if (rename(src, git_dir))
    + 			die_errno(_("unable to move %s to %s"), src, git_dir);
    +-		repair_worktrees(NULL, NULL);
    ++		repair_worktrees_after_gitdir_move(src);
    + 	}
    + 
    + 	write_file(git_link, "gitdir: %s", git_dir);
    +
      ## t/t2408-worktree-relative.sh (new) ##
     @@
     +#!/bin/sh
    @@ worktree.c: int validate_worktree(const struct worktree *wt, struct strbuf *errm
      {
        struct strbuf path = STRBUF_INIT;
     +	struct strbuf repo = STRBUF_INIT;
    ++	struct strbuf file = STRBUF_INIT;
     +	struct strbuf tmp = STRBUF_INIT;
    -+	char *file = NULL;
      
        if (is_main_worktree(wt))
                BUG("can't relocate main worktree");
    @@ worktree.c: int validate_worktree(const struct worktree *wt, struct strbuf *errm
        if (fspathcmp(wt->path, path.buf)) {
     -		write_file(git_common_path("worktrees/%s/gitdir", wt->id),
     -			   "%s/.git", path.buf);
    -+		file = xstrfmt("%s/gitdir", repo.buf);
    -+		write_file(file, "%s/.git", relative_path(path.buf, repo.buf, &tmp));
    -+		free(file);
    -+		strbuf_reset(&tmp);
    -+		file = xstrfmt("%s/.git", path.buf);
    -+		write_file(file, "gitdir: %s", relative_path(repo.buf, path.buf, &tmp));
    ++		strbuf_addf(&file, "%s/gitdir", repo.buf);
    ++		write_file(file.buf, "%s/.git", relative_path(path.buf, repo.buf, &tmp));
    ++		strbuf_reset(&file);
    ++		strbuf_addf(&file, "%s/.git", path.buf);
    ++		write_file(file.buf, "gitdir: %s", relative_path(repo.buf, path.buf, &tmp));
     +
                free(wt->path);
                wt->path = strbuf_detach(&path, NULL);
        }
    -+	free(file);
        strbuf_release(&path);
     +	strbuf_release(&repo);
    ++	strbuf_release(&file);
     +	strbuf_release(&tmp);
      }
      
    @@ worktree.c: static void repair_gitfile(struct worktree *wt,
     -	char *backlink;
     +	struct strbuf backlink = STRBUF_INIT;
     +	struct strbuf tmp = STRBUF_INIT;
    -+	char *git_contents = NULL;
    ++	char *dotgit_contents = NULL;
        const char *repair = NULL;
        int err;
      
    @@ worktree.c: static void repair_gitfile(struct worktree *wt,
        strbuf_realpath(&repo, git_common_path("worktrees/%s", wt->id), 1);
        strbuf_addf(&dotgit, "%s/.git", wt->path);
     -	backlink = xstrdup_or_null(read_gitfile_gently(dotgit.buf, &err));
    -+	git_contents = xstrdup_or_null(read_gitfile_gently(dotgit.buf, &err));
    ++	dotgit_contents = xstrdup_or_null(read_gitfile_gently(dotgit.buf, &err));
     +
    -+	if (git_contents) {
    -+		if (is_absolute_path(git_contents)) {
    -+			strbuf_addstr(&backlink, git_contents);
    ++	if (dotgit_contents) {
    ++		if (is_absolute_path(dotgit_contents)) {
    ++			strbuf_addstr(&backlink, dotgit_contents);
     +		} else {
    -+			strbuf_addf(&backlink, "%s/%s", wt->path, git_contents);
    ++			strbuf_addf(&backlink, "%s/%s", wt->path, dotgit_contents);
     +			strbuf_realpath_forgiving(&backlink, backlink.buf, 0);
     +		}
     +	}
    @@ worktree.c: static void repair_gitfile(struct worktree *wt,
      
     -	free(backlink);
     +done:
    -+	free(git_contents);
    ++	free(dotgit_contents);
        strbuf_release(&repo);
        strbuf_release(&dotgit);
     +	strbuf_release(&backlink);
    @@ worktree.c: static void repair_gitfile(struct worktree *wt,
      }
      
      static void repair_noop(int iserr UNUSED,
    +@@ worktree.c: void repair_worktrees(worktree_repair_fn fn, void *cb_data)
    + 	free_worktrees(worktrees);
    + }
    + 
    ++void repair_worktree_after_gitdir_move(struct worktree *wt, const char *old_path)
    ++{
    ++	struct strbuf path = STRBUF_INIT;
    ++	struct strbuf repo = STRBUF_INIT;
    ++	struct strbuf gitdir = STRBUF_INIT;
    ++	struct strbuf dotgit = STRBUF_INIT;
    ++	struct strbuf olddotgit = STRBUF_INIT;
    ++	struct strbuf tmp = STRBUF_INIT;
    ++
    ++	if (is_main_worktree(wt))
    ++		goto done;
    ++
    ++	strbuf_realpath(&repo, git_common_path("worktrees/%s", wt->id), 1);
    ++	strbuf_addf(&gitdir, "%s/gitdir", repo.buf);
    ++
    ++	if (strbuf_read_file(&olddotgit, gitdir.buf, 0) < 0)
    ++		goto done;
    ++
    ++	strbuf_rtrim(&olddotgit);
    ++	if (is_absolute_path(olddotgit.buf)) {
    ++		strbuf_addbuf(&dotgit, &olddotgit);
    ++	} else {
    ++		strbuf_addf(&dotgit, "%s/worktrees/%s/%s", old_path, wt->id, olddotgit.buf);
    ++		strbuf_realpath_forgiving(&dotgit, dotgit.buf, 0);
    ++	}
    ++
    ++	if (!file_exists(dotgit.buf))
    ++		goto done;
    ++
    ++	strbuf_addbuf(&path, &dotgit);
    ++	strbuf_strip_suffix(&path, "/.git");
    ++
    ++	write_file(dotgit.buf, "gitdir: %s", relative_path(repo.buf, path.buf, &tmp));
    ++	write_file(gitdir.buf, "%s", relative_path(dotgit.buf, repo.buf, &tmp));
    ++done:
    ++	strbuf_release(&path);
    ++	strbuf_release(&repo);
    ++	strbuf_release(&gitdir);
    ++	strbuf_release(&dotgit);
    ++	strbuf_release(&olddotgit);
    ++	strbuf_release(&tmp);
    ++}
    ++
    ++void repair_worktrees_after_gitdir_move(const char *old_path)
    ++{
    ++	struct worktree **worktrees = get_worktrees_internal(1);
    ++	struct worktree **wt = worktrees + 1; /* +1 skips main worktree */
    ++
    ++	for (; *wt; wt++)
    ++		repair_worktree_after_gitdir_move(*wt, old_path);
    ++	free_worktrees(worktrees);
    ++}
    ++
    + static int is_main_worktree_path(const char *path)
    + {
    + 	struct strbuf target = STRBUF_INIT;
     @@ worktree.c: void repair_worktree_at_path(const char *path,
    - 	struct strbuf backlink = STRBUF_INIT;
    + 	struct strbuf inferred_backlink = STRBUF_INIT;
        struct strbuf gitdir = STRBUF_INIT;
        struct strbuf olddotgit = STRBUF_INIT;
     +	struct strbuf realolddotgit = STRBUF_INIT;
     +	struct strbuf tmp = STRBUF_INIT;
    - 	char *git_contents = NULL;
    + 	char *dotgit_contents = NULL;
        const char *repair = NULL;
        int err;
     @@ worktree.c: void repair_worktree_at_path(const char *path,
    - 		fn(1, realdotgit.buf, _("unable to locate repository; .git file broken"), cb_data);
    - 		goto done;
    - 	} else if (git_contents) {
    --		strbuf_addstr(&backlink, git_contents);
    -+		if (is_absolute_path(git_contents)) {
    -+			strbuf_addstr(&backlink, git_contents);
    + 	}
    + 
    + 	infer_backlink(realdotgit.buf, &inferred_backlink);
    ++	strbuf_realpath_forgiving(&inferred_backlink, inferred_backlink.buf, 0);
    + 	dotgit_contents = xstrdup_or_null(read_gitfile_gently(realdotgit.buf, &err));
    + 	if (dotgit_contents) {
    +-		strbuf_addstr(&backlink, dotgit_contents);
    ++		if (is_absolute_path(dotgit_contents)) {
    ++			strbuf_addstr(&backlink, dotgit_contents);
     +		} else {
     +			strbuf_addbuf(&backlink, &realdotgit);
     +			strbuf_strip_suffix(&backlink, ".git");
    -+			strbuf_addstr(&backlink, git_contents);
    ++			strbuf_addstr(&backlink, dotgit_contents);
    ++			strbuf_realpath_forgiving(&backlink, backlink.buf, 0);
     +		}
    + 	} else if (err == READ_GITFILE_ERR_NOT_A_FILE) {
    + 		fn(1, realdotgit.buf, _("unable to locate repository; .git is not a file"), cb_data);
    + 		goto done;
    +@@ worktree.c: void repair_worktree_at_path(const char *path,
    + 			fn(1, realdotgit.buf, _("unable to locate repository; .git file does not reference a repository"), cb_data);
    + 			goto done;
    + 		}
    +-	} else if (err) {
    ++	} else {
    + 		fn(1, realdotgit.buf, _("unable to locate repository; .git file broken"), cb_data);
    + 		goto done;
        }
    - 
    -+	strbuf_realpath_forgiving(&backlink, backlink.buf, 0);
    - 	strbuf_addf(&gitdir, "%s/gitdir", backlink.buf);
    - 	if (strbuf_read_file(&olddotgit, gitdir.buf, 0) < 0)
    +@@ worktree.c: void repair_worktree_at_path(const char *path,
                repair = _("gitdir unreadable");
        else {
                strbuf_rtrim(&olddotgit);
    @@ worktree.c: void repair_worktree_at_path(const char *path,
     +		write_file(gitdir.buf, "%s", relative_path(realdotgit.buf, backlink.buf, &tmp));
        }
      done:
    - 	free(git_contents);
    + 	free(dotgit_contents);
        strbuf_release(&olddotgit);
     +	strbuf_release(&realolddotgit);
        strbuf_release(&backlink);
    + 	strbuf_release(&inferred_backlink);
        strbuf_release(&gitdir);
        strbuf_release(&realdotgit);
        strbuf_release(&dotgit);
    @@ worktree.c: void repair_worktree_at_path(const char *path,
     +	struct strbuf dotgit = STRBUF_INIT;
     +	struct strbuf gitdir = STRBUF_INIT;
     +	struct strbuf repo = STRBUF_INIT;
    ++	struct strbuf file = STRBUF_INIT;
     +	char *path = NULL;
    -+	char *file = NULL;
     +	int rc = 0;
        int fd;
        size_t len;
    @@ worktree.c: void repair_worktree_at_path(const char *path,
     -	if (file_exists(git_path("worktrees/%s/locked", id)))
     -		return 0;
     -	if (stat(git_path("worktrees/%s/gitdir", id), &st)) {
    -+	file = xstrfmt("%s/locked", repo.buf);
    -+	if (file_exists(file)) {
    ++	strbuf_addf(&file, "%s/locked", repo.buf);
    ++	if (file_exists(file.buf)) {
     +		goto done;
     +	}
     +	if (stat(gitdir.buf, &st)) {
    @@ worktree.c: void repair_worktree_at_path(const char *path,
     +		strbuf_realpath_forgiving(&dotgit, dotgit.buf, 0);
     +	}
     +	if (!file_exists(dotgit.buf)) {
    -+		free(file);
    -+		file = xstrfmt("%s/index", repo.buf);
    -+		if (stat(file, &st) || st.st_mtime <= expire) {
    ++		strbuf_reset(&file);
    ++		strbuf_addf(&file, "%s/index", repo.buf);
    ++		if (stat(file.buf, &st) || st.st_mtime <= expire) {
                        strbuf_addstr(reason, _("gitdir file points to non-existent location"));
     -			free(path);
     -			return 1;
    @@ worktree.c: void repair_worktree_at_path(const char *path,
     +	*wtpath = strbuf_detach(&dotgit, NULL);
     +done:
     +	free(path);
    -+	free(file);
     +	strbuf_release(&dotgit);
     +	strbuf_release(&gitdir);
     +	strbuf_release(&repo);
    ++	strbuf_release(&file);
     +	return rc;
      }
      
      static int move_config_setting(const char *key, const char *value,
    +
    + ## worktree.h ##
    +@@ worktree.h: typedef void (* worktree_repair_fn)(int iserr, const char *path,
    +  */
    + void repair_worktrees(worktree_repair_fn, void *cb_data);
    + 
    ++/*
    ++ * Repair the linked worktrees after the gitdir has been moved.
    ++ */
    ++void repair_worktrees_after_gitdir_move(const char *old_path);
    ++
    ++/*
    ++ * Repair the linked worktree after the gitdir has been moved.
    ++ */
    ++void repair_worktree_after_gitdir_move(struct worktree *wt, const char *old_path);
    ++
    + /*
    +  * Repair administrative files corresponding to the worktree at the given path.
    +  * The worktree's .git file pointing at the repository must be intact for the
3:  2347aff < -:  ---------- worktree: sync worktree paths after gitdir move
4:  64638e3 < -:  ---------- worktree: prevent null pointer dereference
-:  ---------- > 3:  a3ea662 worktree: add test for path handling in linked worktrees

--- b4-submit-tracking ---
# This section is used internally by b4 prep for tracking purposes.
{
  "series": {
    "revision": 3,
    "change-id": "20241007-wt_relative_paths-88f9cf5a070c",
    "prefixes": [],
    "history": {
      "v2": [
        "[email protected]"
      ]
    },
    "prerequisites": [
      "message-id: <[email protected]>",
      "base-commit: v2.47.0"
    ]
  }
}
This lays the groundwork for the next patch, which needs the backlink
returned from infer_backlink() as a `strbuf`. It seemed inefficient to
convert from `strbuf` to `char*` and back to `strbuf` again.

This refactors infer_backlink() to return an integer result and use a
pre-allocated `strbuf` for the inferred backlink path, replacing the
previous `char*` return type and improving efficiency.

Signed-off-by: Caleb White <[email protected]>
Git currently stores absolute paths to both the main repository and
linked worktrees. However, this causes problems when moving repositories
or working in containerized environments where absolute paths differ
between systems. The worktree links break, and users are required to
manually execute `worktree repair` to repair them, leading to workflow
disruptions. Additionally, mapping repositories inside of containerized
environments renders the repository unusable inside the containers, and
this is not repairable as repairing the worktrees inside the containers
will result in them being broken outside the containers.

To address this, this patch makes Git always write relative paths when
linking worktrees. Relative paths increase the resilience of the
worktree links across various systems and environments, particularly
when the worktrees are self-contained inside the main repository (such
as when using a bare repository with worktrees). This improves
portability, workflow efficiency, and reduces overall breakages.

Although Git now writes relative paths, existing repositories with
absolute paths are still supported. There are no breaking changes
to workflows based on absolute paths, ensuring backward compatibility.

At a low level, the changes involve modifying functions in `worktree.c`
and `builtin/worktree.c` to use `relative_path()` when writing the
worktree’s `.git` file and the main repository’s `gitdir` reference.
Instead of hardcoding absolute paths, Git now computes the relative path
between the worktree and the repository, ensuring that these links are
portable. Locations where these respective file are read have also been
updated to properly handle both absolute and relative paths. Generally,
relative paths are always resolved into absolute paths before any
operations or comparisons are performed.

Additionally, `repair_worktrees_after_gitdir_move()` has been introduced
to address the case where both the `<worktree>/.git` and
`<repo>/worktrees/<id>/gitdir` links are broken after the gitdir is
moved (such as during a re-initialization). This function repairs both
sides of the worktree link using the old gitdir path to reestablish the
correct paths after a move.

The `worktree.path` struct member has also been updated to always store
the absolute path of a worktree. This ensures that worktree consumers
never have to worry about trying to resolve the absolute path themselves.

Signed-off-by: Caleb White <[email protected]>
A failure scenario reported in an earlier patch series[1] that several
`git worktree` subcommands failed or misbehaved when invoked from within
linked worktrees that used relative paths.

This adds a test that executes a `worktree prune` command inside both an
internally and an externally linked worktree and asserts that the other
worktree was not pruned.

[1]: https://lore.kernel.org/git/CAPig+cQXFy=xPVpoSq6Wq0pxMRCjS=WbkgdO+3LySPX=q0nPCw@mail.gmail.com/

Reported-by: Eric Sunshine <[email protected]>
Signed-off-by: Caleb White <[email protected]>
@calebdw calebdw force-pushed the wt_relative_paths branch from a3ea662 to 3d5a2b0 Compare October 8, 2024 03:10
Copy link

gitgitgadget bot commented Oct 8, 2024

There are issues in commit d79b2aa:
Link worktrees with relative paths
Leading whitespace in sign off: Signed-off-by: Caleb White [email protected]
Leading whitespace in sign off: Signed-off-by: Caleb White [email protected]
Lines in the body of the commit messages should be wrapped between 60 and 76 characters.
Indented lines, and lines without whitespace, are exempt

@calebdw calebdw closed this Oct 8, 2024
@dscho
Copy link
Member

dscho commented Oct 8, 2024

@calebdw how about fixing the issues, amending the commit, force-pushing and then reopening this PR?

@calebdw
Copy link
Author

calebdw commented Oct 8, 2024

@dscho,

I originally submitted v1 and v2 via git format-patch/git send-email. However, there were several reports about patch corruption (likely due to the SMTP server I was using). I was going to try to submitting with GGG, but then I came across b4 and I decided to use that instead---so I submitted v3 via the public web endpoint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants