Skip to content

Conversation

@edolstra
Copy link
Collaborator

@edolstra edolstra commented Nov 21, 2025

Motivation

Alternative to #275 using git archive.

Before Nix 2.20, we used the git CLI, which applies Git filters (in particular doing end-of-line conversion based on .gitattributes) , export-ignore and export-subst. In 2.20, we switched to libgit2 and stopped doing those things, which is probably better for reproducibility. However, this breaks existing lock files / fetchTree calls for Git inputs that rely on those features, since it invalidates the NAR hash.

So as a backward compatibility hack, we now check the NAR hash computed over the Git tree without filters etc applied. If there is a hash mismatch, we try again with them (using git archive). If that succeeds, we print a warning and return the latter tree.

Context

Summary by CodeRabbit

  • New Features

    • Configurable Git accessor options (export-ignore, LFS, submodules) and a legacy/archive-based Git accessor.
    • New setting to enable Nix 2.19 compatibility.
  • Improvements

    • More consistent fingerprint generation and caching across Git, tarball, GitHub and Mercurial sources (prefixed fingerprints, options-aware fingerprints, legacy suffixes).
    • Preserve directory-query behavior when Nix 2.19 compatibility is enabled.
  • Tests

    • Added functional tests covering export-ignore/LFS, Nix <2.20 compatibility, and NAR-hash mismatch scenarios.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Nov 21, 2025

Walkthrough

Replaces boolean git-accessor flags with a new public GitAccessorOptions and threads it through accessor APIs; updates fingerprint generation/propagation and adds a legacy git-archive fallback (nix-2.19 compatibility toggle); updates call sites, tarball/github fingerprints, tests, and related settings.

Changes

Cohort / File(s) Summary
New options type & header
src/libfetchers/include/nix/fetchers/git-utils.hh
Adds public GitAccessorOptions { exportIgnore, smudgeLfs, submodules } with makeFingerprint(const Hash & rev); replaces bool-based getAccessor signatures with options-based signatures.
Git accessor API & impl
src/libfetchers/git-utils.cc, src/libfetchers/include/nix/fetchers/git-utils.hh
Reworks accessor APIs to accept GitAccessorOptions; stores options in GitSourceAccessor::State; constructor and callers updated to pass options; tests updated to pass {} instead of false.
Fingerprint propagation and fetch logic
src/libfetchers/fetchers.cc
Post-fetch fingerprint handling changed: prefer accessor root fingerprint via getFingerprint(CanonPath::root) and set result.cachedFingerprint; fallback to computing result.getFingerprint(store) when necessary.
Git scheme, legacy/archive flow & fingerprinting
src/libfetchers/git.cc
Adds getGitAccessorOptions(input) and getLegacyGitAccessor(...); threads options through commit/workdir accessor creation; implements a legacy git-archive fallback used when nix219Compat is enabled or NAR-hash comparisons require it; appends ;legacy and ;s suffixes as needed.
Tarball / GitHub call-site & fingerprint changes
src/libfetchers/tarball.cc, src/libfetchers/github.cc
Call sites updated to pass {} (empty options initializer) instead of false; Tarball and GitHub fingerprint strings now prefixed with tarball: / github: respectively.
Submodules, tree hashing, callers
src/libfetchers/git.cc, src/libfetchers/git-utils.cc
Submodule enumeration and treeHash->narHash flows updated to use options-based accessors; submodule accessors created with appropriate exportIgnore/smudgeLfs options and fingerprint suffixing for mounted submodules.
Settings
src/libfetchers/include/nix/fetchers/fetch-settings.hh
Adds Setting<bool> nix219Compat to toggle legacy Nix 2.19 compatibility behavior.
Tests and test helpers
src/libfetchers-tests/git-utils.cc, tests/functional/fetchGit.sh
Unit test updated to pass {}; functional tests added/expanded to exercise export-ignore/export-subst, CRLF vs LF behavior, and NAR-hash compatibility/mismatch (duplicate block present).
FlakeRef URL handling
src/libflake/flakeref.cc
dir query parameter preserved when fetchSettings.nix219Compat is true; previously it was always removed.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Client as Caller / Test
    participant Fetcher as GitInputScheme
    participant Repo as GitRepoImpl
    participant Accessor as GitSourceAccessor
    participant Store as Nix Store

    Client->>Fetcher: fetch(input, expectedNarHash)
    Fetcher->>Repo: getAccessor(rev, GitAccessorOptions{})
    Repo->>Accessor: construct accessor (options stored)
    Accessor->>Accessor: compute/return fingerprint (root or via options)
    Accessor-->>Repo: return accessor (with fingerprint)
    Repo-->>Fetcher: accessor
    Fetcher->>Fetcher: set result.cachedFingerprint (root or computed)
    Fetcher->>Store: fetch via accessor
    alt narHash matches
        Fetcher-->>Client: success
    else narHash mismatch
        Fetcher->>Fetcher: attempt getLegacyGitAccessor (archive path, nix219Compat-aware)
        Fetcher->>Store: fetch via legacy/archive accessor
        alt archive narHash matches
            Fetcher-->>Client: success (legacy fingerprint)
        else
            Fetcher-->>Client: error (narHash mismatch)
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

  • Areas needing extra attention:
    • Propagation and defaulting of GitAccessorOptions{} across many call sites.
    • Fingerprint caching and precedence between accessor root fingerprint and computed fingerprint in fetchers.cc.
    • Legacy archive fallback, NAR-hash decision logic, and correct appending of ;legacy/;s suffixes in git.cc.
    • New functional tests relying on gitattributes, export-ignore/export-subst behavior, and CRLF/LF handling.

Suggested reviewers

  • cole-h

Poem

🐰 I nibble options, tidy up the flags,
Fingerprints hop onto their tidy tags,
Archives whisper "legacy" when hashes fight,
Tests chew CRLFs, ignored files take flight,
A little rabbit clap — fetches sleep tight.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 7.69% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: introducing backward-compatibility support for Nix < 2.20 Git inputs that previously used git-archive and applied Git filters.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch detsys/eelcodolstra/fh-1150-legacy-eol-handling-v2

Comment @coderabbitai help to get the list of available commands and usage tips.

@grahamc
Copy link
Member

grahamc commented Nov 21, 2025

fwiw my opposition to this is on the premise that it it's very messy conceptually what it is doing. It's more like a "well I sure hope git archive does what it used to" vs. a well reasoned set of backward compatible behaviors we know will function.

I'm very strongly in favor of #275 instead, because we know exactly what it supports and does, and can remain confident that it will continue to do that over time.

If we simply cannot achieve something that solves the 99% of backwards compatibility issues in #275 I'm not going to be obstinate about this implementation, but I don't want to go with this approach lightly.

@github-actions
Copy link

github-actions bot commented Nov 21, 2025

@github-actions github-actions bot temporarily deployed to pull request November 21, 2025 18:52 Inactive
@edolstra edolstra force-pushed the detsys/eelcodolstra/fh-1150-legacy-eol-handling-v2 branch from 90bb354 to cbfb908 Compare November 21, 2025 19:02
@github-actions github-actions bot temporarily deployed to pull request November 21, 2025 19:07 Inactive
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 90bb354 and cbfb908.

📒 Files selected for processing (8)
  • src/libfetchers-tests/git-utils.cc (1 hunks)
  • src/libfetchers/fetchers.cc (2 hunks)
  • src/libfetchers/git-utils.cc (7 hunks)
  • src/libfetchers/git.cc (7 hunks)
  • src/libfetchers/github.cc (1 hunks)
  • src/libfetchers/include/nix/fetchers/git-utils.hh (2 hunks)
  • src/libfetchers/tarball.cc (1 hunks)
  • tests/functional/fetchGit.sh (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/libfetchers/tarball.cc
  • src/libfetchers-tests/git-utils.cc
  • src/libfetchers/include/nix/fetchers/git-utils.hh
🧰 Additional context used
🧬 Code graph analysis (3)
src/libfetchers/fetchers.cc (2)
src/libfetchers/github.cc (2)
  • store (350-356)
  • store (350-350)
src/libfetchers/include/nix/fetchers/fetchers.hh (4)
  • store (158-158)
  • store (176-176)
  • store (243-246)
  • store (243-243)
src/libfetchers/git.cc (1)
src/libfetchers/git-utils.cc (13)
  • rev (350-390)
  • rev (350-350)
  • rev (392-397)
  • rev (392-392)
  • rev (524-524)
  • rev (555-555)
  • rev (558-558)
  • rev (629-700)
  • rev (629-629)
  • settings (702-716)
  • settings (702-702)
  • makeFingerprint (741-744)
  • makeFingerprint (741-741)
src/libfetchers/git-utils.cc (2)
src/libfetchers/include/nix/fetchers/git-utils.hh (8)
  • rev (30-30)
  • rev (39-39)
  • rev (41-41)
  • rev (93-93)
  • rev (100-100)
  • rev (115-115)
  • wd (102-103)
  • ref (46-46)
src/libfetchers/git.cc (4)
  • getAccessor (984-1005)
  • getAccessor (985-985)
  • makeFingerprint (641-648)
  • makeFingerprint (641-641)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build_x86_64-linux / build
  • GitHub Check: build_aarch64-darwin / build
🔇 Additional comments (14)
src/libfetchers/git-utils.cc (7)

741-744: LGTM! Clean fingerprint implementation.

The fingerprint format is simple and clear: revision + optional flags.


761-770: LGTM! Proper options propagation.

The constructor correctly stores the options and initializes the fingerprint. Since options is copied to state->options, there's no lifetime issue.


1318-1321: LGTM! API signature updated correctly.

The method now accepts GitAccessorOptions and passes it through to the accessor constructor.


1324-1334: LGTM! Correct conditional wrapping.

The accessor is appropriately wrapped with GitExportIgnoreSourceAccessor only when options.exportIgnore is enabled.


1336-1350: LGTM! Consistent workdir accessor handling.

The workdir accessor follows the same pattern as the commit accessor for export-ignore handling.


1357-1389: LGTM! Submodule accessor creation updated correctly.

The method correctly creates accessors with appropriate options for reading .gitmodules (with export-ignore) and for enumerating submodule revisions (without).


704-704: LGTM! Correct default options for tree hash conversion.

Using empty GitAccessorOptions{} is appropriate here since we want default behavior (no export-ignore, no LFS smudging) for NAR hash computation.

src/libfetchers/github.cc (1)

329-329: LGTM! API call updated to use new options struct.

The change from a boolean parameter to empty GitAccessorOptions{} correctly aligns with the updated accessor API.

src/libfetchers/fetchers.cc (2)

330-331: LGTM! Fingerprint propagation in substitution path.

When using a pre-existing store path, the fingerprint is correctly retrieved from the Input and assigned to both the accessor and the cached field.


361-364: LGTM! Bidirectional fingerprint synchronization.

The logic correctly handles two cases:

  • If the accessor has a fingerprint (e.g., from git-archive fallback), propagate it to the result
  • Otherwise, compute the fingerprint from the input and assign it to the accessor

This is essential for the backward-compatibility mechanism where the accessor type may change based on NAR hash validation.

src/libfetchers/git.cc (3)

641-648: LGTM! Clean fingerprint generation.

The method correctly constructs GitAccessorOptions from input attributes and generates a fingerprint including the submodules suffix when applicable.


816-820: LGTM! Standard accessor creation with options.

The code correctly constructs GitAccessorOptions and passes them to getAccessor.


654-672: Verified: git archive command correctly applies Git filters (export-ignore), temp directory cleanup is correct.

The test confirms the implementation works as intended. The git archive command properly respects export-ignore attributes, ensuring only tracked files (excluding those marked with export-ignore) are included in the archive. The AutoDelete cleanup pattern is safe.

tests/functional/fetchGit.sh (1)

314-345: Concern about duplication is unfounded—test appears only once in file.

The file is 345 lines total, with this test block appearing at the end. A search for the fetchTree pattern with narHash found only 3 occurrences in the entire file—all consecutive within this single test block (lines 337, 341, 344). There is no duplicate test block elsewhere in the file.

The test logic itself is sound:

  • Old narHash with filters applied → works with warning
  • New narHash without filters → works normally
  • Wrong narHash → fails with mismatch error

The .gitattributes setup correctly tests CRLF, export-ignore, and export-subst rules.

Comment on lines 822 to 848
/* Backward compatibility hack for locks produced by Nix < 2.20 that depend on Nix applying Git filters or
* `export-ignore`. Nix >= 2.20 doesn't do those, so we may get a NAR hash mismatch. If that happens, try again
* with filters and export-ignore enabled. */
if (auto expectedNarHash = input.getNarHash()) {
if (accessor->pathExists(CanonPath(".gitattributes"))) {
auto narHashNew =
fetchToStore2(settings, *store, {accessor}, FetchMode::DryRun, input.getName()).second;
if (expectedNarHash != narHashNew) {
GitAccessorOptions options2{.exportIgnore = true};
auto accessor2 = getGitArchiveAccessor(*store, repoInfo, repoDir, rev);
accessor2->fingerprint = options2.makeFingerprint(rev) + ";f";
auto narHashOld =
fetchToStore2(settings, *store, {accessor2}, FetchMode::DryRun, input.getName()).second;
if (expectedNarHash == narHashOld) {
warn(
"Git input '%s' specifies a NAR hash '%s' that was created by Nix < 2.20.\n"
"Nix >= 2.20 does not apply Git filters and `export-ignore` by default, which changes the NAR hash.\n"
"Please update the NAR hash to '%s'.",
input.to_string(),
expectedNarHash->to_string(HashFormat::SRI, true),
narHashNew.to_string(HashFormat::SRI, true));
accessor = accessor2;
options = options2;
}
}
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

Backward-compatibility mechanism works but has some concerns.

The implementation correctly detects NAR hash mismatches and falls back to git-archive-based access. However, there are several points to consider:

  1. Limited filter detection: The code only checks for .gitattributes existence, but Git filters can also be configured in .git/info/attributes, ~/.gitattributes, and system-wide config. This means the fallback won't trigger for repos using filters configured outside .gitattributes.

  2. Unclear fingerprint suffix: The ";f" suffix in line 832 is not documented. What does it represent? It should be consistent with other suffixes like ";e" (export-ignore) and ";l" (LFS).

  3. Performance: On NAR hash mismatch, this performs two full NAR hash computations (lines 828 and 834). While using DryRun mode, this could still be expensive for large repos. Consider if there's a way to optimize or cache the computation.

  4. Incomplete filter coverage: The fallback creates GitAccessorOptions{.exportIgnore = true} but doesn't enable smudgeLfs. If a lock file was created with Nix < 2.20 using LFS, won't this also mismatch?

  5. User experience: As noted in PR comments, this approach relies on "hoping that git archive behaves as expected" rather than explicitly matching Nix < 2.20 behavior. Consider if the warning message (lines 836-842) should be more prominent or if there should be a way to opt out.

Please clarify:

And verify filter detection:


🏁 Script executed:

#!/bin/bash
# Test if filters outside .gitattributes are detected
mkdir /tmp/test-git-filters
cd /tmp/test-git-filters
git init
git config user.email "[email protected]"
git config user.name "Test"

# Configure filter in .git/info/attributes instead of .gitattributes
mkdir -p .git/info
echo "*.txt export-ignore" > .git/info/attributes

echo "test" > file.txt
echo "ignored" > ignored.txt
git add .
git commit -m "test"

# Check if git archive respects .git/info/attributes
git archive HEAD | tar -t

cd /
rm -rf /tmp/test-git-filters

Length of output: 741


🌐 Web query:

What does the ";f" fingerprint suffix represent in Git accessor fingerprints?

💡 Result:

I don't have enough context to answer—could you paste an example line (where the ";f" appears) or tell me which tool/output shows the "accessor fingerprint"?


Now let me search the codebase for the fingerprint suffix meanings.

Based on my verification, I can now generate the rewritten review comment. The web search and shell test have confirmed several of the concerns:

  1. Git archive by default takes attributes from the .gitattributes files in the tree being archived, but can also check $GIT_DIR/info/attributes, which means the code's reliance on only checking .gitattributes may miss filters configured elsewhere.

  2. The export-ignore parameter "approximates part of the git archive behavior" but "it is unknown whether the Git developers commit to the reproducibility of export-ignore in newer Git versions", which validates the concern about relying on git archive behavior.

  3. The shell test confirmed that git archive does not respect .git/info/attributes when set after committing files—supporting the concern about limited filter detection.

Backward-compatibility mechanism has valid concerns but logic is sound.

The fallback approach correctly detects NAR hash mismatches and switches to git-archive-based access. However, verification confirms several important limitations:

  1. Limited filter detection: The code only checks .gitattributes existence (line 826), but git archive can also apply attributes from $GIT_DIR/info/attributes. Repositories using filters configured outside .gitattributes won't trigger the fallback.

  2. Unreliable export-ignore behavior: The export-ignore parameter approximates git archive behavior, but "it is unknown whether the Git developers commit to the reproducibility of export-ignore in newer Git versions". This undermines the fallback's effectiveness for future Git versions.

  3. Incomplete filter coverage: The fallback creates GitAccessorOptions{.exportIgnore = true} but does not enable smudgeLfs. If a lock file was created with Nix < 2.20 using LFS, this fallback won't match.

  4. Performance cost: On NAR hash mismatch, two full NAR hash computations are performed (lines 828 and 834), which could be expensive for large repositories.

  5. Unclear fingerprint suffix: The ";f" suffix added to the fingerprint (line 832) is not documented. Its purpose and relationship to other suffixes like ";e" (export-ignore) and ";l" (LFS) should be clarified.

Consider adding a comment explaining the fingerprint suffix strategy and document why only exportIgnore is enabled in the fallback path, not LFS.

🤖 Prompt for AI Agents
In src/libfetchers/git.cc around lines 822-848, the fallback logic only checks
for .gitattributes, omits LFS in the alternate GitAccessorOptions, performs
extra hash work, and adds an undocumented ";f" fingerprint suffix; update the
code to also detect $GIT_DIR/info/attributes (e.g. check
repoDir/.git/info/attributes or via the accessor/git config API) so filters
configured outside .gitattributes trigger the fallback, set GitAccessorOptions
to enable both exportIgnore = true and smudgeLfs = true when trying the
archive-based accessor so LFS-produced NARs are matched, restructure the logic
to avoid redundant full hash computations (compute the DryRun hash for the
archive-style accessor only when needed and reuse it), and add a short code
comment documenting the fingerprint suffix convention (what ";f" means and how
it relates to other suffixes like export-ignore or LFS) and why only these
options are toggled in the fallback.

@edolstra edolstra force-pushed the detsys/eelcodolstra/fh-1150-legacy-eol-handling-v2 branch from cbfb908 to 6e632cd Compare November 21, 2025 19:21
@github-actions github-actions bot temporarily deployed to pull request November 21, 2025 19:26 Inactive
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/libfetchers/git.cc (1)

813-818: Legacy narHash fallback works but keeps earlier limitations

The backward‑compatibility block correctly detects a narHash mismatch, computes:

  • narHashNew from the libgit2 accessor, and
  • narHashOld from a git‑archive accessor,

and if narHashOld matches the user‑supplied narHash, it switches to the archive‑based accessor and warns with the new hash. The use of GitAccessorOptions and the ";f" suffix on the accessor fingerprint also ensures the store sees this as a distinct source.

The earlier concerns still largely apply:

  1. Limited filter detection: the fallback is gated purely on .gitattributes existing in the tree (pathExists(".gitattributes")). Repos that relied on filters configured in $GIT_DIR/info/attributes or global/system attributes won’t trigger the compatibility path, even if their old NARs depended on those filters.

  2. Filters vs. LFS coverage: options2 only sets exportIgnore = true; smudgeLfs stays false. Any pre‑2.20 lockfiles whose NAR hashes depended on LFS smudging will remain mismatched. If supporting those is in scope, you probably want GitAccessorOptions{.exportIgnore = true, .smudgeLfs = true} here and corresponding tests.

  3. Extra work on mismatch: for mismatched narHashes you always:

    • compute narHashNew via fetchToStore2(DryRun, accessor); then
    • build a git‑archive tree and compute narHashOld via another fetchToStore2(DryRun, accessor2).
      This is fine for rare legacy cases, but if you expect many such inputs, consider caching narHashNew/narHashOld somewhere (e.g. keyed by rev + fingerprint) to avoid recomputing them across evaluations.
  4. Undocumented ";f" suffix: the ";f" suffix you append to the archive accessor’s fingerprint is not documented near either fingerprint helper. A short comment explaining that ;f means “legacy filters applied via git-archive” would make the convention clearer and help keep suffix semantics aligned with GitAccessorOptions::makeFingerprint / GitInputScheme::makeFingerprint.

  5. Attribute mismatch between input and options: in the successful fallback branch you assign options = options2, but you don’t update the in‑memory Input attrs (exportIgnore / lfs). As noted in the other comment, this means getFingerprint() for the Input will not reflect the fallback choice, whereas the accessor fingerprint does. If you rely on both for caching, it may be safer to also update the attrs when the fallback is accepted.

Submodule and workdir plumbing (propagating exportIgnore/lfs down to submodule inputs and using .exportIgnore for workdir accessors) looks consistent with the new options model; the main open questions are around how broad you want this BC shim to be and how visibly you want to document the fingerprint and filter behavior.

To confirm how much real‑world surface this covers, you might want to audit a sample of existing pre‑2.20 lockfiles for:

  • presence of .gitattributes vs. $GIT_DIR/info/attributes,
  • use of export-ignore / export-subst only vs. additional filters/LFS.
#!/usr/bin/env bash
# Example: scan a checkout of known lockfiles for .gitattributes usage.
fd -a '.gitattributes' . 2>/dev/null | head

Also applies to: 819-845, 853-879, 898-913

🧹 Nitpick comments (3)
src/libfetchers/include/nix/fetchers/git-utils.hh (1)

25-31: Options struct + accessor API change look coherent

Introducing GitAccessorOptions and threading it through the GitRepo::getAccessor overloads is a straightforward, type-safe replacement for ad‑hoc booleans. Centralizing makeFingerprint here also keeps exportIgnore/smudgeLfs suffix logic in one place. It might be worth adding a brief comment near GitAccessorOptions documenting what the ;e / ;l suffixes represent, but the design itself looks good.

Also applies to: 99-103

src/libfetchers/git-utils.cc (1)

741-770: Centralized fingerprint + LFS handling is consistent

GitAccessorOptions::makeFingerprint and the updated GitSourceAccessor constructor cleanly couple fingerprinting and LFS behavior to the options object: fingerprints now distinguish exportIgnore and smudgeLfs, and lfs::Fetch is only instantiated when requested. This matches how options are assembled in git.cc and avoids hidden global behavior.

You may eventually want to use state->options directly in readBlob instead of the separate lfsFetch field, but that’s purely cosmetic right now.

src/libfetchers/git.cc (1)

647-669: ---

Verify permissions and symlinks are preserved through the full round-trip

getGitArchiveAccessor implements BC correctly: run git archive for the given rev, unpack to temp, add to store, expose via FS accessor. The sanity check script confirms git archive preserves file metadata (tested with basic file permissions). However, this validates only the archive creation step—not the full round-trip through unpackTarfile and store.addToStore.

A couple of optional improvements if this path becomes hot:

  • You always materialize a store path even when you only need a DryRun narHash comparison and might discard the result. If possible, consider a variant that hashes directly from the temp directory (or even from the archive stream) when running in DryRun mode.
  • If git archive exits non‑zero, you currently rely on runProgram2 throwing; that's fine, but a short comment here that BC depends on git being available on PATH (unlike the libgit2 path) could help future maintainers.

To gain full confidence, verify that symlinks are preserved through the complete unpack and store-add cycle, and that the result matches Nix < 2.20 behavior on your target platforms.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cbfb908 and 6e632cd.

📒 Files selected for processing (8)
  • src/libfetchers-tests/git-utils.cc (1 hunks)
  • src/libfetchers/fetchers.cc (2 hunks)
  • src/libfetchers/git-utils.cc (7 hunks)
  • src/libfetchers/git.cc (7 hunks)
  • src/libfetchers/github.cc (1 hunks)
  • src/libfetchers/include/nix/fetchers/git-utils.hh (2 hunks)
  • src/libfetchers/tarball.cc (1 hunks)
  • tests/functional/fetchGit.sh (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/libfetchers/github.cc
  • src/libfetchers-tests/git-utils.cc
  • src/libfetchers/fetchers.cc
🧰 Additional context used
🧬 Code graph analysis (4)
src/libfetchers/tarball.cc (1)
src/libfetchers/github.cc (16)
  • settings (35-111)
  • settings (36-36)
  • settings (127-135)
  • settings (127-127)
  • settings (182-204)
  • settings (182-183)
  • settings (206-213)
  • settings (207-207)
  • settings (215-228)
  • settings (215-216)
  • settings (236-236)
  • settings (238-238)
  • settings (246-316)
  • settings (246-246)
  • settings (318-339)
  • settings (319-319)
src/libfetchers/git.cc (2)
src/libfetchers/git-utils.cc (11)
  • rev (350-390)
  • rev (350-350)
  • rev (392-397)
  • rev (392-392)
  • rev (524-524)
  • rev (555-555)
  • rev (558-558)
  • rev (629-700)
  • rev (629-629)
  • makeFingerprint (741-744)
  • makeFingerprint (741-741)
src/libutil/include/nix/util/source-accessor.hh (1)
  • getFSSourceAccessor (228-228)
src/libfetchers/include/nix/fetchers/git-utils.hh (2)
src/libfetchers/git-utils.cc (10)
  • rev (350-390)
  • rev (350-350)
  • rev (392-397)
  • rev (392-392)
  • rev (524-524)
  • rev (555-555)
  • rev (558-558)
  • rev (629-700)
  • rev (629-629)
  • wd (561-561)
src/libutil/include/nix/util/source-accessor.hh (1)
  • displayPrefix (164-164)
src/libfetchers/git-utils.cc (2)
src/libfetchers/include/nix/fetchers/git-utils.hh (8)
  • rev (30-30)
  • rev (39-39)
  • rev (41-41)
  • rev (93-93)
  • rev (100-100)
  • rev (115-115)
  • wd (102-103)
  • ref (46-46)
src/libfetchers/git.cc (4)
  • getAccessor (981-1002)
  • getAccessor (982-982)
  • makeFingerprint (641-645)
  • makeFingerprint (641-641)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build_aarch64-darwin / build
  • GitHub Check: build_x86_64-linux / build
🔇 Additional comments (3)
tests/functional/fetchGit.sh (1)

314-345: Backward-compat test coverage looks solid

The new git-eol block exercises old vs. new narHash behavior, export-ignore, export-subst, and CRLF vs. LF semantics, and verifies both the warning text and the hard failure path. The setup and assertions are consistent with the rest of this script; no changes requested here.

src/libfetchers/tarball.cc (1)

129-141: Accessor options defaulting preserves tarball behavior

Switching to getAccessor(treeHash, {}, displayPrefix) correctly adapts to the new options-based API while keeping tarball-cache accessors free of exportIgnore/LFS semantics. This should be behaviorally identical to the prior boolean call site.

src/libfetchers/git-utils.cc (1)

552-562: Accessor option plumbing is correct—no issues found

The grep results confirm that all getAccessor call sites pass explicit GitAccessorOptions:

  • git-utils.cc:704 and 1362 pass {} or explicit {.exportIgnore = ...}
  • git.cc:817 and 912 pass options or explicit {.exportIgnore = ...}
  • tarball.cc:139 and github.cc:329 pass {}

No call sites omit the options parameter. The review comment's analysis is accurate.

Copy link
Member

@grahamc grahamc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(leaving a stronger indicator than my previous note to avoid accidental merge)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
src/libfetchers/include/nix/fetchers/fetch-settings.hh (1)

126-134: Clarify nix-219-compat semantics and intended usage in the help text

The Setting<bool> wiring looks fine, but the description is a bit strong/ambiguous:

  • “Nix will use git archive rather than libgit2 to copy Git inputs” reads like an unconditional override of the implementation. If the actual behavior is narrower (e.g., only for lock generation, or only in specific fallback paths), consider tightening the wording so users don’t over-assume what’s guaranteed.
  • It may help to explicitly call out that this is a migration/compatibility knob (for generating or working with Nix 2.19-style locks) and is not recommended for long‑term use, especially since the following sentence already warns that resulting locks may not be compatible with Nix ≥ 2.20.

Clearer wording here will make it easier for users to understand exactly when to enable this and what compatibility guarantees they are (and are not) getting.

src/libfetchers/git.cc (1)

651-674: Document fingerprint suffix and consider LFS handling.

Two concerns with this git archive fallback accessor:

  1. Undocumented fingerprint suffix: Line 671 appends ";f" to the fingerprint, but its meaning is not documented in code. Add a comment explaining that this suffix indicates the archive-based accessor path.

  2. Incomplete filter coverage: Line 670 only enables exportIgnore = true but not smudgeLfs. If a Nix < 2.20 lock file was created with LFS-smudged content, this fallback won't match that narHash. Consider whether LFS should also be enabled here for complete backward compatibility.

Apply this diff to document the suffix:

+    // Create an accessor using git archive for backward compatibility.
+    // The ";f" fingerprint suffix indicates this is an archive-based accessor
+    // that applies export-ignore filters.
     GitAccessorOptions options{.exportIgnore = true};
     accessor->fingerprint = options.makeFingerprint(rev) + ";f";
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6e632cd and c994977.

📒 Files selected for processing (3)
  • src/libfetchers/git.cc (7 hunks)
  • src/libfetchers/include/nix/fetchers/fetch-settings.hh (1 hunks)
  • tests/functional/fetchGit.sh (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/libfetchers/git.cc (2)
src/libfetchers/git-utils.cc (11)
  • rev (350-390)
  • rev (350-350)
  • rev (392-397)
  • rev (392-392)
  • rev (524-524)
  • rev (555-555)
  • rev (558-558)
  • rev (629-700)
  • rev (629-629)
  • makeFingerprint (741-744)
  • makeFingerprint (741-741)
src/libutil/include/nix/util/source-accessor.hh (1)
  • getFSSourceAccessor (228-228)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build_aarch64-darwin / build
  • GitHub Check: build_x86_64-linux / build
🔇 Additional comments (5)
tests/functional/fetchGit.sh (1)

314-369: Comprehensive backward compatibility test coverage.

The test successfully exercises the Nix < 2.20 compatibility path by:

  • Creating a repository with CRLF line endings and gitattributes (export-ignore, export-subst)
  • Verifying old narHash matches legacy git archive behavior (CRLF, ignored files removed)
  • Verifying new narHash matches modern libgit2 behavior (LF, ignored files present)
  • Testing narHash mismatch error handling
  • Testing flake locking with both --nix-219-compat and default semantics

However, note line 366 directly removes the SQLite cache file with a FIXME comment. Consider using a more robust cache invalidation mechanism or documenting why direct file removal is necessary here.

src/libfetchers/git.cc (4)

641-645: Good refactoring - fingerprint construction is now consistent.

Extracting fingerprint construction into a helper method improves code clarity and ensures consistent suffix semantics across all call sites.


873-898: Submodule options propagation is correct.

The changes properly propagate exportIgnore and smudgeLfs options from the parent repository to submodules (lines 896, 898), ensuring consistent behavior across the repository tree.


932-932: Workdir accessor correctly uses options struct.

The conversion from boolean parameters to GitAccessorOptions struct is consistent with the overall refactoring in this PR.


1024-1048: Fingerprint generation now uses consistent helper.

Both the clean (line 1027) and dirty (line 1043) paths now use the makeFingerprint helper, ensuring consistent fingerprint construction across all scenarios.

Comment on lines 827 to 869
if (settings.nix219Compat && !options.smudgeLfs && accessor->pathExists(CanonPath(".gitattributes"))) {
/* Use Nix 2.19 semantics to generate locks, but if a NAR hash is specified, support Nix >= 2.20 semantics
* as well. */
warn("Using Nix 2.19 semantics to export Git repository '%s'.", input.to_string());
auto accessorModern = accessor;
accessor = getGitArchiveAccessor(*store, repoInfo, repoDir, rev);
if (expectedNarHash) {
auto narHashLegacy =
fetchToStore2(settings, *store, {accessor}, FetchMode::DryRun, input.getName()).second;
if (expectedNarHash != narHashLegacy) {
auto narHashModern =
fetchToStore2(settings, *store, {accessorModern}, FetchMode::DryRun, input.getName()).second;
if (expectedNarHash == narHashModern)
accessor = accessorModern;
}
}
} else {
/* Backward compatibility hack for locks produced by Nix < 2.20 that depend on Nix applying Git filters,
* `export-ignore` or `export-subst`. Nix >= 2.20 doesn't do those, so we may get a NAR hash mismatch. If
* that happens, try again using `git archive`. */
auto narHashNew = fetchToStore2(settings, *store, {accessor}, FetchMode::DryRun, input.getName()).second;
if (expectedNarHash && accessor->pathExists(CanonPath(".gitattributes"))) {
if (expectedNarHash != narHashNew) {
auto accessorLegacy = getGitArchiveAccessor(*store, repoInfo, repoDir, rev);
auto narHashLegacy =
fetchToStore2(settings, *store, {accessorLegacy}, FetchMode::DryRun, input.getName()).second;
if (expectedNarHash == narHashLegacy) {
warn(
"Git input '%s' specifies a NAR hash '%s' that was created by Nix < 2.20.\n"
"Nix >= 2.20 does not apply Git filters, `export-ignore` and `export-subst` by default, which changes the NAR hash.\n"
"Please update the NAR hash to '%s'.",
input.to_string(),
expectedNarHash->to_string(HashFormat::SRI, true),
narHashNew.to_string(HashFormat::SRI, true));
accessor = accessorLegacy;
}
}
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

❓ Verification inconclusive

Backward compatibility logic works but has detection limitations.

The dual-path approach correctly handles both Nix 2.19 compatibility mode and automatic fallback for older lock files. However, there are some limitations:

  1. Limited filter detection: Lines 827 and 848 only check for .gitattributes file existence. Git can also apply filters from $GIT_DIR/info/attributes, ~/.config/git/attributes, and system-wide config. Repositories using filters configured outside .gitattributes won't trigger the fallback, leading to potential narHash mismatches.

  2. Performance trade-off: When a narHash mismatch occurs, the code performs two full NAR hash computations (lines 847+852 or 835+838). While using DryRun mode, this could still be expensive for large repositories. This appears to be an intentional trade-off for correctness.

  3. Detection scope: The fallback only triggers when .gitattributes exists, even though Git filters can affect the tree without an explicit attributes file (e.g., via global config).

Consider documenting these limitations in the warning messages or code comments so users understand when the fallback may not work.

Based on learnings from past reviews, consider verifying whether common Git configurations use .git/info/attributes or global attribute files:


🏁 Script executed:

#!/bin/bash
# Check if Git repositories commonly use alternative attribute sources
# Search for .git/info/attributes files in cached repos

cache_dir="${HOME}/.cache/nix/gitv3"
if [ -d "$cache_dir" ]; then
  echo "Checking for .git/info/attributes files in cached repos..."
  fd -H -t f "^attributes$" "$cache_dir" --exec sh -c 'grep -l "export-ignore\|export-subst\|eol=" "{}" 2>/dev/null' | head -20
  echo "---"
  echo "Total cached repos with .git/info/attributes filters:"
  fd -H -t f "^attributes$" "$cache_dir" --exec sh -c 'grep -l "export-ignore\|export-subst\|eol=" "{}" 2>/dev/null' | wc -l
fi

Length of output: 0


I need to gather more information to verify the review comment's concerns. Let me search the codebase for tests and documentation related to this logic, and check Git's attribute detection behavior.

Based on the web search results, Git consults $GIT_DIR/info/attributes file (which has the highest precedence) when deciding what attributes are assigned to a path, followed by .gitattributes files. The search results confirm that export-ignore can be placed in $GIT_DIR/info/attributes file, and this is indeed a valid location for keeping attributes that apply while archiving.

The review comment's concern is valid: the code at lines 827 and 848 only checks for .gitattributes file existence, but would miss filters configured in $GIT_DIR/info/attributes, which actually takes precedence in Git's attribute resolution order. Repositories using .git/info/attributes for export-ignore or export-subst attributes would not trigger the backward compatibility fallback, potentially resulting in undetected NAR hash mismatches.

Backward compatibility logic correctly handles known cases but misses filters in .git/info/attributes.

The filter detection at lines 827 and 848 only checks for .gitattributes presence. However, Git consults $GIT_DIR/info/attributes file (which has the highest precedence) for filter attributes. Repositories using .git/info/attributes for export-ignore or export-subst configuration will not trigger the fallback, potentially missing NAR hash mismatches.

The performance trade-off (dual hash computations on mismatch) is acceptable for correctness. Consider checking both .gitattributes and .git/info/attributes to improve filter detection coverage.

…haviour

Before Nix 2.20, we used git, which applies Git filters (in particular
doing end-of-line conversion based on .gitattributes), `export-ignore`
and `export-subst`. In 2.20, we switched to libgit2 and stopped
applying those, which is probably better for reproducibility
(e.g. `export-subst` can do pretty crazy things). However, that breaks
existing lock files / fetchTree calls for Git inputs that rely on
them, since it invalidates the NAR hash.

So as a backward compatibility hack, we now check the NAR hash
computed over the Git tree without them. If there is a hash mismatch,
we try again *with* them (by calling `git archive`, just like Nix <
2.20). If that succeeds, we print a warning and return the filtered
tree. This is inefficient, but should be fine for legacy flakes.
This makes Nix generate Nix 2.19 compatible lock files by default
(i.e. using `git archive`), while still supporting newer lock files.
@edolstra edolstra force-pushed the detsys/eelcodolstra/fh-1150-legacy-eol-handling-v2 branch from c994977 to 90ca8c3 Compare November 25, 2025 19:38
@edolstra edolstra requested a review from grahamc November 25, 2025 19:38
@github-actions github-actions bot temporarily deployed to pull request November 25, 2025 19:41 Inactive
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/libfetchers/git.cc (1)

650-681: Git archive accessor implementation for backward compatibility.

The getGitArchiveAccessor method correctly:

  • Creates a temporary directory for unpacking
  • Runs git archive to apply filters and export rules
  • Unpacks and adds to store
  • Sets appropriate fingerprint with ;f suffix

However, the ;f suffix (line 678) is undocumented. Consider adding a brief comment explaining that ;f indicates "fallback" or "filtered via git-archive" to match the documentation pattern of ;e (exportIgnore) and ;l (LFS).

+        // ";f" suffix indicates this accessor was produced via git-archive fallback
+        // for Nix < 2.20 compatibility (applying Git filters, export-ignore, export-subst)
         accessor->fingerprint = options.makeFingerprint(rev) + ";f";
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c994977 and 90ca8c3.

📒 Files selected for processing (6)
  • src/libfetchers/fetchers.cc (2 hunks)
  • src/libfetchers/git-utils.cc (7 hunks)
  • src/libfetchers/git.cc (8 hunks)
  • src/libfetchers/include/nix/fetchers/fetch-settings.hh (1 hunks)
  • src/libfetchers/include/nix/fetchers/git-utils.hh (2 hunks)
  • tests/functional/fetchGit.sh (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/libfetchers/fetchers.cc
🧰 Additional context used
🧬 Code graph analysis (3)
src/libfetchers/git.cc (1)
src/libfetchers/include/nix/fetchers/git-utils.hh (7)
  • rev (31-31)
  • rev (40-40)
  • rev (42-42)
  • rev (94-94)
  • rev (101-101)
  • rev (116-116)
  • settings (122-122)
src/libfetchers/include/nix/fetchers/git-utils.hh (2)
src/libfetchers/git-utils.cc (10)
  • rev (350-390)
  • rev (350-350)
  • rev (392-397)
  • rev (392-392)
  • rev (524-524)
  • rev (555-555)
  • rev (558-558)
  • rev (629-700)
  • rev (629-629)
  • wd (561-561)
src/libutil/include/nix/util/source-accessor.hh (1)
  • displayPrefix (164-164)
src/libfetchers/git-utils.cc (1)
src/libfetchers/include/nix/fetchers/git-utils.hh (8)
  • rev (31-31)
  • rev (40-40)
  • rev (42-42)
  • rev (94-94)
  • rev (101-101)
  • rev (116-116)
  • wd (103-104)
  • ref (47-47)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build_aarch64-darwin / build
  • GitHub Check: build_x86_64-linux / build
🔇 Additional comments (12)
src/libfetchers/include/nix/fetchers/fetch-settings.hh (1)

126-134: Well-documented compatibility setting.

The nix219Compat setting is clearly documented and appropriately defaults to false, making the legacy behavior opt-in. The documentation correctly warns users about the trade-off with Nix >= 2.20 compatibility.

src/libfetchers/include/nix/fetchers/git-utils.hh (2)

25-32: Clean options struct design.

The GitAccessorOptions struct consolidates the boolean flags into a well-organized structure. The comment on line 29 clarifying where submodules are implemented is helpful for maintainability.


100-104: Consistent API update.

Both getAccessor overloads now accept GitAccessorOptions, providing a uniform interface for configuring accessor behavior.

tests/functional/fetchGit.sh (2)

313-345: Comprehensive backward compatibility tests.

The test suite properly validates:

  • CRLF line ending conversion behavior
  • export-ignore attribute handling
  • export-subst substitution
  • NAR hash matching for both legacy and modern semantics
  • Warning messages when updating from old hashes

347-368: Good flake locking test coverage.

The flake tests verify that:

  • --nix-219-compat produces legacy-compatible locks with oldHash
  • Default behavior produces modern locks with newHash
  • Both lock types can be consumed by either mode
src/libfetchers/git.cc (3)

641-648: Helper method cleanly extracts options from input.

The getGitAccessorOptions method provides a clean abstraction for constructing the options struct from input attributes.


831-869: Backward compatibility logic correctly handles both modes.

The dual-path approach is sound:

  1. When nix219Compat is enabled, prefer git-archive output but accept modern hashes
  2. When disabled, prefer libgit2 output but fall back to git-archive for legacy hashes

The warning message (lines 858-864) is helpful and actionable, telling users exactly what hash to use.

Note: As flagged in previous reviews, lines 831 and 852 only check for .gitattributes file existence. Git can also apply filters from $GIT_DIR/info/attributes, which this logic won't detect. This is an edge case but worth documenting in comments.

Consider documenting this limitation in a code comment:

+        // Note: This only checks for .gitattributes in the tree. Git can also apply
+        // attributes from $GIT_DIR/info/attributes, which we don't detect here.
         if (expectedNarHash && accessor->pathExists(CanonPath(".gitattributes"))) {

1032-1037: FIXME appropriately documents fingerprint divergence.

The FIXME comment correctly acknowledges that getFingerprint may return an inconsistent value for the legacy git-archive case. Since the accessor fingerprint (with ;f suffix) is set directly on the accessor object, this divergence is contained. The comment helps future maintainers understand this edge case.

src/libfetchers/git-utils.cc (4)

741-744: Clear fingerprint suffix convention.

The makeFingerprint implementation establishes a clear suffix scheme:

  • ;e for exportIgnore
  • ;l for smudgeLfs

This is consistent with the ;f suffix used in git.cc for the git-archive fallback path.


761-769: Options correctly stored and fingerprint initialized.

The GitSourceAccessor constructor properly:

  • Stores options in state for later use (e.g., LFS smudging in readBlob)
  • Initializes the fingerprint using options.makeFingerprint(rev)

1318-1334: Accessor methods correctly propagate options.

The getRawAccessor and getAccessor implementations properly pass the options through to the accessor construction and export-ignore wrapper.


1357-1389: Submodule handling uses options-based approach.

The getSubmodules method correctly:

  • Passes exportIgnore option to the accessor (line 1362)
  • Uses empty options {} for the raw accessor when enumerating submodules (line 1379)

This ensures export-ignore rules are respected when reading .gitmodules but not when looking up submodule revisions.

That's what Nix < 2.20 did (which means that it behaved differently
from the `submodules = false`, by not applying `export-ignore`).
@github-actions github-actions bot temporarily deployed to pull request November 26, 2025 13:28 Inactive
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
src/libfetchers/git.cc (2)

650-702: Verify LFS handling in legacy Git accessor.

The legacy Git accessor correctly implements Nix < 2.20 semantics for submodules (via git checkout) and non-submodules (via git archive). However, at line 683, only exportIgnore is set to true for the archive path, while smudgeLfs is not explicitly enabled.

If Nix < 2.20 applied LFS smudging and a lock file was created with LFS enabled, this fallback may not match the expected NAR hash. Consider whether smudgeLfs should also be enabled here, or document why it's safe to omit.

Additionally, the ";legacy" fingerprint suffix (line 699) is not documented in code. Consider adding a comment explaining its purpose and relationship to other suffixes.

Based on learnings from past reviews, the incomplete LFS coverage in the fallback path is a known concern.


1054-1080: Fingerprint logic has documented limitation for legacy cases.

The FIXME at line 1057 correctly identifies that getFingerprint() cannot determine whether the ;legacy suffix should be appended, since that decision is made later in getAccessorFromCommit() based on NAR hash matching.

This means the Input-level fingerprint may diverge from the actual accessor fingerprint when the legacy path is taken. While this is a known limitation (already flagged in past reviews), consider adding a comment explaining that for legacy NAR hash cases, the accessor's fingerprint is authoritative and may differ from getFingerprint().

Based on learnings from past reviews, the fingerprint divergence between Input and accessor is a known issue.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 90ca8c3 and 4f9c962.

📒 Files selected for processing (1)
  • src/libfetchers/git.cc (8 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/libfetchers/git.cc (3)
src/libfetchers/include/nix/fetchers/git-utils.hh (6)
  • rev (31-31)
  • rev (40-40)
  • rev (42-42)
  • rev (94-94)
  • rev (101-101)
  • rev (116-116)
src/libfetchers/git-utils.cc (25)
  • rev (350-390)
  • rev (350-350)
  • rev (392-397)
  • rev (392-392)
  • rev (524-524)
  • rev (555-555)
  • rev (558-558)
  • rev (629-700)
  • rev (629-629)
  • path (466-469)
  • path (466-466)
  • path (772-796)
  • path (772-772)
  • path (798-801)
  • path (798-798)
  • path (803-807)
  • path (803-803)
  • path (809-840)
  • path (809-809)
  • path (842-863)
  • path (842-842)
  • path (865-868)
  • path (865-865)
  • path (874-884)
  • path (874-874)
src/libutil/include/nix/util/mounted-source-accessor.hh (1)
  • makeMountedSourceAccessor (18-18)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build_aarch64-darwin / build
  • GitHub Check: build_x86_64-linux / build
🔇 Additional comments (5)
src/libfetchers/git.cc (5)

19-19: LGTM: Include added for NAR hash computation.

The fetch-to-store.hh include is necessary for the fetchToStore2 calls used in the backward-compatibility logic.


641-648: LGTM: Clean abstraction for options construction.

Consolidating the options construction in a single helper method improves consistency and reduces duplication throughout the file.


896-938: LGTM: Submodule handling correctly propagates options.

The submodule logic properly propagates exportIgnore, lfs, and submodules flags through the recursive input construction (lines 922-924), and the fingerprint suffix ";s" correctly indicates submodule mounting (line 933).


960-960: LGTM: Workdir accessor uses structured options.

The migration from a boolean parameter to structured GitAccessorOptions improves API consistency.


853-891: Based on my investigation, I can now provide the rewritten review comment.

The filter detection concern about .git/info/attributes is partially invalid for this codebase.

The review comment raises concerns about filter detection being limited to .gitattributes only, citing missing checks for .git/info/attributes. However, this concern does not apply to the main code paths:

  1. Legacy path (git archive): Uses the git archive command, which by design only consults .gitattributes in the tree—not .git/info/attributes (since .git/ is not exported). The check on line 853 and 874 for .gitattributes is correct and sufficient.

  2. Modern path (libgit2): While libgit2 respects git's filter precedence, including .git/info/attributes, the code does not need to pre-check filter file locations since the NAR hash mismatch detection on lines 861-868 and 873-891 will catch any hash differences regardless of filter source.

The dual NAR hash computations on mismatch (lines 861, 864, 873, 878) are an acceptable trade-off for correctness, though potentially expensive for large repositories.

The code does not require changes to check .git/info/attributes separately, as the existing NAR hash comparison logic handles filter-induced changes from any source.

@edolstra edolstra force-pushed the detsys/eelcodolstra/fh-1150-legacy-eol-handling-v2 branch from 9d45626 to f3aeeca Compare November 26, 2025 13:44
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/libflake/flakeref.cc (1)

63-66: Clarify and test nix219Compat semantics around dir stripping

The gating here looks correct: when nix219Compat is false we preserve existing behavior (always erase "dir"), and when it is true we intentionally let "dir" flow into Input::fromURL, which is what you need for Nix≤2.19–style Git inputs.

The subtle part is that FlakeRef::canonicalize() still strips a matching dir from input.attrs["url"] unconditionally. If any flows both (a) rely on nix219Compat for legacy Git hashing and (b) call canonicalize() before constructing the accessor, they will no longer see the legacy ?dir=... URL even in compat mode.

Two suggestions:

  • Add a short comment here tying this to Nix 2.19 Git behavior and noting that canonicalize() is responsible for normalizing lock entries, e.g.:
-    auto dir = getOr(parsedURL.query, "dir", "");
-    if (!fetchSettings.nix219Compat)
-        parsedURL.query.erase("dir");
+    auto dir = getOr(parsedURL.query, "dir", "");
+    // In nix219Compat mode we intentionally keep `dir` in the underlying URL
+    // so Git inputs see the same query string as in Nix ≤ 2.19.
+    // FlakeRef::canonicalize() still strips it when normalising lock entries.
+    if (!fetchSettings.nix219Compat)
+        parsedURL.query.erase("dir");
  • Double‑check (and ideally cover with tests) that the hash‑compatibility path exercises a non‑canonicalised FlakeRef, and that enabling nix219Compat globally doesn’t introduce surprising behavior for non‑Git/path flake refs carrying a dir query.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4f9c962 and 9d45626.

📒 Files selected for processing (1)
  • src/libflake/flakeref.cc (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build_aarch64-darwin / build

@github-actions github-actions bot temporarily deployed to pull request November 26, 2025 13:47 Inactive
@grahamc grahamc dismissed their stale review December 5, 2025 16:21

Agreed this is the best way to go.

@edolstra edolstra added this pull request to the merge queue Dec 5, 2025
@edolstra edolstra removed this pull request from the merge queue due to a manual request Dec 5, 2025
@edolstra edolstra enabled auto-merge December 5, 2025 18:33
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f3aeeca and 99b4669.

📒 Files selected for processing (4)
  • src/libfetchers/git-utils.cc (7 hunks)
  • src/libfetchers/github.cc (2 hunks)
  • src/libfetchers/mercurial.cc (1 hunks)
  • src/libfetchers/tarball.cc (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
src/libfetchers/github.cc (1)
src/libfetchers/include/nix/fetchers/git-utils.hh (6)
  • rev (31-31)
  • rev (40-40)
  • rev (42-42)
  • rev (94-94)
  • rev (101-101)
  • rev (116-116)
src/libfetchers/git-utils.cc (1)
src/libfetchers/include/nix/fetchers/git-utils.hh (8)
  • rev (31-31)
  • rev (40-40)
  • rev (42-42)
  • rev (94-94)
  • rev (101-101)
  • rev (116-116)
  • wd (103-104)
  • ref (47-47)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build_x86_64-linux / build
  • GitHub Check: build_aarch64-darwin / build
🔇 Additional comments (10)
src/libfetchers/mercurial.cc (1)

345-351: LGTM! Consistent fingerprint prefixing for Mercurial inputs.

The "hg:" prefix aligns with the standardized fingerprint format introduced across other fetchers (e.g., "github:", "tarball:", "git:"), improving clarity and preventing potential collisions between different input types.

src/libfetchers/github.cc (1)

328-329: LGTM! Correct migration to options-based accessor API.

The empty initializer {} properly constructs a default GitAccessorOptions struct, matching the updated signature in git-utils.hh.

src/libfetchers/tarball.cc (2)

139-140: LGTM! Correct migration to options-based accessor API.

The empty initializer {} properly matches the updated getAccessor signature expecting GitAccessorOptions.


407-415: LGTM! Consistent fingerprint prefixing for tarball inputs.

Adding "tarball:" prefix to both narHash and rev fingerprints aligns with the standardized format across fetchers and distinguishes tarball inputs from other types.

src/libfetchers/git-utils.cc (6)

741-744: LGTM! Clean fingerprint generation with option flags.

The makeFingerprint method provides a consistent way to generate fingerprints that encode the accessor options. The format "git:<rev>[;e][;l]" clearly distinguishes different configurations.


751-770: LGTM! Well-structured migration to options-based state.

The GitSourceAccessor now stores GitAccessorOptions in its state, and the constructor properly initializes lfsFetch based on options.smudgeLfs. The fingerprint is correctly generated via options.makeFingerprint(rev).


1318-1350: LGTM! Consistent options propagation through accessor creation.

Both getRawAccessor and getAccessor overloads correctly propagate GitAccessorOptions through the accessor creation chain. The export-ignore wrapping logic remains intact, now driven by options.exportIgnore.


1357-1389: LGTM! Correct use of designated initializers for options.

The getAccessor call at line 1362 uses {.exportIgnore = exportIgnore} to explicitly set only the needed option, while getRawAccessor at line 1379 correctly uses {} since submodule revision lookup doesn't require special options.


702-716: LGTM!

Using default GitAccessorOptions{} for NAR hash computation is correct—the hash should be computed over the raw tree without export-ignore filtering or LFS smudging.


552-561: LGTM! Clear API documentation and consistent signatures.

The updated declarations properly reflect the migration to GitAccessorOptions. The comment clarifying that getRawAccessor creates an accessor "with no regard for export-ignore" is helpful documentation.

Comment on lines 350 to 356
std::optional<std::string> getFingerprint(ref<Store> store, const Input & input) const override
{
if (auto rev = input.getRev())
return rev->gitRev();
return "github:" + rev->gitRev();
else
return std::nullopt;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fingerprint prefix "github:" is used for all GitArchiveInputScheme subclasses.

GitArchiveInputScheme::getFingerprint is inherited by GitHubInputScheme, GitLabInputScheme, and SourceHutInputScheme. Using "github:" as the prefix for GitLab and SourceHut inputs is misleading and could cause fingerprint collisions if different forges return the same commit hash.

Consider using schemeName() to generate a scheme-specific prefix:

     std::optional<std::string> getFingerprint(ref<Store> store, const Input & input) const override
     {
         if (auto rev = input.getRev())
-            return "github:" + rev->gitRev();
+            return std::string{schemeName()} + ":" + rev->gitRev();
         else
             return std::nullopt;
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
std::optional<std::string> getFingerprint(ref<Store> store, const Input & input) const override
{
if (auto rev = input.getRev())
return rev->gitRev();
return "github:" + rev->gitRev();
else
return std::nullopt;
}
std::optional<std::string> getFingerprint(ref<Store> store, const Input & input) const override
{
if (auto rev = input.getRev())
return std::string{schemeName()} + ":" + rev->gitRev();
else
return std::nullopt;
}
🤖 Prompt for AI Agents
In src/libfetchers/github.cc around lines 350 to 356, the fingerprint prefix is
hardcoded as "github:" which is used by all GitArchiveInputScheme subclasses;
change the code to build the prefix from the scheme name (e.g. use schemeName()
or the instance's scheme getter) so the returned fingerprint is schemeName() +
":" + rev->gitRev() when rev is present, otherwise return std::nullopt, ensuring
unique, scheme-specific fingerprints and avoiding collisions across
GitHub/GitLab/SourceHut.

@github-actions github-actions bot temporarily deployed to pull request December 5, 2025 18:38 Inactive
@edolstra edolstra added this pull request to the merge queue Dec 5, 2025
Merged via the queue into main with commit 7bcc712 Dec 5, 2025
28 checks passed
@edolstra edolstra deleted the detsys/eelcodolstra/fh-1150-legacy-eol-handling-v2 branch December 5, 2025 19:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants