Skip to content

nix-required-mounts: fix paths#500971

Open
tfc wants to merge 9 commits intoNixOS:masterfrom
tfc:unwhack-nrm
Open

nix-required-mounts: fix paths#500971
tfc wants to merge 9 commits intoNixOS:masterfrom
tfc:unwhack-nrm

Conversation

@tfc
Copy link
Contributor

@tfc tfc commented Mar 18, 2026

I have been running cuda apps on both NVIDIA and AMD (with ZLUDA) graphics cards and found that nix-required-mounts produced some problems:

  • created paths that don't exist
  • infinite looping
  • extremely slow execution because there were too many paths (N*N complexity instead of N buried somewhere)

While fixing these issues, i decided that the scenario is complex enough for test cases, so i wrote those and restructured the app a bit to make it more testable.

Tested these changes again on two NVIDIA and AMD machines, works very nicely.

Fixes #497824

CC @SomeoneSerge @ConnorBaker @kmein @jfly

Things done

  • Built on platform:
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • Tested, as applicable:
  • Ran nixpkgs-review on this PR. See nixpkgs-review usage.
  • Tested basic functionality of all binary files, usually in ./result/bin/.
  • Nixpkgs Release Notes
    • Package update: when the change is major or breaking.
  • NixOS Release Notes
    • Module addition: when adding a new NixOS module.
    • Module update: when the change is significant.
  • Fits CONTRIBUTING.md, pkgs/README.md, maintainers/README.md and other READMEs.

@nixpkgs-ci nixpkgs-ci bot added 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. 10.rebuild-darwin: 1 This PR causes 1 package to rebuild on Darwin. 10.rebuild-linux: 1 This PR causes 1 package to rebuild on Linux. 2.status: merge-bot eligible This PR can be merged by commenting "@NixOS/nixpkgs-merge-bot merge". labels Mar 18, 2026
@tfc tfc force-pushed the unwhack-nrm branch 3 times, most recently from 4acdde7 to 67747fd Compare March 18, 2026 11:24
@kmein
Copy link
Member

kmein commented Mar 18, 2026

nixpkgs-review result

Generated using nixpkgs-review.

Command: nixpkgs-review pr 500971
Commit: 67747fd2b7659d6fe4dcd7dde694c53b392518a1


x86_64-linux

✅ 2 packages built:
  • nix-required-mounts
  • nix-required-mounts.dist

Copy link
Member

@kmein kmein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functionality-wise, everything looks really good. I've left a few inline comments throughout the patch—mostly just some naming questions and minor nitpicks around type hints and a small optimization.

Let me know what you think of those, but overall this is in great shape!

Comment on lines +69 to +74
# we need to resolve paths before concatenation because of things like
# $ ls -l /sys/dev/char/226:128/subsystem
# ... /sys/dev/char/226:128/subsystem
# -> ../../../../../../class/drm
# Path(normpath(...)) to normalize `foo/../bar` to `bar`
p = Path(os.path.normpath(p.parent.resolve() / parent))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reflecting whether p.parent.resolve() might lead to missing paths the way p.resolve() does. Maybe not because we directly mount p?

@tfc
Copy link
Contributor Author

tfc commented Mar 18, 2026

TODO:

  • i will add a testcase that proves why we need to resolve symlink parents
  • split up "paths" into "paths" and "storepaths" so we don't do the recursion needlessly on nix store paths whose closure has long been pre-calculated already. maybe next PR

@tfc tfc force-pushed the unwhack-nrm branch 2 times, most recently from d6e144a to f2797ce Compare March 18, 2026 16:19
Copy link
Contributor

@ConnorBaker ConnorBaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for making tests! I only had the chance to read through part of it (apologies) and left a few comments.

# see also test_path_dsicovery_resolve_rel_links
#
# Path(normpath(...)) needed to normalize `foo/../bar` to `bar`
p = Path(os.path.normpath(p.parent.resolve() / parent))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it not sufficient to always just do p.absolute()? I understand there are some nuances around normalization: https://docs.python.org/3/library/pathlib.html#comparison-to-the-os-and-os-path-modules

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.absolute() does not resolve the .. part.

Comment on lines +62 to +63
def symlink_paths_closure(p: Path) -> List[Path]:
"""Traverses a chain of symlinks to collect every intermediate path up to the final destination."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, do this function collect both the location of each symlink as well as its target?

That is, if we have

- /
  - foo
    - b -> /bar
  - bar
    - c -> /baz
  - baz
  - a -> /foo

what would we expect /a/b/c to return?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a very good additional test case that unearths something that we don't handle at all right now.

currently we would simply return ["/baz"].

I do realize that we should return ["/foo", "/bar", "/baz"].

As this PR fixes a lot of problems that are different to this one, i suggest building that as a followup. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/NixOS/nixpkgs/pull/500772/changes#diff-35d8b1262a71dac56c8aefaabee843c5f6b229f200ddc90385d68111e51ac714R63-R68

That's not the same scenario.

(@SomeoneSerge may i please ask you politely to write more detailed responses in your comments in this PR? i often struggle to understand what you actually mean in some of your comments)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not the same scenario.

Right, misreading on my part

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I addressed this now, including test cases.

Copy link
Contributor

@jfly jfly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very incomplete review, sorry. Posting what I did get through.

pyproject = true;

src = lib.cleanSource ./.;
src = lib.cleanSource ./nix-required-mounts;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's much point in moving source into a subdirectory

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your suggestion to provide a pytest hook transitively made it necessary to give this project a real python package structure. if you check the code out and can make it work without this structure, i am happy to accept your patch.

@SomeoneSerge
Copy link
Contributor

SomeoneSerge commented Mar 20, 2026 via email

@tfc
Copy link
Contributor Author

tfc commented Mar 20, 2026

It works in the other PR:)

@SomeoneSerge when i run this in your PR branch:

Executing pytestCheckPhase
pytest flags: -m pytest
============================= test session starts ==============================
platform darwin -- Python 3.13.12, pytest-9.0.2, pluggy-1.6.0
rootdir: /nix/var/nix/builds/nix-13868-663597863/source
configfile: pyproject.toml
collected 1 item

nix_required_mounts.py .                                                 [100%]

============================== 1 passed in 0.01s ===============================
Finished executing pytestCheckPhase

it works but does not contain tests. as soon as i throw a test_bla.py python into the folder, pytest comes up with all kinds of errors that led me to create a "real project" folder.

@tfc tfc force-pushed the unwhack-nrm branch 4 times, most recently from 500dbbc to 3064232 Compare March 20, 2026 16:43
@SomeoneSerge
Copy link
Contributor

SomeoneSerge commented Mar 20, 2026

@tfc Could you allow maintainer edits please? EDIT: Ah, nevermind. Need a rebase prior to push

@SomeoneSerge
Copy link
Contributor

SomeoneSerge commented Mar 20, 2026

@tfc I moved the tests into the docstrings so they're easier to read (to me), renamed discover_... and enumerate_..., and changed one of the tests so that guest \neq host which is now failing (as in the output didn't match what I expected, I'm yet to reflect if my expectations are wrong)

I expect that we squash prior to merging, and we can abstain from force-pushing until then as to avoid more conflicts.

EDIT: I am open to simply prohibiting some non-trivial cases of guest != host. I'm not focused enough right now to judge if this would be one of such situations

Comment on lines +337 to +342
>>> print(pformat(paths_chain_jump))
[('${TMP}/chain/a', '${TMP}/chain/a'),
('/guest/chain/b', '${TMP}/chain/b'),
('/guest/chain/c', '${TMP}/chain/c'),
('/guest/jump/a', '${TMP}/jump/a'),
('/guest/jump/c/d', '${TMP}/jump/c/d')]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

337 >>> print(pformat(paths_chain_jump))
Differences (ndiff with -expected +actual):
  [('${TMP}/chain/a', '${TMP}/chain/a'),
-  ('/guest/chain/b', '${TMP}/chain/b'),
?    ^^^^^^
+  ('${TMP}/chain/b', '${TMP}/chain/b'),
?    ^^^^^^
+  ('${TMP}/jump/c/d', '${TMP}/jump/c/d'),
   ('/guest/chain/c', '${TMP}/chain/c'),
-  ('/guest/jump/a', '${TMP}/jump/a'),
?                                    ^
+  ('/guest/jump/a', '${TMP}/jump/a')]
?                                    ^
-  ('/guest/jump/c/d', '${TMP}/jump/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Comment on lines +276 to +278
... results_empty = subst_results(match_mounts(allowed_patterns, []))
... results_a = subst_results(match_mounts(allowed_patterns, ["feat_a"]))
... results_b = subst_results(match_mounts(allowed_patterns, ["feat_b2"]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • The latter two tests are really about what's currently called validate_mounts rather than about match_mounts
  • match_mounts's tests, otoh, should cover something like "we request 2 features out of 3"

@tfc
Copy link
Contributor Author

tfc commented Mar 21, 2026

@SomeoneSerge thanks, i hate it... i don't really see why we have to cram eeeeeverything into one file, it looks like quite a mess to be honest.

Apart from the test changes, can you please explain what you are doing there on top of the doctest related changes? i see that there is some host-guest-diff work, but right now it feels like my work is taken over from me and i don't understand your thinking so it makes it difficult to participate in the discussion on how to bring this PR further.

@SomeoneSerge
Copy link
Contributor

can you please explain what you are doing there on top of the doctest related changes

I am validating your tests (at least one I suspect to be wrong, cf. the failure) and trying to make the new names match the actual functionality.

i don't really see why we have to cram eeeeeverything into one file

We don't, it just so happened that once I moved what seemed like simple-enough examples into the docstrings there wasn't anything left in the other file. I do, however, need tests to be localized so that I can compare the environment definition against the "expected output", which is how I came to question the mounts_closure (discover_reachable_paths) implementation.

Feel free to move the more bulky test back into test_whatever.py (with py_modules now in pyproject.toml you shouldn't have trouble adding other top-level modules or package), but please keep tests readable and verifiable.

Details

P.S. Perhaps https://www.youtube.com/watch?v=XpDsk374LDE might help you relax

@tfc
Copy link
Contributor Author

tfc commented Mar 21, 2026

@SomeoneSerge Thanks for the feedback. I’m happy to move the tests back, but I’d like to align on our workflow to save us both some time.

It feels like we’ve reached a point where it felt faster/easier for you to rewrite the implementation than to communicate the specific requirements upfront. This makes it difficult for me to verify the new logic against the original bugs, effectively doubling the work for both of us. Moving forward, if you have specific preferences for structure, please let me know—I’m more than willing to adapt the code to meet community standards once the technical goal is clear.

Proposed Changes:

  • Revert tests to test_....py:
    • Verifiability: Removing the one central tidyUp function of the test file allows for quick and easy terminal inspection of the correctness of all symlink scenarios after a pytest run, which is currently made more difficult by the fine grained use of the guard pattern (i'm all in towards using as many tight guards as possible in production code, but in the tests they make everything less inspectable).
    • Code Clarity: Moving tests out of doctests reduces noise in the main application logic. Currently, the new added boilerplate makes it difficult for newcomers to distinguish between code that only exists for testing and the code that the app needs in production. We had less boilerplate in the external test case.
    • Tooling: The current doctests aren't subject to black formatting, which impacts long-term maintainability.
  • Bug Fix: I will incorporate and fix the specific test case you identified.
  • Naming: I suggest reverting the function rename. The logic generically finds path closures by following symlinks; naming it after "mounts" (the current use case) is misleading, as the function itself is agnostic of mount-specific details.

Let me know what you think. I'll proceed once we're on the same page so we can reach a mergeable state quickly without further rework.

@SomeoneSerge
Copy link
Contributor

SomeoneSerge commented Mar 21, 2026 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2.status: merge-bot eligible This PR can be merged by commenting "@NixOS/nixpkgs-merge-bot merge". 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. 10.rebuild-darwin: 1 This PR causes 1 package to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. 10.rebuild-linux: 1 This PR causes 1 package to rebuild on Linux.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

nix_required_mounts.py::symlink_parents returns non-existent paths

5 participants