Skip to content

nix-make demo#9

Draft
edolstra wants to merge 5 commits intomainfrom
nix-make
Draft

nix-make demo#9
edolstra wants to merge 5 commits intomainfrom
nix-make

Conversation

@edolstra
Copy link
Collaborator

@edolstra edolstra commented Mar 6, 2026

This demonstrates using a Wasm module to scan a source tree for C++ source files and compute their transitive header file dependencies.

Summary by CodeRabbit

  • New Features

    • Added a new WebAssembly-based plugin for analyzing C/C++ source dependencies in Nix builds, enabling automatic transitive dependency resolution and compilation orchestration.
  • Chores

    • Expanded workspace to include the new plugin module and updated vendor dependencies.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 6, 2026

📝 Walkthrough

Walkthrough

This PR introduces a new workspace crate nix-wasm-plugin-nix-make that implements WebAssembly-based C/C++ dependency analysis and build orchestration. It adds a getDeps function performing multi-pass source scanning and include resolution, integrates Nix build configuration, extends workspace members, updates vendor hashes, and adds trait derives to the Value type.

Changes

Cohort / File(s) Summary
Workspace Configuration
Cargo.toml, flake.nix
Added new workspace member "nix-wasm-plugin-nix-make" and updated cargo vendor sha256 hash for consistency.
Plugin Crate Manifest
nix-wasm-plugin-nix-make/Cargo.toml
Defined new cdylib crate with path dependency on nix-wasm-rust, targeting WebAssembly FFI interface.
Nix Build Configuration
nix-wasm-plugin-nix-make/demo.nix
Introduced build orchestration with getDeps wasm invocation, compileCpp for object generation, link for shared library creation, and structured source/dependency aggregation (180+ lines).
Dependency Resolution Logic
nix-wasm-plugin-nix-make/src/lib.rs
Implemented pub extern "C" fn getDeps performing two-pass analysis: file scanning with include extraction (scan_files, extract_includes) and transitive dependency resolution with suffix-map based include resolution (collect_transitive_includes).
Core Type Enhancements
nix-wasm-rust/src/lib.rs
Added Eq, PartialEq, and Hash trait derives to Value struct alongside existing Clone, Debug, Copy derives.

Sequence Diagram

sequenceDiagram
    participant NixBuild as Nix Build System
    participant WasmPlugin as WebAssembly Plugin<br/>(nix-wasm-plugin-nix-make)
    participant FileSystem as File System
    participant DepResolver as getDeps Function

    NixBuild->>WasmPlugin: Call getDeps(dirs, files) via wasm
    WasmPlugin->>FileSystem: readDir() scan directories
    FileSystem-->>WasmPlugin: Directory entries
    WasmPlugin->>DepResolver: scan_files() collect .cc/.hh files
    DepResolver->>WasmPlugin: FileInfo with includes extracted
    WasmPlugin->>DepResolver: collect_transitive_includes() resolve dependencies
    DepResolver->>WasmPlugin: includes set + deps set (library mappings)
    WasmPlugin->>WasmPlugin: Build Value with entries (src, path, includes, deps)
    WasmPlugin-->>NixBuild: Return structured Value with dependency info
    NixBuild->>NixBuild: compileCpp & link orchestration via demo.nix
Loading

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

Possibly Related PRs

Suggested Reviewers

  • cole-h
  • lucperkins
  • grahamc

Poem

🐰 Hop along, dear wasm friend so new!
With getDeps magic, dependencies ring true,
From includes to libs, the scan runs deep,
Nix builds now dance where dependencies sleep! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'nix-make demo' is vague and doesn't clearly convey the main purpose of the PR, which is to demonstrate using a Wasm module to scan C++ source trees and compute transitive header dependencies. Consider a more descriptive title like 'Add nix-wasm-plugin-nix-make demo with C++ dependency scanning' to better communicate the primary change to reviewers.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch nix-make

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (1)
nix-wasm-plugin-nix-make/Cargo.toml (1)

1-4: Consider marking this crate publish = false.

This manifest looks workspace-only and already depends on a local path crate, so making that explicit helps prevent accidental release attempts.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nix-wasm-plugin-nix-make/Cargo.toml` around lines 1 - 4, Add publish = false
to the [package] section of the Cargo.toml for the crate
"nix-wasm-plugin-nix-make" to mark this workspace-only crate as not for release;
edit the [package] block (where name = "nix-wasm-plugin-nix-make", version =
"0.1.0", edition = "2021") and add publish = false so cargo will not attempt to
publish this local/path-dependent crate.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@nix-wasm-plugin-nix-make/demo.nix`:
- Around line 4-7: The current getDeps uses a hardcoded path to a mutable Cargo
artifact; replace that with a Nix-built derivation that produces the wasm binary
and pass its output to builtins.wasm. Create a derivation (e.g., via
pkgs.rustPlatform.buildRustPackage or a small stdenv.mkDerivation) that builds
the crate producing nix_wasm_plugin_nix_make.wasm, then update getDeps to use
the derivation’s output path (instead of "../target/...") while keeping function
= "getDeps" and using builtins.wasm as before so evaluation is reproducible and
hermetic.
- Around line 34-45: The link command unconditionally emits hard-coded -l flags
while buildInputs is computed from builtins.concatLists (map (obj: obj.deps)
objects); change the link step in runCommandCC's link function to derive the
linker flags from that same dependency list (the value used for buildInputs)
instead of hard-coding them: compute the deps list from builtins.concatLists
(map (obj: obj.deps) objects) (the same expression used for buildInputs) and map
that list to the appropriate -l flags, then use that generated flags string in
the g++ invocation so the linker only requests libraries actually present in
buildInputs.
- Around line 50-51: The demo.nix currently embeds machine-specific hard-coded
paths (e.g., the attr values root = /home/eelco/... and prefix = "" and similar
occurrences later) which causes non-portable generated headers; change the file
to accept these paths as parameters (e.g., add function args like root, prefix,
checkoutDir, outPath) or derive sensible defaults from environment (e.g.,
builtins.getEnv "HOME") and replace every hard-coded occurrence (root, prefix
and the other repeated path literals) with those parameters; ensure defaults are
documented in the function signature so callers can override them when
evaluating the demo.

In `@nix-wasm-plugin-nix-make/src/lib.rs`:
- Around line 57-99: The loop uses unordered HashMap/HashSet iteration
(all_files.iter(), all_includes.values(), all_deps.iter()) which makes emitted
entries non-deterministic; fix by iterating in a stable order: collect the
keys/values into Vecs, sort them (e.g. sort by file.path or string), and then
build results—specifically: gather cc file keys from all_files into a Vec, sort
by their corresponding path before the for-loop; when creating include_attrs,
collect (path, Value) pairs from all_includes into a Vec and sort by path string
before mapping to Value::make_attrset; and convert all_deps to a Vec, sort it,
then map to Value::make_string so "deps" order is stable; you can also replace
the HashMap/HashSet with BTreeMap/BTreeSet if preferred (affects
collect_transitive_includes usage).
- Around line 46-54: The current suffix_map logic loses includer context and
allows wrong resolutions; change the mapping to record entries keyed by the
includer-directory + included-suffix (or maintain a nested map dir -> basename
-> Value) instead of global suffixes: for each (value, file) from all_files use
file.path dirname as context and insert keys like "{dirname}/{suffix}" (or
populate a HashMap<String, HashMap<String, Value>> mapping dirname -> filename
-> value) so that when resolving an `#include` you first try joining the includer
dirname with the include token, then fall back to other search paths; also stop
using or_insert_with blindly for suffix collisions—ensure explicit resolution
order uses includer-context first and only uses global suffix matches as a
last-resort fallback.

---

Nitpick comments:
In `@nix-wasm-plugin-nix-make/Cargo.toml`:
- Around line 1-4: Add publish = false to the [package] section of the
Cargo.toml for the crate "nix-wasm-plugin-nix-make" to mark this workspace-only
crate as not for release; edit the [package] block (where name =
"nix-wasm-plugin-nix-make", version = "0.1.0", edition = "2021") and add publish
= false so cargo will not attempt to publish this local/path-dependent crate.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f810cfe2-c7bf-4ff3-b14c-eef1f070248f

📥 Commits

Reviewing files that changed from the base of the PR and between 454575f and 9e5572d.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (6)
  • Cargo.toml
  • flake.nix
  • nix-wasm-plugin-nix-make/Cargo.toml
  • nix-wasm-plugin-nix-make/demo.nix
  • nix-wasm-plugin-nix-make/src/lib.rs
  • nix-wasm-rust/src/lib.rs

Comment on lines +4 to +7
getDeps = builtins.wasm {
path = ../target/wasm32-unknown-unknown/release/nix_wasm_plugin_nix_make.wasm;
function = "getDeps";
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Build the wasm plugin through Nix instead of ../target/....

This demo depends on a mutable Cargo artifact outside the store, so a clean checkout cannot evaluate it and the result is not reproducible.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nix-wasm-plugin-nix-make/demo.nix` around lines 4 - 7, The current getDeps
uses a hardcoded path to a mutable Cargo artifact; replace that with a Nix-built
derivation that produces the wasm binary and pass its output to builtins.wasm.
Create a derivation (e.g., via pkgs.rustPlatform.buildRustPackage or a small
stdenv.mkDerivation) that builds the crate producing
nix_wasm_plugin_nix_make.wasm, then update getDeps to use the derivation’s
output path (instead of "../target/...") while keeping function = "getDeps" and
using builtins.wasm as before so evaluation is reproducible and hermetic.

Comment on lines +34 to +45
link = name: objects: runCommandCC
name
{
inherit objects;
buildInputs = map (dep: pkgs'.${dep}) (builtins.concatLists (map (obj: obj.deps) objects));
}
''
mkdir -p $out/lib
g++ -o $out/lib/$name.so \
$(find $objects -name '*.o' -type f) \
-lboost_context -lboost_iostreams -lboost_url -larchive -lcrypto -lsodium -lblake3 -lbrotlicommon -lbrotlienc -lbrotlidec -lcpuid -shared
'';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Generate linker flags from the same dependency set as buildInputs.

buildInputs is filtered by obj.deps, but the link command always passes every -l... flag. As soon as one of those libraries is not in the computed deps, the link step loses that search path and fails.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nix-wasm-plugin-nix-make/demo.nix` around lines 34 - 45, The link command
unconditionally emits hard-coded -l flags while buildInputs is computed from
builtins.concatLists (map (obj: obj.deps) objects); change the link step in
runCommandCC's link function to derive the linker flags from that same
dependency list (the value used for buildInputs) instead of hard-coding them:
compute the deps list from builtins.concatLists (map (obj: obj.deps) objects)
(the same expression used for buildInputs) and map that list to the appropriate
-l flags, then use that generated flags string in the g++ invocation so the
linker only requests libraries actually present in buildInputs.

Comment on lines +50 to +51
{ root = /home/eelco/Dev/nix/src/libutil;
prefix = "";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Parameterize the checkout and output paths.

The hard-coded /home/eelco/... roots make the demo machine-specific, and the generated headers embed those same local paths into the build. Anyone else evaluating this will either fail immediately or compile with wrong constants.

Also applies to: 84-87, 111-112, 145-148

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nix-wasm-plugin-nix-make/demo.nix` around lines 50 - 51, The demo.nix
currently embeds machine-specific hard-coded paths (e.g., the attr values root =
/home/eelco/... and prefix = "" and similar occurrences later) which causes
non-portable generated headers; change the file to accept these paths as
parameters (e.g., add function args like root, prefix, checkoutDir, outPath) or
derive sensible defaults from environment (e.g., builtins.getEnv "HOME") and
replace every hard-coded occurrence (root, prefix and the other repeated path
literals) with those parameters; ensure defaults are documented in the function
signature so callers can override them when evaluating the demo.

Comment on lines +46 to +54
let mut suffix_map: HashMap<String, Value> = HashMap::new();
for (value, file) in &all_files {
let rel_path = &file.path;
let parts: Vec<&str> = rel_path.split('/').collect();
for i in 0..parts.len() {
let suffix = parts[i..].join("/");
suffix_map.entry(suffix).or_insert_with(|| *value);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

What is the search order for C/C++ quoted include directives (#include "foo.h"), and are relative include paths like ../foo.h valid?

💡 Result:

Search order for #include "foo.h" (quoted includes)

The C and C++ standards do not mandate a single universal directory order: they say the named file is searched for in an implementation-defined manner, and that if the "-form search fails, it is retried as if it were an <...> include (i.e., using the system-header search rules) [2].

What most compilers do (and what GCC documents) is this order for #include "file" [1]:

  1. Directory containing the file that has the #include directive
  2. Directories given with -iquote (left-to-right)
  3. Directories given with -I (left-to-right)
  4. Directories given with -isystem (left-to-right)
  5. Standard system include directories
  6. Directories given with -idirafter (left-to-right)

(For #include <file>, step 1 and the -iquote directories are typically skipped; it searches only the “system” set of include directories [1].)

Are relative include paths like ../foo.h valid?

Yes. The header name in #include "..." is essentially a sequence of source characters (with very few restrictions), and common preprocessors treat / as the portable path separator inside header names [1][2]. So #include "../foo.h" is a valid directive.

How it resolves is still implementation-defined, but typically:

  • With quotes, ../foo.h is resolved relative to the directory of the including file first (then through the rest of the include path, if needed) [1].
  • With angle brackets, it’s usually resolved relative to each include directory being searched (not relative to the including file) [1].

Sources: GCC “The C Preprocessor” manual (include syntax + search path) [1]; cppreference summary of standard #include semantics (implementation-defined search; "-form falls back to <...>) [2].


Quoted includes must resolve relative to the including file; suffix-based lookup breaks C/C++ semantics.

Per the C/C++ preprocessor standard, #include "file" searches first in the directory of the including file, then system paths. The current suffix_map.get(inc) lookup discards includer context entirely—it cannot resolve relative paths like "../foo.h" (which are valid and required by C/C++ semantics), and when suffixes collide, or_insert_with() silently binds to whichever file inserted first, yielding wrong transitive dependencies.

Store includes with their includer's full path as context, not as a global suffix map.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nix-wasm-plugin-nix-make/src/lib.rs` around lines 46 - 54, The current
suffix_map logic loses includer context and allows wrong resolutions; change the
mapping to record entries keyed by the includer-directory + included-suffix (or
maintain a nested map dir -> basename -> Value) instead of global suffixes: for
each (value, file) from all_files use file.path dirname as context and insert
keys like "{dirname}/{suffix}" (or populate a HashMap<String, HashMap<String,
Value>> mapping dirname -> filename -> value) so that when resolving an `#include`
you first try joining the includer dirname with the include token, then fall
back to other search paths; also stop using or_insert_with blindly for suffix
collisions—ensure explicit resolution order uses includer-context first and only
uses global suffix matches as a last-resort fallback.

Comment on lines +57 to +99
let mut results = vec![];
for (cc_file_val, cc_file) in all_files.iter() {
if !cc_file.path.ends_with(".cc") {
continue;
}
//warn!("processing {path}...", path = all_files[cc_file_val].path);
let mut all_includes: HashMap<String, Value> = HashMap::new();
let mut all_deps: HashSet<String> = HashSet::new();
let mut visited = HashSet::new();
collect_transitive_includes(
*cc_file_val,
&all_files,
&suffix_map,
&mut all_includes,
&mut all_deps,
&mut visited,
);

let include_attrs: Vec<(&str, Value)> = all_includes
.values()
.map(|path_val| {
let inc_file = &all_files[path_val];
(inc_file.path.as_str(), *path_val)
})
.collect();
let entry = Value::make_attrset(&[
("src", *cc_file_val),
("path", Value::make_string(&cc_file.path)),
("includes", Value::make_attrset(&include_attrs)),
(
"deps",
Value::make_list(
&all_deps
.iter()
.map(|s| Value::make_string(s))
.collect::<Vec<_>>(),
),
),
]);
results.push(entry);
}

Value::make_list(&results)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

What guarantees, if any, does Rust give about the iteration order of std::collections::HashMapandstd::collections::HashSet?

💡 Result:

Rust does not guarantee any particular iteration order for either std::collections::HashMap or std::collections::HashSet.

  • HashMap: its iterators (e.g., iter, keys, values, into_iter) visit entries in “arbitrary order”—you must not rely on any stable or sorted ordering. [1]
  • HashSet: likewise, its iterators visit elements in “arbitrary order.” [2]
  • More generally, the standard library describes unordered collections like HashMap as yielding items in whatever order is most convenient for the internal representation (i.e., an implementation detail that can change). [3]

If you need a defined order, use an ordered collection like BTreeMap / BTreeSet (sorted by key/element), or collect and sort before iterating. [3]

Sources: [1] (doc.rust-lang.org) [2] (doc.rust-lang.org) [3] (doc.rust-lang.org)

Citations:


🏁 Script executed:

# Find and examine the full context of the lib.rs file around lines 57-99
cd nix-wasm-plugin-nix-make || cd . 2>/dev/null
if [ -f "src/lib.rs" ]; then
  echo "=== Lines 40-110 of src/lib.rs ==="
  sed -n '40,110p' src/lib.rs | cat -n
else
  echo "File not found, searching for lib.rs:"
  find . -name "lib.rs" -type f
fi

Repository: DeterminateSystems/nix-wasm-rust

Length of output: 3060


🏁 Script executed:

# Search for where all_files is defined and its type
rg "all_files\s*[:=]" -A 2 -B 2 --context=3

Repository: DeterminateSystems/nix-wasm-rust

Length of output: 1614


🏁 Script executed:

# Search for all iterations over hash collections in the file to ensure nothing is missed
rg "\.iter\(\)" src/lib.rs -B 3 -A 1 --context=4

Repository: DeterminateSystems/nix-wasm-rust

Length of output: 126


🏁 Script executed:

# Check the context of all_includes to understand if its order matters for output
sed -n '36,46p' nix-wasm-plugin-nix-make/src/lib.rs | cat -n

Repository: DeterminateSystems/nix-wasm-rust

Length of output: 616


🏁 Script executed:

# Verify the structure of include_attrs and how it's used in the output
sed -n '36,56p' nix-wasm-plugin-nix-make/src/lib.rs | cat -n

Repository: DeterminateSystems/nix-wasm-rust

Length of output: 1066


Stabilize the emitted source, include, and dep order.

all_files.iter(), all_includes.values(), and all_deps.iter() all iterate over hash collections, producing non-deterministic ordering. Because these collections flow directly into Nix derivations, the output will be non-reproducible unless you sort or use ordered collections.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nix-wasm-plugin-nix-make/src/lib.rs` around lines 57 - 99, The loop uses
unordered HashMap/HashSet iteration (all_files.iter(), all_includes.values(),
all_deps.iter()) which makes emitted entries non-deterministic; fix by iterating
in a stable order: collect the keys/values into Vecs, sort them (e.g. sort by
file.path or string), and then build results—specifically: gather cc file keys
from all_files into a Vec, sort by their corresponding path before the for-loop;
when creating include_attrs, collect (path, Value) pairs from all_includes into
a Vec and sort by path string before mapping to Value::make_attrset; and convert
all_deps to a Vec, sort it, then map to Value::make_string so "deps" order is
stable; you can also replace the HashMap/HashSet with BTreeMap/BTreeSet if
preferred (affects collect_transitive_includes usage).

@edolstra edolstra marked this pull request as draft March 7, 2026 01:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant