This document presents a mid-level engineering design for Netsuke, a modern
build automation tool implemented in Rust. Netsuke is designed to provide the
power and dependency resolution capabilities of traditional make while
offering a significantly more intuitive, readable, and secure user experience.
This is achieved by leveraging a user-friendly YAML-based manifest, a powerful
Jinja templating engine for dynamic configuration, and the high-performance
Ninja build system as its execution backend.
At its core, Netsuke should not be conceptualized as a direct, imperative
replacement for make. Instead, it is architected as a high-level build
system compiler. This architectural paradigm is central to its design. Ninja,
the chosen execution backend, describes itself as a low-level "assembler" for
build systems.1 It is intentionally constrained, lacking features like
string manipulation or conditional logic, to ensure its primary goal: running
builds as fast as possible.2
This design choice by Ninja's authors necessitates the existence of a higher-
level generator tool. Netsuke fulfills this role. It provides a rich, user-friendly language (YAML with Jinja) for describing the what and why of a build--the project's structure, its logical rules, and its configurable parameters. Netsuke's primary responsibility is to compile this high-level description into a low-level, highly optimized execution plan that Ninja can understand and execute. This separation of concerns--Netsuke managing build logic and Ninja managing execution--is the foundational principle of the entire architecture.
The process of transforming a user's Netsukefile manifest into a completed
build artefact now follows a six-stage pipeline. This data flow validates the
manifest as YAML first, then resolves all dynamic logic into a static plan
before execution, a critical requirement for compatibility with Ninja.
-
Stage 1: Manifest Ingestion
The process begins by locating and reading the user's project manifest file (e.g., Netsukefile) from the filesystem into memory as a raw string.
-
Stage 2: Initial YAML Parsing
The raw string is parsed with
serde_saphyrinto an untypedserde_json::Value. This step ensures the manifest is valid YAML before any templating takes place. -
Stage 3: Template Expansion
Netsuke walks the parsed
Value, evaluating Jinja macros, variables, and theforeachandwhenkeys. Each mapping containing these keys is expanded with an iteration context providingitemand optionalindex. Variable lookups respect the precedenceglobals<target.vars< per-iteration locals, and this context is preserved for later rendering. At this stage Jinja must not modify the YAML structure directly; control constructs live only within these explicit keys. Structural Jinja blocks ({% ... %}) are not permitted to reshape mappings or sequences. -
Stage 4: Deserialisation & Final Rendering
The expanded
Valueis deserialised into strongly typed Rust structs. Jinja expressions are then rendered, but only within string fields. Structural templating using{% %}blocks is forbidden; all control flow must appear in YAML values. -
Stage 5: IR Generation & Validation
The AST is traversed to construct a canonical, fully resolved Intermediate Representation (IR) of the build. This IR represents the build as a static dependency graph with all file paths, commands, and dependencies explicitly defined. During this transformation, Netsuke performs critical validation checks. It verifies the existence of referenced rules, ensures each rule has exactly one of
commandorscript, and ensures every target specifies exactly one ofrule,command, orscript. Circular dependencies and missing inputs are also detected at this stage. -
Stage 6: Ninja Synthesis & Execution
The final, validated IR is traversed by a code generator. This generator synthesizes the content of a
build.ninjafile, translating the IR's nodes and edges into corresponding Ninja rule and build statements. Once the file is written, Netsuke invokes theninjaexecutable as a subprocess, passing control to it for the final dependency checking and command-execution phase.Netsuke's pipeline is deterministic. Given the same
Netsukefileand environment variables, the generatedbuild.ninjawill be byte-for-byte identical. This property is essential for reproducible builds and makes the output suitable for caching or source control.
flowchart TD
A[Stage 1:\nManifest Ingestion] --> B[Stage 2:\nInitial YAML Parsing]
B --> C[Stage 3:\nTemplate Expansion]
C --> D[Stage 4:\nDeserialisation & Final Rendering]
D --> E[Stage 5:\nIR Generation & Validation]
E --> F[Stage 6:\nNinja Synthesis & Execution]
The architecture's multi-stage pipeline is a direct consequence of a fundamental design constraint imposed by the choice of Ninja as the backend. Ninja's remarkable speed in incremental builds stems from its simplicity; it operates on a pre-computed, static dependency graph and avoids costly runtime operations like filesystem queries (e.g., glob expansion) or string manipulation.2
At the same time, a "friendlier" build system must offer dynamic capabilities. Users will expect to define builds that can adapt to their environment, such as using different compiler flags on Linux versus Windows, or automatically discovering source files in a directory. These features are provided in Netsuke by the Jinja templating engine.
This creates a necessary architectural division. All the dynamic logic, templating, and configuration must be fully evaluated by Netsuke before Ninja is ever invoked. The point of this transition is the Intermediate Representation (IR) generated in Stage 4. The IR serves as a static snapshot of the build plan after all Jinja logic has been resolved. It is the "object code" that the Netsuke "compiler" produces, which can then be handed off to the Ninja "assembler" for execution. This mandate for a pre-computed static graph dictates the entire six-stage pipeline and establishes a clean boundary between the user-facing logic layer and the machine-facing execution layer.
The primary interface for the user is the Netsuke manifest file, Netsukefile.
The design of its YAML schema is paramount to achieving the goal of being
"friendlier" than make. The schema is guided by a set of core principles
aimed at maximizing readability, reducing cognitive overhead, and promoting
best practices.
-
Readability: The schema prioritizes human-readability. It uses clear, descriptive keys and a structured format to make build configurations self- documenting. This stands in contrast to the often-cryptic special variables and implicit rules of Makefiles.
-
Declarative Style: Users should declare the desired state of their project--the targets they want to build and the rules to build them--rather than writing imperative scripts. Netsuke is responsible for determining the necessary steps to achieve that state.
-
Reusability: The schema is designed to encourage the creation of reusable components. Variables and rules are defined once and can be referenced throughout the manifest, reducing duplication and improving maintainability.
-
Discoverability: The structure is intended to be intuitive. A developer familiar with YAML should be able to understand the intent of a simple
Netsukefilefile with minimal reference to documentation.
A Netsukefile file is a YAML mapping containing a set of well-defined top-
level keys.
-
netsuke_version: A mandatory string that specifies the version of the Netsuke schema the manifest conforms to (e.g.,"1.0.0"). This allows for future evolution of the schema while maintaining backward compatibility. This version string should be parsed and validated using thesemvercrate.3 -
vars: A mapping of global key-value pairs. Keys must be strings. Values may be strings, numbers, booleans, or sequences. These variables seed the Jinja templating context and drive control flow within the manifest. Non-string YAML keys (for example integers such as1: value) trigger a parse-time diagnostic (E0001: "vars key must be a string") because Netsuke loads the values into a JSON object before Jinja evaluation. -
macros: An optional list of Jinja macro definitions. Each item provides asignaturestring using standard Jinja syntax and abodydeclared with the YAML|block style. Netsuke registers these macros in the template environment before rendering other sections. -
rules: A list of rule definitions. Each rule is a reusable template for a command, analogous to a Ninjaruleblock.2 -
targets: The primary list of build targets. Each target defines an output, the sources it depends on, and the rule used to produce it. This corresponds to a Ninjabuildstatement.4 -
actions: A secondary list of build targets. Any target placed here is treated as{ phony: true, always: false }by default. -
defaults: An optional list of target names to be built when Netsuke is invoked without any specific targets on the command line. This maps directly to Ninja'sdefaulttarget statement.4
The E-R diagram below summarizes the structure of a Netsukefile and the
relationships between its components.
erDiagram
NETSUKE_MANIFEST {
string netsuke_version
map vars
list rules
list actions
list targets
list defaults
}
RULE {
string name
Recipe recipe
string description
StringOrList deps
}
TARGET {
StringOrList name
Recipe recipe
StringOrList sources
StringOrList deps
StringOrList order_only_deps
map vars
bool phony
bool always
}
RECIPE {
string command
string script
StringOrList rule
}
STRING_OR_LIST {
enum value
}
NETSUKE_MANIFEST ||--o{ RULE : contains
NETSUKE_MANIFEST ||--o{ TARGET : has_actions
NETSUKE_MANIFEST ||--o{ TARGET : has_targets
RULE }o--|| RECIPE : uses
TARGET }o--|| RECIPE : uses
TARGET }o--|| STRING_OR_LIST : uses
RECIPE }o--|| STRING_OR_LIST : uses
Each entry in the rules list is a mapping that defines a reusable action.
-
name: A unique string identifier for the rule. -
command: A single command string to be executed. It may include the placeholders{{ ins }}and{{ outs }}to represent input and output files. Netsuke expands these placeholders to space-separated lists of file paths quoted for POSIX/bin/shusing theshell-quotecrate (Sh mode) before hashing the action. The IR stores the fully expanded command; Ninja executes this text verbatim. After interpolation, the command must be parsable by shlex (POSIX mode). Any interpolation other thaninsoroutsis automatically shell-escaped. -
script: A multi-line script declared with the YAML|block style. The entire block is passed to an interpreter. If the first line begins with#!Netsuke executes the script verbatim, respecting the shebang. Otherwise, the block is wrapped in the interpreter specified by the optionalinterpreterfield (defaulting to/bin/sh -e). For/bin/shscripts, each interpolation is automatically passed through theshell_escapefilter unless a| rawfilter is applied. Future versions will allow configurable script languages with their own escaping rules. On Windows, scripts default topowershell -Commandunless the manifest'sinterpreterfield overrides the setting. Exactly one ofcommand,script, orrulemust be provided. The manifest parser enforces this rule to prevent invalid states.Internally, these options deserialise into a shared
Recipeenum. Presence of exactly one ofcommand,script, orruledetermines the variant. -
description: An optional, user-friendly string that is printed to the console when the rule is executed. This maps to Ninja'sdescriptionfield and improves the user's visibility into the build process.2 -
deps: An optional field to configure support for C/C++-style header dependency generation. Its value specifies the format (e.g.,gccormsvc), which instructs Netsuke to generate the appropriatedepfileordepsattribute in the corresponding Ninja rule.4
Each entry in targets defines a build edge; placing a target in the optional
actions list instead marks it as phony: true with always left false.
-
name: The primary output file or files for this build step. This can be a single string or a list of strings. -
rule: The name of the rule (from therulessection) to use for building this target. A YAML list may be supplied to run multiple rules sequentially.
rule:
- build
- clean-up-
command: A single command string to run directly for this target. -
script: A multi-line script passed to the interpreter. When present, it is defined using the YAML|block style.
Only one of rule, command, or script may be specified. The parser
validates this exclusivity during deserialisation. When multiple fields are
present, Netsuke emits a RecipeConflict error with the message "rule, command
and script are mutually exclusive".
This union deserialises into the same Recipe enum used for rules. The parser
enforces that only one variant is present and errors if multiple recipe fields
are specified.
-
sources: The input files required by the command. This can be a single string or a list of strings. If any source entry matches thenameof another target, that target is built first, before the current target's explicitdeps. -
deps: An optional list of other target names. These targets are explicit dependencies and must be successfully built before this target can be. A change in any of these dependencies will trigger a rebuild of the current target. -
order_only_deps: An optional list of other target names that must be built before this target, but whose modification does not trigger a rebuild of this target. This maps directly to Ninja's order-only dependencies, specified with the||operator.5 -
vars: An optional mapping of local variables. These variables override any global variables defined in the top-levelvarssection for the scope of this target only. This provides the same functionality as Ninja's build-local variables.4 -
macros: An optional list of Jinja macro definitions. Each item provides asignaturestring using standard Jinja syntax and abodydeclared with the YAML|block style. Netsuke registers these macros in the template environment before rendering other sections. -
phony: When set totrue, the target runs when explicitly requested even if a file with the same name exists. The default value isfalse. -
always: When set totrue, the target runs on every invocation regardless of timestamps or dependencies. The default value isfalse.
Large sets of similar outputs can clutter a manifest when written individually.
Netsuke supports a foreach entry within targets to generate multiple
outputs succinctly. The foreach and optional when keys accept bare Jinja
expressions evaluated after the initial YAML pass. Each resulting value becomes
item in the target context, and the per-iteration environment is carried
forward to later rendering.
- foreach: glob('assets/svg/*.svg')
when: item | basename != 'logo.svg'
name: "{{ outdir }}/{{ item | basename | replace('.svg', '.png') }}"
rule: rasterise
sources: "{{ item }}"The expansion flow is:
flowchart TD
A[Iterate over targets in YAML] --> B{Has foreach?}
B -- Yes --> C[Evaluate foreach expression]
C --> D[For each item:]
D --> E{Has when?}
E -- Yes --> F[Evaluate when expression]
F -- True --> G[Expand target with item/index]
F -- False --> H[Skip target]
E -- No --> G
B -- No --> I[Keep target as is]
Each element in the sequence produces a separate target. The iteration context:
item: current elementindex: 0-based index (optional)- Variables resolve with precedence
globals<target.vars< iteration locals
Jinja control structures cannot shape the YAML; all templating must occur within the string values. The resulting build graph is still fully static and behaves the same as if every target were declared explicitly.
To illustrate the ergonomic advantages of the Netsuke schema, the following
table compares a simple C compilation project defined in both a traditional
Makefile and a Netsukefile file. The comparison highlights Netsuke's
explicit, structured, and self-documenting nature.
| Feature | Makefile Example | Netsukefile Example |
|---|---|---|
| Variables | CC=gcc | { vars: { cc: gcc } } |
| Macros | define greet\t@echo Hello $$1endef | { macros: { signature: "greet(name)", body: "Hello {{ name }}" } } |
| Rule Definition | %.o: %.c\n\t$(CC) -c |
{ rules: { name: compile, command: "{{ cc }} -c {{ ins }} -o {{ outs }}", description: "Compiling {{ outs }}" } } |
| Target Build | my_program: main.o utils.o\t$(CC) |
{ targets: { name: my_program, rule: link, sources: [main.o, utils.o] } |
| Readability | Relies on cryptic automatic variables ( |
Uses explicit, descriptive keys (name, rule, sources) and standard YAML list/map syntax. |
Once the Jinja evaluation stage has produced a pure YAML string, the next critical step is to parse this string and deserialise it into a structured, in- memory representation. The choice of libraries and the definition of the target data structures are crucial for the robustness and maintainability of Netsuke.
Netsuke now relies on serde_saphyr for YAML parsing and serialisation. The
crate wraps the actively maintained saphyr parser while preserving the
familiar serde_yaml-style API: helpers such as from_str, from_reader, and
to_string integrate cleanly with serde derives, and the error type exposes
line and column information for diagnostics. This provides a maintained,
panic-free alternative to the archived serde_yml without forcing a redesign
of the parsing pipeline.
Because serde_saphyr intentionally omits a bespoke Value tree, Netsuke
deserialises manifests into serde_json::Value for its intermediate
transformations. The JSON value retains YAML anchors, scalars, and sequences in
data structures that are easy to traverse and mutate. Once templating and
foreach expansion complete, serde_json::from_value hydrates the strongly
typed manifest AST exactly as before.
Adopting serde_saphyr delivers the features Netsuke depends on:
- Full YAML 1.2 support with alias resolution handled during parsing.
- Drop-in compatibility with existing
serdederives, keeping the AST code unchanged. - Structured errors that carry location metadata for precise diagnostics.
An Architecture Decision Record documents the migration rationale and compatibility results; no further action is required beyond monitoring upstream releases.
The Rust structs that serde_saphyr deserialises into form the Abstract Syntax
Tree (AST) of the build manifest. These structs must precisely mirror the YAML
schema defined in Section 2. They will be defined in a dedicated module,
src/ast.rs, and annotated with #[derive(Deserialize)] (and Debug) to
enable automatic deserialisation and easy debugging.
Rust
// In src/ast.rs
use serde::Deserialize;
use std::collections::HashMap;
/// Represents the top-level structure of a Netsukefile file.
#[serde(deny_unknown_fields)]
pub struct NetsukeManifest {
pub netsuke_version: Version,
#[serde(default)]
pub vars: HashMap<String, serde_json::Value>,
#[serde(default)]
pub rules: Vec<Rule>,
#[serde(default)]
pub actions: Vec<Target>,
pub targets: Vec<Target>,
#[serde(default)]
pub defaults: Vec<String>,
}
/// Represents a reusable command template.
#[serde(deny_unknown_fields)]
pub struct Rule {
pub name: String,
#[serde(flatten)]
pub recipe: Recipe,
pub description: Option<String>,
#[serde(default)]
pub deps: StringOrList,
// Additional fields like 'pool' or 'restat' can be added here
// to map to more advanced Ninja features.
}
/// A union of execution styles for both rules and targets.
#[serde(untagged)]
pub enum Recipe {
Command { command: String },
Script { script: String },
Rule { rule: StringOrList },
}
/// Represents a single build target or edge in the dependency graph.
#[serde(deny_unknown_fields)]
pub struct Target {
pub name: StringOrList,
#[serde(flatten)]
pub recipe: Recipe,
#[serde(default)]
pub sources: StringOrList,
#[serde(default)]
pub deps: StringOrList,
#[serde(default)]
pub order_only_deps: StringOrList,
#[serde(default)]
pub vars: HashMap<String, serde_json::Value>,
/// Run this target when requested even if a file with the same name exists.
#[serde(default)]
pub phony: bool,
/// Run this target on every invocation regardless of timestamps.
#[serde(default)]
pub always: bool,
}
/// An enum to handle fields that can be either a single string or a list of strings.
#[serde(untagged)]
pub enum StringOrList {
#[default]
Empty,
String(String),
List(Vec<String>),
}Note: The StringOrList enum with #[serde(untagged)] provides the
flexibility for users to specify single sources, dependencies, and rule names
as a simple string and multiple as a list, enhancing user-friendliness.
The following minimal Netsukefile shows how the derived structures behave when unknown fields are denied.
YAML
netsuke_version: "1.0.0"
targets:
- name: hello
command: echo hiRust
use std::collections::HashMap;
use netsuke::ast::*;
let ast = NetsukeManifest {
netsuke_version: Version::parse("1.0.0").unwrap(),
vars: HashMap::new(),
macros: vec![],
rules: vec![],
actions: vec![],
targets: vec![Target {
name: StringOrList::String("hello".into()),
recipe: Recipe::Command {
command: "echo hi".into(),
},
sources: StringOrList::Empty,
deps: StringOrList::Empty,
order_only_deps: StringOrList::Empty,
vars: HashMap::new(),
phony: false,
always: false,
}],
defaults: vec![],
};The integration of a templating engine like Jinja fundamentally shapes the
parsing pipeline, mandating a two-pass approach. It is impossible to parse the
user's Netsukefile file with serde_saphyr in a single step.
Consider a manifest containing Jinja syntax:
YAML
targets:
- name: my_app
sources: "{{ glob('src/*.c') }}"
rule: compileThe value of sources, {{ glob('src/*.c') }}, is a plain YAML string. The
manifest must be valid YAML before any templating occurs, so the parser can
first load it into a serde_json::Value tree.
Once parsed, Netsuke performs a series of transformation stages:
- Template Expansion: The
foreachand optionalwhenkeys in the raw YAML are evaluated to generate additional targets. Each iteration layers theitemandindexvariables over the manifest's globals and any target locals. - Deserialisation: The expanded document is deserialised into the typed
[
NetsukeManifest] AST. - Final Rendering: Remaining string fields are rendered using Jinja,
resolving expressions such as
{{ glob('src/*.c') }}.
This data-first approach avoids a lossy text-rendering pre-pass and keeps YAML parsing and template evaluation cleanly separated.
The AST structures are implemented in src/ast.rs and derive Deserialize.
Unknown fields are rejected to surface user errors early. StringOrList
provides a default Empty variant, so optional lists are trivial to represent.
The manifest version is parsed using the semver crate to validate that it
follows semantic versioning rules. Global and target variable maps now share
the ManifestMap alias:
type ManifestMap = serde_json::Map<String, serde_json::Value>;This alias preserves booleans and sequences needed for Jinja control flow while
presenting a stable public API surface. The serde_json library is built with
the preserve_order feature so the backing ManifestMap retains the insertion
order observed in the YAML manifest. This guarantees that downstream consumers
see keys in a stable sequence after foreach expansion, matching the authoring
intent and keeping diagnostics and serialised output predictable. Targets also
accept optional phony and always booleans. They default to false, making
it explicit when an action should run regardless of file timestamps. Targets
listed in the actions section are deserialised using a custom helper so they
are always treated as phony tasks. This ensures preparation actions never
generate build artefacts. Convenience functions in src/manifest.rs load a
manifest from a string or a file path, returning anyhow::Result for
straightforward error handling. Diagnostics now wrap source and manifest
identifiers in the ManifestSource and ManifestName newtypes, allowing
downstream tooling to reuse the strongly typed strings when producing errors or
logs.
The ingestion pipeline now parses the manifest as YAML before any Jinja
evaluation. A dedicated expansion pass handles foreach and when, and string
fields are rendered only after deserialisation, keeping data and templating
concerns clearly separated.
Unit tests in tests/ast_tests.rs and behavioural scenarios in
tests/features/manifest.feature exercise the deserialisation logic. They
assert that manifests fail to parse when unknown fields are present, and that a
minimal manifest round-trips correctly. A collection of sample manifests under
tests/data cover both valid and invalid permutations of the schema. These
fixtures are loaded by the tests to ensure real-world YAML files behave as
expected. This suite guards against regressions as the schema evolves.
To provide the dynamic capabilities and logical expressiveness that make a build system powerful and "friendly," Netsuke will integrate a Jinja templating engine. This engine acts as the user's primary tool for scripting and configuration within the YAML manifest.
The recommended templating engine is minijinja.
This crate is the ideal choice for several reasons. It is explicitly designed as a Rust implementation of the Jinja2 template engine, aiming for close compatibility with its syntax and behaviour.6 This is advantageous as Jinja2 is a mature, well-documented, and widely understood language, reducing the learning curve for new Netsuke users. Furthermore,
minijinja is designed with minimal dependencies, which is beneficial for
keeping Netsuke's compile times and binary size reasonable.7 Its API is
well-documented and provides first-class support for adding custom functions
and filters, which is essential for extending its capabilities to suit the
needs of a build system.8
Alternative template engines like Askama are less suitable for this use case. Askama is a type-safe engine that compiles templates into Rust code at build time.9 This model is incompatible with Netsuke's requirement to load and parse user-defined manifest files at runtime.
minijinja, with its dynamic environment and runtime rendering, is perfectly
aligned with Netsuke's architecture.
Netsuke will construct a single minijinja::Environment instance at startup.
This environment will be configured with a set of custom functions and filters
that provide build-specific functionality.
When rendering a user's Netsukefile file, the initial context provided to the
template will be constructed from the vars section of the manifest. This
allows users to define variables in their YAML and immediately reference them
within Jinja expressions. For example:
YAML
vars:
compiler: gcc
src_dir: src
targets:
- name: "{{ compiler }}_output"
sources: "{{ glob(src_dir ~ '/*.c') }}"
#...The vars mapping is read directly from the raw YAML before any Jinja is
evaluated. This avoids a lenient rendering pass for undefined placeholders and
keeps evaluation deterministic. The values are injected into the environment
prior to rendering.
The parser copies vars values into the environment using
Value::from_serializable. This preserves native YAML types so Jinja's
{% if %} and {% for %} constructs can branch on booleans or iterate over
sequences. Keys must be strings; any non-string key causes manifest parsing to
fail. Attempting to iterate over a non-sequence results in a render error
surfaced during manifest loading.
Netsuke allows users to declare reusable Jinja macros directly in the manifest.
These are provided in a top-level macros list where each entry defines a
signature and a body string. The body must use YAML's | block syntax so
multi-line macro definitions remain readable. All macros are registered with
the template environment before any other section is rendered.
YAML
macros:
- signature: "greet(name)"
body: |
Hello {{ name }}Macros can be invoked in any templated field using normal Jinja call syntax. For example:
rules:
- name: hello
command: "echo {{ greet('world') }}"If a macro name matches a built-in function or filter, the macro overrides the
built-in definition. This mirrors Jinja's behaviour and follows minijinja
semantics where later definitions shadow earlier ones.
The manifest loader compiles each macro definition into an internal template
and registers a wrapper function that evaluates the macro on demand. The
wrapper constructs a fresh MiniJinja state for every invocation so macro calls
do not depend on the lifetime of the manifest parsing state. This preserves
MiniJinja's argument handling, including keyword parameters and caller
support, while allowing later macros to override earlier ones.
To transform minijinja from a general-purpose templating engine into a
powerful build tool, Netsuke must expose a curated set of custom functions to
the template environment. These functions will be implemented in safe Rust,
providing a secure bridge to the underlying system.
-
env(var_name: &str) -> Result<String, Error>: A function that reads an environment variable from the system. This allows build configurations to be influenced by the external environment (e.g.,PATH,CC). It returns an error if the variable is undefined or contains invalid UTF-8 to ensure manifests fail fast on missing inputs. -
glob(pattern: &str) -> Result<Vec<String>, Error>: Expand filesystem patterns (e.g.,src/**/*.c) into a list of matched paths. Results are yielded in lexicographic order by the iterator and returned unchanged. Symlinks are followed by theglobcrate by default. Matching is case- sensitive on all platforms.glob_withenforcesrequire_literal_separator = trueinternally, so wildcards do not cross path separators unless**is used. Callers may use/or\\in patterns; these are normalized to the host platform before matching. Results contain only files (directories are ignored) and path separators are normalized to/. Leading-dot entries are matched by wildcards. Empty results are represented as[]. Invalid patterns surface asSyntaxError; filesystem iteration errors surface asInvalidOperation, matching minijinja error semantics. On Unix, backslash escapes for glob metacharacters ([,],{,},*,?) are preserved during separator normalization. A backslash before*or?is kept only when the wildcard is trailing or followed by an alphanumeric,_, or-; otherwise it becomes a path separator soconfig\*.ymlmaps toconfig/*.yml. On Windows, backslash escapes are not supported. This provides globbing support not available in Ninja itself, which does not support globbing.4 -
python_version(requirement: &str) -> Result<bool, Error>: An example of a domain-specific helper function that demonstrates the extensibility of this architecture. This function would executepython --versionorpython3 --versionusingstd::process::Command,10 parse the output using thesemvercrate,3 and compare it against a user-provided SemVer requirement string (e.g.,">=3.8"). This allows for conditional logic in the build based on toolchain versions.
In addition to functions, custom filters provide a concise, pipe-based syntax for transforming data within templates.
-
| shell_escape: A filter that takes a string or list and escapes it for safe inclusion as a single argument in a shell command. This is a non-negotiable security feature to prevent command injection vulnerabilities. The implementation will use theshell-quotecrate for robust, shell-aware quoting.11 -
| to_path: A filter that converts a string into a platform-native path representation, handling/and\separators correctly. -
| parent: A filter that takes a path string and returns its parent directory.
The integration of Jinja is more than a simple convenience for string
substitution. It effectively serves as the logic layer for the entire build
system. Traditional make provides powerful but often opaque functions like
$(shell...) and $(wildcard...). Netsuke achieves and surpasses this
functionality in a much friendlier and safer way.
By implementing complex or potentially unsafe operations (like filesystem
access or command execution) as custom functions in Rust and exposing them as
simple, declarative primitives in the Jinja environment, Netsuke provides a
powerful yet controlled scripting environment. The user can write a clean,
readable template like sources: {{ glob("src/*.c") }}, and the complex,
error-prone logic of traversing the filesystem is handled by secure,
well-tested Rust code. This design pattern is the key to providing both power
and safety, fulfilling the core requirement of a system that is friendlier and
more robust than its predecessors.
Netsuke bundles a small "standard library" of Jinja helpers. These tests, filters, and functions are available to every template and give concise access to common filesystem queries, path manipulations, collection utilities, and network operations.
| Test | True when the operand… |
|---|---|
dir / file / symlink |
…is that object type |
pipe / block_device / char_device (Unix-only) |
…is that object type |
device (legacy, Unix-only) |
…is a block or character device |
present |
…exists (any type) |
owned |
…is owned by the current UID |
readable / writable / executable |
…has the corresponding permission bit for current user |
empty |
…has size 0 bytes |
older_than(value) |
…has mtime < given value (seconds, timedelta, or file) |
newer_than(value) |
…has mtime > given value |
contains(substr) |
…file’s text contains substr |
matches(regex) |
…file’s text matches regex |
type(kind) |
…is of the file-type string supplied ("file", "dir", etc.) |
The dir, file, and symlink tests use cap_std's UTF-8-capable
Dir::symlink_metadata with camino paths to inspect the
operand's FileType. Because this lookup does not follow links,
symlink tests never report a file or directory for the same path. On Unix the
pipe, block_device, char_device, and legacy device tests also probe the
metadata. On non-Unix targets these predicates are stubbed to always return
false so templates remain portable. Missing paths evaluate to false, while
I/O errors raise a template error.
| Filter | Purpose |
|---|---|
basename |
Return last path component |
dirname |
Return parent directory |
with_suffix(suffix, n=1, sep='.') |
Replace last n dotted suffix components (foo.tar.gz → foo.zip) |
relative_to(root) |
Make path relative to root |
realpath |
Resolve symlinks to canonical path |
commonpath(other) |
Longest common prefix with other |
expanduser |
Expand leading ~ |
size |
File size in bytes |
contents(encoding='utf-8') |
File content as text |
linecount |
Number of text lines |
head(n=10) / tail(n=10) |
First / last n lines |
mtime / ctime |
Return timestamp (datetime) |
age(unit='s') |
Seconds (or m, h, d) since mtime |
date(fmt='%Y-%m-%d') |
Format mtime/ctime |
owner / group |
User / group name |
stat |
Full os.stat() result as dict |
hash(alg='sha256') |
Hex digest of file (md5, sha1, …) |
digest(n=8, alg='sha256') |
Truncated digest (e.g. build ID) |
base64 / hex |
Encode bytes or string |
slugify |
Make filename-safe slug |
snake_case / camel_case / kebab-case |
Rename helpers |
All built-in filters use snake_case. The camel_case helper is provided in
place of camelCase so naming remains consistent with snake_case and
kebab-case.
Implementation notes:
- Filters rely on
cap-stddirectories opened with ambient authority for file-system work. Callers must ensure that templates granted access to the stdlib are trusted to read from the process' working tree. realpathcanonicalises the parent directory before joining the resolved entry so results are absolute and symlink-free.contentsandlinecountcurrently support UTF-8 input; other encodings are rejected with an explicit error.contentsstreams data from the ambient file-system, so consumers should guard access carefully when evaluating untrusted templates.hashanddigestacceptsha256(default) andsha512. Legacy algorithmssha1andmd5are cryptographically broken and are disabled by default; enabling them requires thelegacy-digestsCargo feature and should only be done for compatibility with existing ecosystems.expandusermirrors shell semantics by inspectingHOME,USERPROFILE, and on Windows theHOMEDRIVE/HOMEPATHorHOMESHAREfallbacks. Platform-specific forms such as~userremain unsupported.with_suffixremoves dotted suffix segments (defaultn = 1) before appending the provided suffix.
Netsuke ships a cross-platform which filter that deterministically resolves
executables inside the MiniJinja environment without breaking the static graph
or purity guarantees of Stages 3 and 4. The filter follows the data-first
template discipline: it is used from string values and keeps all structural
logic in Rust.
- Usage:
{{ "gcc" | which(all=false, canonical=false, fresh=false, cwd_mode="auto") }} - Alias: A helper function
which(name, **kwargs)mirrors the filter so manifests can call it directly where piping would be unwieldy. - Returns: A string path when
allisfalse, or a list of candidate paths ordered byPATHprecedence whenallistrue.
The filter accepts four keyword arguments:
all(bool, defaultfalse) — emit every match, similar towhich -a.canonical(bool, defaultfalse) — resolve symlinks withstd::fs::canonicalizeafter discovery, deduplicating on canonical paths while preserving discovery order.fresh(bool, defaultfalse) — bypass the per-process cache for this lookup without flushing previous entries.cwd_mode("auto" | "never" | "always", default"auto") — control how the current working directory is injected into the search path.
Semantics honour platform conventions while enforcing predictable behaviour:
- On POSIX, names containing
/skipPATHtraversal and are validated directly. Executability requires a regular file with at least one execute bit. EmptyPATHsegments (leading, trailing, or::) map to the working directory whencwd_modeis"auto"or"always". - On Windows, the lookup respects
PATHEXTwhen the command lacks an extension. Comparisons are case-insensitive, results normalise both slash styles, andcwd_modedefaults to skipping the working directory to avoid the platform’s surprise "search CWD first" rule. Opting in via"always"restores that behaviour. - Canonicalisation happens after discovery and only when requested so that manifests can balance reproducibility against host-specific absolute paths.
The resolver keeps a small LRU cache keyed by the command, a fingerprint of
PATH/PATHEXT, the working directory, and the cache-relevant options (all,
canonical, cwd_mode). Entries are validated once at insertion; cache reads
no longer re-probe executability, keeping the hot path lean. Because fresh
only controls bypass behaviour, it is stripped from the cache key so fresh
lookups still repopulate the cache for subsequent calls. The fingerprint means
environment changes invalidate keys without cloning large strings, and the
helper remains pure because all inputs still derive from the manifest or
process environment. Callers can request a bypass with fresh=true when they
need to observe recent toolchain changes during a long session.
Cache capacity defaults to 64 entries, covering typical PATH sizes without
overcommitting memory, and can be tuned via
StdlibConfig::with_which_cache_capacity for hosts with unusually large or
tiny search paths. Zero is rejected to keep the cache usable.
Errors follow the design’s actionable diagnostic model. Missing executables
raise netsuke::jinja::which::not_found with context on how many PATH
entries were inspected, a shortened preview of the path list, and platform
appropriate hints (for example suggesting cwd_mode="always" on Windows).
Invalid arguments surface as netsuke::jinja::which::args.
Unit tests cover POSIX and Windows specifics, canonical deduplication, cache reuse, and list-all semantics. Behavioural MiniJinja fixtures exercise the filter in Stage 3/4 renders to prove determinism across repeated invocations with identical environments.
Workspace fallback traversals are bounded to a depth of six, skip heavy
directories such as .git, target, node_modules, dist, and build, and
honour the NETSUKE_WHICH_WORKSPACE environment variable (set to
0/false/off to disable) to avoid surprising latency on large trees.
Sequence of the resolver when falling back to the workspace:
sequenceDiagram
participant "Caller" as "Caller"
participant "WhichResolver" as "WhichResolver"
participant "EnvSnapshot" as "EnvSnapshot"
participant "Lookup" as "lookup() in lookup.rs"
participant "HandleMiss" as "handle_miss()"
participant "SearchWorkspace" as "search_workspace()"
"Caller"->>"WhichResolver": "resolve(command, options)"
"WhichResolver"->>"EnvSnapshot": "capture(cwd_override)"
"EnvSnapshot"-->>"WhichResolver": "EnvSnapshot { cwd, raw_path }"
"WhichResolver"->>"Lookup": "lookup(env, command, options)"
"Lookup"->>"Lookup": "search PATH directories for matches"
alt "matches found"
"Lookup"-->>"WhichResolver": "Vec<Utf8PathBuf> (maybe canonicalised)"
"WhichResolver"-->>"Caller": "Ok(matches)"
else "no matches in PATH"
"Lookup"->>"HandleMiss": "handle_miss(env, command, options, dirs)"
"HandleMiss"->>"HandleMiss": "check if 'raw_path' is empty"
alt "PATH empty and 'cwd_mode' != 'Never'"
"HandleMiss"->>"SearchWorkspace": "search_workspace(env.cwd, command, options.all, skip_dirs)"
"SearchWorkspace"->>"SearchWorkspace": "walk workspace with 'WalkDir' and filter executables"
"SearchWorkspace"-->>"HandleMiss": "discovered paths (possibly empty)"
alt "discovered not empty"
alt "options.canonical is true"
"HandleMiss"->>"HandleMiss": "canonicalise(discovered)"
"HandleMiss"-->>"Lookup": "canonical paths"
else "options.canonical is false"
"HandleMiss"-->>"Lookup": "discovered paths"
end
"Lookup"-->>"WhichResolver": "Vec<Utf8PathBuf> from workspace"
"WhichResolver"-->>"Caller": "Ok(matches)"
else "discovered empty"
"HandleMiss"-->>"Lookup": "Error(not_found_error)"
"Lookup"-->>"WhichResolver": "Error"
"WhichResolver"-->>"Caller": "Err(not_found)"
end
else "PATH not empty or 'cwd_mode' is 'Never'"
"HandleMiss"-->>"Lookup": "Error(not_found_error)"
"Lookup"-->>"WhichResolver": "Error"
"WhichResolver"-->>"Caller": "Err(not_found)"
end
end
Workspace traversal honours a configurable skip list to avoid expensive scans
of tool caches and IDE metadata. The default skips .git, target,
node_modules, .idea, and .vscode, and callers can replace the list via
StdlibConfig::with_workspace_skip_dirs. Entries are normalised
case-insensitively on Windows so users can pass either casing without surprises.
Structural view of the which module and configuration wiring:
classDiagram
class StdlibConfig {
+workspace_root_path() -> OptionalPath
+workspace_skip_dirs() -> StringList
+which_cache_capacity() -> NonZeroUsize
}
class Environment {
+register_with_config(config: StdlibConfig)
}
class WhichModule {
+register(env: Environment, config: WhichConfig)
}
class WhichResolver {
-cache: LruCache
-cwd_override: OptionalPath
-workspace_skips: WorkspaceSkipList
+new(cwd_override: OptionalPath, skips: WorkspaceSkipList, cache_capacity: NonZeroUsize) -> Result
+resolve(command: String, options: WhichOptions) -> Result
}
class EnvSnapshot {
+cwd: Utf8PathBuf
+raw_path: OptionalString
+capture(cwd_override: OptionalPath) -> Result
}
class WhichOptions {
+cwd_mode: CwdMode
+canonical: bool
+all: bool
+fresh: bool
}
class WhichConfig {
+new(cwd_override: OptionalPath, skips: WorkspaceSkipList, cache_capacity: NonZeroUsize) -> WhichConfig
}
class WorkspaceSkipList {
+default() -> WorkspaceSkipList
+from_names(names: StringList) -> WorkspaceSkipList
}
class CwdMode {
<<enumeration>>
+Never
+OtherModes
}
Environment --> StdlibConfig : uses
Environment --> WhichModule : calls register
StdlibConfig --> WhichModule : provides workspace_root_path, skip dirs, cache capacity
WhichModule --> WhichResolver : constructs via new(cwd_override, skips, cache_capacity)
WhichResolver --> EnvSnapshot : calls capture(cwd_override)
WhichResolver --> WhichOptions : reads lookup options
WhichResolver --> WorkspaceSkipList : reads traversal filters
WhichOptions --> CwdMode : uses cwd_mode
sequenceDiagram
actor "Developer" as "Developer"
participant "TestRunner" as "Rust test binary"
participant "CliWorld" as "CliWorld"
participant "Cucumber" as "Cucumber runner"
participant "FS" as "Feature files under 'tests/features'"
"Developer"->>"TestRunner": "run 'cargo test' (including cucumber tests)"
"TestRunner"->>"CliWorld": "create world instance"
"CliWorld"->>"CliWorld": "configure via 'cucumber()'"
"CliWorld"->>"Cucumber": "builder with 'max_concurrent_scenarios(1)'"
"Cucumber"->>"FS": "discover '.feature' files in 'tests/features'"
"Cucumber"->>"CliWorld": "execute scenarios sequentially (max 1)"
"CliWorld"-->>"Cucumber": "scenario results (stdout, stderr, exit codes)"
"Cucumber"-->>"TestRunner": "aggregate results and 'run_and_exit'"
"TestRunner"-->>"Developer": "process exit code and output with improved diagnostics"
Figure: Which resolver control flow with cache lookups and workspace fallback.
sequenceDiagram
participant Caller
participant WhichResolver
participant Cache
participant EnvSnapshot
participant Lookup
participant Workspace
Caller->>WhichResolver: resolve(command, options)
activate WhichResolver
WhichResolver->>EnvSnapshot: capture(cwd_override)
activate EnvSnapshot
EnvSnapshot-->>WhichResolver: env snapshot
deactivate EnvSnapshot
WhichResolver->>Cache: compute key (command, fingerprint, cwd, options)
alt cache hit (unless fresh=true)
Cache-->>WhichResolver: cached matches
else cache miss or fresh
WhichResolver->>Lookup: lookup(command, env, options)
activate Lookup
alt direct path
Lookup->>Lookup: resolve_direct(command, env, options)
else PATH search
Lookup->>Lookup: iterate resolved_dirs, collect candidates
end
alt found
Lookup-->>WhichResolver: matches
else not found in PATH
Lookup->>Workspace: fallback search (if enabled)
activate Workspace
Workspace-->>Lookup: candidates from workspace
deactivate Workspace
Lookup-->>WhichResolver: matches or not_found error
end
deactivate Lookup
WhichResolver->>Cache: store(key, matches)
end
WhichResolver-->>Caller: Result<Vec<Utf8PathBuf>, Error>
deactivate WhichResolver
| Filter | Purpose |
|---|---|
uniq |
De-duplicate list (preserve order) |
flatten |
Deep flatten of arbitrarily nested lists |
group_by(attr) |
Dict keyed on attr of list items |
zip(other) |
Pairwise tuples of two lists |
version_compare(other, op='>=') |
SemVer comparison ('<', '<=', '==', …) |
Implementation notes for collection filters:
uniqstores values in anIndexSetso duplicates are removed withO(n)complexity while preserving the original order according to MiniJinja's equality semantics.flattenrecurses through nested sequences and iterables; scalars raise anInvalidOperationerror to avoid silently iterating over strings or other unintended inputs.group_byreturns an insertion-order-preserving mapping keyed by the original value so lookups via attribute names and bracket syntax remain in sync. Empty attribute names and items without the attribute surface anInvalidOperationerror so templates fail loudly rather than mis-grouping data.
| Name | Kind | Purpose |
|---|---|---|
fetch(url, cache=False) |
function | Retrieve URL, return content (str/bytes) |
http_head(url) |
function | Return headers dict |
download(url, dest) |
function | Idempotent file download (returns dest) |
shell(cmd) |
filter | Pipe value to arbitrary shell command; marks template impure |
grep, sed, awk, cut, wc, tr |
filters | Canonical wrappers implemented via shell() for convenience |
Using shell() marks the template as impure and disables caching of the
rendered YAML between Stage 2 and Stage 3. This avoids accidental reuse of
results that depend on external commands.
Implementation details:
fetchissues HTTP requests through theureqclient. When caching is enabled a SHA-256 digest of the URL becomes the cache key and responses are written beneath.netsuke/fetchinside the workspace. Directories are opened via capability-restricted handles fromStdlibConfig. Templates can no longer override the cache path, ensuring caches remain bounded by the workspace. Remote fetches and cache writes mark the stdlib state as impure so callers can discard memoised renders, while cache hits remain pure and preserve memoised renders.fetchenforces a configurable response limit (default 8 MiB) and streams cached downloads directly to disk. Exceeding the budget aborts the request with an error that quotes the configured byte cap so template authors can adjust their expectations. Cache reads reuse the same guard, preventing stale oversized entries from leaking unbounded data back into the renderer.fetchvalidates URLs against a policy that allows onlyhttps://by default. Operators can expand the allowlist with--fetch-allow-scheme <SCHEME>, declare explicit host allowlists via--fetch-allow-host <HOST>and--fetch-default-deny, and block individual hosts through--fetch-block-host <HOST>. Policy failures abort before a network call and leave the template marked pure.manifest::from_pathderives the workspace root from the manifest file's directory before registering the stdlib. This keeps caches scoped to the manifest tree even when the CLI evaluates a manifest from another working directory.shellandgrepspawn the platform shell (shorcmd.exe) with POSIX single-quoted arguments emitted viashell-quote. The stdlib registers a sharedStdlibStatethat flips animpureflag whenever these helpers execute so callers can detect templates that interacted with the outside world.shellandgrepenforce a configurable stdout capture limit (default 1 MiB) viaStdlibConfig::with_command_max_output_bytes. Exceeding the limit raises an error that quotes the configured budget so manifests can adjust. Templates can request streaming by passing{'mode': 'tempfile'}as the second filter argument. Streaming writes stdout to a temporary file guarded byStdlibConfig::with_command_max_stream_bytes, which defaults to 64 MiB to prevent runaway disk usage while still tolerating deliberate large outputs.- The command helpers manage pipe budgets using a
PipeSpec/PipeLimittracker. Each pipe spawns a dedicated reader thread that records how many bytes were drained and aborts once the configured limit is exceeded, surfacing anOutputLimitdiagnostic that names the stream and mode. When streaming is requested the reader persists data to a temporary file, keeping the limit in place so exceptionally large outputs are rejected before the filesystem fills up. TheStdlibConfig::into_componentshelper consumes the builder and hands owned network/command configurations to the registration routines, avoiding needless cloning of the capability handles.
classDiagram
class read_pipe {
+read_pipe<R>(reader: R, spec: PipeSpec): Result<PipeOutcome, CommandFailure>
}
class read_pipe_capture {
+read_pipe_capture<R>(reader: R, limit: PipeLimit): Result<PipeOutcome, CommandFailure>
}
class read_pipe_tempfile {
+read_pipe_tempfile<R>(reader: R, limit: PipeLimit): Result<PipeOutcome, CommandFailure>
}
read_pipe --> read_pipe_capture : calls
read_pipe --> read_pipe_tempfile : calls
class PipeSpec {
+into_limit(): PipeLimit
+mode(): OutputMode
}
class PipeOutcome {
<<enum>>
Bytes(Vec<u8>)
Tempfile(Utf8PathBuf)
}
class CommandFailure {
<<enum>>
Io
StreamPathNotUtf8
}
class PipeLimit {
+record(read: usize): Result<(), CommandFailure>
}
class OutputMode {
<<enum>>
Capture
Tempfile
}
read_pipe ..> PipeSpec : uses
read_pipe_capture ..> PipeLimit : uses
read_pipe_tempfile ..> PipeLimit : uses
read_pipe_capture ..> PipeOutcome : returns
read_pipe_tempfile ..> PipeOutcome : returns
read_pipe_capture ..> CommandFailure : error
read_pipe_tempfile ..> CommandFailure : error
Custom external commands can be registered as additional filters. Those should
be marked pure if safe for caching or impure otherwise.
| Name | Kind | Purpose |
|---|---|---|
now() |
function | Current datetime (UTC by default) |
timedelta(**kwargs) |
function | Convenience creator for age comparisons |
The now() helper produces an object that renders as an ISO 8601
timestamp and exposes iso8601, unix_timestamp, and offset accessors so
templates can serialize or compare values without string parsing. It defaults
to UTC but accepts an offset="+HH:MM" keyword argument that re-bases the
captured time on another fixed offset. Time is captured lazily when the helper
executes so behaviour remains deterministic during a render.
timedelta(**kwargs) constructs a duration object that renders using the
ISO 8601 duration grammar (for example, P1DT2H30M5.75025S). The helper
accepts integer keyword arguments weeks, days, hours, minutes,
seconds, milliseconds, microseconds, and nanoseconds, allowing callers
to describe durations at nanosecond precision. Arguments may be negative, but
overflow or non-integer inputs raise InvalidOperation errors so templates
cannot silently wrap. The resulting object exposes .iso8601, .seconds, and
.nanoseconds attributes for downstream predicates.
{% if "config.yaml" is file and "config.yaml" is readable %}
{{ "config.yaml" | contents | grep("version") }}
{% endif %}
{{ "src/app.c" | basename | with_suffix(".o") }}
{{ fetch('https://example.com/data.csv', cache=True) | head(5) }}After the user's manifest has been deserialized into the AST and remaining
string fields have been rendered by Jinja, the next phase is to transform this
high-level representation into a format suitable for the Ninja backend. This is
accomplished via a two-step process: converting the AST into a canonical
Intermediate Representation (IR), and then synthesizing the final build.ninja
file from that IR.
The Intermediate Representation is a critical architectural component that serves as the static, fully resolved, and validated representation of the entire build graph. It is the bridge between the user-facing front-end (the YAML schema and its corresponding AST) and the machine-facing back-end (the Ninja file format).
The primary purpose of the IR is to create a decoupling layer. This abstraction
barrier allows the front-end and back-end to evolve independently. For example,
the YAML schema could be significantly redesigned in a future version of
Netsuke, but as long as the transformation logic is updated to produce the same
stable IR, the Ninja generation back-end would require no changes. Conversely,
if the decision were made to support an alternative execution back-end (e.g., a
distributed build system), only a new generator module (IR -> NewBackend)
would need to be written, leaving the entire front-end parsing and validation
logic untouched.
Importantly, the IR contains no Ninja-isms. Placeholders such as $in and
$out are resolved to plain lists of file paths, and command strings are
expanded before hashing. This deliberate absence of Ninja-specific syntax makes
the IR a stable contract that future back-ends--distributed builders, remote
executors, or otherwise--can consume without modification.
Furthermore, the IR is the ideal stage at which to perform graph-level analysis and optimizations, such as detecting circular dependencies, pruning unused build targets, or identifying duplicate build actions.
The IR data structures are designed to closely mirror the conceptual model of the Ninja build system, which consists of "Action" nodes (commands) and "Target" nodes (files).5 This close mapping simplifies the final code generation step.
Rust
// In src/ir.rs
use std::collections::HashMap;
use camino::Utf8PathBuf;
/// The complete, static build graph.
pub struct BuildGraph {
/// A map of all unique actions (rules) in the build.
/// The key is a hash of a canonical JSON serialisation of the action's
/// properties to enable deduplication.
pub actions: HashMap<String, Action>,
/// A map of all target files to be built. The key is the output path.
pub targets: HashMap<Utf8PathBuf, BuildEdge>,
/// A list of targets to build by default.
pub default_targets: Vec<Utf8PathBuf>,
}
/// Represents a reusable command, analogous to a Ninja 'rule'.
pub struct Action {
pub recipe: Recipe,
pub description: Option<String>,
pub depfile: Option<String>, // Template for the.d file path, e.g., "$out.d"
pub deps_format: Option<String>, // "gcc" or "msvc"
pub pool: Option<String>,
pub restat: bool,
}
/// Represents a single build statement, analogous to a Ninja 'build' edge.
/// It connects a set of inputs to a set of outputs via an Action. The `phony`
/// and `always` flags control execution when outputs already exist or when
/// timestamps would normally skip the step.
pub struct BuildEdge {
/// The unique identifier of the Action used for this edge.
pub action_id: String,
/// Explicit inputs that, when changed, trigger a rebuild.
pub inputs: Vec<Utf8PathBuf>,
/// Outputs explicitly generated by the command.
pub explicit_outputs: Vec<Utf8PathBuf>,
/// Outputs implicitly generated by the command. Maps to Ninja's '|' syntax.
pub implicit_outputs: Vec<Utf8PathBuf>,
/// Dependencies that must be built first but do not trigger a rebuild on change.
/// Maps to Ninja's '||' syntax.
pub order_only_deps: Vec<Utf8PathBuf>,
/// Run this edge when requested even if the output file already exists.
pub phony: bool,
/// Run this edge on every invocation regardless of timestamps.
pub always: bool,
}classDiagram
class BuildGraph {
+HashMap<String, Action> actions
+HashMap<Utf8PathBuf, BuildEdge> targets
+Vec<Utf8PathBuf> default_targets
}
class Action {
+Recipe recipe
+Option<String> description
+Option<String> depfile
+Option<String> deps_format
+Option<String> pool
+bool restat
}
class BuildEdge {
+String action_id
+Vec<Utf8PathBuf> inputs
+Vec<Utf8PathBuf> explicit_outputs
+Vec<Utf8PathBuf> implicit_outputs
+Vec<Utf8PathBuf> order_only_deps
+bool phony
+bool always
}
class Recipe {
<<enum>>
Command
Script
Rule
}
class ninja_gen {
+generate(graph: &BuildGraph) String
}
BuildGraph "1" o-- "many" Action : actions
BuildGraph "1" o-- "many" BuildEdge : targets
Action "1" o-- "1" Recipe
BuildEdge "1" --> "1" Action : action_id
ninja_gen ..> BuildGraph : uses
ninja_gen ..> Action : uses
ninja_gen ..> BuildEdge : uses
ninja_gen ..> Recipe : uses
The core logic of the validation stage is a function, ir::from_manifest, that
consumes a NetsukeManifest (the AST) and produces a BuildGraph (the IR).
This transformation involves several steps:
-
Rule Collection: Insert each entry in
manifest.rulesinto aHashMapkeyed by its name. Rules are stored as templates and are not deduplicated at this stage. -
Target Expansion: Iterate through the
manifest.targetsand the optionalmanifest.actions. Entries inactionsare treated identically to targets but withphonydefaulting totrue. For each item, resolve all strings intoUtf8PathBufs and resolve all dependency names against other targets. -
Action Registration and Edge Creation: For each expanded target, resolve the referenced rule template, interpolate its command with the target's input and output paths, and register the resulting
ir::Actionin theactionsmap. Actions are hashed on the fully resolved command and file set, so identical rule templates yield distinct actions when their paths differ. Create a correspondingir::BuildEdgelinking the target to the action identifier and transfer thephonyandalwaysflags. -
Graph Validation: As the graph is constructed, perform validation checks. This includes ensuring that every rule referenced by a target exists in the
actionsmap and running a cycle detection algorithm (e.g., a depth-first search maintaining a visitation state) on the dependency graph to fail early on circular dependencies.The implemented algorithm performs a depth-first traversal of the target graph and maintains a recursion stack. Order-only dependencies are ignored during this search. Self-edges are rejected immediately, and encountering an already visiting node indicates a cycle. The stack slice from the first occurrence of that node forms the cycle and is returned in
IrGenError::CircularDependencyfor improved debugging. The cycle list is rotated so the lexicographically smallest node appears first, ensuring deterministic error messages.Traversal state is implemented in the dedicated
ir::cyclemodule. ItsCycleDetectorhelper owns the recursion stack and visitation map. Keys are cloned from thetargetsmap so traversal leaves the input graph untouched. Missing dependencies encountered during traversal are logged, collected, and returned alongside any cycle to aid diagnostics.
The final step is to synthesize the build.ninja file from the BuildGraph
IR. This process is a straightforward, mechanical translation from the IR data
structures to the Ninja file syntax.
-
Write Variables: Any global variables that need to be passed to Ninja can be written at the top of the file (e.g.,
msvc_deps_prefixfor Windows -
Write Rules: Iterate through the
graph.actionsmap. For eachir::Action, write a corresponding Ninjarulestatement. The input and output lists stored in the action replace theinsandoutsplaceholders. These lists are then rewritten as Ninja's$inand$out.When an action's
recipeis a script, the generated rule wraps the script in an invocation of/bin/sh -e -cso that multi-line scripts execute consistently across platforms.Code snippet
# Generated from an ir::Action rule cc command = gcc -c -o $out $in description = CC $out depfile = $out.d deps = gcc
-
Write Build Edges: Iterate through the
graph.targetsmap. For eachir::BuildEdge, write a corresponding Ninjabuildstatement. This involves formatting the lists of explicit outputs, implicit outputs, inputs, and order-only dependencies using the correct Ninja syntax (:,|, and||).5 Use Ninja's built-inphonyrule whenphonyistrue. For analwaysedge, either generate aphonybuild with no outputs or emit a dummy output markedrestat = 1and depend on a permanently dirty target so the command runs on each invocation.Code snippet
# Generated from an ir::BuildEdge build foo.o: cc foo.c build bar.o: cc bar.c build my_app: link foo.o bar.o | lib_dependency.a
4. Write Defaults: Finally, write the default statement, listing all
paths from graph.default_targets.
default my_appThe IR structures defined in src/ir.rs are minimal containers that mirror
Ninja's conceptual model while remaining backend-agnostic. BuildGraph
collects all Actions and BuildEdges in hash maps keyed by stable strings
and Utf8PathBufs so the graph can be deterministically traversed for snapshot
tests. Actions hold the parsed Recipe and optional execution metadata.
BuildEdge connects inputs to outputs using an action identifier and carries
the phony and always flags verbatim from the manifest. No Ninja specific
placeholders are stored in the IR to keep the representation portable.
- Actions are deduplicated using a SHA-256 hash of a canonical JSON serialisation of their recipe, inputs, and outputs. Because commands embed shell-quoted file paths, two targets share an identifier only when both the command text and file sets match exactly.
- Multiple rule references in a single target are not yet supported. The IR
generator reports
IrGenError::MultipleRuleswhen encountered. - Duplicate output files are rejected. Attempting to define the same output
path twice results in
IrGenError::DuplicateOutput. - The Ninja generator sorts actions and edges before output and deduplicates
edges based on their full set of explicit outputs. Sorting uses the joined
path strings to keep ordering stable across platforms, ensuring deterministic
build.ninjafiles. Small macros reduce formatting boilerplate when writing optional key-value pairs or flags, keeping the generator easy to scan. - Integration tests snapshot the generated Ninja file with
instaand execute the Ninja binary to validate structure and no-op behaviour.
The final stage of a Netsuke build involves executing commands. While Netsuke delegates the core task scheduling and execution to the Ninja binary, it remains responsible for invoking Ninja correctly and, most importantly, for ensuring that the commands it generates for Ninja to run are constructed securely.
Netsuke will use Rust's standard library std::process::Command API to
configure and spawn the ninja process.10 This provides fine-grained
control over the child process's execution environment.
The command construction will follow this pattern:
-
A new
Commandis created viaCommand::new("ninja"). Netsuke will assumeninjais available in the system'sPATH. -
Arguments passed to Netsuke's own CLI will be translated and forwarded to Ninja. For example, a
Netsuke build my_targetcommand would result inCommand::new("ninja").arg("my_target"). Flags like-jfor parallelism will also be passed through.12 -
The working directory for the Ninja process will be set using
.current_dir(). When the user supplies a-Cflag, Netsuke canonicalises the path and applies it viacurrent_dirrather than forwarding the flag to Ninja. -
Standard I/O streams (
stdin,stdout,stderr) will be configured using.stdout(Stdio::piped())and.stderr(Stdio::piped()).10 This allows Netsuke to capture the real-time output from Ninja, which can then be streamed to the user's console, potentially with additional formatting or status updates from Netsuke itself.
In the initial implementation a small helper wraps Command::new to forward
the -j and -C flags and any explicit build targets. Standard output and
error are piped and written back to Netsuke's own streams so users see Ninja's
messages in order. A non-zero exit status or failure to spawn the process is
reported as an io::Error for the CLI to surface.
A primary security responsibility for Netsuke is the prevention of command
injection attacks. The command strings defined in a user's Netsukefile are
templates. When Netsuke substitutes variables like file paths into these
templates, it is imperative that these substituted values are treated as
single, literal arguments by the shell that Ninja ultimately uses to execute
the command.
Without proper escaping, a malicious or even accidental filename like
"my file; rm -rf /;.c" could be interpreted as multiple commands, leading to
catastrophic consequences.
For this critical task, the recommended crate is shell-quote.
While other crates like shlex exist, shell-quote offers a more robust and
flexible API specifically designed for this purpose.11 It supports quoting
for multiple shell flavours (e.g., Bash, sh, Fish), which is vital for a
cross-platform build tool. It also correctly handles a wide variety of input
types, including byte strings and OS-native strings, which is essential for
dealing with non-UTF8 file paths. The
QuoteExt trait provided by the crate offers an ergonomic and safe method for
building command strings by pushing quoted components into a buffer:
script.push_quoted(Bash, "foo bar").
The command generation logic within the ninja_gen.rs module must not use
simple string formatting (like format!) to construct the final command
strings. Instead, parse the Netsuke command template (e.g.,
{{ cc }} -c {{ ins }} -o {{ outs }}) and build the final command string
step by step. The placeholders {{ ins }} and {{ outs }} are expanded to
space-separated lists of file paths within Netsuke itself, each path being
shell-escaped using the shell-quote API. Netsuke uses the Sh quoting mode
to emit POSIX-compliant single-quoted strings and scans the template for
standalone $in and $out tokens to avoid rewriting unrelated variables.
Substitution happens during IR generation and the fully expanded command is
emitted to build.ninja unchanged. After substitution, the command is
validated with [shlex](https://docs.rs/shlex/latest/shlex/) to ensure it
parses correctly. This approach guarantees that every dynamic part of the
command is securely quoted, albeit at the cost of deduplicating only actions
with identical file sets.
The concept of being "friendlier" than make extends beyond syntactic sugar to
encompass safety and reliability. A tool that is easy to use but exposes the
user to trivial security vulnerabilities is fundamentally unfriendly. In many
build systems, the burden of correct shell quoting falls on the user, an
error-prone task that requires specialised knowledge.
Netsuke's design elevates security to a core feature by making it automatic and
transparent. The user writes a simple, unquoted command template, and Netsuke
performs the complex and critical task of making it secure behind the scenes.
By integrating shell-quote directly into the Ninja file synthesis stage,
Netsuke protects users from a common and dangerous class of errors by default.
This approach embodies a deeper form of user-friendliness: one that anticipates
and mitigates risks on the user's behalf.
A key differentiator for a "friendly" tool is how it communicates failure. Cryptic, unhelpful error messages are a major source of frustration for developers. Netsuke's error handling framework is designed to provide messages that are clear, contextual, and actionable.
Errors are not exceptional events; they are an expected part of the development workflow. Every error reported by Netsuke to the user must strive to answer three fundamental questions:
-
What went wrong? A concise summary of the failure (e.g., "YAML parsing failed," "Build configuration is invalid").
-
Where did it go wrong? Precise location information, including the file, line number, and column where applicable (e.g., "in
Netsukefileat line 15, column 3"). -
Why did it go wrong, and what can be done about it? The underlying cause of the error and a concrete suggestion for how to fix it (e.g., "Cause: Found a tab character, which is not allowed. Hint: Use spaces for indentation instead.").
Netsuke uses a two-tier error architecture:
anyhowcaptures internal context as errors propagate through the application.mietterenders user-facing diagnostics and is not optional. All surface errors must implementmiette::Diagnosticso the CLI can present spans, annotated source, and helpful suggestions.
This hybrid strategy is common in the Rust ecosystem and provides both rich context and polished user output.13
thiserror: This crate is used within Netsuke's internal library modules (e.g.,parser,ir,ninja_gen) to define specific, structured error types. The#[derive(Error)]macro reduces boilerplate and allows for the creation of rich, semantic errors.14
Rust
// In src/ir.rs use thiserror::Error; use camino::Utf8PathBuf;
#[derive(Debug, Error)]
pub enum IrGenError {
#[error("rule '{rule_name}' referenced by target '{target_name}' was not found")]
RuleNotFound { target_name: String, rule_name: String },
#[error("multiple rules for target '{target_name}': {rules:?}")]
MultipleRules { target_name: String, rules: Vec<String> },
#[error("No rules specified for target {target_name}")]
EmptyRule { target_name: String },
#[error("duplicate target outputs: {outputs:?}")]
DuplicateOutput { outputs: Vec<String> },
#[error("circular dependency detected: {cycle:?}")]
CircularDependency {
cycle: Vec<Utf8PathBuf>,
missing_dependencies: Vec<(Utf8PathBuf, Utf8PathBuf)>,
},
#[error("failed to serialise action: {0}")]
ActionSerialisation(#[from] serde_json::Error), }-
anyhow: Used in the main application logic (main.rs) and at the boundaries between modules.anyhow::Resultwraps any error implementingstd::error::Error.15 The?operator provides clean propagation, while.context()and.with_context()attach high-level explanations as errors bubble up.16 -
miette: Presents human-friendly diagnostics, highlighting exact error locations with computed spans. Every diagnostic must retainmiette'sDiagnosticimplementation as it travels throughanyhow.
YamlDiagnostic is the reference implementation of a Netsuke diagnostic. It
wraps yaml-rust errors with annotated source, spans, and optional help text:
#[derive(Debug, Error, Diagnostic)]
#[error("{message}")]
#[diagnostic(code(netsuke::yaml::parse))]
pub struct YamlDiagnostic {
#[source_code]
src: NamedSource<String>,
#[label("parse error here")]
span: Option<SourceSpan>,
#[help]
help: Option<String>,
#[source]
source: YamlError,
message: String,
}
#[derive(Debug, Error, Diagnostic)]
pub enum ManifestError {
#[error("manifest parse error")]
#[diagnostic(code(netsuke::manifest::parse))]
Parse {
#[source]
#[diagnostic_source]
source: Box<dyn Diagnostic + Send + Sync + 'static>,
},
}ManifestError::Parse boxes the diagnostic to preserve the rich error so
miette can show the offending YAML snippet. All new user-facing errors with
source context must follow this model.
Common use cases requiring miette diagnostics include:
- YAML parsing errors.
- Jinja template rendering failures with line numbers and context.
- Any scenario where highlighting spans or providing structured help benefits the user.
Although src/diagnostics.rs is currently unused, it contains prototypes for
miette patterns and remains a valuable reference. Future diagnostics should
mirror the YamlDiagnostic approach by implementing Diagnostic, providing a
NamedSource, a SourceSpan, and actionable help text.
The flow of an error from its origin to the user follows a clear path of enrichment:
-
A specific, low-level error occurs within a module. For instance, the IR generator detects a missing rule and creates an
IrGenError::RuleNotFound. Likewise, the Ninja generator returnsNinjaGenError::MissingActionwhen a build edge references an undefined action, preventing panics during file generation. -
The function where the error occurred returns
Err(IrGenError::RuleNotFound {... }.into()). The.into()call converts the specificthiserrorenum variant into a genericanyhow::Errorobject, preserving the original error as its source. -
A higher-level function in the call stack, which called the failing function, receives this
Errvalue. It uses the.with_context()method to wrap the error with more application-level context. For example:ir::from_manifest(ast).with_context(|| "Failed to build the internal build graph from the manifest")?. -
This process of propagation and contextualisation repeats as the error bubbles up towards
main. Useanyhow::Contextto add detail, but never convert amiette::Diagnosticinto a plainanyhow::Error--doing so would discard spans and help text. -
Finally, the
mainfunction receives theErrresult. It prints the entire error chain provided byanyhow, which displays the highest-level context first, followed by a list of underlying "Caused by:" messages. This provides the user with a rich, layered explanation of the failure, from the general to the specific.
For automation use cases, Netsuke supports a --diag-json flag layered through
OrthoConfig as --diag-json, NETSUKE_DIAG_JSON, and diag_json = true. When
enabled, Netsuke emits a Netsuke-owned JSON document on stderr instead of
relying on upstream formatter output directly. The current schema is versioned
with schema_version = 1 and an envelope of:
generator:nameandversiondiagnostics: an array of entries containingmessage,code,severity,help,url,causes,source,primary_span,labels, andrelated
Design decisions for this mode:
- Netsuke owns the schema rather than exposing
miette's raw JSON formatter, so compatibility can be documented and guarded by snapshot tests. - JSON mode reserves
stderrfor one machine-readable document only. Progress updates, verbose timing summaries, emoji prefixes, and tracing logs are suppressed while the mode is active. stdoutsemantics do not change. Commands such asmanifest -andgraphkeep streaming their normal artefacts tostdout.- Early startup failures honour only the CLI flag and environment variable. Configuration files cannot request JSON for errors raised while those same files are still being located or parsed.
This table provides a specification for the desired output of Netsuke's error reporting system, contrasting raw, unhelpful messages with the friendly, actionable output that the implementation should produce.
| Error Type | Poor Message (Default) | Netsuke's Friendly Message (Goal) |
|---|---|---|
| YAML Parse | (line 15, column 3): Found a tab character where indentation is expected | Error: Failed to parse 'Netsukefile'. Caused by: Found a tab character. Hint: Use spaces for indentation instead of tabs. |
| Validation | thread 'main' panicked at 'Rule not found' | Error: Build configuration is invalid. Caused by: Target 'my_program' uses a rule named 'link-program' which is not defined in the 'rules' section. |
| Execution | ninja: error: 'main.o', needed by 'my_program', missing and no known rule to make it | Error: Build failed during execution. Caused by: Ninja could not build target 'my_program' because its dependency 'main.o' is missing. Hint: Ensure a target produces 'main.o'. |
The command-line interface is the user's entry point to Netsuke. A well-designed CLI is essential for a good user experience. It should be intuitive, self-documenting, and consistent with the conventions of modern command-line tools.
The CLI for Netsuke will be built using the clap (Command Line Argument
Parser) crate, specifically leveraging its derive feature. clap is the de-
facto standard for building rich, professional CLIs in Rust. It automatically
generates parsing logic, help messages, version information, and shell
completions from simple struct definitions. Its integration with error handling
frameworks like anyhow is seamless, making it the ideal choice.17
The CLI's structure will be defined using a set of structs annotated with
clap's derive macros. This provides a single, clear source of truth for the
entire CLI specification.
Rust
use clap::{Args, Parser, Subcommand}; use std::path::PathBuf;
#[derive(Parser)]
#[command(author, version, about, long_about = None)]
struct Cli { /// Path to the Netsuke manifest file to use.
#[arg(short, long, value_name = "FILE", default_value = "Netsukefile")]
file: PathBuf,
/// Change to this directory before doing anything.
#[arg(short = 'C', long, value_name = "DIR")]
directory: Option<PathBuf>,
/// Set the number of parallel build jobs.
#[arg(short, long, value_name = "N")]
jobs: Option<usize>,
/// Enable verbose diagnostic logging and completion timing summaries.
#[arg(short, long)]
verbose: bool,
#[command(subcommand)]
command: Option<Commands>, }
#[derive(Subcommand)]
enum Commands { /// Build specified targets (or default targets if none are
given). /// This is the default subcommand. Build(BuildArgs),
/// Remove build artefacts and intermediate files. Clean,
/// Display the build dependency graph in DOT format for visualisation.
Graph,
/// Write the Ninja manifest to `FILE` without invoking Ninja.
///
/// Use `-` to write the generated Ninja file to stdout instead of
/// persisting it to disk.
Manifest {
/// Output path for the generated Ninja file.
#[arg(value_name = "FILE")]
file: PathBuf, }, }
#[derive(Args)]
struct BuildArgs { /// Write the generated Ninja manifest to this path and
retain it.
#[arg(long, value_name = "FILE")]
emit: Option<PathBuf>,
/// A list of specific targets to build. targets: Vec<String>, }
Note: The Build command is wrapped in an Option<Commands> and will be
treated as the default subcommand if none is provided, allowing for the common
Netsuke [targets...] invocation.
The behaviour of each subcommand is clearly defined:
-
Netsuke build [--emit FILE] [targets...]: This is the primary and default command. It executes the full six-stage pipeline: Manifest Ingestion, Initial YAML Parsing, Template Expansion, Deserialisation & Final Rendering, IR Generation & Validation, and Ninja Synthesis & Execution. By default the generated Ninja file is written to a securely created temporary location and removed after the build completes. Supplying--emit FILEwrites the Ninja file toFILEand retains it. If no targets are provided on the command line, the targets listed in thedefaultssection of the manifest are built. -
Netsuke clean: This command provides a convenient way to clean the build directory. It will invoke the Ninja backend with the appropriate flags, such asninja -t clean, to remove the outputs of the build rules. -
Netsuke graph: This command is an introspection and debugging tool. It runs the Netsuke pipeline through Ninja synthesis (Stage 6) to produce a temporarybuild.ninja, then invokes Ninja with the graph tool,ninja -t graph, which outputs the complete build dependency graph in the DOT language. The result can be piped through Graphviz tools such asdot -Tsvg. An optional--htmlrenderer is planned for a later milestone. -
Netsuke manifest FILE: This command performs the pipeline up to Ninja synthesis and writes the resulting Ninja file toFILEwithout invoking Ninja. Supplying-forFILEstreams the generated Ninja file to stdout.
The CLI is implemented using clap's derive API in src/cli/mod.rs. Netsuke
applies Cli::with_default_command after parsing so invoking netsuke with no
explicit command still triggers a build. Configuration is layered with
OrthoConfig (defaults, configuration files, environment variables, then CLI
overrides) while treating clap defaults as absent so file or environment values
are not masked. Configuration discovery honours NETSUKE_CONFIG_PATH and the
standard OrthoConfig search order; environment variables use the NETSUKE_
prefix with __ as a nesting separator. CLI help and clap errors are localized
via Fluent resources; locale resolution is handled in
src/locale_resolution.rs with the precedence --locale -> NETSUKE_LOCALE
-> configuration locale -> system default. System locale strings are
normalized by stripping encoding suffixes (such as .UTF-8), removing variant
suffixes (such as @latin), and replacing underscores with hyphens before
validation. English plus Spanish catalogues ship in locales/; unsupported
locales fall back to en-US. Runtime diagnostics (for example manifest
parsing, stdlib template errors, and runner failures) use the same Fluent
localizer so the locale selection is consistent across user-facing output. A
build-time audit in build.rs validates that all referenced Fluent message
keys exist in the bundled catalogues, ensuring missing strings fail CI before
release. CLI execution and dispatch live in src/runner.rs, keeping main.rs
focused on parsing. Process management, Ninja invocation, argument redaction,
and the temporary file helpers reside in src/runner/process.rs, allowing the
runner entry point to delegate low-level concerns. The working directory flag
mirrors Ninja's -C option but is resolved internally: Netsuke runs Ninja with
a configured working directory and resolves relative output paths (for example
build --emit and manifest) under the same directory so behaviour matches a
real directory change. Error scenarios are validated using clap's ErrorKind
enumeration in unit tests and via Cucumber steps for behavioural coverage.
Real-time stage reporting now uses a six-stage model in src/status.rs backed
by indicatif::MultiProgress for standard terminals. The reporter keeps one
persistent summary line per stage and updates each line through localized state
labels (pending, in progress, done, failed) plus localized stage text.
During Stage 6, Netsuke parses Ninja status lines of the form
[current/total] ... and emits localized task progress updates. Parsed updates
are monotonic: malformed lines, regressive counts, and total-mismatch lines are
ignored to avoid noisy or inconsistent progress state. Task updates fall back
to textual output when stdout is not a teletype terminal (TTY), ensuring
deterministic continuous integration (CI) logs; accessible mode always uses
textual output. Accessible output remains text-first and static; it does not
animate. The standard reporter is configurable through OrthoConfig layering via
progress: Option<bool> (--progress, NETSUKE_PROGRESS, or config file),
with accessible mode taking precedence when enabled. Verbose mode (--verbose
through OrthoConfig layers) wraps the resolved reporter with a timing recorder
that emits a localized completion summary on successful runs:
Stage timing summary:- one line per completed stage (
- Stage N/6: ...: 12ms) Total pipeline time: ...
Timing summaries are completion diagnostics. They are suppressed when verbose mode is off and also suppressed on failed runs so failures do not imply a successful pipeline completion.
Theme resolution for CLI output is centralized in src/theme.rs. Netsuke
resolves one theme through OrthoConfig layers (--theme, NETSUKE_THEME,
config file, then mode defaults) and hands the resulting symbol and spacing
tokens to reporters through the OutputPrefs compatibility facade. This keeps
reporter code focused on status semantics rather than glyph choice, preserves
no_emoji as a legacy ASCII-forcing alias when no explicit theme is supplied,
and gives later roadmap items a stable snapshot surface for validating ASCII
and Unicode renderings without duplicating formatting rules.
For screen readers: The following flowchart shows how the build script audits localization keys against English and Spanish Fluent bundles.
flowchart TD
A_Start[Start build.rs] --> B_ReadKeys
B_ReadKeys[extract_key_constants
from src/localization/keys.rs] --> C_ReadEn
C_ReadEn[extract_ftl_keys
from locales/en-US/messages.ftl] --> D_ReadEs
D_ReadEs[extract_ftl_keys
from locales/es-ES/messages.ftl] --> E_Compare
E_Compare[Compute differences
between declared and en-US/es-ES keys] --> F_CheckMissing
F_CheckMissing{Any missing
keys?} -->|No| G_Success[Audit passes
continue build]
F_CheckMissing -->|Yes| H_Error[Emit error message
with missing keys per locale
and fail build]
H_Error --> I_End[Build script returns Err]
G_Success --> I_End
Figure: Build script localization audit flow for Fluent key validation.
The Ninja executable may be overridden via the NINJA_ENV environment
variable. For example, NINJA_ENV=/opt/ninja/bin/ninja netsuke build forces
Netsuke to execute the specified binary while preserving the default when the
variable is unset or invalid.
The CLI definition doubles as the source for user documentation. A build script
uses clap_mangen to emit a netsuke.1 manual page in
target/generated-man/<target>/<profile> and mirrors the page into Cargo's
OUT_DIR so release automation can discover it without additional tooling. The
staging helper always prefers the deterministic generated-man copy and falls
back to the most recent OUT_DIR candidate only when necessary, avoiding false
positives when several historical build directories remain on disk. Release
artefacts include this platform-agnostic man page; the published crate remains
code-only. The build script honours SOURCE_DATE_EPOCH to produce reproducible
dates, emitting a warning and falling back to 1970-01-01 when the environment
value is invalid.
Release engineering is delegated to GitHub Actions workflows built on the
leynos/shared-actions toolchain. All shared composites are pinned to explicit
SHAs so release automation remains reproducible. The tagging workflow first
verifies that the Git ref matches Cargo.toml and records the crate's binary
name once so all subsequent jobs operate on consistent metadata. Linux builds
invoke the rust-build-release composite action to cross-compile for x86_64
and aarch64, generate the staged binary + man page directory, and then call
the shared linux-packages composite a second time with explicit metadata so
the resulting .deb and .rpm archives both declare a runtime dependency on
ninja-build. Windows builds reuse the same action for compilation and now
invoke the generic staging stage-release-artefacts composite from
leynos/shared-actions. The composite shells out to a Cyclopts-driven script
that reads the .github/release-staging.toml configuration (Tom's Obvious,
Minimal Language (TOML)), merges the [common] configuration with the
target-specific overrides, and copies the configured artefacts into a fresh
dist/{bin}_{platform}_{arch} directory. It installs Astral's Python package
manager (uv) with astral-sh/setup-uv, double-checks the tool is present, and
only then launches the Python entry point so workflows stay declarative. The
helper writes SHA-256 sums for every staged file and exports a JSON map of the
artefact outputs, allowing the workflow to hydrate downstream steps without
hard-coded path logic. Figure 8.1 summarises the configuration entities,
including optional keys reserved for templated directories and explicit
artefact destinations that the helper can adopt without breaking compatibility.
Figure 8.1: Entity relationship for the staging configuration schema.
%% Figure 8.1: Entity relationship for the staging configuration schema.
erDiagram
COMMON {
string bin_name
string dist_dir
string checksum_algorithm
string staging_dir_template
ArtefactConfig[] artefacts
}
TARGETS {
string platform
string arch
string target
string bin_ext
string staging_dir_template
ArtefactConfig[] artefacts
}
ArtefactConfig {
string source
boolean required
string output
string destination
string[] alternatives
}
COMMON ||--o{ TARGETS : "has targets"
COMMON ||--o{ ArtefactConfig : "has artefacts"
TARGETS ||--o{ ArtefactConfig : "has artefacts"
The staged artefacts feed a Windows Installer XML (WiX) v4 authoring template
stored in installer/Package.wxs; the workflow invokes the shared
windows-package composite to convert the repository licence into Rich Text
Format (RTF), embed the binary, and output a signed Microsoft Installer (MSI)
installer alongside the staged directory. The packaging step gates the action's
internal artefact uploader behind the should_publish flag exported by the
metadata job so that dry runs do not leak MSI artefacts. The composite pins the
WixToolset.UI.wixext extension to v6 to match the WiX v6 CLI and avoid the
WIX6101 incompatibility seen with the legacy v4 bundle. The installer uses
WiX v4 syntax, installs per-machine, and presents the minimal UI appropriate
for a CLI tool. Windows does not modify the PATH, so users must add the
installation directory manually if they want global command resolution. The
Unix manual page remains in the staged artefacts for parity with the other
platforms but is not bundled into the installer to avoid shipping an
inaccessible help format.
macOS releases execute the shared action twice: once on an Intel runner and
again on Apple Silicon. The same composite action interprets the TOML
configuration, emits checksums, and exposes artefact metadata via JSON outputs
before feeding the resulting paths into the macos-package action. Embedding
the PEP 723 metadata keeps Cyclopts discoverable without a repository-level
pyproject.toml, maintaining the existing approach where uv resolves
dependencies on demand. Python linting still lives in the top-level
ruff.toml, so the dedicated staging scripts remain self-contained whilst the
broader helper suite stays consistently linted.
Each job uploads its products as workflow artefacts, and the final release job downloads every file, filters out unrelated downloads, and prefixes asset names with their staging directories to avoid collisions before attaching them to the GitHub release draft. This automated pipeline guarantees parity across Windows, Linux, and macOS without custom GoReleaser logic.
Netsuke's manifest loader now re-exports the ManifestValue and ManifestMap
aliases alongside the ManifestSource, ManifestName, map_yaml_error, and
map_data_error helpers. Library consumers should upgrade to these symbols
when interacting with manifest data or diagnostics; the change is user-visible
and must be highlighted in the next crate release summary.
This final section outlines a strategic plan for implementing Netsuke, along with a summary of key technological choices and potential avenues for future development. This roadmap is designed to manage complexity, mitigate risk, and ensure that a functional and robust tool is delivered efficiently.
A phased implementation approach is recommended to tackle the project in manageable stages. Each phase builds upon the last and has a clear, verifiable goal.
-
Phase 1: The Static Core
-
Objective: To create a minimal, working build compiler for static manifests.
-
Tasks:
-
Implement the initial
clapCLI structure for thebuildcommand. -
Implement the YAML parser using
serde_saphyrand the AST data structures (ast.rs). -
Implement the AST-to-IR transformation logic, including basic validation like checking for rule existence.
-
Implement the IR-to-Ninja file generator (
ninja_gen.rs). -
Implement the
std::process::Commandlogic to invokeninja.
-
-
Success Criterion: Netsuke can successfully take a
Netsukefilefile without any Jinja syntax and compile it to abuild.ninjafile, then execute it to produce the correct artefacts. This phase validates the entire static compilation pipeline.
-
-
Phase 2: The Dynamic Engine
-
Objective: To integrate the templating engine and support dynamic manifests.
-
Tasks:
-
Integrate the
minijinjacrate into the build pipeline. -
Implement the two-pass parsing mechanism: first render the manifest with
minijinja, then parse the result withserde_saphyr. -
Populate the initial Jinja context with the global
varsfrom the manifest. -
Implement basic Jinja control flow (
{% if... %},{% for... %}) and variable substitution.
-
-
Success Criterion: Netsuke can successfully build a manifest that uses variables and conditional logic (e.g., different compiler flags based on a variable).
-
-
Phase 3: The "Friendly" Polish
-
Objective: To implement the advanced features that deliver a superior user experience.
-
Tasks:
-
Implement the full suite of custom Jinja functions (
glob,env, etc.) and filters (shell_escape). -
Mandate the use of
shell-quotefor all command variable substitutions. -
Refactor the error handling to fully adopt the
anyhow/thiserrorstrategy, ensuring all user-facing errors are contextual and actionable as specified in Section 7. -
Implement the
cleanandgraphsubcommands. -
Refine the CLI output for clarity and readability.
-
-
Success Criterion: Netsuke is a feature-complete, secure, and user-friendly build tool that meets all the initial design goals.
-
This table serves as a quick-reference guide to the core third-party crates selected for this project and the rationale for their inclusion.
| Component | Recommended Crate | Rationale |
|---|---|---|
| CLI Parsing | clap | The Rust standard for powerful, derive-based CLI development. |
| YAML Parsing | serde_saphyr | Maintained, panic-free YAML 1.2 parser with a serde-compatible API. |
| Templating | minijinja | High compatibility with Jinja2, minimal dependencies, and supports runtime template loading. |
| Shell Quoting | shell-quote | A critical security component; provides robust, shell-specific escaping for command arguments. |
| Error Handling | anyhow + thiserror + miette | An idiomatic and powerful combination for creating rich, contextual, and user-friendly error reports with precise source spans. |
| Logging | tracing | Structured, levelled diagnostic output for debugging and insight. |
| Versioning | semver | The standard library for parsing and evaluating Semantic Versioning strings, essential for the netsuke_version field. |
The architecture described in this document provides a solid foundation for a powerful build tool. The use of a decoupled IR, in particular, opens up many possibilities for future enhancements beyond the initial scope.
-
Advanced Caching: While Ninja provides excellent file-based incremental build caching, Netsuke could implement a higher-level artefact caching layer. This could involve caching build outputs in a shared network location (e.g., S3) or a local content-addressed store, allowing for cache hits across different machines or clean checkouts.
-
Plugin Architecture: A system could be designed to allow users to load custom Jinja functions and filters from external Rust crates at runtime. This would enable the community to extend Netsuke's capabilities without requiring changes to the core application.
-
Language-Specific Toolchains: Netsuke could offer pre-packaged "toolchain" modules. For example, a
Netsuke-rust-toolchaincould provide a standard set of rules and variables for compiling Rust projects, abstracting away the details of invokingcargo. -
Distributed Builds: The IR is backend-agnostic. A future version of Netsuke could include an alternative generator that targets a distributed build system, allowing for massively parallel builds across a cluster of machines. The user's
Netsukefilemanifest would remain unchanged.
The repository includes several complete Netsuke manifests in the examples/
directory. They demonstrate how the YAML schema can be applied to real-world
projects.
basic_c.yml: a minimal C project compiling two object files and linking them into a small application.photo_edit.yml: converts RAW photographs and generates a simple HTML gallery for previewing the results.visual_design.yml: rasterises a set of SVG design assets into PNG images using Inkscape.website.yml: builds a static web site from Markdown pages with Pandoc and assembles an index page.writing.yml: produces a multi-chapter PDF book by combining chapters rendered from Markdown via LaTeX.
Footnotes
-
Ninja, a small build system with a focus on speed. Accessed on 12 July 2025. https://ninja-build.org/ ↩
-
"Ninja (build system)." Wikipedia. Accessed on 12 July 2025. https://en.wikipedia.org/wiki/Ninja_(build_system) ↩ ↩2 ↩3 ↩4
-
"semver - Rust." Accessed on 12 July 2025. https://creative-coding-the-hard-way.github.io/Agents/semver/index.html ↩ ↩2
-
"A Complete Guide To The Ninja Build System." Spectra - Mathpix. Accessed on 12 July 2025. https://spectra.mathpix.com/article/2024.01.00364/a-complete-guide-to-the-ninja-build-system ↩ ↩2 ↩3 ↩4 ↩5
-
"How Ninja works." Fuchsia. Accessed on 12 July 2025. https://fuchsia.dev/fuchsia-src/development/build/ninja_how ↩ ↩2 ↩3
-
"minijinja." crates.io. Accessed on 12 July 2025. https://crates.io/crates/minijinja ↩
-
"minijinja." wasmer-pack API docs. Accessed on 12 July 2025. https://wasmerio.github.io/wasmer-pack/api-docs/minijinja/index.html ↩
-
"minijinja." Docs.rs. Accessed on 12 July 2025. https://docs.rs/minijinja/ ↩
-
"Template engine - list of Rust libraries/crates." Lib.rs. Accessed on 12 July 2025. https://lib.rs/template-engine ↩
-
"std::process." Rust. Accessed on 12 July 2025. https://doc.rust-lang.org/std/process/index.html ↩ ↩2 ↩3
-
"shell_quote." Docs.rs. Accessed on 12 July 2025. https://docs.rs/shell-quote/latest/shell_quote/ ↩ ↩2
-
"The Ninja build system." Ninja. Accessed on 12 July 2025. https://ninja-build.org/manual.html
-
"Rust Error Handling Compared: anyhow vs thiserror vs snafu." dev.to. Accessed on 12 July 2025. https://dev.to/leapcell/rust-error-handling-compared-anyhow-vs-thiserror-vs-snafu-2003 ↩
-
"Practical guide to Error Handling in Rust." Dev State. Accessed on 12 July 2025. https://dev-state.com/posts/error_handling/ ↩
-
"thiserror and anyhow." Comprehensive Rust. Accessed on 12 July 2025. https://comprehensive-rust.mo8it.com/error-handling/thiserror-and-anyhow.html ↩
-
"Simple error handling for precondition/argument checking in Rust." Stack Overflow. Accessed on 12 July 2025. https://stackoverflow.com/questions/78217448/simple-error-handling-for-precondition-argument-checking-in-rust ↩
-
"Nicer error reporting." Command Line Applications in Rust. Accessed on 12 July 2025. https://rust-cli.github.io/book/tutorial/errors.html ↩