bgrid

Observation-driven behavioral specification for CLI binaries. Given a binary, bgrid runs invocations across varied input states and records what happens — stdout, stderr, exit code, and filesystem changes. It tests every flag individually and all pairwise combinations in a single phase, then reports which flags have unique observable behavior.

Usage

bgrid <binary>                        explore: discover flags + run grid + report
bgrid <binary> <file.probe>           run observation grid from a probe file
bgrid --dry-run <binary> <file.probe> show resolved grid without executing

Exploring a binary

bgrid sort discovers flags from --help, probes working argument values (via help text mining, error mining, and compound probing), generates orthogonal contexts (varying file content, structure, permissions, timestamps, stdin), runs every flag individually and all pairwise flag combinations across every context in parallel, then analyzes behavioral groups. Output is a report with exemplar observations showing what each flag does.

For subcommands: bgrid git diff explores git diff.

Manual probe authoring

Write a .probe file to add custom contexts and runs that exercise flags the mechanical discovery can't reach:

bgrid sort sort.probe

See LANGUAGE.md for the probe language specification.

How it works

Factor identification — parse --help for flags, metavars, aliases, and value enumerations. Probe invocation patterns (positional args, stdin, Usage-line structural patterns like COMMAND or [expression]).
Level determination — pilot study probes each flag with candidate values: help-mined values → metavar candidates → error mining → companion probing → mutual compound probing. Each phase runs all probes in a single batched bwrap invocation.
Design construction — cross all flags × invocation patterns × contexts into a fixed grid. No adaptation after this point.
Execution — batched bwrap sandboxing, one invocation per context, up to 32 threads. Each cell has a 2-second timeout.
Analysis — hash-anchored structural diff (O(n) for shared lines, NW only on gap segments), hash-based behavioral grouping, pairwise interaction evidence, leave-one-out robustness scoring.
Report — flags classified as solo-distinguished, combo-distinguished, error-differentiated, or behavioral aliases. Each flag gets an exemplar observation. Robustness tiers reported.

See DESIGN.md for the full experiment design.

Requirements

Linux (uses bubblewrap for sandbox isolation)
bubblewrap (bwrap) — install via your package manager
Rust toolchain

Testing

cargo test                          # unit tests
./tests/coreutils.sh                # integration test (22 binaries, ~2 min)
REPRO=1 ./tests/coreutils.sh        # + cross-run reproducibility check

The test harness checks four properties per tool:

observed ≥ threshold — behavioral evidence doesn't regress
total flags = expected — flag discovery surface is stable
fragile = 0 — no noise-dependent distinctions
reproducible (opt-in) — observed count matches across runs

Reference reports are in tests/results/.

Development

nix develop              # enter dev shell (or use direnv)
cargo build              # build bgrid

Name		Name	Last commit message	Last commit date
Latest commit History 590 Commits
bin		bin
fixtures		fixtures
src		src
tests		tests
tools		tools
.envrc		.envrc
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
DESIGN.md		DESIGN.md
DISCOVERIES.md		DISCOVERIES.md
LANGUAGE.md		LANGUAGE.md
LICENSE		LICENSE
README.md		README.md
clippy.toml		clippy.toml
flake.lock		flake.lock
flake.nix		flake.nix

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bgrid

Usage

Exploring a binary

Manual probe authoring

How it works

Requirements

Testing

Development

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

bgrid

Usage

Exploring a binary

Manual probe authoring

How it works

Requirements

Testing

Development

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages