This module provides the principled surface for comparing sandbox policies across the compile-reverse-recompile pipeline. Structural differences between source SBPL and reversed SBPL are handled here, not papered over in the reverser or worked around in comparison logic.
For the conceptual architecture — what the compiler erases, what can be
recovered, and how the confidence model works — see NORMALIZE.md.
The sandbox compiler transforms source SBPL in ways that cannot be perfectly inverted. These transformations fall into four categories: structural erasure (boolean canonicalization erases ordering, nesting, and algebraic equivalences), node sharing (the compiler merges decision-graph nodes across operations, losing predicate-operation ownership), baseline implicit promotions (the compiler silently allows or denies operations not mentioned in the source), and entitlement macro expansion (compile-time entitlement macros expand into concrete rules, losing the original macro intent). Rather than maintaining a growing list of "known differences to ignore," normalization commits to handling these transformations programmatically.
The goal: any profile that compiles and reverses should compare equal at the canonical level, without asterisks or caveats that leak into user-facing tools.
compare_source_and_reversed() runs an ordered pass pipeline. Ordering is intentional because later passes depend on earlier canonicalization decisions.
| Order | Pass | Flag | Module | Sidecar / Metadata |
|---|---|---|---|---|
| 0 | Entitlement augmentation | augment_entitlements |
passes/entitlement.py |
entitlement_augmented, entitlement_keys |
| 1 | Source normalization | always | source.py |
— |
| 2 | Reversed normalization + reconstruction | always | reversed.py, passes/reconstruction.py |
has_reconstruction |
| 3 | Wildcard collapse | collapse_wildcards |
passes/wildcard_collapse.py |
wildcard_collapse_info |
| 4 | Guided deny denormalization | denormalize_require_not |
passes/deny_denormalize.py |
deny_denormalized, ops_denormalized |
| 5 | Structural diff | always | policy.py |
— |
| 6 | Import filtering | filter_imports |
baseline/imports.py |
ImportExpansion, import_filtered_count |
| 7 | Baseline filtering | filter_baseline |
baseline/predicates.py |
baseline_filtered_count |
| 8 | Predicate dropout filtering | filter_predicate_dropout |
passes/predicate_dropout.py |
PredicateDropoutInfo, predicate_dropout_filtered |
| 9 | Predicate merge filtering | filter_predicate_merge |
passes/predicate_filter.py |
PredicateFilterInfo, predicate_merge_info |
When semantic mode is enabled, passes 5-9 are re-run on semantic diff output.
Normalize passes that remove or transform differences return sidecar metadata alongside transformed output. Sidecars capture provenance for decisions so callers can distinguish:
- differences that truly disappeared
- differences that were filtered because a specific normalize pass applied
Examples:
passes/wildcard_collapse.pyreturnsWildcardCollapseInfopasses/predicate_dropout.pyreturnsPredicateDropoutInfopasses/predicate_filter.pyreturnsPredicateFilterInfobaseline/imports.pyreturnsImportExpansion
Sidecars are threaded into compare_source_and_reversed() result metadata when their pass is enabled.
| Transformation | Example | Module |
|---|---|---|
| Whitespace, ordering | (subpath "/a") (subpath "/b") vs (subpath "/b") (subpath "/a") |
source.py |
| Predicate grouping | (require-any (pred-a) (pred-b)) vs separate rules |
source.py |
| Deny-as-negation | (deny op X) -> (allow op (require-not X)) |
passes/deny_denormalize.py |
| Wildcard family expansion/compression | file-read-data + file-read-metadata + file-read-xattr <-> file-read* |
passes/wildcard_collapse.py, operations.py |
| Import flattening | Imported rules appear inline in reversed | baseline/imports.py |
| Baseline predicates | Compiler-added mach-lookup/file predicates | baseline/predicates.py |
| Entitlement blocks | (let ((x (entitlement ...))) ...) compiles to nothing |
passes/entitlement.py |
| Param substitution | (param "HOME") -> literal path |
source.py |
| Disconnected filters | Filters compiled but not connected to ops | passes/reconstruction.py |
| Regex equivalence | Different bytecode, same match behavior | baseline/compiler_model.py |
| Predicate dropout | Many bare allows causes predicates to be dropped | passes/predicate_dropout.py |
| Predicate simplification | (path-regex #"^/path$") -> (literal "/path") |
passes/predicate_dropout.py |
| Predicate merge contamination | Node-sharing causes cross-operation predicate misattribution | passes/predicate_filter.py + integration/ir/mappings/predicate_collapse.json |
Wildcards and their children are semantically equivalent in compiled blobs. The compiler consolidates child operations into wildcards when they share the same op-table entry.
Comparison supports both directions:
collapse_wildcards=True: collapse reversed child ops into wildcard formwildcard_equivalence=True: tolerate reversed wildcard matching source children
See operations.py:WILDCARD_CHILDREN for wildcard family membership.
from pawl.normalize import (
CanonicalPolicy,
CanonicalRule,
CanonicalPredicate,
normalize_source,
normalize_reversed,
)from pawl.normalize.reversed import compare_source_and_reversed
result = compare_source_and_reversed(
source_sbpl,
reversed_sbpl,
disconnected_filters=metadata.get("disconnected_filters"),
param_bindings={"HOME": "/Users/alice"},
filter_imports=True,
collapse_wildcards=True,
search_paths=[source_dir, system_profiles_dir],
)
if result["equivalent"]:
print("Policies match at canonical level")
else:
print("Diff:", result["diff"])- Structural equivalence: exact canonical form match after normalization.
- Semantic equivalence: ignores require-any/require-all grouping differences.
- Relaxed equivalence (default): source rules must appear in reversed; reversed may include additional compiler-added rules.
When a structural difference is discovered between source and reversed:
- Characterize it: compiler behavior, reverser limitation, or genuine semantic difference.
- Add it here if it is a known compiler transformation.
- Add tests in
integration/tests/pawl/normalize/and contract coverage where needed. - Keep ownership boundaries clear: oracle facts in
pawl/normalize/predicate_merge, comparison policy inpawl/normalize, render-time behavior inpawl/reverse.
| Module | Purpose |
|---|---|
__init__.py |
Public normalize exports |
policy.py |
CanonicalPolicy, CanonicalRule, CanonicalPredicate and diff logic |
source.py |
normalize_source() and source canonicalization helpers |
reversed.py |
normalize_reversed() and compare_source_and_reversed() |
passes/reconstruction.py |
Decode disconnected filters into canonical predicates |
passes/wildcard_collapse.py |
Collapse wildcard-family child operations with sidecar provenance |
operations.py |
Operation normalization rules and wildcard family mapping |
passes/deny_denormalize.py |
Convert require-not structures back to explicit deny rules |
baseline/imports.py |
Import expansion simulation and imported-op filtering |
baseline/predicates.py |
Baseline predicate and op filtering |
passes/predicate_dropout.py |
Bare-allow predicate-dropout analysis and filtering |
passes/predicate_filter.py |
Mapping-backed predicate contamination filtering sidecar |
boolean_canonicalizer.py |
Shared S-expression simplification utilities |
baseline/compiler_model.py |
Compiler behavior model helpers |
passes/entitlement.py |
Entitlement let-block extraction/injection for comparison |
ir.py |
Canonical IR normalization surface |
NORMALIZE.md |
Conceptual architecture: compiler information loss, recovery model, ownership boundaries |
predicate_merge/ |
Predicate merge oracle: admissibility facts, collapse rules, validation gates |
pawl/reverse/: produces reversed SBPL. It does not own comparison normalization.pawl/structure/: provides decoded IR and compile metadata used by normalization.integration/ir/profile/five_point_harness.py: consumescompare_source_and_reversed()for roundtrip validation.