Skip to content

Latest commit

 

History

History
258 lines (214 loc) · 11.8 KB

File metadata and controls

258 lines (214 loc) · 11.8 KB

ADR-051: Physics-Informed Graph Transformer Layers

Status

Accepted

Date

2026-02-25

Context

Many real-world graphs -- molecular dynamics simulations, particle physics detectors, protein interaction networks, climate meshes -- obey physical conservation laws, symmetries, and variational principles. Standard graph transformers learn representations from data alone, ignoring these inductive biases. This wastes training data (100x more samples required to implicitly learn energy conservation) and produces physically inconsistent predictions that diverge after a few integration steps.

RuVector already provides the building blocks for physics-informed graph transformers across several crates:

  • ruvector-mincut-gated-transformer/src/energy_gate.rs: EnergyGate, EnergyGateConfig for energy-based gating decisions
  • ruvector-attention/src/sheaf/restriction.rs: RestrictionMap for parallel transport (gauge connections on graph fiber bundles)
  • ruvector-attention/src/sheaf/attention.rs: SheafAttention, SheafAttentionConfig for sheaf cohomology attention
  • ruvector-attention/src/transport/sliced_wasserstein.rs: SlicedWassersteinAttention for optimal transport on graphs
  • ruvector-attention/src/pde_attention/diffusion.rs: DiffusionAttention for heat/diffusion equation on graphs
  • ruvector-attention/src/pde_attention/laplacian.rs: graph Laplacian operators for PDE discretization
  • ruvector-attention/src/curvature/fused_attention.rs: MixedCurvatureFusedAttention for Ricci flow
  • ruvector-verified/src/gated.rs: ProofTier, route_proof, verify_tiered for proof-gated verification

However, there is no unified module that composes these into physics-informed graph transformer layers with formally verified conservation laws. The research document docs/research/gnn-v2/22-physics-informed-graph-transformers.md outlines the theoretical framework but defines no implementation path through the existing crates.

Decision

We will implement a physics module in ruvector-graph-transformer behind the physics feature flag. The module provides three layer types -- HamiltonianGraphNet, LagrangianAttention, and GaugeEquivariantMP -- each integrated with the proof-gated mutation protocol (ADR-047) to certify conservation laws per forward step.

HamiltonianGraphNet

Symplectic leapfrog integration that PROVES energy is conserved, not just checks post-hoc:

/// Hamiltonian graph network with symplectic integration.
///
/// Each forward step produces a ProofGate<HamiltonianOutput> whose
/// proof requirement is energy conservation within tolerance.
pub struct HamiltonianGraphNet {
    /// Learned kinetic energy: T(p) via MLP.
    kinetic_net: MLP,
    /// Learned potential energy: V(q) + sum_{(i,j)} U(q_i, q_j).
    potential_net: GraphAttentionPotential,
    /// Integration timestep (fixed or learned).
    dt: f32,
    /// Leapfrog steps per layer.
    num_steps: usize,
    /// Energy tolerance for proof gate (relative |dE/E|).
    energy_tolerance: f64,
    /// Bridges to ruvector-mincut-gated-transformer::energy_gate.
    energy_gate: EnergyGateConfig,
}

impl HamiltonianGraphNet {
    /// Symplectic forward pass with energy conservation proof.
    ///
    /// Executes Stormer-Verlet leapfrog integration on the graph.
    /// After integration, computes |H_final - H_initial| / |H_initial|
    /// and routes through ProofTier::Reflex (< 10 ns) since this is
    /// a scalar comparison. If drift exceeds tolerance, escalates to
    /// ProofTier::Standard for diagnosis.
    pub fn forward(
        &self,
        positions: &mut [f32],    // [n x d] node positions (q)
        momenta: &mut [f32],      // [n x d] node momenta (p)
        graph: &impl GraphRepr,
        env: &mut ProofEnvironment,
    ) -> Result<ProofGate<HamiltonianOutput>>;

    /// Compute Hamiltonian H(q, p) = T(p) + V(q) + sum U(q_i, q_j).
    pub fn hamiltonian(
        &self,
        positions: &[f32],
        momenta: &[f32],
        graph: &impl GraphRepr,
    ) -> f32;
}

The proof requirement for each step is:

ProofRequirement::InvariantPreserved {
    invariant_id: ENERGY_CONSERVATION_INVARIANT,
}

This maps to ProofKind::DimensionEquality (scalar comparison of energy values) and routes to ProofTier::Reflex in steady state, keeping overhead below 10 ns per step.

GaugeEquivariantMP

Uses sheaf restriction maps as gauge connections:

/// Gauge-equivariant message passing using sheaf attention.
///
/// Restriction maps from ruvector-attention::sheaf serve as connection
/// forms (parallel transport operators) on the graph fiber bundle.
/// Attention weights are invariant under gauge transformations g_i at
/// each node because keys are parallel-transported to the query frame
/// before the dot product: alpha_{ij} = softmax(q_i^T A_{ij} k_j).
pub struct GaugeEquivariantMP {
    /// Sheaf attention (restriction maps = gauge connections).
    sheaf_attention: SheafAttention,
    /// Gauge group dimension.
    gauge_dim: usize,
    /// Yang-Mills regularization strength.
    ym_lambda: f32,
    /// Proof requirement: gauge invariance check.
    gauge_proof: ProofRequirement,
}

impl GaugeEquivariantMP {
    /// Gauge-invariant attention forward pass.
    ///
    /// Parallel-transports keys via RestrictionMap before dot product.
    /// Computes Yang-Mills action as regularization loss.
    pub fn forward(
        &self,
        queries: &[f32],
        keys: &[f32],
        values: &[f32],
        graph: &impl GraphRepr,
        env: &mut ProofEnvironment,
    ) -> Result<ProofGate<AttentionOutput>>;

    /// Yang-Mills action: S_YM = sum_{plaquettes} ||F_{ijk}||^2.
    /// Measures curvature (field strength) of the gauge connection.
    pub fn yang_mills_action(&self, graph: &impl GraphRepr) -> f32;
}

LagrangianAttention

Action-minimizing message passing via optimal transport:

/// Lagrangian attention using action-weighted optimal transport.
///
/// The attention weight between nodes i and j is proportional to
/// exp(-beta * W_2(mu_i, mu_j)), where W_2 is Wasserstein-2 distance.
/// This is the information-geometric dual of kinetic energy in
/// Wasserstein space: L = (1/2) ||d mu/dt||^2_{W_2}.
///
/// Delegates to ruvector-attention::transport::SlicedWassersteinAttention
/// for the transport computation and wraps in proof gate for
/// action bound verification.
pub struct LagrangianAttention {
    /// Sliced Wasserstein transport from ruvector-attention.
    transport: SlicedWassersteinAttention,
    /// Inverse temperature for action weighting.
    beta: f32,
    /// Variational integrator timestep.
    dt: f32,
    /// Action bound proof requirement.
    action_proof: ProofRequirement,
}

impl LagrangianAttention {
    /// Variational forward pass.
    ///
    /// Computes discrete Euler-Lagrange equations on the graph.
    /// Action bound is verified via ProofTier::Standard (bounded
    /// fuel for action functional evaluation).
    pub fn forward(
        &self,
        q_prev: &[f32],
        q_curr: &[f32],
        graph: &impl GraphRepr,
        env: &mut ProofEnvironment,
    ) -> Result<ProofGate<LagrangianOutput>>;
}

PDE Attention Integration

The existing ruvector-attention/src/pde_attention/diffusion.rs provides diffusion on graphs. The physics module wraps this with conservation proofs:

/// PDE attention with mass conservation proof.
///
/// Bridges to ruvector_attention::pde_attention::DiffusionAttention.
/// After each diffusion step, proves total mass is conserved:
/// sum_i h_i(t+dt) == sum_i h_i(t) within tolerance.
pub struct ConservativePdeAttention {
    diffusion: DiffusionAttention,
    mass_tolerance: f64,
}

Feature Flag

# In crates/ruvector-graph-transformer/Cargo.toml
[features]
physics = [
    "ruvector-mincut-gated-transformer/energy_gate",
    "ruvector-attention/pde_attention",
    "ruvector-attention/sheaf",
    "ruvector-attention/transport",
]

Consequences

Positive

  • Energy conservation is guaranteed by construction via symplectic integration and formally verified per step
  • Gauge invariance from sheaf attention ensures predictions are coordinate-independent
  • PDE attention with mass conservation proof prevents unphysical feature drift
  • Physics priors reduce required training data by encoding known laws, with estimated 100x improvement for molecular dynamics tasks
  • All layers compose with the proof-gated mutation protocol (ADR-047), producing auditable attestation chains

Negative

  • Leapfrog integration adds O(num_steps) overhead per layer compared to a standard residual connection
  • Yang-Mills regularization requires computing holonomies around plaquettes (small graph cycles), which is O(triangles) per forward pass
  • LagrangianAttention requires Newton iteration to solve the implicit discrete Euler-Lagrange equation (5 iterations by default)
  • Users must supply phase-space representations (q, p) rather than generic node features

Risks

  • If energy tolerance is set too tight, Reflex-tier proofs will fail and escalate to Standard/Deep, exceeding the 2% overhead budget (ADR-047). Mitigation: default tolerance of 1e-4 relative drift, which is achievable with double-precision leapfrog
  • Sheaf restriction maps as gauge connections assume orthogonal gauge group. Extending to non-abelian groups (SU(2), SU(3)) requires operator ordering care and is deferred to a follow-up ADR
  • Noether symmetry mining (automatic conservation law discovery) is not included in this ADR due to training cost; it is an extension for ADR-049's verified training pipeline

Implementation

  1. Create crates/ruvector-graph-transformer/src/physics/mod.rs re-exporting all layer types
  2. Implement HamiltonianGraphNet in physics/hamiltonian.rs, bridging to ruvector-mincut-gated-transformer::energy_gate
  3. Implement GaugeEquivariantMP in physics/gauge.rs, bridging to ruvector-attention::sheaf::{SheafAttention, RestrictionMap}
  4. Implement LagrangianAttention in physics/lagrangian.rs, bridging to ruvector-attention::transport::SlicedWassersteinAttention
  5. Implement ConservativePdeAttention in physics/pde.rs, bridging to ruvector-attention::pde_attention::DiffusionAttention
  6. Add benchmark: benches/physics_bench.rs measuring energy drift over 10,000 leapfrog steps on a 1,000-node molecular graph
  7. Integration test: compose HamiltonianGraphNet + GaugeEquivariantMP in a full forward pass, verify attestation chain integrity
  8. Verify build: cargo test --features physics -p ruvector-graph-transformer

References

  • ADR-046: Graph Transformer Unified Architecture (module structure, AttentionRegistry)
  • ADR-047: Proof-Gated Mutation Protocol (ProofGate<T>, ProofRequirement, three-tier routing)
  • ADR-049: Verified Training Pipeline (conservation law invariants during training)
  • Research: docs/research/gnn-v2/22-physics-informed-graph-transformers.md
  • crates/ruvector-mincut-gated-transformer/src/energy_gate.rs: EnergyGate, EnergyGateConfig
  • crates/ruvector-attention/src/sheaf/restriction.rs: RestrictionMap
  • crates/ruvector-attention/src/sheaf/attention.rs: SheafAttention, SheafAttentionConfig
  • crates/ruvector-attention/src/transport/sliced_wasserstein.rs: SlicedWassersteinAttention
  • crates/ruvector-attention/src/pde_attention/diffusion.rs: DiffusionAttention
  • crates/ruvector-attention/src/pde_attention/laplacian.rs: graph Laplacian
  • crates/ruvector-attention/src/curvature/fused_attention.rs: MixedCurvatureFusedAttention
  • crates/ruvector-verified/src/gated.rs: ProofTier, route_proof, verify_tiered
  • crates/ruvector-verified/src/proof_store.rs: ProofAttestation, 82-byte witnesses
  • Greydanus et al., "Hamiltonian Neural Networks" (arXiv:1906.01563, 2019)
  • Cranmer et al., "Lagrangian Neural Networks" (arXiv:2003.04630, 2020)
  • Cohen et al., "Gauge Equivariant Convolutional Networks" (arXiv:1902.04615, 2019)
  • Hansen & Gebhart, "Sheaf Neural Networks" (arXiv:2012.06333, 2020)