Efficient per-row model matrix evaluation for Julia statistical models using position-mapped compilation and compile-time specialization.
- Memory efficiency: Optimized evaluation approach minimizes memory allocation during computation
- Performance: Faster single-row evaluation than building full model matrices
- Compatibility: Supports StatsModels.jl formulas, including interactions and transformations
- Scenario analysis: Memory-efficient variable override system for counterfactual analysis
- Unified architecture: Single compilation pipeline accommodates diverse formula structures
- Ecosystem integration: Compatible with GLM.jl, MixedModels.jl, and StandardizedPredictors.jl
Fit a model, convert data to a column table, compile once, then evaluate rows quickly without allocations.
using Pkg
Pkg.add(url="https://github.com/emfeltham/FormulaCompiler.jl")
using FormulaCompiler, GLM, DataFrames, Tables
# Fit your model normally
df = DataFrame(
y = randn(1000),
x = randn(1000),
z = abs.(randn(1000)) .+ 0.1,
group = categorical(rand(["A", "B", "C"], 1000))
)
model = lm(@formula(y ~ x * group + log(z)), df)
# Compile once for efficient repeated evaluation
data = Tables.columntable(df)
compiled = compile_formula(model, data)
row_vec = Vector{Float64}(undef, length(compiled))
# Memory-efficient evaluation suitable for repeated calls
compiled(row_vec, data, 1) # Zero allocations after warmup
# Pre-compile for optimal performance
data = Tables.columntable(df)
compiled = compile_formula(model, data)
row_vec = Vector{Float64}(undef, length(compiled))
# Evaluate individual rows
compiled(row_vec, data, 1) # Row 1
compiled(row_vec, data, 100) # Row 100
# Evaluate multiple rows
matrix = Matrix{Float64}(undef, 10, length(compiled))
modelrow!(matrix, compiled, data, 1:10)
# Single row (allocating)
row_values = modelrow(model, data, 1)
# Multiple rows (allocating)
matrix = modelrow(model, data, [1, 5, 10, 50])
# In-place with automatic caching
row_vec = Vector{Float64}(undef, size(modelmatrix(model), 2))
modelrow!(row_vec, model, data, 1) # Uses cache
# Create evaluator object
evaluator = ModelRowEvaluator(model, df)
# Efficient evaluation
result = evaluator(1) # Row 1
evaluator(row_vec, 1) # In-place evaluation
Create data scenarios with variable overrides for policy analysis and counterfactuals:
data = Tables.columntable(df)
# Create policy scenarios
baseline = create_scenario("baseline", data)
treatment = create_scenario("treatment", data;
treatment = true,
dose = 100.0
)
policy_change = create_scenario("policy", data;
x = mean(df.x), # Set to population mean
group = "A", # Override categorical
regulatory = true # Add policy variable
)
# Evaluate scenarios
compiled = compile_formula(model, data)
row_vec = Vector{Float64}(undef, length(compiled))
compiled(row_vec, baseline.data, 1) # Original data
compiled(row_vec, treatment.data, 1) # With treatment
compiled(row_vec, policy_change.data, 1) # Policy scenario
Generate comprehensive scenario combinations:
# Create all combinations
policy_grid = create_scenario_grid("policy_analysis", data, Dict(
:treatment => [false, true],
:dose => [50.0, 100.0, 150.0],
:region => ["North", "South"]
))
# Evaluates 2×3×2 = 12 scenarios
results = Matrix{Float64}(undef, length(policy_grid), length(compiled))
for (i, scenario) in enumerate(policy_grid)
compiled(view(results, i, :), scenario.data, 1)
end
scenario = create_scenario("dynamic", data; x = 1.0)
# Modify scenarios iteratively
set_override!(scenario, :y, 100.0) # Add override
update_scenario!(scenario; x = 2.0, z = 0.5) # Bulk update
remove_override!(scenario, :y) # Remove override
using GLM
# Linear models
linear_model = lm(@formula(mpg ~ hp * cyl + log(wt)), mtcars)
data_mtcars = Tables.columntable(mtcars)
compiled = compile_formula(linear_model, data_mtcars)
# Generalized linear models
logit_model = glm(@formula(vs ~ hp + wt), mtcars, Binomial(), LogitLink())
compiled_logit = compile_formula(logit_model, data_mtcars)
# Both work identically
row_vec = Vector{Float64}(undef, length(compiled))
compiled(row_vec, data_mtcars, 1)
Automatically extracts fixed effects from mixed models:
using MixedModels
# Mixed model with random effects
mixed_model = fit(MixedModel, @formula(y ~ x + z + (1|group) + (1+x|cluster)), df)
# Compiles only the fixed effects: y ~ x + z
compiled = compile_formula(mixed_model, Tables.columntable(df))
# Random effects are automatically stripped
fixed_form = fixed_effects_form(mixed_model) # Returns: y ~ x + z
Standardized predictors are integrated (currently only ZScore()
):
using StandardizedPredictors
contrasts = Dict(:x => ZScore(), :z => ZScore())
model = lm(@formula(y ~ x + z + group), df, contrasts=contrasts)
compiled = compile_formula(model, Tables.columntable(df)) # Standardization built-in
FormulaCompiler provides memory-efficient computation of derivatives and marginal effects with standard errors:
# Build derivative evaluator
vars = [:x, :z]
de = build_derivative_evaluator(compiled, data; vars=vars)
β = coef(model)
# Single-row marginal effect gradient (η case)
gβ = Vector{Float64}(undef, length(compiled))
me_eta_grad_beta!(gβ, de, β, 1, :x) # Minimal memory allocation
# Standard error via delta method
Σ = vcov(model)
se = delta_method_se(gβ, Σ) # Efficient computation
# Average marginal effects with backend selection
rows = 1:100
gβ_ame = Vector{Float64}(undef, length(compiled))
accumulate_ame_gradient!(gβ_ame, de, β, rows, :x; backend=:fd) # Memory-efficient computation
se_ame = delta_method_se(gβ_ame, Σ)
println("AME standard error for x: ", se_ame)
Key capabilities:
- Dual backends:
:fd
(memory-efficient) and:ad
(higher numerical accuracy) - η and μ cases: Linear predictors and link function transformations
- Delta method: Standard error computation for marginal effects
- Validated implementation: Cross-validated against reference implementations
The scenario system employs OverrideVector
for efficient data representation:
# Traditional approach: memory allocation for large datasets
traditional = fill(42.0, 1_000_000) # ~8MB allocation
# FormulaCompiler approach: reduced memory overhead
efficient = OverrideVector(42.0, 1_000_000) # ~32 bytes allocation
# Identical computational interface
traditional[500_000] == efficient[500_000] # true
- Basic terms:
x
,log(z)
,x^2
,(x > 0)
- Boolean variables:
Vector{Bool}
treated as continuous (false → 0.0, true → 1.0) - Categorical variables: All contrast types, ordered/unordered
- Interactions:
x * group
,x * y * z
,log(z) * group
- Functions:
log
,exp
,sqrt
,sin
,cos
,abs
,^
, boolean operators - Complex formulas:
x * log(z) * group + sqrt(abs(y)) + (x > mean(x))
- Standardized predictors: ZScore, custom transformations
- Mixed models: Automatic fixed-effects extraction
-
Pre-compile formulas for repeated evaluation:
data = Tables.columntable(df) compiled = compile_formula(model, data) # Do once # Then call compiled() millions of times
-
Use column-table format for best performance:
data = Tables.columntable(df) # Convert once, reuse many times
-
Pre-allocate output vectors:
row_vec = Vector{Float64}(undef, length(compiled)) # Reuse across calls
-
Batch operations when possible:
# Better: batch evaluation matrix = Matrix{Float64}(undef, 1000, length(compiled)) modelrow!(matrix, compiled, data, 1:1000) # Avoid: many single evaluations with allocation results = [modelrow(model, data, i) for i in 1:1000]
Comparative performance evaluation across formula types:
using BenchmarkTools
# Traditional approach (full model matrix construction)
@benchmark modelmatrix(model)[1, :]
# Note: Constructs entire model matrix, computationally intensive for large datasets
# FormulaCompiler optimized approach
data = Tables.columntable(df)
compiled = compile_formula(model, data)
row_vec = Vector{Float64}(undef, length(compiled))
@benchmark compiled(row_vec, data, 1)
# Performance improvement with reduced memory allocation
Derivative computation:
- ForwardDiff-based operations involve some per-call allocations due to automatic differentiation requirements
- Finite difference backend provides alternative with validation against automatic differentiation results
- Marginal effects computations utilize preallocated buffers
The automatic differentiation backend for batch gradient operations (accumulate_ame_gradient!
) involves allocations. Users with strict memory constraints should utilize the :fd
backend, which provides mathematically equivalent results with reduced memory overhead.
FormulaCompiler achieves efficiency through a unified compilation pipeline that transforms statistical formulas into specialized, type-stable execution paths:
- Position mapping: Operations utilize compile-time position specialization
- Adaptive dispatch: Threshold-based approach (≤25 operations: recursive dispatch, >25 operations: generated functions) chosen based on Julia's compilation limits
- Unified design: Single compilation system accommodates diverse formula structures without special-case handling
- Monte Carlo simulations: Millions of model evaluations
- Bootstrap resampling: Repeated matrix construction
- Marginal effects: Numerical derivatives require many evaluations
- Policy analysis: Evaluate many counterfactual scenarios
- Real-time applications: Low-latency prediction serving
- Large-scale inference: Memory-efficient batch processing
Contributions are welcome!
- Margins.jl: Marginal effects (built on this package)
- GLM.jl: Generalized linear models
- MixedModels.jl: Mixed-effects models
- StandardizedPredictors.jl: Standardized predictors
- StatsModels.jl: Formula interface
- DIAGRAMS.md: Complete visual guide with system architecture, technical implementation, and usage workflows
- categorical_handling.md: Detailed explanation of categorical variable and interaction handling
- docs/diagrams/: Individual diagram files for embedding in documentation
MIT License. See LICENSE for details.