We are building a library to convert Jimple three-address code (as generated by SootUp) into FAIR (Fixpoint-aware Abstract Intermediate Representation), an IR for static analyses (not for the concrete program being analyzed). A FAIR program is the specialized, executable form of an analysis A on a particular program P.
The FAIR paper is dated March 14, 2026, and describes:
- Core data-flow instructions:
copy,set_top,set_bottom,set_const,apply,assume,load,store,goto,call,ret - Context statements:
ctx_init,ctx_extend,ctx_klimit,ctx_merge - Fixpoint control:
widen - A
foreach/foldconstruct for iteration over sets (e.g. access paths)
A Kotlin-builder DSL that allows defining the semantics of a static analysis using:
- A lattice (induced by an abstract domain and merge operator)
- Data-flow functions on the Jimple expression level — how evaluation of a Jimple expression by the static analysis transforms abstract-domain values
The DSL defines flow functions keyed by Jimple expression types, not statement types. The converter handles structural statement→FAIR mapping (e.g., JAssignStmt with an InvokeExpr on the RHS wires the FAIR apply result into the LHS variable). The DSL only says what the expression evaluation means in the abstract domain.
Phase 1 — Conversion time (Jimple→FAIR):
- Converter walks each Jimple statement, finds its expressions
- Calls
analysis.processExpression(expr)— a single, uniform entry point - Inside, matcher predicates fire as callbacks (e.g.,
matching { it.nameIs("getParam") }) - First match determines which
DomainOperations to embed in FAIRapplystatements - Returns
List<FairApply<T>>or empty list if expression is irrelevant
Phase 2 — Interpretation time (FAIR execution):
- FAIR interpreter executes
FairApply.execute(env) - This calls back into
DomainOperation.execute(dest, args, env) - The operation reads/writes the
AbstractEnvironment<T>via lattice operations
Mirrors SootUp's Jimple expression tree. Each handler holds a default DomainOperation (generic, lattice-only):
ExprHandler<T> — abstract base (holds defaultOp)
├── UnaryExprHandler<T> — default: IdentityOp (pass through)
│ ├── NegExprHandler<T> — -x propagates abstract value of x
│ ├── LengthExprHandler<T> — arr.length propagates from arr
│ ├── CastExprHandler<T> — (T)x propagates from x
│ └── InstanceOfExprHandler<T> — x instanceof T propagates from x
├── BinaryExprHandler<T> — default: JoinOp (join operands)
├── InvokeExprHandler<T> — matcher-based dispatch
└── AllocationExprHandler<T> — default: SetTopOp (new = ⊤)
├── NewExprHandler<T>
├── NewArrayExprHandler<T>
└── NewMultiArrayExprHandler<T>
All defined purely via Lattice<T> — no analysis-specific knowledge:
| Operation | What it does | Used by default for |
|---|---|---|
IdentityOp |
env[dest] = env[args[0]] |
Unary exprs (neg, cast, length, instanceof) |
JoinOp |
env[dest] = join(env[a1], env[a2]) |
Binary exprs (arithmetic, bitwise, comparison) |
SetTopOp |
env[dest] = lattice.top |
Allocation exprs (new), Sources |
SetBottomOp |
env[dest] = lattice.bottom |
Sanitizers |
CheckTopOp |
if leq(top, env[arg]) → report |
Sinks |
For InvokeExpr, the handler has a list of rules, each pairing a matcher predicate with domain operations. The DSL provides three categories:
source: matched invoke →SetTopOpon return value or argument access pathsink: matched invoke →CheckTopOpon argument, with reporter callbacksanitizer: matched invoke →SetBottomOpon return value or argument access pathdefaultPropagation: fallback for unmatched invocations
fun defineTaintAnalysis() = fairAnalysis("TaintAnalysis", TaintLattice) {
onInvokeExpr {
source(matching { it.nameIs("getParam") }, sourceId = "HTTP_PARAM") {
taintsReturnValue()
}
source(matching { it.nameIs("getSecret") }, sourceId = "SECRET_FIELD") {
taintsArgument(0, AccessPath("arg0") / "secretField")
}
sink(matching { it.nameIs("sqlQuery") }, sinkId = "SQL") {
checksArgument(0) { finding ->
println("!! FINDING: ${finding.kind} — ${finding.detail}")
}
}
sanitizer(matching { it.nameIs("sanitize") }, sanitizerId = "HTML_SANITIZE") {
killsTaintOnReturnValue()
}
}
// No onBinopExpr needed — default JoinOp is correct for taint
// No onNegExpr/onCastExpr needed — default IdentityOp propagates taint
}Lattice<T> interface: bottom, top, join, meet, leq
FlatLattice<E>: sealed class Flat<E> = Bottom | Top | Element<E>
- join of two distinct elements → ⊤
- meet of two distinct elements → ⊥
TaintLattice: specializes so bottom = Untainted, top = Tainted
- Two-point lattice: Untainted ⊑ Tainted
Based on SootUp's sootup.core.jimple.common.expr package:
- Binary: JAddExpr, JSubExpr, JMulExpr, JDivExpr, JRemExpr, JAndExpr, JOrExpr, JXorExpr, JShlExpr, JShrExpr, JUshrExpr, JEqExpr, JNeExpr, JLtExpr, JLeExpr, JGtExpr, JGeExpr, JCmpExpr, JCmpgExpr, JCmplExpr
- Unary: JNegExpr, JLengthExpr
- Cast/TypeCheck: JCastExpr, JInstanceOfExpr
- Invoke: JStaticInvokeExpr, JVirtualInvokeExpr, JInterfaceInvokeExpr, JSpecialInvokeExpr, JDynamicInvokeExpr
- Allocation: JNewExpr, JNewArrayExpr, JNewMultiArrayExpr
- Expression-level, not statement-level: The converter handles statements; the DSL handles expressions.
- Callbacks all the way: The converter is oblivious to source/sink/sanitizer categories — it hands expressions to the DSL; matcher predicates fire as callbacks to classify them.
- FAIR
applyholds a reference to aDomainOperation: At FAIR interpretation time, the interpreter calls back into the operation, which performs the actual lattice mutation. - Generic defaults for all expression types: Unary=identity, binary=join, allocation=set_top. Analyses only override what they need.
- No special binop casing for taint: The default JoinOp already does
join(left, right), which for the taint lattice means "if either is tainted, result is tainted." - Source/sink/sanitizer builders compose the five generic DomainOperations:
taintsReturnValue()=SetTopOp,killsTaintOnReturnValue()=SetBottomOp,checksArgument()=CheckTopOp, etc.
From the FAIR paper's own open questions:
- Modeling heap representations and pointer/callgraph analysis in FAIR
- Handling exceptional flows
- Backward and demand-driven analyses
- Aliasing and strong updates
- User feedback integration (e.g., "ignore this path")
From our DSL design:
- Context sensitivity DSL (separate from expression handling):
contextSensitivity { callSiteSensitive(k = 2) } - Product lattices for multi-domain analyses
- Interval lattice for constant analysis
- FlowDroid-compatible source/sink specification import:
fromFlowDroidConfig("SourcesAndSinks.txt") - Matching on receiver type, not just method name
- Taint-through for wrapper methods (beyond source/sink/sanitizer)
- Access path model integration into lattice definition
FairDSL.kt— The complete implementation with all of the above, including amain()end-to-end demo.
The FAIR paper (attached as PDF) describes the IR syntax, semantics, and examples for taint analysis, constant analysis, call-string/IFDS/IDE approaches, and polymorphic call handling. Sections 4-7 contain the running examples we referenced.