Conversation
Implements a Debug Adapter Protocol server that allows stepping through abstract interpretation in IntelliJ via the LSP4IJ plugin. The debugger pauses at each FAIR statement, showing the evolving abstract environment with threads, variables (parameters, locals, analysis info), and virtual FAIR IR source generated on-the-fly. Key components: - DebugListener interface injected into interpreter (zero overhead when null) - BlockingDebugListener pauses interpreter via ReentrantLock+Condition - FairDebugServer (LSP4J) handles DAP requests from IntelliJ - FairDebugSession manages thread-safe shared state between threads - FairSourceProvider generates virtual FAIR IR listings per method - InterpreterBridge wires SootUp/FAIR/interpreter with the debugger - Pre-configured LSP4IJ DAP run configurations in .run/ Also fixes: - RPO fixpoint strategy skipping all blocks on first iteration - IFDSCallSemantics crashing on abstract/native methods from CHA Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When LSP4IJ launches the DAP server with noDebug=true (Run button instead of Debug button), the server now prints a clear error message and sends terminated/exited events so the client disconnects cleanly, instead of silently hanging. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…Up view The visualizer SootUp view only received the temp compile directory, missing the stub classes (stubs.taintAnalysis.*) that TaintExample depends on. Added vararg additionalClassPaths to SootUpManager.createView and passed the test classes directory from ConvertEndpoint. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- stopOnEntry (default true) pauses at first FAIR statement on launch - Send terminated/exited events when interpretation completes - Run configs updated with stopOnEntry in launch configuration - Revert unsuccessful breakpoint infrastructure (LSP4IJ limitation) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix off-by-one in computeLineNumber (skip block header line) - Pass callee FAIRBody through onMethodEntry so pushFrame registers the correct source reference instead of the caller s body - Add session-unique path to DAP source objects to prevent LSP4IJ from serving stale cached content across debug runs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Resolve merge conflict in CallHandler.kt by adopting the new methodRegistry.getMethodBySignature() API. Also migrate InterpreterBridge.kt from the removed program.classes API to methodRegistry.allMethods(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
AnalysisType.valueOf("CONSTPROP") silently failed because the enum
name is CONSTANT_PROPAGATION, causing all analyses to fall back to
TAINT_ANALYSIS. Use direct string matching like the develop branch.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Include parameter locals and synthetic locals (e.g. constant_value_*) alongside regular locals in the debugger Threads & Variables panel. Sort variables alphabetically by name. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Save the caller's environment snapshot, body, and block onto a stack when entering a callee method, and restore them on exit. Previously the Variables view would show the callee's environment after stepping over or returning from a call. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Here the method can be registered for the kind of optimization it should undergo after the generator pass has been done
The file selector now recursively discovers test files in subdirectories (e.g., TaintAnalysis/) and groups them using <optgroup> in the dropdown. Fixes compilation and class lookup for packaged test classes by respecting package directory structure and splitting qualified class names correctly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fixed Taint analysis: when constants are assigned, we wish to kill the taint. This fixes previously detected bugs in the test case TaintAssignmentChainTest!
… failing assertions When assertion tests fail, `./gradlew debug --tests "..."` generates IntelliJ DAP run configs (.run/Debug *.run.xml) that launch the FAIR debugger with a breakpoint at the failing method. The --break-method flag skips to the named method before pausing. Also adds methodName to Finding for richer diagnostics. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… no-op
FAIRJoinStmt was treated as a no-op, causing return values from methods
like passthrough(s) { return s } to lose their abstract state. Now folds
all locals through lattice.join and stores the result in the destination.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ged state Assertion findings (assertTop, assertBottom, etc.) are non-monotone — their result can change between fixpoint iterations as abstract values stabilize. Previously they were collected eagerly alongside monotone taint findings, causing spurious failures when evaluated at intermediate states. Now assertion findings use a last-write-wins map keyed by FAIRStmt identity in StmtExecutor, merged into the final findings list by FixpointEngine only after convergence. This ensures assertions reflect the eventual fixpoint. Updated documentation in Assertions.java, DomainOperation.kt, and CLAUDE.md to clarify converged-state evaluation semantics. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
16 test files (76 methods) covering all DSL constructs: integer literals, arithmetic/bitwise/shift operators, comparisons, branch pruning via FlatIntAssumeHandler, abs() invoke rule, control flow, assignment chains, overflow, and mixed operations. Three tests document expected IFDS limitations where inter-procedural calls cannot correlate multiple arguments. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Introduces assertToStringEquals(x, "value") for checking abstract state via toString comparison. The string literal is extracted at conversion time from the ConstantNormalizer-generated preceding assignment and embedded in AssertToStringEqualsOp. Also adds formatLatticeValue() as a shared ≪…≫ wrapper used by Flat.Element, Lifted.Value, and the new assertion, and adds toString() overrides to Lifted sealed class members. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
10 test files (23 methods) covering file lifecycle typestate tracking: init, open/close transitions, error states, assignment propagation, multi-object independence, predicate narrowing, inter-procedural calls, and field access. Uses assertToStringEquals to check abstract state values directly. 4 known failures document inter-procedural receiver mutation and field-based typestate limitations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR expands the Jimple→FAIR conversion and analysis pipeline with (1) push-based optimization hint registration for fold-heavy methods, (2) a generalized constant-normalization pass for Jimple bodies, and (3) configurable + more efficient call-graph leveling (SCC + leveling strategies), with additional benchmarking/reporting to compare phases.
Changes:
- Add an
OptimizationRegistrythat records per-method fold statistics at fold-creation time (viaFoldHelper) to surface alias-analysis candidates. - Replace the old field-store-only normalization with a broader
ConstantNormalizerthat extracts constants from multiple expression positions into fresh locals. - Extend call-graph topological sorting with selectable SCC/leveling strategies and add method-level benchmark diff reporting.
Reviewed changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| TODO.MD | Updates generator TODO items and marks array handling as done. |
| src/test/kotlin/OptimizationRegistryTest.kt | Adds unit tests for the new optimization registry behavior. |
| src/test/kotlin/converter/ConstantNormalizerTest.kt | Adds tests validating constant extraction and name-collision handling. |
| src/test/kotlin/benchmark/FAIRGeneratorBenchmarkTest.kt | Adds per-method metrics + phase-to-phase diff report, plus CG-opt summarized/pruned annotations. |
| src/main/kotlin/semantics/FAIRSemanticsContext.kt | Makes context policy non-null (default NoContextFactory) and adds resolveContextId. |
| src/main/kotlin/semantics/call/MonotoneCallSemantics.kt | Removes !! on context policy and routes call context via resolveContextId. |
| src/main/kotlin/semantics/call/IFDSCallSemantics.kt | Adjusts call emission for void callees and updates argument-mapping calls to pass context. |
| src/main/kotlin/semantics/call/CallHelper.kt | Uses resolveContextId when building call statements; updates API to accept context. |
| src/main/kotlin/optimizer/OptimizationRegistry.kt | Implements concurrent fold-stat tracking and reporting. |
| src/main/kotlin/optimizer/OptimizationHint.kt | Defines the alias-analysis candidate hint with fold-count breakdown. |
| src/main/kotlin/optimizer/MethodOptimizationProfile.kt | Adds a method-level profile wrapper + readable toString. |
| src/main/kotlin/helper/FoldHelper.kt | Push-registers folds into OptimizationRegistry during conversion. |
| src/main/kotlin/helper/CallGraphTopologicalSorter.kt | Adds Tarjan SCC, alternative leveling strategy, and new strategy enums/params. |
| src/main/kotlin/converter/interceptor/FieldStoreNormalizer.kt | Removes the old field-store-only normalizer. |
| src/main/kotlin/converter/interceptor/ConstantNormalizer.kt | Adds the new multi-position constant extraction transformer. |
| src/main/kotlin/converter/FAIRTranslationSuite.kt | Switches normalization to ConstantNormalizer and standardizes context policy selection. |
| src/main/kotlin/converter/FAIRMethodConverter.kt | Uses blocksSorted and makes context init unconditional-call to the policy. |
| src/main/kotlin/converter/ConverterConfig.kt | Adds config knobs for SCC algorithm and leveling strategy. |
| fair.core/src/main/kotlin/model/type/ClassType.kt | Fixes equals/hashCode to include packageName (aligning with toString). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| assertThat(result.locals.map { it.name }).contains("\$cTmp1") | ||
| assertThat(result.locals.map { it.name }).doesNotContain("\$cTmp0".repeat(2)) // still only one $cTmp0 |
There was a problem hiding this comment.
The final assertion is ineffective: "$cTmp0".repeat(2) produces the single string "$cTmp0$cTmp0", so this does not detect duplicate locals named "$cTmp0". If the intent is to ensure the original $cTmp0 local is not duplicated, assert on the count of that name (or on distinct()/containsExactlyOnce).
| assertThat(result.locals.map { it.name }).contains("\$cTmp1") | |
| assertThat(result.locals.map { it.name }).doesNotContain("\$cTmp0".repeat(2)) // still only one $cTmp0 | |
| val localNames = result.locals.map { it.name } | |
| assertThat(localNames).contains("\$cTmp1") | |
| assertThat(localNames.count { it == "\$cTmp0" }).isEqualTo(1) // still only one $cTmp0 |
| context.contextPolicy.emitInit( | ||
| ContextId.Root(methodId), | ||
| if (CallGraphIndex.hasCaller(methodSignature)) { | ||
| ContextInitSource.FromCall | ||
| } else { | ||
| ContextInitSource.Method(methodId) | ||
| }, | ||
| )?.takeIf { it.isNotEmpty() }?.let { | ||
| ).takeIf { it.isNotEmpty() }?.let { | ||
| entryBlock.statements.addAll(it) | ||
| context.currentContext = it.last().ctxId | ||
| } |
There was a problem hiding this comment.
currentContext is only set when contextPolicy.emitInit(...) returns a non-empty list. With NoContextFactory (used for ContextStrategy.NONE), emitInit returns an empty list, leaving currentContext null; MonotoneCallSemantics later dereferences context.currentContext!! and will throw. Consider initializing currentContext to the root ContextId regardless of whether context statements are emitted (or making MonotoneCallSemantics tolerate the no-context policy).
| * Computes conversion levels for [methods] given a [callees] adjacency map. | ||
| * | ||
| * @param methods the filtered set of methods to convert | ||
| * @param callees maps each method to its direct callees within [methods] | ||
| * @param sccAlgorithm which SCC detection algorithm to use (default: [SccAlgorithm.KOSARAJU]) | ||
| * @param levelingStrategy which level-assignment strategy to use (default: [LevelingStrategy.KAHNS_BFS]) | ||
| * @return list of levels ordered from index 0 (leaves, no callees) to N (entry-point | ||
| * callers). Each level is a list of SCCs; each SCC is the [Set] of | ||
| * mutually-recursive methods that must be converted sequentially. | ||
| */ | ||
| fun computeLevels( | ||
| methods: Set<MethodSignature>, | ||
| callees: Map<MethodSignature, Set<MethodSignature>>, | ||
| sccAlgorithm: SccAlgorithm = SccAlgorithm.TARJAN, | ||
| levelingStrategy: LevelingStrategy = LevelingStrategy.KAHNS_BFS, | ||
| ): List<List<Set<MethodSignature>>> { |
There was a problem hiding this comment.
KDoc says sccAlgorithm defaults to SccAlgorithm.KOSARAJU, but computeLevels actually defaults to SccAlgorithm.TARJAN. Update the documentation (or the default) so callers understand which algorithm is used by default.
| data class ConverterConfig( | ||
| val parallelConversion: Boolean = true, | ||
| val callGraphOptimization: Boolean = true, | ||
| val excludedPackagePrefixes: Set<String> = JvmLibraryFilter.DEFAULT_EXCLUDED_PREFIXES, | ||
| val sccAlgorithm: SccAlgorithm = SccAlgorithm.TARJAN, | ||
| val levelingStrategy: LevelingStrategy = LevelingStrategy.KAHNS_BFS, |
There was a problem hiding this comment.
sccAlgorithm and levelingStrategy are added to ConverterConfig, but there is no usage in the production code path (the conversion still calls CallGraphTopologicalSorter.computeLevels(toConvert, calleesMap) without passing these). Either wire these options into FAIRWholeProgramConverter when computing levels or remove them to avoid a misleading/unused configuration surface.
| val isOpaque = m.fairBody.fairStmts.isEmpty() && m.fairBody.cfg == null | ||
| key to MethodMetrics( | ||
| stmtCount = m.fairBody.fairStmts.size, | ||
| blockCount = m.fairBody.cfg?.blocks?.size ?: 0, | ||
| cfgEdgeCount = m.fairBody.cfg?.blocks?.values?.sumOf { b -> b.successorIds.size } ?: 0, | ||
| loopHeaders = m.fairBody.cfg?.blocks?.values?.count { b -> b.isLoopHeader } ?: 0, | ||
| joinPoints = m.fairBody.cfg?.blocks?.values?.count { b -> b.isJoinPoint } ?: 0, | ||
| isOpaque = isOpaque, | ||
| bodyText = if (isOpaque) "<opaque — no body>" else m.fairBody.toString(), | ||
| ) |
There was a problem hiding this comment.
perMethodMetrics stores bodyText = m.fairBody.toString() for every non-opaque method and every phase. For large programs this can significantly increase memory usage and distort the benchmark (or even OOM). Consider making bodyText lazy/optional (only computed for methods that differ) or guarding it behind a flag so the default benchmark run only collects numeric metrics.
No description provided.