Skip to content

Commit b11e388

Browse files
committed
proper termination, take 2
This PR is an alternative to #99. This is built on top of #116. With this PR, the following test cases now pass correctly: ```julia #Final block is not a `return`: Need to use #`config::SelectiveEvalRecurse` explicitly ex = quote x = 1 yy = 7 @Label loop x += 1 x < 5 || return yy @goto loop end frame = Frame(ModSelective, ex) src = frame.framecode.src edges = CodeEdges(ModSelective, src) config = SelectiveEvalRecurse() isrequired = lines_required(GlobalRef(ModSelective, :x), src, edges, config) selective_eval_fromstart!(config, frame, isrequired, true) @test ModSelective.x == 5 @test !isdefined(ModSelective, :yy) ``` The basic approach is overloading `JuliaInterpreter.step_expr!` and `LoweredCodeUtils.next_or_nothing!` for the new `SelectiveEvalController` type, as described below, to perform correct selective execution. When `SelectiveEvalController` is passed as the `recurse` argument of `selective_eval!`, the selective execution is adjusted as follows: - **Implicit return**: In Julia's IR representation (`CodeInfo`), the final block does not necessarily return and may `goto` another block. And if the `return` statement is not included in the slice in such cases, it is necessary to terminate `selective_eval!` when execution reaches such implicit return statements. `controller.implicit_returns` records the PCs of such return statements, and `selective_eval!` will return when reaching those statements. This is the core part of the fix for the test cases in #99. - **CFG short-cut**: When the successors of a conditional branch are inactive, and it is safe to move the program counter from the conditional branch to the nearest common post-dominator of those successors, this short-cut is taken. This short-cut is not merely an optimization but is actually essential for the correctness of the selective execution. This is because, in `CodeInfo`, even if we simply fall-through dead blocks (i.e., increment the program counter without executing the statements of those blocks), it does not necessarily lead to the nearest common post-dominator block. And now [`lines_required`](@ref) or [`lines_required!`](@ref) will update the `SelectiveEvalController` passed as their argument to be appropriate for the program slice generated. One thing to note is that currently, the `controller` is not be recursed. That said, in Revise, which is the main consumer of LCU, there is no need for recursive selective execution, and so `selective_eval!` does not provide a system for inter-procedural selective evaluation. Accordingly `SelectiveEvalController` does not recurse too, but this can be left as a future extension.
1 parent 0f44fa8 commit b11e388

File tree

6 files changed

+150
-40
lines changed

6 files changed

+150
-40
lines changed

docs/src/api.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ lines_required
1717
lines_required!
1818
selective_eval!
1919
selective_eval_fromstart!
20+
SelectiveEvalController
2021
```
2122

2223
## Interpreter

src/codeedges.jl

Lines changed: 125 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -182,6 +182,59 @@ function postprint_linelinks(io::IO, idx::Int, src::CodeInfo, cl::CodeLinks, bbc
182182
return nothing
183183
end
184184

185+
struct CFGShortCut
186+
from::Int # pc of GotoIfNot with inactive 𝑰𝑵𝑭𝑳 blocks
187+
to::Int # pc of the entry of the nearest common post-dominator of the GotoIfNot's successors
188+
end
189+
190+
"""
191+
info::SelectiveEvalInfo
192+
193+
When this object is passed as the `recurse` argument of `selective_eval!`,
194+
the selective execution is adjusted as follows:
195+
196+
- **Implicit return**: In Julia's IR representation (`CodeInfo`), the final block does not
197+
necessarily return and may `goto` another block. And if the `return` statement is not
198+
included in the slice in such cases, it is necessary to terminate `selective_eval!` when
199+
execution reaches such implicit return statements. `info.implicit_returns` records
200+
the PCs of such return statements, and `selective_eval!` will return when reaching those statements.
201+
202+
- **CFG short-cut**: When the successors of a conditional branch are inactive, and it is
203+
safe to move the program counter from the conditional branch to the nearest common
204+
post-dominator of those successors, this short-cut is taken.
205+
This short-cut is not merely an optimization but is actually essential for the correctness
206+
of the selective execution. This is because, in `CodeInfo`, even if we simply fall-through
207+
dead blocks (i.e., increment the program counter without executing the statements of those
208+
blocks), it does not necessarily lead to the nearest common post-dominator block.
209+
210+
These adjustments are necessary for performing selective execution correctly.
211+
[`lines_required`](@ref) or [`lines_required!`](@ref) will update the `SelectiveInterpreter`
212+
passed as an argument to be appropriate for the program slice generated.
213+
"""
214+
struct SelectiveEvalInfo
215+
implicit_returns::BitSet # pc where selective execution should terminate even if they're inactive
216+
shortcuts::Vector{CFGShortCut}
217+
end
218+
SelectiveEvalInfo() = SelectiveEvalInfo(BitSet(), CFGShortCut[])
219+
220+
"""
221+
struct SelectiveInterpreter{S<:Interpreter,T<:AbstractVector{Bool}} <: Interpreter
222+
inner::S
223+
isrequired::T
224+
end
225+
226+
An `JuliaInterpreter.Interpreter` that executes only the statements marked `true` in `isrequired`.
227+
Note that this inforeter does not recurse into callee frames.
228+
That is, when `JuliaInterpreter.finish!(info::SelectiveEvalInfo, frame, ...)` is
229+
performed, the `frame` will be executed selectively according to `info.isrequired`, but
230+
any callee frames within it will be executed by `info.inner::Interpreter`, not by `info`.
231+
"""
232+
struct SelectiveInterpreter{S<:Interpreter,T<:AbstractVector{Bool}} <: Interpreter
233+
inner::S
234+
isrequired::T
235+
info::SelectiveEvalInfo
236+
end
237+
185238
function namedkeys(cl::CodeLinks)
186239
ukeys = Set{GlobalRef}()
187240
for c in (cl.namepreds, cl.namesuccs, cl.nameassigns)
@@ -591,8 +644,8 @@ function terminal_preds!(s, j, edges, covered) # can't be an inner function bec
591644
end
592645

593646
"""
594-
isrequired = lines_required(obj::GlobalRef, src::CodeInfo, edges::CodeEdges)
595-
isrequired = lines_required(idx::Int, src::CodeInfo, edges::CodeEdges)
647+
isrequired = lines_required(obj::GlobalRef, src::CodeInfo, edges::CodeEdges, [info::SelectiveEvalInfo])
648+
isrequired = lines_required(idx::Int, src::CodeInfo, edges::CodeEdges, [info::SelectiveEvalInfo])
596649
597650
Determine which lines might need to be executed to evaluate `obj` or the statement indexed by `idx`.
598651
If `isrequired[i]` is `false`, the `i`th statement is *not* required.
@@ -601,21 +654,26 @@ will end up skipping a subset of such statements, perhaps while repeating others
601654
602655
See also [`lines_required!`](@ref) and [`selective_eval!`](@ref).
603656
"""
604-
function lines_required(obj::GlobalRef, src::CodeInfo, edges::CodeEdges; kwargs...)
657+
function lines_required(obj::GlobalRef, src::CodeInfo, edges::CodeEdges,
658+
info::SelectiveEvalInfo=SelectiveEvalInfo();
659+
kwargs...)
605660
isrequired = falses(length(edges.preds))
606661
objs = Set{GlobalRef}([obj])
607-
return lines_required!(isrequired, objs, src, edges; kwargs...)
662+
return lines_required!(isrequired, objs, src, edges, info; kwargs...)
608663
end
609664

610-
function lines_required(idx::Int, src::CodeInfo, edges::CodeEdges; kwargs...)
665+
function lines_required(idx::Int, src::CodeInfo, edges::CodeEdges,
666+
info::SelectiveEvalInfo=SelectiveEvalInfo();
667+
kwargs...)
611668
isrequired = falses(length(edges.preds))
612669
isrequired[idx] = true
613670
objs = Set{GlobalRef}()
614-
return lines_required!(isrequired, objs, src, edges; kwargs...)
671+
return lines_required!(isrequired, objs, src, edges, info; kwargs...)
615672
end
616673

617674
"""
618-
lines_required!(isrequired::AbstractVector{Bool}, src::CodeInfo, edges::CodeEdges;
675+
lines_required!(isrequired::AbstractVector{Bool}, src::CodeInfo, edges::CodeEdges,
676+
[info::SelectiveEvalInfo];
619677
norequire = ())
620678
621679
Like `lines_required`, but where `isrequired[idx]` has already been set to `true` for all statements
@@ -627,9 +685,11 @@ should _not_ be marked as a requirement.
627685
For example, use `norequire = LoweredCodeUtils.exclude_named_typedefs(src, edges)` if you're
628686
extracting method signatures and not evaluating new definitions.
629687
"""
630-
function lines_required!(isrequired::AbstractVector{Bool}, src::CodeInfo, edges::CodeEdges; kwargs...)
688+
function lines_required!(isrequired::AbstractVector{Bool}, src::CodeInfo, edges::CodeEdges,
689+
info::SelectiveEvalInfo=SelectiveEvalInfo();
690+
kwargs...)
631691
objs = Set{GlobalRef}()
632-
return lines_required!(isrequired, objs, src, edges; kwargs...)
692+
return lines_required!(isrequired, objs, src, edges, info; kwargs...)
633693
end
634694

635695
function exclude_named_typedefs(src::CodeInfo, edges::CodeEdges)
@@ -649,7 +709,9 @@ function exclude_named_typedefs(src::CodeInfo, edges::CodeEdges)
649709
return norequire
650710
end
651711

652-
function lines_required!(isrequired::AbstractVector{Bool}, objs, src::CodeInfo, edges::CodeEdges; norequire = ())
712+
function lines_required!(isrequired::AbstractVector{Bool}, objs, src::CodeInfo, edges::CodeEdges,
713+
info::SelectiveEvalInfo=SelectiveEvalInfo();
714+
norequire = ())
653715
# Mark any requested objects (their lines of assignment)
654716
objs = add_requests!(isrequired, objs, edges, norequire)
655717

@@ -684,7 +746,10 @@ function lines_required!(isrequired::AbstractVector{Bool}, objs, src::CodeInfo,
684746
end
685747

686748
# now mark the active goto nodes
687-
add_active_gotos!(isrequired, src, cfg, postdomtree)
749+
add_active_gotos!(isrequired, src, cfg, postdomtree, info)
750+
751+
# check if there are any implicit return blocks
752+
record_implcit_return!(info, isrequired, cfg)
688753

689754
return isrequired
690755
end
@@ -762,19 +827,19 @@ end
762827

763828
## Add control-flow
764829

765-
766830
# The goal of this function is to request concretization of the minimal necessary control
767831
# flow to evaluate statements whose concretization have already been requested.
768832
# The basic algorithm is based on what was proposed in [^Wei84]. If there is even one active
769833
# block in the blocks reachable from a conditional branch up to its successors' nearest
770834
# common post-dominator (referred to as 𝑰𝑵𝑭𝑳 in the paper), it is necessary to follow
771-
# that conditional branch and execute the code. Otherwise, execution can be short-circuited
835+
# that conditional branch and execute the code. Otherwise, execution can be short-cut
772836
# from the conditional branch to the nearest common post-dominator.
773837
#
774-
# COMBAK: It is important to note that in Julia's intermediate code representation (`CodeInfo`),
775-
# "short-circuiting" a specific code region is not a simple task. Simply ignoring the path
776-
# to the post-dominator does not guarantee fall-through to the post-dominator. Therefore,
777-
# a more careful implementation is required for this aspect.
838+
# It is important to note that in Julia's intermediate code representation (`CodeInfo`),
839+
# "short-cutting" a specific code region is not a simple task. Simply incrementing the
840+
# program counter without executing the statements of 𝑰𝑵𝑭𝑳 blocks does not guarantee that
841+
# the program counter fall-throughs to the post-dominator.
842+
# To handle such cases, `selective_eval!` needs to use `SelectiveInterpreter`.
778843
#
779844
# [Wei84]: M. Weiser, "Program Slicing," IEEE Transactions on Software Engineering, 10, pages 352-357, July 1984.
780845
function add_control_flow!(isrequired, src::CodeInfo, cfg::CFG, postdomtree)
@@ -849,8 +914,8 @@ function reachable_blocks(cfg, from_bb::Int, to_bb::Int)
849914
return visited
850915
end
851916

852-
function add_active_gotos!(isrequired, src::CodeInfo, cfg::CFG, postdomtree)
853-
dead_blocks = compute_dead_blocks(isrequired, src, cfg, postdomtree)
917+
function add_active_gotos!(isrequired, src::CodeInfo, cfg::CFG, postdomtree, info::SelectiveEvalInfo)
918+
dead_blocks = compute_dead_blocks!(isrequired, src, cfg, postdomtree, info)
854919
changed = false
855920
for bbidx = 1:length(cfg.blocks)
856921
if bbidx dead_blocks
@@ -868,7 +933,7 @@ function add_active_gotos!(isrequired, src::CodeInfo, cfg::CFG, postdomtree)
868933
end
869934

870935
# find dead blocks using the same approach as `add_control_flow!`, for the converged `isrequired`
871-
function compute_dead_blocks(isrequired, src::CodeInfo, cfg::CFG, postdomtree)
936+
function compute_dead_blocks!(isrequired, src::CodeInfo, cfg::CFG, postdomtree, info::SelectiveEvalInfo)
872937
dead_blocks = BitSet()
873938
for bbidx = 1:length(cfg.blocks)
874939
bb = cfg.blocks[bbidx]
@@ -889,13 +954,31 @@ function compute_dead_blocks(isrequired, src::CodeInfo, cfg::CFG, postdomtree)
889954
end
890955
if !is_𝑰𝑵𝑭𝑳_active
891956
union!(dead_blocks, delete!(𝑰𝑵𝑭𝑳, postdominator))
957+
if postdominator 0
958+
postdominator_bb = cfg.blocks[postdominator]
959+
postdominator_entryidx = postdominator_bb.stmts[begin]
960+
push!(info.shortcuts, CFGShortCut(termidx, postdominator_entryidx))
961+
end
892962
end
893963
end
894964
end
895965
end
896966
return dead_blocks
897967
end
898968

969+
function record_implcit_return!(info::SelectiveEvalInfo, isrequired, cfg::CFG)
970+
for bbidx = 1:length(cfg.blocks)
971+
bb = cfg.blocks[bbidx]
972+
if isempty(bb.succs)
973+
i = findfirst(idx::Int->!isrequired[idx], bb.stmts)
974+
if !isnothing(i)
975+
push!(info.implicit_returns, bb.stmts[i])
976+
end
977+
end
978+
end
979+
nothing
980+
end
981+
899982
# Do a traveral of "numbered" predecessors and find statement ranges and names of type definitions
900983
function find_typedefs(src::CodeInfo)
901984
typedef_blocks, typedef_names = UnitRange{Int}[], Symbol[]
@@ -1018,26 +1101,19 @@ function add_inplace!(isrequired, src, edges, norequire)
10181101
return changed
10191102
end
10201103

1021-
"""
1022-
struct SelectiveInterpreter{S<:Interpreter,T<:AbstractVector{Bool}} <: Interpreter
1023-
inner::S
1024-
isrequired::T
1025-
end
1026-
1027-
An `JuliaInterpreter.Interpreter` that executes only the statements marked `true` in `isrequired`.
1028-
Note that this interpreter does not recurse into callee frames.
1029-
That is, when `JuliaInterpreter.finish!(interp::SelectiveInterpreter, frame, ...)` is
1030-
performed, the `frame` will be executed selectively according to `interp.isrequired`, but
1031-
any callee frames within it will be executed by `interp.inner::Interpreter`, not by `interp`.
1032-
"""
1033-
struct SelectiveInterpreter{S<:Interpreter,T<:AbstractVector{Bool}} <: Interpreter
1034-
inner::S
1035-
isrequired::T
1036-
end
10371104
function JuliaInterpreter.step_expr!(interp::SelectiveInterpreter, frame::Frame, istoplevel::Bool)
10381105
pc = frame.pc
1106+
if pc in info.implicit_returns
1107+
return nothing
1108+
elseif pc_expr(frame) isa GotoIfNot
1109+
for shortcut in info.shortcuts
1110+
if shortcut.from == pc
1111+
return frame.pc = shortcut.to
1112+
end
1113+
end
1114+
end
10391115
if interp.isrequired[pc]
1040-
step_expr!(interp.inner, frame::Frame, istoplevel::Bool)
1116+
step_expr!(interp.inner, frame, istoplevel)
10411117
else
10421118
next_or_nothing!(interp, frame)
10431119
end
@@ -1068,12 +1144,23 @@ See [`selective_eval_fromstart!`](@ref) to have that performed automatically.
10681144
10691145
`isrequired` pertains only to `frame` itself, not any of its callees.
10701146
1147+
When `interp.info::SelectiveEvalInfo` is configured, the selective evaluation execution
1148+
becomes fully correct. Conversely, with the default `finish_and_return!`, selective
1149+
evaluation may not be necessarily correct for all possible Julia code (see
1150+
https://github.com/JuliaDebug/LoweredCodeUtils.jl/pull/99 for more details).
1151+
1152+
Ensure that the specified `interp` is properly synchronized with `isrequired`.
1153+
Additionally note that, at present, it is not possible to recurse the `interp`.
1154+
In other words, there is no system in place for inforocedural selective evaluation.
1155+
10711156
This will return either a `BreakpointRef`, the value obtained from the last executed statement
10721157
(if stored to `frame.framedata.ssavlues`), or `nothing`.
10731158
Typically, assignment to a variable binding does not result in an ssa store by JuliaInterpreter.
10741159
"""
1160+
selective_eval!(interp::SelectiveInterpreter, frame::Frame, istoplevel::Bool=false) =
1161+
JuliaInterpreter.finish_and_return!(interp, frame, istoplevel)
10751162
selective_eval!(interp::Interpreter, frame::Frame, isrequired::AbstractVector{Bool}, istoplevel::Bool=false) =
1076-
JuliaInterpreter.finish_and_return!(SelectiveInterpreter(interp, isrequired), frame, istoplevel)
1163+
selective_eval!(SelectiveInterpreter(interp, isrequired, SelectiveEvalInfo()), frame, istoplevel)
10771164
selective_eval!(args...) = selective_eval!(RecursiveInterpreter(), args...)
10781165

10791166
"""

src/packagedef.jl

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,8 @@ const trackedheads = (:method,) # Revise uses this (for now), don't delete; a
2424
const structdecls = (:_structtype, :_abstracttype, :_primitivetype)
2525

2626
export signature, rename_framemethods!, methoddef!, methoddefs!, bodymethod
27-
export CodeEdges, lines_required, lines_required!, selective_eval!, selective_eval_fromstart!
27+
export CodeEdges, SelectiveEvalInfo, SelectiveInterpreter,
28+
lines_required, lines_required!, selective_eval!, selective_eval_fromstart!
2829

2930
include("utils.jl")
3031
include("signatures.jl")

src/signatures.jl

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -334,6 +334,7 @@ function _rename_framemethods!(interp::Interpreter, frame::Frame,
334334
set_to_running_name!(interp, replacements, frame, methodinfos, selfcalls[idx], calledby, callee, caller)
335335
catch err
336336
@warn "skipping callee $callee (called by $caller) due to $err"
337+
# showerror(stderr, err, stacktrace(catch_backtrace()))
337338
end
338339
end
339340
for sc in selfcalls

test/codeedges.jl

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -217,6 +217,25 @@ module ModSelective end
217217
@test ModSelective.k11 == 0
218218
@test 3 <= ModSelective.s11 <= 15
219219

220+
# Final block is not a `return`: Need to use `controller::SelectiveEvalController` explicitly
221+
ex = quote
222+
x = 1
223+
yy = 7
224+
@label loop
225+
x += 1
226+
x < 5 || return yy
227+
@goto loop
228+
end
229+
frame = Frame(ModSelective, ex)
230+
src = frame.framecode.src
231+
edges = CodeEdges(ModSelective, src)
232+
info = SelectiveEvalInfo()
233+
isrequired = lines_required(GlobalRef(ModSelective, :x), src, edges, info)
234+
interp = LoweredCodeUtils.SelectiveInterpreter(LoweredCodeUtils.RecursiveInterpreter(), isrequired, info)
235+
selective_eval_fromstart!(interp, frame, true)
236+
@test ModSelective.x == 5
237+
@test !isdefined(ModSelective, :yy)
238+
220239
# Control-flow in an abstract type definition
221240
ex = :(abstract type StructParent{T, N} <: AbstractArray{T, N} end)
222241
frame = Frame(ModSelective, ex)

test/signatures.jl

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -414,9 +414,10 @@ bodymethtest5(x, y=Dict(1=>2)) = 5
414414
oldenv = Pkg.project().path
415415
try
416416
# we test with the old version of CBinding, let's do it in an isolated environment
417+
# so we don't cause package conflicts with everything else
417418
Pkg.activate(; temp=true, io=devnull)
418419

419-
@info "Adding CBinding to the environment for test purposes"
420+
@info "Adding CBinding v0.9.4 to the environment for test purposes"
420421
Pkg.add(; name="CBinding", version="0.9.4", io=devnull) # `@cstruct` isn't defined for v1.0 and above
421422

422423
m = Module()

0 commit comments

Comments
 (0)