Skip to content

Commit 2b0b2cd

Browse files
committed
New invalidations (#418)
This is a rewrite of the invalidations infrastructure based on Julia 1.12+. Julia 1.12 separates the data into two logging streams, one for immediate method insertion/deletion and the other for edge-validation during package loading. This allows a cleaner implementation of the parsing. I've also invested in a more extensive and systematic test suite, attempting to cover all code paths, and added tests for binding invalidation. Finally, this expands the documentation on developer topics.
1 parent d700e38 commit 2b0b2cd

File tree

32 files changed

+965
-337
lines changed

32 files changed

+965
-337
lines changed

.github/workflows/ci.yml

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -20,8 +20,7 @@ jobs:
2020
fail-fast: false
2121
matrix:
2222
version:
23-
- '1.10'
24-
- '1'
23+
# - 'min'
2524
- 'pre'
2625
- 'nightly'
2726
os:
@@ -44,7 +43,7 @@ jobs:
4443
${{ runner.os }}-test-${{ env.cache-name }}-
4544
${{ runner.os }}-test-
4645
${{ runner.os }}-
47-
- run: julia --project -e 'using Pkg; Pkg.develop([PackageSpec(path="SnoopCompileCore")])'
46+
# - run: julia --project -e 'using Pkg; Pkg.develop([PackageSpec(path="SnoopCompileCore")])'
4847
- uses: julia-actions/julia-buildpkg@latest
4948
- uses: julia-actions/julia-runtest@latest
5049
- run: julia --check-bounds=yes --project -e 'using Pkg; Pkg.test(; test_args=["cthulhu"], coverage=true)'

Project.toml

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
name = "SnoopCompile"
22
uuid = "aa65fe97-06da-5843-b5b1-d5d13cad87d2"
3+
version = "3.2.0"
34
author = ["Tim Holy <[email protected]>", "Shuhei Kadowaki <[email protected]>"]
4-
version = "3.1.3"
55

66
[deps]
77
AbstractTrees = "1520ce14-60c1-5f80-bbc7-55ef81b5835c"
@@ -20,6 +20,9 @@ Cthulhu = "f68482b8-f384-11e8-15f7-abe071a5a75f"
2020
JET = "c3a54625-cd67-489e-a8e7-0a5a0ff4e31b"
2121
PyPlot = "d330b81b-6aea-500a-939a-2ce795aea3ee"
2222

23+
[sources]
24+
SnoopCompileCore = {path = "SnoopCompileCore"}
25+
2326
[extensions]
2427
CthulhuExt = "Cthulhu"
2528
JETExt = ["JET", "Cthulhu"]
@@ -32,7 +35,7 @@ Cthulhu = "2"
3235
FlameGraphs = "1"
3336
InteractiveUtils = "1"
3437
JET = "0.9"
35-
MethodAnalysis = "0.4"
38+
MethodAnalysis = "1"
3639
OrderedCollections = "1"
3740
Pkg = "1"
3841
PrettyTables = "2, 3"
@@ -42,10 +45,10 @@ PyPlot = "2"
4245
REPL = "1"
4346
Random = "1"
4447
Serialization = "1"
45-
SnoopCompileCore = "3"
48+
SnoopCompileCore = "3.1"
4649
Test = "1"
4750
YAML = "0.4"
48-
julia = "1.10"
51+
julia = "1.12"
4952

5053
[extras]
5154
Cthulhu = "f68482b8-f384-11e8-15f7-abe071a5a75f"

SnoopCompileCore/Project.toml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
name = "SnoopCompileCore"
22
uuid = "e2b509da-e806-4183-be48-004708413034"
33
author = ["Tim Holy <[email protected]>"]
4-
version = "3.0.0"
4+
version = "3.1.0"
55

66
[deps]
77
Serialization = "9e88b42a-f829-5b0c-bbe9-9e923198166b"
88

99
[compat]
10-
julia = "1"
10+
julia = "1.12"

SnoopCompileCore/src/snoop_invalidations.jl

Lines changed: 18 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,10 @@
11
export @snoop_invalidations
22

3+
struct InvalidationLists
4+
logedges::Vector{Any}
5+
logmeths::Vector{Any}
6+
end
7+
38
"""
49
invs = @snoop_invalidations expr
510
@@ -26,12 +31,19 @@ Method insertion results in the sequence
2631
The authoritative reference is Julia's own `src/gf.c` file.
2732
"""
2833
macro snoop_invalidations(expr)
29-
quote
30-
local invs = ccall(:jl_debug_method_invalidation, Any, (Cint,), 1)
31-
Expr(:tryfinally,
32-
$(esc(expr)),
34+
# It's a little unclear why this is better than a quoted try/finally, but it seems to be
35+
# Guessing it's a lack of a block around `expr`
36+
exoff = Expr(:tryfinally,
37+
esc(expr),
38+
quote
39+
Base.StaticData.debug_method_invalidation(false)
3340
ccall(:jl_debug_method_invalidation, Any, (Cint,), 0)
34-
)
35-
invs
41+
end
42+
)
43+
return quote
44+
local logedges = Base.StaticData.debug_method_invalidation(true)
45+
local logmeths = ccall(:jl_debug_method_invalidation, Any, (Cint,), 1)
46+
$exoff
47+
$InvalidationLists(logedges, logmeths)
3648
end
3749
end

docs/make.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ makedocs(
1515
pages = ["index.md",
1616
"Basic tutorials" => ["tutorials/invalidations.md", "tutorials/snoop_inference.md", "tutorials/snoop_llvm.md", "tutorials/pgdsgui.md", "tutorials/jet.md"],
1717
"Advanced tutorials" => ["tutorials/snoop_inference_analysis.md", "tutorials/snoop_inference_parcel.md"],
18-
"Explanations" => ["explanations/tools.md", "explanations/gotchas.md", "explanations/fixing_inference.md"],
18+
"Explanations" => ["explanations/tools.md", "explanations/gotchas.md", "explanations/fixing_inference.md", "explanations/invalidation_classes.md", "explanations/devs.md"],
1919
"reference.md",
2020
]
2121
)

docs/src/explanations/devs.md

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
# Information for SnoopCompile developers
2+
3+
## Invalidations
4+
5+
### Capturing invalidation logs
6+
7+
Julia itself handles (in)validation when you define (or delete) methods and load packages. Julia's internal machinery provides the option of recording these invalidation decisions to a log, which is just a `Vector{Any}`. Currently (as of Julia 1.12) there are two independent logs:
8+
9+
- for method insertion and deletion (i.e., new methods invalidating old code), logging is handled in Julia's `src/gf.c`. You enable it with `logmeths = ccall(:jl_debug_method_invalidation, Any, (Cint,), true)` and pass a final argument of `false` to turn it off.
10+
- for validating precompiled code during package loading (i.e., "new" code being invalidated by old methods), logging is handled in Julia's `base/staticdata.jl`. You enable it with `logedges = Base.StaticData.debug_method_invalidation(true)` and pass `false` to turn it off.
11+
12+
In both cases, the log will initially be empty, but subsequent activity (defining or deleting methods, or loading packages) may add entries.
13+
14+
SnoopCompileCore's `@snoop_invalidation` just turns on these logging streams, executes the user's block of code, turns off logging, and returns the captured log streams.
15+
16+
### Interpreting invalidation logs
17+
18+
The definitive source for interpreting these two logging streams is Julia's own code; the documentation below may be outdated by future changes in Julia. (Such changes have happened repeatedly over the course of Julia's development.) If you have even a shred of doubt about whether any of this is (still) correct, check Julia's code.
19+
20+
For both logging streams, a single decision typically results in appending multiple entries to the log. These decisions come with a string (the *tag*) documenting the origin of each entry. In general, each distinct mechanism by which invalidations can occur should have its own unique tag. Often these correspond to specific lines in the source code.
21+
22+
#### method logs
23+
24+
Let `trigger::Method` indicate an added or deleted method for function `f`. If defining/deleting this method would change how one or more `caller::MethodInstance`s of the corresponding function would dispatch, those `caller`s must be invalidated. Such events can result in a cascade of invalidations of code that directly or indirectly called `trigger` or less-specific methods of the same function. The order in which these invalidations appear in the log stream is as follows:
25+
26+
2. Backedges of `callee` below, encoded as a tree where links are specified as `(caller::MethodInstance, depth::Int32)` pairs.
27+
`depth=1` typically corresponds to an inferrable caller. `depth=0` corresponds to a potentially-missing callee (at the time of compilation), and will be followed by `calleesig::DataType`. (If the called function had potentially-applicable methods, `calleesig` will not be a subtype of any of their signatures.) corresponds to the root (though no entry with `depth=0` is written), and sequential increases in `depth` indicate a traversal through branches. If `depth` decreases, this indicates the start of a new branch from the parent with depth `depth-1`.
28+
1. `(callee::MethodInstance, tag)` pairs that were directly affected by change in dispatch.
29+
3. Possibly,
30+
31+
After all such `callee` branches are complete, the `(trigger::Method, tag)` event that initiated the entire set of invalidations pair is logged.
32+
33+
The interpretation of the tags is as follows:
34+
35+
- `"jl_method_table_disable"`: the `trigger` with the same tag was deleted (`Base.delete_method`)
36+
- `"jl_method_table_insert"`: the `trigger` with the same tag was added (`function f(...) end`)
37+
- `(callee::MethodInstance, "invalidate_mt_cache")`: a method-table cache for runtime dispatch was invalidated by a method insertion. At sites of runtime dispatch, Julia will maintain local method tables of the most common call targets to make dispatch more efficient. Since runtime dispatch involves real-time method lookup anyway, this form of invalidation is not serious, and a detailed listing is suppressed by SnoopCompile's printing behavior. These are always followed (eventually) by a `(trigger::Method, "jl_method_table_insert")` pair.
38+
39+
#### edge logs
40+
41+
Since edge logs are populated during package loading, we'll use `PkgDep` to indicate a package that is a dependency for `PkgUser`. (`PkgUser`'s `Project.toml` might list `PkgDep` in its `[deps]` section, or it might be an indirect dependency.)
42+
Invalidation events result in the insertion of 3 or 4 items in `logedges`. The tag is always the second item. They take one of the following forms:
43+
44+
- `(def::Method, "method_globalref", codeinst::CodeInstance, nothing)`: method `def` in `PkgUser` references `PkgDep.SomeObject` (which might be `const` data, a type, etc.), but the binding for `SomeObject` has been modified since `PkgUser` was compiled. `codeinst`, which holds a compiled specialization of `def`, needs to be recompiled.
45+
- `(edge::Union{MethodInstance,DataType,Core.Binding}, "insert_backedges_callee", codeinst::CodeInstance, matches::Union{Vector{Any},Nothing})`: `edge` was selected as a dispatch target (a "callee") of `codeinst`, but new method(s) listed in `matches` now supersede it in dispatch specificity. There are 3 or 4 sub-cases:
46+
* `edge::MethodInstance` indicates a known target at the time of compilation
47+
* `edge::DataType` represents either
48+
+ `Tuple{typeof(f), argtypes...}` for a poorly-inferred or `invoke`d call for which the target selected at compilation time is no longer valid (`matches` will be `nothing`)
49+
+ a signature of a known function for which no appropriate method had yet been defined at the time of compilation. `matches` lists methods that now apply.
50+
* `edge::Core.Binding` indicates a target that was unknown at the time of compilation, and `matches` will be `nothing`.
51+
- `(caller::CodeInstance, "verify_methods", callee::CodeInstance)`: `callee` is an invalidated dependency of `caller`. These encode invalidations that cascade from the proximal source.
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
# Invalidation classes
2+
3+
[`invalidation_trees`](@ref) returns two broad classes of invalidated targets in `backedges` and `mt_backedges`.
4+
To understand the difference, let's introduce a new term: we say that a callee method *covers* a call if it can accept all possible types used in that call. Consider this example:
5+
6+
```julia
7+
f(x::Integer) = false
8+
g1(x::Signed) = f(x) # `f(::Integer)` always covers this call
9+
g2(x::Number) = f(x) # `f(::Integer)` may not cover this call
10+
```
11+
12+
`g1` will only ever be called for `Signed` inputs, and because `Signed <: Integer`, the method of `f` fully covers the call in `g1`. In contrast, `g2` can be called for any `Number` type, and since `Number` is not a subtype of `Integer`, `f` may not cover the entire call.
13+
14+
With this understanding, the difference is straightforward: `backedges`-class invalidations are when there is exactly one applicable method and it fully covers the call. `mt_backedges`-class invalidations are for anything else. In such cases, Julia may need to scan the method table (the `mt` in `mt_backedges`) of the function in order to determine which method, if any, might be applicable.
15+
16+
This helps explain why `mt_backedges` invalidations are more likely to arise from poor inference: poor inference "widens" the argument types and thus makes it more likely that a call is unlikely to be covered by exactly one method. It's still possible to get a `backedges`-class invalidation from poor inference:
17+
18+
```julia
19+
g3(x::Ref{Any}) = f(x[]::Signed)
20+
```
21+
22+
guarantees that our method of `f` covers the call, even though we can't predict with precision what type `x[]` will return. Thus if you invalidate the compiled code of `g3` by defining a new method for `f(x::Signed)`, you'll get a `backedges`-class invalidation.

docs/src/tutorials/invalidations.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -163,7 +163,7 @@ sig, victim = tree.mt_backedges[end];
163163
```
164164

165165
!!! note
166-
`mt_backedges` stands for "MethodTable backedges." In other cases you may see a second type of invalidation, just called `backedges`. With these, there is no `sig`, and so you'll use just `victim = tree.backedges[i]`.
166+
`mt_backedges` stands for "MethodTable backedges." In other cases you may see a second kind of invalidation, just called `backedges`. With these, there is no `sig`, and so you'll use just `victim = tree.backedges[i]`. For those curious about the reasons for these two kinds of invalidation, see [Invalidation classes](@ref).
167167

168168
First let's look at the the problematic method `sig`nature:
169169

@@ -223,7 +223,9 @@ The first and simplest technique is to ensure that the full range of possibiltie
223223

224224
#### Method 2: improve inferability
225225

226-
The second way to prevent invalidations is to improve the inferability of the victim(s). If `Int` and `Char` really are the only possible kinds of cards, then in `playgame` it would be better to declare
226+
The second way to prevent invalidations is to improve the inferability of the victim(s). This approach is often applicable to `mt_backedges` invalidations, but it can sometimes fix `backedges` invalidations too. [Invalidation classes](@ref) explains the differences in detail and why inference failures tend to be affiliated with `mt_backedges` invalidations.
227+
228+
In our blackjack example, if `Int` and `Char` really are the only possible kinds of cards, then in `playgame` it would be better to declare
227229

228230
```julia
229231
myhand = Union{Int,Char}[]

src/SnoopCompile.jl

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -33,9 +33,10 @@ you should prefer them above the more limited tools available on earlier version
3333
module SnoopCompile
3434

3535
using SnoopCompileCore
36+
using SnoopCompileCore: InvalidationLists
3637
# More exports are defined below in the conditional loading sections
3738

38-
using Core: MethodInstance, CodeInfo
39+
using Core: MethodInstance, CodeInstance, Binding, CodeInfo
3940
using InteractiveUtils
4041
using Serialization
4142
using Printf
@@ -82,8 +83,8 @@ export read_snoop_llvm
8283
include("invalidations.jl")
8384
export uinvalidated, invalidation_trees, filtermod, findcaller
8485

85-
include("invalidation_and_inference.jl")
86-
export precompile_blockers
86+
# include("invalidation_and_inference.jl")
87+
# export precompile_blockers
8788

8889
# Write
8990
include("write.jl")

0 commit comments

Comments
 (0)