-
Notifications
You must be signed in to change notification settings - Fork 9
SyntaxGraph: Usability and performance tweaks #48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
(passing the NamedTuple variant is an error here, and can't be mutated anyway)
Missing utilities in line with existing ones (`freeze_attrs`, `attrnames`)
For `ensure_attributes` and `delete_attributes`, the output graph's
`.attributes` now have the same type (`Dict` or `NamedTuple`) as the input.
Add `delete_attributes!` defined only on dict-attrs to be consistent with
`ensure_attributes!`
Funny to realize it wasn't doing anything. If we want freezing, it should go
after lowering anyway.
Also clarify the `SyntaxTree(graph, syntaxnode)` signature.
It's good that we seemingly have better lowering runtime performance now. I don't think this will resolve the compile-time bottleneck entirely, but I think forcing the type of unfrozen attributes to julia> Base.infer_return_type(x->Dict(pairs(x)...), (Dict{Symbol,Any},))
Union{Dict{Any, Any}, Dict{Symbol, Any}}
julia> Base.infer_return_type(x->Dict{Symbol,Any}(pairs(x)...), (Dict{Symbol,Any},))
Dict{Symbol, Any}In practice, I think the actual type of attributes is |
Co-authored-by: Shuhei Kadowaki <[email protected]>
|
Excellent suggestion. One day I hope to have an abstract interpreter running in my head as you do. |
|
Wow, I didn't expect it to be that effective. That's great. In my opinion, I think we could be even more aggressive: mutable struct SyntaxGraph{Attrs <: Union{Dict{Symbol,Any},NamedTuple}}
edge_ranges::Vector{UnitRange{Int}}
edges::Vector{NodeId}
attributes::Attrs
endBut this is a somewhat restrictive change, so we might want to hear @c42f's opinion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean we're now using Dict-based attributes throughout lowering?
If using Dict everywhere rather than NamedTuple is a runtime improvement, that's a huge performance debugging TODO, and honestly quite alarming. If the frozen attributes were working as expected, we should have things like ex.name_val type inferred as a String; overall the design goal was for SyntaxGraph to support type stable arbitrary attributes.
If you're sure Dict is an improvement we can merge this for now but it's clear that attribute storage performance needs revisiting.
Related - I'm currently working on some changes to make setattr! fully inferrable.
src/ast.jl
Outdated
| any other attribute. | ||
| """ | ||
| function copy_ast(ctx, ex) | ||
| function copy_ast(ctx, ex; copy_source=true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When would we ever want copy_source=false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I needed it for copying a tree into its own graph (I'll add a check_same_graph here). There should be a way of doing this without recursing on .source (which probably reaches everything in the graph), though I'm open to having it be a separate function if there's a reasonable name we can give it.
Tangential, but I'm now realizing we should fix the way we copy .source too. We're recursing on every child or source reference, where most nodes are referenced twice in this way, causing an explosion.
julia> st = jlower("function foo(;a=1); end")
SyntaxTree with attributes scope_type,lambda_bindings,name_val,syntax_flags,meta,scope_layer,mod,kind,value,var_id,id,is_toplevel_thunk,source,slots
<ast snip>
julia> length(st._graph.edge_ranges)
549
julia> @time JL.copy_ast(st._graph, st)
11.903939 seconds (65.51 M allocations: 3.234 GiB, 27.79% gc time)
<ast snip>
julia> length(st._graph.edge_ranges)
4761563
I think we get lucky by never calling copy_ast this late in the lowering pipeline where nontrivial .source chains show up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, copy_ast has only really been used for macro expansion.
I forgot the source handling didn't memoize anything (can't remember exactly but I think that code was part of some fairly early prototyping ... oops!)
It would be interesting if this helps, but surely it would be a sign that things aren't type inferred as expected and that should instead be fixed by a one or two well-placed type asserts? |
5312d93 to
cf0cd4f
Compare
Co-authored-by: Claire Foster <[email protected]>
cf0cd4f to
7e28e09
Compare
Yes, although the type is no longer converted to anything after each pass, so either attribute type will come out of lowering unchanged, and it would be easy to switch the default back.
I believe the runtime of the compiled code is unchanged (~1.95 seconds to run all tests). The numbers I gave were for the first time running tests, which show a large percentage of compile time.
Sounds good; I'll merge this given the speedup and the easiness of switching the default back. I agree we do need to go back and look into performance further. |
Sounds reasonable. Hopefully my attempts to make the |
Several small
SyntaxGraphtweaks. Some of these have been split out from #35 since they are tiny improvements that don't need to wait for data size experimentation.Changes worth discussing:
Do not coerce attributes to NamedTuple unnecessarily
It doesn't make sense to freeze Dict-attributes to NamedTuple after an
ensure_attributesordelete_attributescall. I assume this freezing was unintentional and not detected because the attribute storage type abstraction is implemented quite well.Note that this change increases precompile time and decreases test time. Below are outputs from
@time using JuliaLoweringand@time include("test/runtests.jl")Before:
After:
(Our precompile statements could probably use some tweaking still)
Delete ineffective freeze_attrs call
This post-parsing freeze call wasn't doing anything. The freezing we got was actually from
ensure_attributes. The output of lowering now has Dict-attributes, which I think are fine, but if we want to freeze post-lowering we can do that. Making the call effective has a bad effect on the test time above.Print more information when node does not have attribute (and no default is provided)
This is an unrecoverable error anyway, so print a bit more information about the node we tried to access. Before:
After: