Skip to content

Conversation

@mlechu
Copy link
Collaborator

@mlechu mlechu commented Apr 28, 2025

There's a significant delay when calling lower for the first time. This PR is
an attempt to improve performance without interfering too much with @c42f's
design choices.

Info

The delay is mostly JIT time, with some functions (expand_function_def,
compile_try) taking ~1.5 seconds to compile! It's become somewhat worse since
the version of julia pre-#10, but was still present before.

Julia currently doesn't handle large immutable structs very well, and
graph.attributes (usually) being a named tuple triggers this. Since we allow
it to be a Dict anyway, this change experiments with it always being a Dict.
Note that this removes our ability to freeze attributes.

Some measurements

Using julia commit 8780aa93ac84198bc7ebcbdd784f26041c99dea5 and changes from
PR #10

lower(1)

Before:

julia> @time st = JuliaLowering.lower(Main, JuliaSyntax.parse(SyntaxTree, "1"))
 34.541930 seconds (105.04 M allocations: 5.177 GiB, 2.72% gc time, 99.99% compilation time)
SyntaxTree with attributes scope_type,lambda_bindings,name_val,syntax_flags,meta,scope_layer,mod,kind,value,var_id,id,is_toplevel_thunk,source,slots
[code_info]                                                  │ is_toplevel_thunk=true,slots
  [block]                                                    │
    [return]                                                 │
      1                                  :: Integer

After:

julia> @time st = JuliaLowering.lower(Main, JuliaSyntax.parse(SyntaxTree, "1"))
 14.536475 seconds (71.05 M allocations: 3.527 GiB, 5.25% gc time, 99.99% compilation time)
SyntaxTree with attributes scope_type,lambda_bindings,name_val,syntax_flags,meta,scope_layer,mod,kind,value,var_id,id,is_toplevel_thunk,source,slots
[code_info]                                                  │ is_toplevel_thunk=true,slots
  [block]                                                    │
    [return]                                                 │
      1                                  :: Integer

Pkg.test()

(IR tests commented out)
Before:

$ julia --project=. --startup-file=no -e 'using Pkg; @time Pkg.test()'
Test Summary:    | Pass  Total     Time
JuliaLowering.jl |  402    402  2m13.2s
     Testing JuliaLowering tests passed
135.007196 seconds (4.82 M allocations: 356.619 MiB, 0.11% gc time, 0.50% compilation time: <1% of which was recompilation)

After:

$ julia --project=. --startup-file=no -e 'using Pkg; @time Pkg.test()'
Test Summary:    | Pass  Total   Time
JuliaLowering.jl |  402    402  28.0s
     Testing JuliaLowering tests passed
 29.890489 seconds (4.82 M allocations: 356.358 MiB, 0.52% gc time, 2.30% compilation time: <1% of which was recompilation)

@c42f
Copy link
Owner

c42f commented Apr 29, 2025

The initial latency due to type inference is pretty painful, I agree.

Did you benchmark the effects of this after the initial compilation latency? I would assume that this only hurts us once, and if we add a precompile workload the initial latency will go away. In that case the tighter results in type inference should give us better runtime when accessing attributes. Unfortunately, we will always pay the precompilation latency during development but other than that it's a win to have strongly typed attributes.

The "right way" to do this is probably just make the Attrs type a Dict{Symbol,Any}, as is already done by the default SyntaxGraph constructor. Nothing much else should need to change in the whole codebase other than some of the entry points where we freeze/unfreeze the attributes. If we can find a generic way to express "maybe freeze attributes if in strongly typed mode" we can have both options available in the same code.

@mlechu
Copy link
Collaborator Author

mlechu commented Apr 29, 2025

The "right way" to do this is probably just make the Attrs type a Dict{Symbol,Any}

This should be what I've changed so far (it's a pretty tiny change), but feel free to make suggestions or commits to this branch as you see fit. I'm also open to making a change that's reconsidered later (once development is less active, or once julia doesn't have this pathological case)

Did you benchmark the effects of this after the initial compilation latency?

Good idea. I didn't find any significant difference in runtime (though maybe I should find a bigger workload).

Below are times from running all tests, including failing IR tests, via @time try; include("test/runtests.jl"); catch x; end ~20 times and picking the shortest time (the average times are also similar). Unrelated, but this testing procedure eventually hits some non-deterministic segfault in jl_make_opaque_closure_method after enough iterations for me, which I should probably chase down in rr...

Before this change:

  1.500735 seconds (4.55 M allocations: 348.962 MiB, 1.77% gc time, 46.28% compilation time)

After this change:

  1.486018 seconds (6.31 M allocations: 310.264 MiB, 0.90% gc time, 47.05% compilation time)

TODO: Global attributes!
"""
struct SyntaxGraph{Attrs}
struct SyntaxGraph
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I had in mind here is too not remove the Attrs type parameter, but simply set it to Dict during construction.

This is what the existing design has in mind: we can have either a fully typed version with NamedTuple, or a flexible dynamic version with Dict.

That way we get both options within the same exact code base.

Copy link
Collaborator Author

@mlechu mlechu Apr 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see; pushed a change doing this. I unfortunately couldn't achieve the same performance (from the original PR message, lower(1) is the same as before, and Pkg.test() takes about 60 seconds).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternative: marking structs mutable in syntax_graph.jl, which should change some stack allocations to heap allocations. This gives me 18 seconds and 68 seconds. Updated in case you'd like to try it.

@c42f
Copy link
Owner

c42f commented Apr 30, 2025

If the benchmarks suggest the runtime is the same, that seems like proof that that the NamedTuple is not having the desired affect here.

There's one place where cloning attributes of SyntaxTree in mapchildren which is currently not type inferrable in the current design and that should certainly hurt us right now. I'm a little surprised that we still lose regardless. Clearly there's still some work to do here to realise the theoretical type stability benefits from the current design.

@mlechu
Copy link
Collaborator Author

mlechu commented Apr 30, 2025

For clarity, the long initial hang is not in type inference; it's LLVM taking forever to deal with what julia gives it (large structs on the stack are the likely culprit, thanks @gbaraldi for investigating).

@c42f
Copy link
Owner

c42f commented Jul 21, 2025

It's so interesting that just making these mutable fixes this - thanks for investigating further!

@c42f c42f merged commit e24d0e6 into c42f:main Jul 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants