-
Notifications
You must be signed in to change notification settings - Fork 36
Description
So after type stability issues are taken care of, one more optimization that can be worked on is how to best access fields in the VarInfo
struct? The main set of optimizations I can think of make the assumption that:
- The model is static, i.e. has a fixed number of parameters, and
- The model has deterministic access pattern for the random variables, so any 2 variables are always evaluated in the same order.
This unlocks a whole set of possible optimizations for most if not all samplers, definitely HMC:
- We can use
StaticArrays.MArray
for the fields ofTypedVarInfo
since it won't need to grow in size. - We can avoid the need for the
idcs
dictionary which looks up the variable by itsVarName
. Instead, we can sort theranges
andvals
vectors by the access order of the parameters and only keep a counter for each symbol that gives us the index inranges
which gives us the index range invals
. This counter can be incremented inline in the model when expanding the@model
macro. This has 2 effects. Firstly, we avoid the overhead of looking up the index inidcs
and secondly we make sure the random variables are accessed in a contiguous manner in memory. This will be faster sinceTypedVarInfo
is typically heap-allocated.
With the 2 optimizations in-place and a whole bunch of inlining, it may be possible for the Julia compiler to avoid allocating any memory on the heap for TypedVarInfo
! The Julia compiler is good at eliding allocations when mutable objects only live within a single function body. This means if we make sure all the functions which use vi::TypedVarInfo
are properly inlined, the compiler may be able to elide these allocations entirely. Unfortunately, this is not true for Base.Array
since these are implemented in C
so they have special semantics.
This optimization promises significant speedups for small and medium-sized, static, determinisitic models.