-
Notifications
You must be signed in to change notification settings - Fork 36
Description
@yebai and I have been discussing the idea of replacing the current VarInfo
type with something more general as a wrapper around VarNamedVector
/Metadata
. The motivation is that VarInfo
currently stores
logp
. This is ostensibly innocuous and straight-forward, but actually this is sometimes the log prior, sometimes log likelihood, sometimes log joint. Also some samplers hijack this so that even when you're sampling from log joint a sampler may actually use thelogp
to store the log likelihood. This can cause mix-ups, I've had bugs in the new Gibbs implementation where I've thought I have the likelihood but I actually have the prior.num_produce
andorder
, which are only used by particle methods (see Decide the fate ofVarInfo.num_produce
ย #661 for previous discussion)- a
VarNamedVector
/Metadata
with the variable values.
We would rather not have fields in VarInfo
that are only used by specific samplers, and others that are used for different purposes at different times.
The solution we've been thinking of would be some sort of wrapper type, probably nested, that wraps a VarNamedVector
and allows one to store any extra information needed. If a particle sampler needs num_produce
and order
, it'll implement it's own wrapper type, and other samplers that only need e.g. the logjoint would use a different wrapper type.
All of this could of course be stored in a context (because anything can be done with a context) but contexts are already too difficult to reason about, and should in my opinion only be used when a simpler tool won't cut it. Thus we would rather like an interface where somewhere in the tilde pipeline, maybe everywhere where we currently call acclogp
, we would call some more generic store_custom_varinfo_data
function, which each wrapper varinfo would then overload to store logprior
/num_produce
/whatever_they_want_to_store
.
It's not obvious though whether such an interface would be powerful enough to implement all the things we want to use it for. So we should probably start by making a list of all the things we want to use it for. This would include at least
- log prior/log likelihood/log joint. I would like to store these separately too, to avoid mixing them up.
- num_produce/order
- what else?
@yebai probably has more thoughts to share on this.