I've been seeing this that people want to keep pretrained weights somewhere else (due to system / admin / memory, etc., limitations) and currently our design limits using two checkpoint paths of generalist models to be loaded together!
I'll work on making it modular and open a PR. Let's see if I can keep the design minimalistic! (because there is quite some conditions we need to consider there)