Consider the example using https://github.com/pyro-ppl/pyro/pull/2946 ```sh FUNSOR_PROFILE=50 python -m tests.infer.autoguide.test_gaussian --no-jit -n 2 -s 800 ``` What makes this slow? Can we implement better profiling tools to diagnose this?