Atm prototyping a new task with different terminals/rewards etc requires tearing down the current training loop or adding lots of messy branching.