So that we need not rerun RL training in order to test the model.
So that we need not rerun RL training in order to test the model.