**Describe the bug** In q_learning agent, the same model is being stored multiple times, one for each test episode. **Expected behaviour** Store the model only once