The current implementation could improve q-scores faster if we start from epsilon value of 0.95, and decay it gradually to 0.1. Something like the following should do the trick ```py epsilon = min(0.1, epsilon * decay_factor) ```