The constructDuration.py is the slowest one. Strangely it is slow also at experiment level, although I expect it to be really fast given that it simply takes already computed metrics from the trial metrics.
For the trial level evaluate the following improvements:
- Reuse the
filteredRDD instead of basically computing it again, and also reuse the data variable where you basically have all the information you compute again for the for