To get the most consistent results, we might do well to run each unit test multiple times, discard any outliers (maybe the system was in a weird state for one run), and average the remaining runs.
Doing a simple average is easy enough, and we'll see a huge amount of improvement that way, so it might be worth it to just write it in to do an average of 3-5 runs, and worry about outliers once they crop up.