We ran cuSignal on a VM five times and observed a large variation in terms of running time among different runs for a couple of tests although we didn't change anything neither from our platform nor from the code. For instance, the runtimes of ISTFT with "1024-1000000.0-65536-float64" parameter from two runs were 362.2986us and 665.0717us. We observed some similar differences for a couple of other tests (e.g. ChannelizePoly CWT,...) as well. What could be the cause of such big variations?