Describe the bug
Trying to execute multiple unit tests in parallel with xdist -n X on Windows leads to failures. It happens most likely because one worker compiles and starts executing a kernel using a launcher DLL (with pyd extension on windows) from ~/.triton/cache folder while another worker tries to compile the same kernel and write the a launcher DLL into the same folder. On Windows a DLL that is loaded into a process is locked and cannot be modified, so 2nd worker that tries to write a .pyd file gets an IO error and fails. This doesn't happen on Linux because Linux doesn't lock files that are open by running processes.
Any ideas on how to reliably solve this are welcome.
Environment details
Triton on any GPU running on Windows.