I ran `npx --no node-llama-cpp download --cuda` and it takes a seriously long time to compile, seemingly because it's only running on one thread. Is there anything I can do to speed it up?