it seems that it select a cuda device if possible, but not using it while calculating : ) (while the program still works fine~)