-
Notifications
You must be signed in to change notification settings - Fork 1
Description
I am getting the below error though cuda is already installed in my system.
`
(vitis-ai-pytorch) Vitis-AI /workspace/KV260_Vitis_AI_examples/mnist_pyt/files > nvidia-smi
Wed Mar 9 07:31:47 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.103.01 Driver Version: 470.103.01 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla K20c Off | 00000000:83:00.0 Off | 0 |
| 42% 49C P0 49W / 225W | 0MiB / 4743MiB | 68% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 Tesla K20c Off | 00000000:84:00.0 Off | 0 |
| 41% 54C P0 56W / 225W | 1210MiB / 4743MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+`
I have gpu builded vitis-ai repo.
`(vitis-ai-pytorch) Vitis-AI /workspace/KV260_Vitis_AI_examples/mnist_pyt/files > python -u train.py -d ${BUILD} 2>&1 | tee ${LOG}/train.log
PyTorch version : 1.7.1
3.6.15 | packaged by conda-forge | (default, Dec 3 2021, 18:49:41)
[GCC 9.4.0]
Command line options:
--build_dir : ./build
--batchsize : 100
--learnrate : 0.001
--epochs : 3
You have 2 CUDA devices available
Device 0 : Tesla K20c
Device 1 : Tesla K20c
Selecting device 0..
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz to ./build/dataset/MNIST/raw/train-images-idx3-ubyte.gz
9920512it [00:04, 2146323.53it/s]
Extracting ./build/dataset/MNIST/raw/train-images-idx3-ubyte.gz to ./build/dataset/MNIST/raw
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz to ./build/dataset/MNIST/raw/train-labels-idx1-ubyte.gz
32768it [00:01, 19985.86it/s]
Extracting ./build/dataset/MNIST/raw/train-labels-idx1-ubyte.gz to ./build/dataset/MNIST/raw
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz to ./build/dataset/MNIST/raw/t10k-images-idx3-ubyte.gz
1654784it [00:08, 185356.98it/s]
Extracting ./build/dataset/MNIST/raw/t10k-images-idx3-ubyte.gz to ./build/dataset/MNIST/raw
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz to ./build/dataset/MNIST/raw/t10k-labels-idx1-ubyte.gz
8192it [00:01, 6238.02it/s]
Extracting ./build/dataset/MNIST/raw/t10k-labels-idx1-ubyte.gz to ./build/dataset/MNIST/raw
Processing...
/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.6/site-packages/torchvision/datasets/mnist.py:480: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /opt/conda/conda-bld/pytorch_1607370120218/work/torch/csrc/utils/tensor_numpy.cpp:141.)
return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)
Done!
Epoch 1
Traceback (most recent call last):
File "train.py", line 131, in
run_main()
File "train.py", line 124, in run_main
train_test(args.build_dir, args.batchsize, args.learnrate, args.epochs)
File "train.py", line 89, in train_test
train(model, device, train_loader, optimizer, epoch)
File "/workspace/KV260_Vitis_AI_examples/mnist_pyt/files/common.py", line 63, in train
x = model(data)
File "/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/workspace/KV260_Vitis_AI_examples/mnist_pyt/files/common.py", line 49, in forward
x = self.network(x)
File "/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.6/site-packages/torch/nn/modules/container.py", line 117, in forward
input = module(input)
File "/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 423, in forward
return self._conv_forward(input, self.weight)
File "/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 420, in _conv_forward
self.padding, self.dilation, self.groups)
RuntimeError: CUDA error: no kernel image is available for execution on the device
`