Skip to content

Unable to Install and use with GPUs #3

@ajtarraga

Description

@ajtarraga

Hi @mdolz, I am trying to install and configure PyDTNN in a project with several heterogeneous nodes for supercomputing. In this nodes I have several GPUs interconnected via GPUDirect Storage and RDMA.

While I am trying to execute PyDTNN but while I execute the command python3 -Ou pydtnn_benchmark.py --model=vgg16_cifar10 --dataset=cifar10 --dataset_train_path=datasets/cifar-10/cifar-10-batches-bin --dataset_test_path=datasets/cifar-10/cifar-10-batches-bin --evaluate_only=True --batch_size=64 --validation_split=0.2 --weights_and_bias_filename=vgg16-weights-nhwc.npz --tracing=False --profile=False --enable_gpu=True --dtype=float32 (it is the example that you gives in the code), I obtain the next output:
/home/ajtarraga/.local/lib/python3.8/site-packages/skcuda/cublas.py:284: UserWarning: creating CUBLAS context to get version number
warnings.warn('creating CUBLAS context to get version number')
Please, install pycuda, skcuda, and cudnn to be able to use the GPUs!

I have installed pycuda, skcuda and cudnn:
$ pip3 install -r requirements_cuda_2.txt
Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Requirement already satisfied: pycuda>=2021.1 in /home/ajtarraga/.local/lib/python3.8/site-packages (from -r requirements_cuda_2.txt (line 1)) (2022.2.2)
Requirement already satisfied: scikit-cuda>=0.5.3 in /home/ajtarraga/.local/lib/python3.8/site-packages (from -r requirements_cuda_2.txt (line 2)) (0.5.3)
Requirement already satisfied: nvidia-cudnn>=8.1.1.33 in /home/ajtarraga/.local/lib/python3.8/site-packages (from -r requirements_cuda_2.txt (line 3)) (8.2.0.51)
Requirement already satisfied: appdirs>=1.4.0 in /home/ajtarraga/.local/lib/python3.8/site-packages (from pycuda>=2021.1->-r requirements_cuda_2.txt (line 1)) (1.4.4)
Requirement already satisfied: pytools>=2011.2 in /home/ajtarraga/.local/lib/python3.8/site-packages (from pycuda>=2021.1->-r requirements_cuda_2.txt (line 1)) (2022.1.14)
Requirement already satisfied: mako in /home/ajtarraga/.local/lib/python3.8/site-packages (from pycuda>=2021.1->-r requirements_cuda_2.txt (line 1)) (1.2.4)
Requirement already satisfied: numpy>=1.2.0 in /home/ajtarraga/.local/lib/python3.8/site-packages (from scikit-cuda>=0.5.3->-r requirements_cuda_2.txt (line 2)) (1.24.1)
Requirement already satisfied: wheel in /usr/lib/python3/dist-packages (from nvidia-cudnn>=8.1.1.33->-r requirements_cuda_2.txt (line 3)) (0.34.2)
Requirement already satisfied: setuptools in /home/ajtarraga/.local/lib/python3.8/site-packages (from nvidia-cudnn>=8.1.1.33->-r requirements_cuda_2.txt (line 3)) (65.6.3)
Requirement already satisfied: MarkupSafe>=0.9.2 in /home/ajtarraga/.local/lib/python3.8/site-packages (from mako->pycuda>=2021.1->-r requirements_cuda_2.txt (line 1)) (2.1.1)
Requirement already satisfied: typing-extensions>=4.0 in /home/ajtarraga/.local/lib/python3.8/site-packages (from pytools>=2011.2->pycuda>=2021.1->-r requirements_cuda_2.txt (line 1)) (4.4.0)
Requirement already satisfied: platformdirs>=2.2.0 in /home/ajtarraga/.local/lib/python3.8/site-packages (from pytools>=2011.2->pycuda>=2021.1->-r requirements_cuda_2.txt (line 1)) (3.1.0)

What do you think could be the problem and how can I solve it?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions