Issue
I'm currently working on a server and I would like to be able the GPUs for PyTorch network training. I am not able to detect GPU by using torch but, if I use TensorFlow, I can detect both of the GPUs I am supposed to have. I suppose it's a problem with versions within PyTorch/TensorFlow and the CUDA versions on it.
However, after trying different versions of Pytorch, I am not still able to use them...
I am attaching the specificities of the GPUs and the current version of Tensorflow and Pytorch I am using. Does anyone have any hint on it? Would be very helpful.
| NVIDIA-SMI 4--.--.-- Driver Version: 465.19.01 CUDA Version: 11.3 |
|-------------------------------+----------------------+----------------------|
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... On | 00000000:02:00.0 Off | N/A |
| 27% 39C P8 17W / 250W | 1MiB / 11176MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce ... On | 00000000:81:00.0 Off | N/A |
| 28% 45C P8 11W / 250W | 1MiB / 11178MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Wed_Jul_22_19:09:09_PDT_2020
Cuda compilation tools, release 11.0, V11.0.221
Build cuda_11.0_bu.TC445_37.28845127_0
Torch
version: 1.10.2
Tensorflow
Version: 2.6.2
Cuda toolkit
: 11.3.1
>>> print('Number of GPUs: %d' % len(tf.config.list_physical_devices('GPU')))
Number of GPUs: 2
>>> torch.cuda.is_available()
False
I am so lost... Thank you in advance!
Solution
I finally could resolve this problem by specifying the cuda version of pytorch... The combination of those specific versions was installing the CPU based version.
After installing the correct one, I have been able to use the GPU server without any problem.
Answered By - mdelas
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.