Friday, June 24, 2022

[FIXED] How to free GPU from CUDA (using Pytorch)?

June 24, 2022 cuda, python, pytorch No comments

Issue

I'm using spark/face-alignment to generate faces that are almost the same.

 fa = face_alignment.FaceAlignment(face_alignment.LandmarksType._2D, flip_input=False) # try to use GPU with Pytorch depenencies.
 imageVector.append( convertImagefa(image, fa))
 del fa
 gc.collect()
 torch.cuda.empty_cache() # trying to clean up cuda.
 return imageVector

I'm on a 1 machine with 4 threads that all try to access the GPU. As such I have worked out a strategy that every 4th request it uses the GPU. This seems to fit in memory.

My issue is that when I clean up after cuda it never actually fully cleans. I'll see the load move around the threads and some space free up but CUDA never lets go of the last 624MiB. Is there a way to clean it all the way up?

nvidia-smi                                                                                                                                                              
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A     17132  C   .../face-the-same/bin/python      624MiB |
|    0   N/A  N/A     17260  C   .../face-the-same/bin/python     1028MiB |
|    0   N/A  N/A     17263  C   .../face-the-same/bin/python      624MiB |
|    0   N/A  N/A     17264  C   .../face-the-same/bin/python      624MiB |

FYI: I ended up using a distributed lock to pin the GPU computation to one executor/process id. THis was the outcome derived from the comment made from @Jan.

Solution

According to https://discuss.pytorch.org/t/pytorch-do-not-clear-gpu-memory-when-return-to-another-function/125944/3 this is due to the CUDA context remaining in place unless you end the script. They recommend calling torch.cuda.empty_cache() to clear the cache, however, there will always be a remainder. To get rid of that you could switch to processes instead of threads so that the process can actually be killed without killing your programm (but that'll be quite some effort I suppose).

Answered By - Jan

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Friday, June 24, 2022

[FIXED] How to free GPU from CUDA (using Pytorch)?

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels