Issue
I'm using spark/face-alignment to generate faces that are almost the same.
fa = face_alignment.FaceAlignment(face_alignment.LandmarksType._2D, flip_input=False) # try to use GPU with Pytorch depenencies.
imageVector.append( convertImagefa(image, fa))
del fa
gc.collect()
torch.cuda.empty_cache() # trying to clean up cuda.
return imageVector
I'm on a 1 machine with 4 threads that all try to access the GPU. As such I have worked out a strategy that every 4th request it uses the GPU. This seems to fit in memory.
My issue is that when I clean up after cuda it never actually fully cleans. I'll see the load move around the threads and some space free up but CUDA never lets go of the last 624MiB. Is there a way to clean it all the way up?
nvidia-smi
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 17132 C .../face-the-same/bin/python 624MiB |
| 0 N/A N/A 17260 C .../face-the-same/bin/python 1028MiB |
| 0 N/A N/A 17263 C .../face-the-same/bin/python 624MiB |
| 0 N/A N/A 17264 C .../face-the-same/bin/python 624MiB |
FYI: I ended up using a distributed lock to pin the GPU computation to one executor/process id. THis was the outcome derived from the comment made from @Jan.
Solution
According to https://discuss.pytorch.org/t/pytorch-do-not-clear-gpu-memory-when-return-to-another-function/125944/3 this is due to the CUDA context remaining in place unless you end the script. They recommend calling torch.cuda.empty_cache()
to clear the cache, however, there will always be a remainder. To get rid of that you could switch to processes instead of threads so that the process can actually be killed without killing your programm (but that'll be quite some effort I suppose).
Answered By - Jan
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.