Saturday, January 6, 2024

[FIXED] Pytorch in docker with read-only mode

January 06, 2024 docker, python, pytorch No comments

Issue

I'm running pytorch in docker. The requirements from security team is to run docker in read-only mode.

I need to fork main process with models, that's why I use function module.share_memory() to move all models to shared memory and use torch.multiprocessing.set_sharing_strategy('file_system') because otherwise in file_descriptor mode 1024 open file descriptors is not enough for me, and I can't increase it because it is hardcoded in Linux. I use gunicorn is sync mode, it uses linux select under the hood.

So when I run docker in read-only mode I'm getting an error:

  File "/app/.venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1515, in share_memory
    return self._apply(lambda t: t.share_memory_())
  File "/app/.venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 387, in _apply
    module._apply(fn)
  File "/app/.venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 387, in _apply
    module._apply(fn)
  File "/app/.venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 387, in _apply
    module._apply(fn)
  [Previous line repeated 2 more times]
  File "/app/.venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 409, in _apply
    param_applied = fn(param)
  File "/app/.venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1515, in <lambda>
    return self._apply(lambda t: t.share_memory_())
  File "/app/.venv/lib/python3.9/site-packages/torch/tensor.py", line 385, in share_memory_
    self.storage().share_memory_()
  File "/app/.venv/lib/python3.9/site-packages/torch/storage.py", line 143, in share_memory_
    self._share_filename_()
RuntimeError: std::exception at /pytorch/torch/lib/libshm/core.cpp:99

I understand that I need to give an additional RW access to some directories but I can't figure out to which directories. Could you help me, how I can find these directories? Of course there is a RW access to /dev/shm, I even can see that pytorch creates files there but then crashes with above error.

I'm using pytroch 1.8.1.

Solution

I was able to fix it in two different ways:

ENV TEMP=/var/tmp in docker (change tmp path for pytorch from default /tmp to /var/tmp) and giving rw access to /var/tmp by adding to docker run args: -v /var/tmp:/var/tmp
-v /tmp:/tmp in docker run args (pytorch uses /tmp by default)

Answered By - Tipok

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Saturday, January 6, 2024

[FIXED] Pytorch in docker with read-only mode

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels