Issue
I understand there are a variety of techniques for sharing memory and data structures between processes in python. This question is specifically about this inherently shared memory in python scripts that existed in python 3.6 but seems to no longer exist in 3.10. Does anyone know why and if it's possible to bring this back in 3.10? Or what this change that I'm observing is? I've upgraded my Mac to Monterey and it no longer supports python 3.6, so I'm forced to upgrade to either 3.9 or 3.10+.
Note: I tend to develop on Mac and run production on Ubuntu. Not sure if that factors in here. Historically with 3.6, everything behaved the same regardless of OS.
Make a simple project with the following python files
myLibrary.py
MyDict = {}
test.py
import threading
import time
import multiprocessing
import myLibrary
def InitMyDict():
myLibrary.MyDict = {'woot': 1, 'sauce': 2}
print('initialized myLibrary.MyDict to ', myLibrary.MyDict)
def MainLoop():
numOfSubProcessesToStart = 3
for i in range(numOfSubProcessesToStart):
t = threading.Thread(
target=CoolFeature(),
args=())
t.start()
while True:
time.sleep(1)
def CoolFeature():
MyProcess = multiprocessing.Process(
target=SubProcessFunction,
args=())
MyProcess.start()
def SubProcessFunction():
print('SubProcessFunction: ', myLibrary.MyDict)
if __name__ == '__main__':
InitMyDict()
MainLoop()
When I run this on 3.6 it has a significantly different behavior than 3.10. I do understand that a subprocess cannot modify the memory of the main process, but it is still super convenient to access the main process' data structure that was previously set up as opposed to moving every little tiny thing into shared memory just to read a simple dictionary/int/string/etc.
Python 3.10 output:
python3.10 test.py
initialized myLibrary.MyDict to {'woot': 1, 'sauce': 2}
SubProcessFunction: {}
SubProcessFunction: {}
SubProcessFunction: {}
Python 3.6 output:
python3.6 test.py
initialized myLibrary.MyDict to {'woot': 1, 'sauce': 2}
SubProcessFunction: {'woot': 1, 'sauce': 2}
SubProcessFunction: {'woot': 1, 'sauce': 2}
SubProcessFunction: {'woot': 1, 'sauce': 2}
Observation:
Notice that in 3.6, the subprocess can view the value that was set from the main process. But in 3.10, the subprocess sees an empty dictionary.
Solution
In short, since 3.8, CPython uses the spawn start method on MacOs. Before it used the fork method.
On UNIX platforms, the fork start method is used which means that every new multiprocessing
process is an exact copy of the parent at the time of the fork.
The spawn method means that it starts a new Python interpreter for each new multiprocessing
process. According to the documentation:
The child process will only inherit those resources necessary to run the process object’s
run()
method.
It will import your program into this new interpreter, so starting processes et cetera sould only be done from within the if __name__ == '__main__':
-block!
This means you cannot count on variables from the parent process being available in the children, unless they are module level constants which would be imported.
So the change is significant.
What can be done?
If the required information could be a module-level constant, that would solve the problem in the simplest way.
If that is not possible (e.g. because the data needs to be generated at runtime) you could have the parent write the information to be shared to a file. E.g. in JSON format and before it starts other processes. Then the children could simply read this. That is probably the next simplest solution.
Using a multiprocessing.Manager
would allow you to share a dict
between processes. There is however a certain amount of overhead associated with this.
Or you could try calling multiprocessing.set_start_method("fork")
before creating processes or pools and see if it doesn't crash in your case. That would revert to the pre-3.8 method on MacOs. But as documented in this bug, there are real problems with using the fork
method on MacOs.
Reading the issue indicates that fork
might be OK as long as you don't use threads.
Answered By - Roland Smith
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.