Issue
By running only the following code in a single Jupyter Notebook cell, I load a file of 1GB in memory in a function and I return the result after the function definition:
import pickle
def fun():
with open('./data/input/train_test_clevr_pkls/train_clevr_1000.pkl', 'rb') as handle:
return pickle.load(handle)
fun()
The Jupyter Notebook will then print the output of the function. Looking at the memory consumed after I run the cell it sits at 1GB as expected. However, if I run the same cell multiple times the memory footprint increases by 1GB each time until my entire RAM is consumed and then my Windows Operating System uses page file swapping to handle even more memory marked as being consumed which destroys my application's performance. I already tried to use the gc.collect()
to free the memory but to no avail.
I saw similar questions being asked but I found no answer to my problem. The memory is NOT being reused internally, it only grows!
Solution
The reason you are seeing this is because Jupyter stores all the references to a name called Out
.
MRE (all this in one cell)
from itertools import product
def foo():
return [*product(range(5), repeat=5)]
[foo() for _ in range(5)]
When you run this in a cell Jupyter saves this to the dict Out
.
Partial Contents of Out
{1: [[(0, 0, 0, 0, 0),
(0, 0, 0, 0, 1),
(0, 0, 0, 0, 2),
(0, 0, 0, 0, 3),
(0, 0, 0, 0, 4),
(0, 0, 0, 1, 0),
(0, 0, 0, 1, 1),
(0, 0, 0, 1, 2),
(0, 0, 0, 1, 3),
(0, 0, 0, 1, 4),
(0, 0, 0, 2, 0),
(0, 0, 0, 2, 1),...}
When you run the code block again it stores that value to a new key 3
in the Out
. This keeps on adding new key value pair to the Out
dict every time you run the cell.
Now the reason for why it does not happen when you do print(...)
from itertools import product
def foo():
return [*product(range(5), repeat=5)]
print([foo() for _ in range(5)])
Jupyter does not save the result to the Out
dict in that case. No matter how many times you run the cell, the Out
dict will always be {}
.
Answered By - python_user
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.