Issue
Problem: I am currently working on clustering of data and found a weird behavior in my Jupyter Notebooks. All seeds are fixed. Executing some or all code multiple times generates stable results. Restarting the Kernel causes the results to change. Anyway those new results are stable as long as the kernel is active.
import numpy as np
import random, os
# Fix Seeds
def fix_seeds(seed=1234):
random.seed(seed)
np.random.seed(seed)
# In case the modules do something on import
fix_seeds()
# Other imports (Only depending on random and/or numpy)
So my Question: How or at which point is that randomness introduced and how to fix that?
Solution
Apparently Python >= 3.3 uses a random Hash-Seed to avoid collision attacks. Fixing that seed (eg. executing PYTHONHASHSEED=0 python3 <file>.py
) solves my problem. Same goes for the Kernels. On startup they generate a Hash-Seed as well.
Source: https://docs.python.org/3/using/cmdline.html#envvar-PYTHONHASHSEED
Answered By - MyNameIsFu
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.