Python numpy set random seed, thread-safe

36 views Asked by At

I am trying to set a seed to each call of my function which does some numpy stuff, which I will be running in parallel. From the numpy documentation (https://numpy.org/doc/stable/reference/random/parallel.html) it is said to use SeedSequence or default_rng (and not to use random.seed or random.RandomState as some older answers suggest, as these are not thread safe) however this same exact code from the documentation does not work for me, even when running iteratively.

from numpy.random import default_rng, normal

def worker(root_seed, worker_id):
    rng = default_rng([worker_id, root_seed])
    print(normal())

root_seed = 0x8c3c010cb4754c905776bdac5ee7501
results = [worker(root_seed, worker_id) for worker_id in range(5)]

Running it twice I get different results. Why?

1

There are 1 answers

0
simeonovich On BEST ANSWER

The np.random.normal call inside worker uses the default generator initialized on startup.

For reproducibility, you want to use the Generator object returned by default_rng instead - simply constructing a generator does not set the random state globally.

from numpy.random import default_rng

def worker(root_seed, worker_id):
    rng = default_rng([worker_id, root_seed])
    print(rng.normal())

root_seed = 0x8c3c010cb4754c905776bdac5ee7501
results = [worker(root_seed, worker_id) for worker_id in range(5)]