In case it matters at all, I'm using Python 3.11.5 64-bit on a Windows 11 Pro desktop computer with NumPy 1.26.4.
In order to try to better understand what NumPy is doing behind the scenes when I ask for a np.random.Generator object from some given SeedSequence, I decided to try to reconstruct in pure Python what happens when I initialize a SeedSequence from a given entropy value.
Based on the source code for SeedSequence found here, my understanding of how uint32 overflow works, and the fact that (on my machine at least) np.dtype(np.uint32).itemsize is 4, i.e. XSHIFT, defined as np.dtype(np.uint32).itemsize * 8 // 2, is 16, I wrote the following code:
seed = int(input('Please enter a seed: '))
Entropy = seed
Spawn_key = ()
Pool_size = 8
N_children_spawned = 0
Pool = [0 for _ in range(Pool_size)]
Assembled_entropy = []
Ent = Entropy + 0
while Ent > 0:
Assembled_entropy.append(Ent & 0xffffffff)
Ent >>= 32
if not Assembled_entropy:
Assembled_entropy = [0]
hash_const = 0x43b0d7e5
for i in range(Pool_size):
if i < len(Assembled_entropy):
Assembled_entropy[i] ^= hash_const
hash_const *= 0x931e8875
hash_const &= 0xffffffff
Assembled_entropy[i] *= hash_const
Assembled_entropy[i] &= 0xffffffff
Assembled_entropy[i] ^= Assembled_entropy[i] >> 16
Pool[i] = Assembled_entropy[i]
else:
value = hash_const
hash_const *= 0x931e8875
hash_const &= 0xffffffff
value *= hash_const
value &= 0xffffffff
value ^= value >> 16
Pool[i] = value
for i_src in range(Pool_size):
for i_dst in range(Pool_size):
if i_src != i_dst:
Pool[i_src] ^= hash_const
hash_const *= 0x931e8875
hash_const &= 0xffffffff
Pool[i_src] *= hash_const
Pool[i_src] &= 0xffffffff
Pool[i_src] ^= Pool[i_src] >> 16
x = (0xca01f9dd * Pool[i_dst]) & 0xffffffff
y = (0x4973f715 * Pool[i_src]) & 0xffffffff
Pool[i_dst] = x - y
Pool[i_dst] &= 0xffffffff
Pool[i_dst] ^= Pool[i_dst] >> 16
for i_src in range(Pool_size, len(Assembled_entropy)):
for i_dst in range(Pool_size):
Assembled_entropy[i_src] ^= hash_const
hash_const *= 0x931e8875
hash_const &= 0xffffffff
Assembled_entropy[i_src] *= hash_const
Assembled_entropy[i_src] &= 0xffffffff
Assembled_entropy[i_src] ^= Assembled_entropy[i_src] >> 16
x = (0xca01f9dd * Pool[i_dst]) & 0xffffffff
y = (0x4973f715 * Assembled_entropy[i_src]) & 0xffffffff
Pool[i_dst] = x - y
Pool[i_dst] &= 0xffffffff
Pool[i_dst] ^= Pool[i_dst] >> 16
print(Pool)
I have copied the shell outputs of some test runs below.
Please enter a seed: 0
[595626433, 3558985979, 200295889, 3864401631, 3155212474, 198111058, 4047350828, 373757291]
Please enter a seed: 1
[2396653877, 491222160, 2441066534, 3196981647, 1764919720, 3210735412, 1132315803, 1197535761]
Please enter a seed: 123456789
[2161290507, 266876805, 2694113549, 3306969538, 3218948428, 3543586554, 886289367, 3129292100]
Please enter a seed: 123456789123456789
[2628723507, 610487362, 209721652, 1960674985, 3519121735, 1259052354, 2097159984, 3934338599]
Please enter a seed: 123456789123456789123456789123456789
[2988668238, 798946769, 2484899198, 1005350017, 2633831484, 343737596, 1402961265, 3184558744]
Please enter a seed: 123456789123456789123456789123456789123456789123456789123456789123456789
[431881030, 3789410928, 218849910, 879851040, 1423068736, 85390627, 3721593143, 198649564]
Please enter a seed: 123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789
[702225118, 2293461530, 514808704, 2115883586, 3179647446, 3197133803, 3807436730, 1822195906]
from numpy.random import SeedSequence
seed = int(input('Please enter a seed: '))
seedseq = SeedSequence(entropy=seed, spawn_key=[], pool_size=8, n_children_spawned=0)
print([int(value) for value in seedseq.pool])
However, providing those same values to the above version of the program, which calls NumPy's SeedSequence directly, gives very different results:
Please enter a seed: 0
[2043904064, 467759482, 3940449851, 2747621207, 4006820188, 4161973813, 800317807, 2622167125]
Please enter a seed: 1
[476219752, 3923368624, 2653737542, 2876255837, 1861759290, 3300511046, 3253139541, 2224879358]
Please enter a seed: 123456789
[480462800, 1421661229, 2686834002, 3365909768, 3295673516, 1830753151, 1249963727, 3680881655]
Please enter a seed: 123456789123456789
[3112345096, 1618497203, 2864025213, 3262672577, 379697145, 163816190, 1265228116, 2568065655]
Please enter a seed: 123456789123456789123456789123456789
[2197723902, 2868273012, 1547285866, 2772382071, 2016971656, 1130152919, 897020445, 135618137]
Please enter a seed: 123456789123456789123456789123456789123456789123456789123456789123456789
[3230290517, 251217303, 1180998335, 454107561, 4150025399, 1840013050, 1216833737, 89665521]
Please enter a seed: 123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789
[902839167, 3446715647, 2106916613, 1578536987, 595141342, 3126308643, 400300642, 3659109886]
What is going on here?
UPDATE: based on @OskarHoffman's answer, I have fixed my code. It is included here in case anybody is interested.
seed = int(input('Please enter a seed: '))
Entropy = seed
Spawn_key = ()
Pool_size = 8
N_children_spawned = 0
Pool = [0 for _ in range(Pool_size)]
Assembled_entropy = []
Ent = Entropy + 0
while Ent > 0:
Assembled_entropy.append(Ent & 0xffffffff)
Ent >>= 32
if not Assembled_entropy:
Assembled_entropy = [0]
hash_const = 0x43b0d7e5
for i in range(Pool_size):
if i < len(Assembled_entropy):
temp = Assembled_entropy[i] ^ hash_const
hash_const *= 0x931e8875
hash_const &= 0xffffffff
temp *= hash_const
temp &= 0xffffffff
temp ^= temp >> 16
Pool[i] = temp
else:
value = hash_const
hash_const *= 0x931e8875
hash_const &= 0xffffffff
value *= hash_const
value &= 0xffffffff
value ^= value >> 16
Pool[i] = value
for i_src in range(Pool_size):
for i_dst in range(Pool_size):
if i_src != i_dst:
temp = Pool[i_src] ^ hash_const
hash_const *= 0x931e8875
hash_const &= 0xffffffff
temp *= hash_const
temp &= 0xffffffff
temp ^= temp >> 16
x = (0xca01f9dd * Pool[i_dst]) & 0xffffffff
y = (0x4973f715 * temp) & 0xffffffff
Pool[i_dst] = x - y
Pool[i_dst] &= 0xffffffff
Pool[i_dst] ^= Pool[i_dst] >> 16
for i_src in range(Pool_size, len(Assembled_entropy)):
for i_dst in range(Pool_size):
temp = Assembled_entropy[i_src] ^ hash_const
hash_const *= 0x931e8875
hash_const &= 0xffffffff
temp *= hash_const
temp &= 0xffffffff
temp ^= temp >> 16
x = (0xca01f9dd * Pool[i_dst]) & 0xffffffff
y = (0x4973f715 * temp) & 0xffffffff
Pool[i_dst] = x - y
Pool[i_dst] &= 0xffffffff
Pool[i_dst] ^= Pool[i_dst] >> 16
print(Pool)
The difference is in your second for-loop implementing the
hashmix()function. You modify yourPoollist at positioni_srcto calculate the value fory. The numpy implementation does not. It just copies the valuePool[i_src](by using it as an argument for calling thehashmixfunction) and modifies that copy (discarding it afterwards).So modifying that for-loop to:
I get the same results as the numpy-implementation.