Trying to explore the Random Walk algorithm on Infinite Line and I'm looking for a way to make it as optimal as possible. Here is the code:
import random
from collections import Counter
def run(initial_pos,iterations,trials):
final_pos = []
for i in range (0,trials):
pos = initial_pos
for j in range (0,iterations):
if random.choice(["left","right"]) == "left":
pos -= 1
else:
pos += 1
final_pos.append(pos)
return Counter(final_pos)
Variable iterations indicates the number of repetitions in a single walk.
While trials indicates the number of walks.
The runtime is satisfactory for trials and iterations equals 10^4
Hwever, increasing to 10^5 requires long waiting time.
The Low-hanging Fruit
First of all, let's get rid of those magic number and strings, and define some named constants:
Now, we can rewrite the inner loop body to use those constants:
Looking at that, two issues become apparent:
Let's eliminate both of those issues by just using the integer constants
RIGHTandLEFT. Let's also store the result ofrandom.choiceinto a temporary to better see the next issue:Now, we can see that
current_directioncan be eitherRIGHTorLEFT. If it'sLEFT, then we addLEFTtopos, otherwise (the only other choice isRIGHT), we addRIGHTtopos. In other words:Same thing happens in both branches, so let's get rid of the condition:
Here is our new starting point:
Now, if we look at the code critically (and having the low-level details of the byte-code and interpreter in mind -- the
dismodule helps here) we can see some other deficiencies:range(0, trials)is functionally identical torange(trials). However, that redundant0means an extra opcode to push the constant onto the stack. We don't want to waste time on useless things, right?[LEFT, RIGHT]. Python doesn't optimize such things, so that means we're creating the same listiterations * trialstimes. Even though it's just 3 opcodes, let's do it once, and then reuse the same list. We might as well make it a tuple, it doesn't need to change.collections.Countersupports updates. So let's avoid the intermediatefinal_poslist, and update theCounterdirectly.random.choiceinvolves two opcodes (first getrandom, then findchoice). We can cache the actual functions to call in local variables to avoid the extra step.Those initial changes only mean the code runs in about 90%-95% of the time needed by the original, but they give us a solid basis for further optimizations.
Eliminating Inner Loop — Attempt 1
Right now our inner loop basically makes a sum of a list of random choices, offset by
initial_pos. Let's userandom.choicesinstead to generateiterationschoices in one call, and the builtinsumto add them up.This requires roughly 25% of the time required by
run_v1.Eliminating Inner Loop — Attempt 2
The main problem with the previous version is that it ends up allocating a lot of intermediate Python objects (one for each choice). A more efficient way of using the memory would be to use numpy arrays and various functions the library provides. For example, we could use
numpy.Generator.choiceand thesummethod of numpy arrays:This version requires around 20% of the time required by
run_v2.More Efficient Choices
Currently, the random choices require an indirect lookup to map into our desired set of
(1, -1). Let's observe thatiterations = right_count + left_count. We already know the value ofiterations, so as long as we know one ofright_countorleft_count, we can calculate the other.Therefore
pos_offset = 2 * right_count - iterations, and we can implement it as such:This time it requires about 60%-90% of
run_v3.Further Variations
Using
np.random.default_rng().integersinstead.Using more memory to eliminate the upper loop.
As suggested by Jérôme Richard sampling binomial distribution.
Applying math and using
np.random.binomial.This runs in 1/4 second for 10^6 iterations and trials.