To my best knowledge if I have a CPU intensive work multithreading should work similar to linear execution of the same code. So I tested it using this simple code.
import datetime
import threading
import time
def test(args):
i,wait = args
for _ in range(i):
# a = 0
# while a <= 1000000:
# a+=1
t = datetime.datetime.now()
while datetime.datetime.now() <= t + datetime.timedelta(seconds=wait):
pass
if __name__ =="__main__":
iteration = 50000
wait = 0.001
print(f'Running {iteration} iteration, wait {wait}')
t1 = threading.Thread(target=test, args=((iteration,wait),))
t2 = threading.Thread(target=test, args=((iteration,wait),))
start = time.time()
t1.start()
t2.start()
t1.join()
t2.join()
multi = time.time()-start
start = time.time()
test((iteration*2,wait))
print('multi and linear time:',multi,time.time()-start)
And result changes depending on parameters iteration and wait. I thought they should be similar regardless of these parameters.
And now if I comment and uncomment the code like this
a = 0
while a <= 1000000:
a+=1
# t = datetime.datetime.now()
# while datetime.datetime.now() <= t + datetime.timedelta(seconds=wait):
# pass
Results are much similar.
Can someone please explain these results.


So, the problem you have is that for a small wait time, when running threaded code, it takes almost 50% longer than the linear code, and when the waiting time is larger, running the threaded code has the same result as the linear code.
This is certain due to implementation details related to the way the Python runtime will lock/unlock resources when calling the OS calls that answer the time proper: it looks like when you are working with 0.001 delay.
I'd say that the time it takes for Python to set-up and get out of the
whileloop is more significative than repeatng many calls totime.time()once the setup is complete: so, setting up the while loop takes a lot of time compared with the0.001wait in the loop, but that setup time is the same (close to 0.0005 seconds if we think on the 50% extra time) for the 1 sec. delay.Anyway - it is due to minor details, and works to illustrate that multi-threading in Python indeed have a lot of peculiarities, and one will be best served keeping in mind that for CPU-intensive tasks, Python-multi-threading should just be avoided.
But for Python 3.12 (now in alpha), PEP 684 allowing independent GIL in separate sub-interpreters may make possible to overturn some of these limitations (by setting a thread in each independent interpreter). Wait for it.
(shameless self-promotion): the API for use of the sub-interpreters will be a bit rough at first, but I am working in a 3rdy party package to make it as easy as working with threads: "extrainterpreters".
running the code
Back here - I could not resist trying this code, and include the "extrainterpreter"s measurements. It turns out it only gets wilder.
So, first: I am not in a mood to wait 600 seconds for a test run, so I cut back on the number of interations in your script. Then I refactored it a bit, so there were no need of edits to run the tests with different parameters
Second, I just added the modality using sub-interpreters to the mix.
As for the results: I) I got the opposite of your results, with multi-threading run being actually faster than linear in the same scenario it was 50% slower in yours: that should be either due to (1) changes in the run conditions due to the smaller number of interations and ratio of the fixed "0.001" delay vs raw CPU speed in my machine vs yours (mine is a 2018 era dual core i7, a bit dated) or (2) due to improvements in the Python runtime in v. 3.12 alpha vs your version (I think you didn't say which Python version you got there, but these timings should have changed a lot in 3.11, and some more in 3.12)
II) The improvements with using sub-interpreters varied a lot, with the modality using 10 iterations and just doing increments using Python code taking exactly twice as long on the sub-interpreter mode. (possibly due to some shared resource across interpreters, in the setup of the function call itself - it may even be a problem in my code). The same modality with 1000 iterations (down from your 50000) showed, on the other hand, a significative gain with the sub-interpreters modality.
Here are the results, followed by the script:
The script. To use it while sub-interpreters are not generally available, just comment the necessary lines.
It will require a special branch from Eric Snow where the code for PEPs 684 and 554 is in developement. It should work in Python 3.12 beta, when its out (in 3 more weeks) - but , there is a chance they will not aprove PEP 554, which provides the Python-side code to run sub-interpreters. Nonetheless, once you get a supported Python runtime, you can install
extrainterpretersfrom https://github.com/jsbueno/extrainterpreters (withpip install git+https://github.com/jsbueno/extrainterpreters.git)