I wanna make anapp, but it has to calculate a lot of things, so I want to use multiprocessing to speedup but can't make it work. I did and somewhat worked is copying all the data for all processes, but it took too much memory...
What I did:
from concurrent.futures import ProcessPoolExecutor
import pandas as pd
import time
variables = {} # tried to save the values for accessing later
def test(n):
distances = variables['distances']
position = variables['position']
time.sleep(350)
return position
def main():
variables['distances'] = pd.read_csv('very_large_file.csv') # let 1 for this exemple # imagine a file with a lot of data, like 2Gb
variables['position'] = pd.read_csv('another_very_large_file.csv') # let 2 # another file with a lot of data, like 2Gb
with ProcessPoolExecutor(5) as executor:
for result in executor.map(test, range(5)):
print(result)
if __name__ == '__main__':
main()
This code returns a error that 'a' is not in variables.
If I load the file using a 'file.csv' it takes way too much ram, but my pc can't run it, so I want to know how it can access a that object, instead of creating a new one.