How to mpi send and receive large arrays with dtype=object in python?

31 views Asked by At

The following is a toy code which demonstrates my problem. I run it as

mpirun -np 2 python3 main.py

I want to send large arrays (dtype=object) across CPUs. The code works for smaller arrays but fails with larger ones. The error message is displayed below the code.

import numpy as np
from mpi4py import MPI

comm = MPI.COMM_WORLD
rank = comm.Get_rank()
N = comm.Get_size()

if rank == 0:
    print(N)
    
    arr = np.zeros(2,dtype=object)
    
    #This works
    #a=1.5*np.ones(50000000,dtype='float64')
    #b=2.5*np.ones(100000000,dtype='float64')

    #This doesn't work
    a=1.5*np.ones(500000000,dtype='float64')
    b=2.5*np.ones(1000000000,dtype='float64')

    arr[0] = a
    arr[1] = b

    comm.send(arr,dest=1)
        
else:
    receive = comm.recv(source=0)

Error message:

2
Traceback (most recent call last):
  File "/home/shikhar/Documents/Examples/mpi/main.py", line 19, in <module>
    comm.send(arr,dest=1)
  File "mpi4py/MPI/Comm.pyx", line 1406, in mpi4py.MPI.Comm.send
  File "mpi4py/MPI/msgpickle.pxi", line 211, in mpi4py.MPI.PyMPI_send
  File "mpi4py/MPI/msgpickle.pxi", line 147, in mpi4py.MPI.pickle_dump
  File "mpi4py/MPI/msgbuffer.pxi", line 50, in mpi4py.MPI.downcast
OverflowError: integer 12000000297 does not fit in 'int'

How to rectify the above problem or if people can suggest alternative strategies for such situations it would be greatly appreciated?

1

There are 1 answers

0
DrMittal On

Following this page and this (suggested by @9769953), the code works. We need to use mpi4py.util.pkl5.

import numpy as np
from mpi4py import MPI
from mpi4py.util import pkl5

comm = pkl5.Intracomm(MPI.COMM_WORLD)
rank = comm.Get_rank()
N = comm.Get_size()

if rank == 0:
    print(N)
    
    arr = np.zeros(2,dtype=object)

    a=1.5*np.ones(500000000,dtype='float64')
    b=2.5*np.ones(1000000000,dtype='float64')

    arr[0] = a
    arr[1] = b

    comm.send(arr,dest=1)
        
else:
    receive = comm.recv(source=0)