The problem
Whe defining different objects in cython, the memoryviews will return the same address. However, the array itself will get modified when indexed into.
Background.
I have base class and derived class written in cython. I noticed that when I applied multiprocessing to the classes, the underlying buffers were altered in different processess, which was not intended. During the pickling procedure I wrote a simple __reduce__ method and __deepcopy__ method that rebuilds the original object. For sake of clarity I reduced the complexity to the code below. Now my question is, why do the memoryviews return the same address? Additionally, why are the numpy array itself altered correctly even though the memoryview is the same
#distutils: language=c++
import numpy as np
cimport numpy as np
cdef class Temp:
cdef double[::1] inp
def __init__(self, inp):
print(f'id of inp = {id(inp)}')
self.inp = inp
cdef np.ndarray x = np.ones(10)
cdef Temp a = Temp(x)
cdef Temp b = Temp(x)
cdef Temp c = Temp(x.copy())
b.inp[0] = -1
c.inp[2] = 10
print(f'id of a.inp = {id(a.inp)}\nid of b.inp = {id(b.inp))}\nid of c.inp = {id(c.inp)}')
print(f'id of a.inp.base = {id(a.inp.base)}\nid of b.inp.base = {id(b.inp.base))}\nid of c.inp.base = {id(c.inp.base)}')
print('a.inp.base',a.inp.base)
print('b.inp.base',b.inp.base) # expected to be the same as a
print('c.inp.base',c.inp.base) # expected to be different to a/b
Output:
id of inp = 139662709551872
id of inp = 139662709551872
id of inp = 139662709551952
id of a.inp = 139662450248672
id of b.inp = 139662450248672
id of c.inp = 139662450248672
id of a.inp.base = 139662709551872
id of b.inp.base = 139662709551872
id of c.inp.base = 139662709551952
a.inp.base [-1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
b.inp.base [-1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
c.inp.base [ 1. 1. 10. 1. 1. 1. 1. 1. 1. 1.]
What we call typed memory view isn't a single class: Depending on the context (Cython code, pure Python code) it changes its identity under the hood.
So let's start with
Here
double[::1] inpis of type__Pyx_memviewslicewhich isn't a Python object:What happens when we call
id(self.inp)? Obviously,idis a pure-Python function, so a new temporary python-object (a memoryview) must be created fromself.inp(only to be able to callid) and destroyed directly afterwards. The creation of the temporary Python-object is done via__pyx_memoryview_fromslice.Knowing that, it is easy to explain, why the ids are equal: despite being different objects, temporary memoryviews have coincidentally the same address (and thus the same
id, which is an implementation detail of CPython), because the memory is reused over and over again by CPython.There are similar scenarios all over in Python, here is an example for method-objects, or even a more simple one:
So in a nutshell: your expectation, that the same
idmeans the same object is wrong. This assumption only holds, when the life times of objects overlap.