does read method of io.BytesIO returns copy of underlying bytes data?

1.9k views Asked by At

I am aware that io.BytesIO() returns a binary stream object which uses in-memory buffer. but also provides getbuffer() which provides a readable and writable view (memoryview obj) over the contents of the buffer without copying them.

obj = io.BytesIO(b'abcdefgh')
buf = obj.getbuffer()

Now, we know buf points to underlying data and when sliced(buf[:3]) returns a memoryview object again without making a copy. So I want to know, if we do obj.read(3) does it also uses in-memory buffer or makes a copy ?. if it does uses in-memeory buffer, what is the difference between obj.read and buf and which one to prefer to effectively read the data in chunks for considerably very long byte objects ?

1

There are 1 answers

12
GIZ On

Simply put, BytesIO.read reads data from the in-memory buffer. The method reads the data and returns as bytes objects and gives you a copy of the read data. buf however, is a memory view object that views the underlying buffer and doesn't make a copy of the data.

The difference between BytesIO.read and buf is that, subsequent data retrieves will not be affected when io.BytesIO.read is used as you will get a copy of the data of the buffer, but if you change data bufyou also will change the data in the buffer as well.

In terms of performance, using obj.read would be a better choice if you want to read the data in chunks, because it provides a clear separation between the data and the buffer, and makes it easier to manage the buffer. On the other hand, if you want to modify the data in the buffer, using buf would be a better choice because it provides direct access to the underlying data.