I am using Pybind11/Nanobind to write Python bindings for my C++ libraries.
One of my C++ functions takes in the argument of type std::istream & e.g.:
std::string cPXGStreamReader::testReadStream(std::istream &stream)
{
std::ostringstream contentStream;
std::string line;
while (std::getline(stream, line)) {
contentStream << line << '\n'; // Append line to the content string
}
return contentStream.str(); // Convert contentStream to string and return
}
What kind of argument do I need to pass in Python which corresponds to this?
I have tried passing s where s is created:
s = open(r"test_file.pxgf", "rb")
# and
s = io.BytesIO(b"some initial binary data: \x00\x01")
to no avail. I get the error
TypeError: test_read_file(): incompatible function arguments. The following argument types are supported:
1. (self: pxgf.PXGStreamReader, arg0: std::basic_istream<char,std::char_traits<char> >) -> str
Invoked with: <pxgf.PXGStreamReader object at 0x000002986CF9C6B0>, <_io.BytesIO object at 0x000002986CF92250>
Did you forget to `#include <pybind11/stl.h>`? Or <pybind11/complex.h>,
<pybind11/functional.h>, <pybind11/chrono.h>, etc. Some automatic
Pybind11 doesn't provide support for stream arguments out of the box, so a custom implementation needs to be developed. We will need some kind of an adapter that will allow us to create an
istream, which reads from a Python object. There are two options that come to mind:std::streambufand use that withstd::istream(related Q&A).boost::iostreams::stream(which you can pass to your function without any modifications).I'll focus on the latter option, and restrict the solution only to file-like objects (i.e. derived from
io.IOBase) used for input.Using Boost IOStreams
Source Implementation
We need to create a class satisfying the "Source" model of Boost IOStreams (there is a handy tutorial in the documentation about this subject). Basically something with the following characteristics:
Constructor
In the constructor, we should store a reference to the Python object (
pybind11::object) used as a data source. We should also verify that data source object is actually file-like.Finally, we can cache the
readattribute of the file-like object in anotherpybind11::objectmember variable, in order to avoid a lookup every time we want to call it.read(...)
This function needs to read the requested number of bytes to the provided buffer, and signal End-Of-File when it's reached. The simplest approach is to cast the result of the Python file-like's
read()method tostd::string, and then copy its contents to the read buffer. It works for all both binary and text IO.However, this involves an unnecessary copy. If you're using recent enough C++ standard, you could cast to
std::string_viewinstead. Otherwise, a somewhat lower-level approach may be used (based on the implementation ofpybind11::string):Sample Function Binding
Let's use a free-standing function modeled after your example to test this:
Creating a Pybind11 typecaster for streams appears like opening yet another can of worms, so let's skip that. Instead, let's use a rather simple lambda when defining the function bindings. It simply needs to construct a
boost::iostreams::streamusing our source, and call the wrapped function.Complete Source Code
Here it's all together:
Running it produces the following output:
Note: Due to buffering inherent in
istream, it will usually read from the Python object in chunks (e.g. here it asks for 4096 bytes every time). If you make partial reads of the stream (e.g. just get a single line), you will need to re-adjust the file-like's read position explicitly to reflect the number of bytes actually consumed.Not Using Boost
Pybind11 source code contains an output
streambufimplementation that writes to Python streams. This can be a decent inspiration to get started. I managed to find an existingstreambufimplementation for input in BlueBrain/nmodl repository on GitHub. You would use it in the following manner (probably as part of a wrapper lambda used to dispatch your C++ function):