-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Description
Required prerequisites
- Make sure you've read the documentation. Your issue may be addressed there.
- Search the issue tracker and Discussions to verify that this hasn't already been reported. +1 or comment there if it has.
- Consider asking first in the Gitter chat room or in a Discussion.
What version (or hash if on master) of pybind11 are you using?
2.13.1
Problem description
Currently, the only way to have an iterator finish is by throwing py::stop_iteration{}. While this is "pythonic", it also incurs huge overhead, especially on short-lived iterators.
I was benchmarking an utility I wrote in C++ that iterates over a lot of files and parses text fields from each file. The files are organized in a specific directory structure that is reflected as three layers of directory iterators, leading to a total of ~25k iterators being created.
The C++ program took 0.8s to execute, 0.47s of which were spent waiting on IO.
The equivalent Python code exposed via pybind took 4s to execute.
When profiling, I saw that 15% of time was spent in exception handling (or up to 40% when using libunwind or llvm-libunwind, bumping execution time to 6s). This seems like a low-hanging fruit compared to all the other pybind-induced overhead.
Sadly, I couldn't come up with a good way how to solve this just yet. Perhaps pybind could add a std::optional-esque container that wraps the iterator return type + a tag on whether the iterator is at it's end?
I also found no way to signal the iterator end without going throw py::stop_iteration, if I missed something obvious please yell at me.
Reproducible example code
No response
Is this a regression? Put the last known working version here if it is.
Not a regression