-
Notifications
You must be signed in to change notification settings - Fork 2.2k
fix: make wrapped C++ functions pickleable #5580
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ch")` macos-13 • brew install llvm ``` /Users/runner/work/pybind11/pybind11/include/pybind11/detail/function_record_pyobject.h:40:26: error: cast from 'PyObject *(*)(PyObject *, PyObject *, PyObject *)' (aka '_object *(*)(_object *, _object *, _object *)') to 'PyCFunction' (aka '_object *(*)(_object *, _object *)') converts to incompatible function type [-Werror,-Wcast-function-type-mismatch] 40 | = {{"__reduce_ex__", (PyCFunction) reduce_ex_impl, METH_VARARGS | METH_KEYWORDS, nullptr}, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ ```
@henryiii This is now ready for review. @rainwoodman FYI |
Not sure I want to argue with ChatGPT, but I feel its description is overblown. It's very rare that you want to pickle a function. You want to (and very much can) pickle objects/data, but it is quite rare to need to pickle a function (which has no state). Boost-histogram, for example, has extensive Parallel and Concurrent Processing, Distributed Computing and Cloud Environments (including dask-histogram), and Testing and Mocking usage and we've never once needed to pickle a function. There are cases where it can be handy (if you are using functions as inputs to your API, for example), which is probably why a large system like PyCLIF ended up hitting it. But ChatGPT makes it sound like it's a massive deal, when I think it's a minor nicety. ;) Are we sure this won't run into any issues in supporting the Stable ABI? I'd assume not, since we are relying a bit less on provided tools in CPython, but worth asking. That's something we really should look into doing in the future. Also, moving away from the tooling provided for this purpose worries me slightly. The key feature of PyCapsule is allowing a C-API from one extension module to use another one transparently without the Python middle layer; this is really important and is used by SciPy, boost-histogram, and many other performance applications, I assume that's not affected by this? Here's an example in boost-histogram: https://github.com/scikit-hep/boost-histogram/blob/460ef90905d6a8a9e6dd3beddfe7b4b49b364579/include/bh_python/transform.hpp#L70-L74 - Can you still convert pybind11 functions to By the way, a remotely related issues is that a tiny bit of work (setting |
/* destructor tp_del */ nullptr, | ||
/* unsigned int tp_version_tag */ 0, | ||
/* destructor tp_finalize */ nullptr, | ||
#if PY_VERSION_HEX >= 0x03080000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't aware we supported Python < 3.8.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching that, removed with commit e8e921d.
Yes, this immediately breaks: diff --git a/pyproject.toml b/pyproject.toml
index 890c6e6..38f4e82 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,5 +1,5 @@
[build-system]
-requires = ["scikit-build-core>=0.11", "pybind11>=2.13.3"]
+requires = ["scikit-build-core>=0.11", "pybind11 @ git+https://github.com/rwgk/pybind11@pickle_callable"]
build-backend = "scikit_build_core.build"
[project] $ uv run pytest
Built boost-histogram @ file:///Users/henryschreiner/git/scikit-hep/boost-histogram
Uninstalled 1 package in 2ms
Installed 1 package in 3ms
ImportError while loading conftest '/Users/henryschreiner/git/scikit-hep/boost-histogram/tests/conftest.py'.
tests/conftest.py:16: in <module>
import boost_histogram as bh
src/boost_histogram/__init__.py:3: in <module>
from . import accumulators, axis, numpy, storage
src/boost_histogram/axis/__init__.py:24: in <module>
from . import transform
src/boost_histogram/axis/transform.py:156: in <module>
sqrt = Function("_sqrt_fn", "_sq_fn", convert=_internal_conversion, name="sqrt")
src/boost_histogram/axis/transform.py:145: in __init__
self._this = cpp_class(forward, inverse, convert, name)
E ValueError: PyCapsule_GetName called with invalid PyCapsule object This is a core performance and design requirement for many applications, including a core design of several parts of SciPy ( |
It's true that pickling functions is rare, but I stumbled over this multiple times while working at Google. Note that SWIG-wrapped and PyCLIF-wrapped functions are pickleable (those two systems are still in wide use there). If functions are pickleable, the feature is getting used, and if not, people are forced to fish for workarounds, which can be a significant time sink depending on the situation. Global testing at Google passed with the original PR (google/pybind11clif#30099). The boost-histogram code is using Copying here for easy reference: if(auto cfunc = func.cpp_function()) {
auto c = py::reinterpret_borrow<py::capsule>(
PyCFunction_GET_SELF(cfunc.ptr()));
auto rec = c.get_pointer<py::detail::function_record>();
if(rec && rec->is_stateless
&& py::detail::same_type(
typeid(raw_t*),
*reinterpret_cast<const std::type_info*>(rec->data[1]))) {
struct capture {
raw_t* f;
};
return std::make_tuple((reinterpret_cast<capture*>(&rec->data))->f,
src);
}
// Note that each error is slightly different just to help with debugging
throw py::type_error("Only ctypes double(double) and C++ functions allowed "
"(must be stateless)");
} We'd need to provide a public API for that performance optimization. We could provide something like |
Unlike for boost-histogram, there are no matches in scipy:
Global testing passed, although that's now about 7 months ago. But I'm inclined to believe that boost-histogram is probably the only project that will need tweaking, to use a public API. |
@henryiii It looks like boost-histogram copy-paste-tweaked this pybind11 code:
Did you consider using that caster instead? (Call |
Today you can use pybind11 to create a function you can call in SciPy. After this PR, that will no longer work. I had to dig into the internals a bit to make something that takes a
|
I'm really surprised, could you please provide an example? — The Google global testing ~7 months ago had a lot of scipy dependencies for sure, although I don't know with what version of scipy. |
I'm confused by the way PyCapsule is used here. If boost-histogram is the only major issue, I can fix that. Here's the sample code, including the raw pointer method that ChatGPT suggested too (Faster than calling through Python, slower than PyCapsule): somecode.cpp:// cppimport
#include <pybind11/pybind11.h>
namespace py = pybind11;
double square(double x) {
return x * x;
}
void* get_my_function() {
return reinterpret_cast<void*>(square);
}
py::capsule get_my_capsule() {
return py::capsule(reinterpret_cast<void*>(square), "double (double)");
}
PYBIND11_MODULE(somecode, m) {
m.def("square", &square);
m.def("square_callable", &get_my_function);
m.def("square_capsule", &get_my_capsule);
}
/*
<%
setup_pybind11(cfg)
%>
*/ example.py:import contextlib
import time
import cppimport.import_hook
import somecode
import scipy
import scipy.integrate
@contextlib.contextmanager
def timing(description: str) -> None:
start = time.monotonic()
yield
ellapsed_time = time.monotonic() - start
print(f"{description}: {ellapsed_time*1_000_000:.1f}µs")
with timing("Through Python"):
integral = scipy.integrate.quad(somecode.square, 0, 1)
print(integral)
llc = scipy.LowLevelCallable(somecode.square_callable(), signature="double (double)")
with timing("With LowLevelCallable"):
integral = scipy.integrate.quad(llc, 0, 1)
print(integral)
llc2 = scipy.LowLevelCallable(somecode.square_capsule())
with timing("With PyCapsule"):
integral = scipy.integrate.quad(llc2, 0, 1)
print(integral) $ uv init
$ uv add pybind11 scipy cppimport
$ uv run python example.py
Through Python: 43.3µs
(0.33333333333333337, 3.700743415417189e-15)
With LowLevelCallable: 12.1µs
(0.33333333333333337, 3.700743415417189e-15)
With PyCapsule: 7.0µs
(0.33333333333333337, 3.700743415417189e-15) Is there a way to pass this through without the explicit |
Thanks! That looks promising at first glance. I'll look at this carefully tomorrow. |
I get " |
I want to add a pybind11 unit test that mirrors your situation. Is your work accessible publicly, for me to use us a starting point? |
https://github.com/scikit-hep/boost-histogram/blob/460ef90905d6a8a9e6dd3beddfe7b4b49b364579/include/bh_python/transform.hpp is available, I don't have the various experiments I've been trying locally anywhere, though, since none of them worked. https://boost-histogram.readthedocs.io/en/latest/user-guide/transforms.html is where I describe how to use this (though I don't mention making a function in pybind11, I was assuming numba would be the most common way). |
Interesting, this escaped me completely before. @henryiii Please take a look here (demo PR #5585): 8b500da — Add This is to show that the new dedicated function record Python type introduced with this PR gives us new leverage. I don't a have good idea for making the magic |
def test_assumptions(): | ||
assert pickle.HIGHEST_PROTOCOL >= 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this? This isn't a test for pybind11, it's a test of the Python standard library? https://docs.python.org/3/library/pickle.html#pickle.HIGHEST_PROTOCOL and it's at least 4 since Python 3.4.
assert pickle.HIGHEST_PROTOCOL >= 0 | ||
|
||
|
||
@pytest.mark.parametrize("protocol", range(pickle.HIGHEST_PROTOCOL + 1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, we don't normally support protocol 0 with py::pickle
, but I guess this does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it like this:
-def test_assumptions():
+def all_pickle_protocols():
assert pickle.HIGHEST_PROTOCOL >= 0
+ return range(pickle.HIGHEST_PROTOCOL + 1)
-@pytest.mark.parametrize("protocol", range(pickle.HIGHEST_PROTOCOL + 1))
+@pytest.mark.parametrize("protocol", all_pickle_protocols())
def test_pickle_simple_callable(protocol):
Basically, I'm paranoid about accidentally not running test_pickle_simple_callable()
at all.
This is easily overlooked:
test_pickling.py::test_pickle_simple_callable[protocol0] SKIPPED (got empty parameter set ['protocol'], function test_pickle_simple_...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oof, sorry, the above was meant to be a response to the other question.
Interesting, we don't normally support protocol 0 with
py::pickle
, but I guess this does?
Yes, this is completely independent. There is no reason to limit what protocol versions are accepted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If pickle reports a maximum pickle version of 0, pickle would be broken. That would be a Python bug, I don't think we should be worried about Python bugs of that magnitude. Especially since it only would produce a minor warning in the test suite.
I highly, highly, highly recommend rebasing and never making merge commits. Merge commits are evil and can cause strange results. There's even a "rebase" button in the UI. Though it's possible this is a GitHub bug, it probably wouldn't happen on linear history. |
You can cherry pick the commit, though I'd rather just use the value without testing it. We should rely on Python being correctly implemented, it adds distractions to our test suite. |
Coincidentally, I had this ChatGPT conversation about PRs and force pushing just before this little incident: I'm not happy either way, and gravitate to merging because I'm often referencing commits. It's bad if those references go stale. This is a tiny pretty inconsequential commit though. I'll simply carry it into #5585. |
I'm getting a warning on every single file when compiling now:
|
Sorry I didn't see this before. Is this still an issue? What are the exact conditions to reproduce? We have these warning suppressions already: PYBIND11_WARNING_PUSH
#if defined(__GNUC__) && __GNUC__ >= 8
PYBIND11_WARNING_DISABLE_GCC("-Wcast-function-type")
#endif
#if defined(__clang__) && !defined(__apple_build_version__) && __clang_major__ >= 19
PYBIND11_WARNING_DISABLE_CLANG("-Wcast-function-type-mismatch")
#endif
static PyMethodDef tp_methods_impl[]
= {{"__reduce_ex__", (PyCFunction) reduce_ex_impl, METH_VARARGS | METH_KEYWORDS, nullptr},
{nullptr, nullptr, 0, nullptr}};
PYBIND11_WARNING_POP Do we need to work on the preprocessor conditions? — I didn't want to make them too general, maybe that's backfiring now? I remember I looked around quite a bit for the best way to implement that line, when I was working on the code originally, but I couldn't find a cleaner solution. Coincidentally, a couple days ago I spent some time looking at issue #5600, specifically: #5600 (comment) For that situation ChatGPT suggested among other things:
Maybe that could help here, too? In the long run. Warning suppressions seem best as an immediate solution. |
I’m on macOS. So probably the clang but not Apple is the problem. |
Does this silence the warnings for you? -#if defined(__clang__) && !defined(__apple_build_version__) && __clang_major__ >= 19
+#if defined(__clang__) It's a bit sweeping, but I think it should be fine? It's just for that one cast. Please let me know, it's a quick PR. |
I’ll try to check tomorrow, only have my phone ATM. |
I think I have something better for you: PR #5608 (CI was still running when I posted this, but it's looking promising.) |
Description
This PR requires a
PYBIND11_INTERNALS_VERSION
bump (from8
to9
).Closes #1261
Full ChatGPT assessment
ChatGPT summary (with a couple edits)
Motivation and Importance
Before this PR, pybind11-bound functions are not pickleable, which creates significant friction for Python users who rely on
pickle
and related serialization mechanisms for common tasks such as:Parallel and Concurrent Processing
Python's
multiprocessing
module depends on pickling to distribute tasks across worker processes. Since pybind11 functions cannot be pickled, users are forced to implement fragile and inefficient workarounds (e.g., passing function names as strings or relying on global state). This limits the usability of pybind11 bindings in parallel and distributed computing frameworks such asmultiprocessing
,concurrent.futures
, andjoblib
.Caching and Memoization
Libraries like
joblib
,functools.lru_cache
, anddiskcache
rely on pickle to store function results and enable fast recomputation. The inability to pickle pybind11-bound functions makes it difficult to integrate C++ extensions with these widely used caching mechanisms.Distributed Computing and Cloud Environments
Modern data processing frameworks such as Dask and Ray depend on pickling to serialize tasks and distribute them across clusters. The current pybind11 limitation prevents users from leveraging C++-based functions in these environments without cumbersome workarounds.
Testing and Mocking
Serialization of test artifacts and fixtures is a common practice in Python testing frameworks like
pytest
. Non-pickleable pybind11 functions make it difficult to persist test state and mock functions consistently.General Python Compatibility
Python developers expect functions and objects to be pickleable as a baseline feature. The fact that pybind11-bound functions break this expectation creates unnecessary obstacles.
This PR introduces a solution to enable pickling of pybind11-bound functions in a way that is efficient, reliable, and consistent with Python's native serialization semantics. The solution has been tested extensively in large-scale production environments and is designed to maintain backward compatibility and minimal performance overhead. By addressing this long-standing limitation, this PR will significantly improve the usability of pybind11 for parallel processing, distributed systems, caching, and testing — making it easier for developers to integrate C++ extensions into modern Python workflows.
Notes:
repr()
for pybind11 functions is made much more informative and truthful:Before this PR:
(
PyCapsule
objects do not have methods.)With this PR:
-DCMAKE_BUILD_TYPE=MinSizeRel
):Before this PR:
With this PR:
Factor: 5257248 / 5252640 = 1.0008772731426483
Comparison with stock Python and other systems for Python-C++ bindings:
Python functions (built-in & native) are pickleable.
SWIG functions are pickleable.
PyCLIF-C-API functions are pickleable. (Because they are implemented exactly like stock Python build-in functions.)
Boost.Python functions are not pickleable.
nanobind functions are not pickleable.
Everything below are very technical details:
Why is
not pickleable before this PR?
The reason is that the
simple_callable
is implemented as shown byrepr(simple_callable)
:To Python it appears to be a bound function of the type
PyCapsule
. (The reasons for this choice of implementation are deep and omitted here.)pickle.dumps(simple_callable)
is capable of pickling built-in methods. It does that in two steps:produces this tuple (see
__reduce_ex__
documentation):pickle then recurses, attempting to serialize
<built-in function getattr>
(no problem),<capsule object NULL at 0x...>
(problem), and'simple_callable'
(no problem).The attempt to serialize
<capsule object NULL at 0x...>
fails with this exception (copy-pasted from pytest output):While this is not the desired behavior when pickling
simple_callable
, there are very good reasons in general thatPyCapsule
objects are not pickleable. See, for example, here:The solution implemented in this PR is to
replace the
PyCapsule
with a custom built-in type, wrapping thepybind11::detail::function_record
C++ type the good-old way, with manually written bindings (implemented in include/pybind11/detail/function_record_pyobject.h),which then makes it possible to provide a
__reduce_ex__
implementation to achieve the desired behavior.The new
repr(simple_callable)
is:Side note: The Python type name is versioned to ensure ABI compatibility, but to maximize compatibility it is independent of
PYBIND11_INTERNALS_VERSION
.simple_callable.__reduce_ex__(0)
now produces:When pickle recurses to call
__reduce_ex__
of the wrappedfunction_record
object, the result is:Very simple! — Your author has to admit though that it took quite a while to figure this out. See the development history of google/pybind11clif#30099 for the many dead ends explored before this solution was discovered.
To explain how this works in the
pickle.load()
step:eval("__import__('importlib').import_module('pybind11_tests.pickling')")
produces a reference to the importedpybind11_tests.pickling
module.getattr(imported_module, 'simple_callable')
simply accesses what was just imported.Note that
__import__('pybind11_tests.pickling')
only produces the top-level module (pybind11_tests
in this case), as explained in theimportlib.import_module()
documentation.This PR is a backport of google/pybind11clif#30099
Suggested changelog entry: