-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
REF: Make Util functions accessible from C #50425
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Initial timeits don't show anything conclusive between main: In [5]: %timeit lib.is_integer_array(np.array([1, 2]))
525 ns ± 0.77 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
In [6]: %timeit lib.is_integer_array(np.array([1, 2]))
466 ns ± 1.64 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
In [7]: %timeit lib.is_integer_array(np.array([1, 2]))
469 ns ± 1.51 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
In [8]: %timeit lib.is_integer_array(np.array([1, 2]))
468 ns ± 0.787 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each) and this PR In [7]: %timeit lib.is_integer_array(np.array([1, 2]))
531 ns ± 0.679 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
In [8]: %timeit lib.is_integer_array(np.array([1, 2]))
479 ns ± 1.41 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
In [9]: %timeit lib.is_integer_array(np.array([1, 2]))
474 ns ± 3.77 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each) |
is_timedelta64 : bool | ||
*/ | ||
int is_timedelta64_object(PyObject *obj) { | ||
return PyObject_TypeCheck(obj, &PyTimedeltaArrType_Type); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure the implementations ulltimately should be in this header file. Done for now just to emulate the existing util.pxd
file, but would make the build system a bit easier with a dedicated .c implementation
The full license is in the LICENSE file, distributed with this software. | ||
*/ | ||
|
||
#ifndef PANDAS__LIBS_PANDAS_TYPE_H_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type_inference.h might also be a better name than just type.h
IIRC cython has some mechanism to make cython functions available in C. is that an option? |
Looks like this is the section in the docs: Worth trying in the future for more advanced use cases, but I think here the functions we are putting into Cython functions are really C functions anyway, so would be less steps the way it is That said, I think this PR will be dependent on moving away from setuptools into meson. setuptools it seems implicitly requires that all extensions you build are actually Python extensions, so we aren't able to really set up the full dependency chain needed without meson |
IMO, ideally, we should be doing C -> Cython since its much easier from a build standpoint unless absolutely necessary (Cython generates the C header files at runtime when going the other way which can be tricky). |
This should be possible with build_clib. (See https://github.com/pandas-dev/pandas/pull/47081/files#diff-60f61ab7a8d1910d86d9fda2261620314edcae5894d5aaa236b821c7256badd7). The only question left should be whether you want to wait for setuptools support to be dropped. Assuming the meson build system PR lands sometime in January, we'll probably have to wait until no more 2.0.x releases to fully drop setuptools to prevent main/2.x from diverging too much. (I guess we can push the transition more aggressively/delay 2.0 slightly after the PR lands too. At any rate, we'd probably need 1 month with the new meson build system before I'm confident enough to drop setuptools) |
Yea I think this should wait until we drop setuptools. There's no rush on this PR and I'm not confident we can do it well without meson |
Not a huge deal, but we specifically moved the util.pxd functions from a now-removed numpy_helper.h file years ago to be more friendly to non-C contributors |
Gonna close for now - dependent on the new build system. Can come back after that work stabilizes |
Came up when reviewing #50400 with @lithomas1 . Doesn't directly solve this, but a lot of our util.pyx functions are in essence C functions to begin with. Because they are implemented in Cython though, its a built difficult to use them in our non-Cython extensions like JSON.
This takes the code out of the Cython file and ports to a C header file. This is mostly a verbatim copy with some special considerations done for NA handling. Marking as draft because I'm not sure a header-only solution is better than a shared library, but need more time to mess around with the setup script. May also just wait until we implement meson fully