Skip to content

Cythonize is_null_slice #55037

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed

Conversation

WillAyd
Copy link
Member

@WillAyd WillAyd commented Sep 6, 2023

This makes the diff in #54506 smaller

@lithomas1
Copy link
Member

Is there a reason you want to Cythonize this? (impact on benchmarks?)

This seems to be used only 6 times.
https://github.com/search?q=repo%3Apandas-dev%2Fpandas%20is_null_slice&type=code

@WillAyd
Copy link
Member Author

WillAyd commented Sep 6, 2023

Yea it also would get used in #54506 in the getitem / setitem methods of the BitmaskArray

@WillAyd
Copy link
Member Author

WillAyd commented Sep 6, 2023

I won't claim that this has a significant impact on benchmarks. The main thing is that it shows less Python interaction in Cython annotate, so it causes less visual noise when searching through the annotate to find bottlenecks. If there's unease about this then no problem closing

@mroeschke mroeschke added Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Internals Related to non-user accessible pandas implementation labels Sep 7, 2023
except TypeError:
return False

if start == 0 and stop == PY_SSIZE_T_MAX and step == 1:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this behave differently from the status quo on e.g. slice(0, None, 1)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No - PY_SSIZE_T_MAX is the largest unit used for addressing on the platform. CPython converts None to that internally, essentially letting you index as far out as your platform allows

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not the None here that im concerned with, its the 0 and 1. On main:

In [1]: import pandas.core.common as com
In [2]: obj = slice(0, None, 1)
In [3]: com.is_null_slice(obj)
Out[3]: False

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm nice catch. That does return True here - are you thinking that is better or worse?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the function is intended pretty specifically to catch obj[:], so would keep the status quo behavior

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK cool. Thanks for catching that - I think can just close this for now then. Not a super high priority

@WillAyd WillAyd closed this Sep 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Internals Related to non-user accessible pandas implementation Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants