-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
release gil more #29322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
release gil more #29322
Conversation
Thanks @jbrockmendel. I am doing some heavy refactoring in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure makes sense. Minor things
pandas/_libs/window.pyx
Outdated
return _roll_weighted_sum_mean(values, weights, minp, avg=1) | ||
|
||
|
||
def _roll_weighted_sum_mean(float64_t[:] values, float64_t[:] weights, | ||
int minp, bint avg): | ||
cdef _roll_weighted_sum_mean(float64_t[:] values, float64_t[:] weights, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cdef void
(unless this returns something)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure. does this affect perf or is it just more explicit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm coming from the explicit angle but I can't imagine it would hurt either
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually looks like this should be cdef ndarray[float64_t]
, will update
pandas/_libs/window.pyx
Outdated
for i in range(1, N): | ||
cur = vals[i] | ||
is_observation = (cur == cur) | ||
nobs += <int64_t>(1 if is_observation else 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the if...else
really required here? Wouldn't cast to int64 handle that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh i thought this was necessary, but now looks like it isnt. will change
if axis == 0: | ||
if periods >= 0: | ||
start, stop = periods, sx | ||
with nogil: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the nogil
here not just precede the if...else
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That'd be nice, but no. The arr.flags.f_contiguous
check is in python-space
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you assign to a bool before the block and check that at the top instead of staying in the Python space? On first impression I was expecting the difference branches to have different GIL considerations, hence the ask
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea, will do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me. Minor things to consider otherwise happy with merge
pandas/_libs/window.pyx
Outdated
for i in range(1, N): | ||
cur = vals[i] | ||
is_observation = (cur == cur) | ||
nobs += <int64_t>is_observation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this cast is unnecessary; at least C should widen the bool type to int64 as part of the addition
pandas/_libs/window.pyx
Outdated
cur_x = input_x[i] | ||
cur_y = input_y[i] | ||
is_observation = ((cur_x == cur_x) and (cur_y == cur_y)) | ||
nobs += <int64_t>is_observation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment on cast
pandas/_libs/window.pyx
Outdated
|
||
return output | ||
# `.base` to access underlying ndarray | ||
return output.base |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at this point output
is a memoryview and we need an ndarray. this is how you do that without creating a new object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC don't we ucommonly use np.asarray(output)
? I think this is more explict (and the same perf wise).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we're not entirely consistent about it, but i agree on the explicit thing. On the margin it must be slightly less performant because it calls into python-space, but that'll be in the "too small to measure" category
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah i like it because its more explicit.
does this show any perf benefits on roll ing sum? |
pls rebase |
rebased, changed output.base to np.asarray(output), will run some asvs |
Wow, way better than I had expected:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. Whatsnew note for perf boost might be nice, especially for the expanding benchmarks. Otherwise over to @jreback
lgtm. nice perf boosts! what's up with the windows build? |
Looks unrelated:
|
green |
thanks, very nice |
cc @mroeschke since a lot of this touches libwindow