-
-
Notifications
You must be signed in to change notification settings - Fork 19.3k
WARN: Remove false positive warning for iloc inplaceness #48397
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
We are trying to cast [nan, 1.0] to integer, which raises the RuntimeWarning, but I don't think that this should leak to our users. |
mroeschke
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM off to @jbrockmendel
|
@jbrockmendel ok to merge? |
pandas/core/indexing.py
Outdated
| or new_values.shape != orig_values.shape | ||
| or ( | ||
| not can_hold_element(orig_values, np.nan) | ||
| and isna(new_values).any() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how do we get here with warn=True?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iloc._setitem_with_indexer(indexer, value, self.name)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so i think what's happening here is that we are checking can_hold_element too soon, bc reindexing occurs within self.obj._iset_item(loc, value). I think it would be better to do the reindex before the can_hold_element check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushed a suggestion. Not sure what is better performance wise. Checking initially and the rechecking? Is _get_column_array expensive or cheap? If it is cheap, it might be better only checking can_hold_element for new values?
This breaks one test, not sure if you can set an all NaT Series into an underlying Series with timezone UTC?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_get_column_array is pretty cheap. its the isna(...).any() that i think may be expensive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This breaks one test, not sure if you can set an all NaT Series into an underlying Series with timezone UTC?
depends on the dtype of the all-NaT Series
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
original values are datetime64[ns, UTC] and all NaT Series is datetime64[ns]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that should definitely not be inplace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, we fixed the test instead of breaking it :)
Anything against checking only new_values then? See latest commit now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think that makes sense. caffeine is still kicking in though
This reverts commit 7762cda.
|
We should try to get this into 1.5 too to avoid confusione |
mroeschke
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fairly good to me. @jbrockmendel for any additional comments.
| if tz is None: | ||
| msg = "will attempt to set the values inplace instead" | ||
| with tm.assert_produces_warning(FutureWarning, match=msg): | ||
| df.iloc[:, 0] = pd.NaT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@phofl i think i misunderstood earlier when i said this should warn. i was under the impression that the RHS was pd.Series([pd.NaT]*N, dtype="M8[ns]"), not the scalar pd.NaT. With the scalar, this should not warn.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is an existing problem that i think is orthogonal to this PR, so no need to hold up on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep agreed, the NaT gets cast to an all NaT Series without a timezone, hence this seems equal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ideal then would be to assert_warns(None) unconditionally and xfail the tzaware case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
some nitpicks, otherwise LGTM |
|
Thanks @phofl |
doc/source/whatsnew/vX.X.X.rstfile if fixing a bug or adding a new feature.This won't be able to set inplace, so should not warn