-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Fixing == __eq__ operator for MultiIndex ... closes #9785 #9872
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -2439,6 +2439,8 @@ def _get_dtype_type(arr_or_dtype): | |||
return np.dtype(arr_or_dtype).type | |||
elif isinstance(arr_or_dtype, CategoricalDtype): | |||
return CategoricalDtypeType | |||
elif arr_or_dtype is NotImplemented: | |||
return type(NotImplemented) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems strange to me. Why would we ever want to use NotImplementedType
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function just returns the type of an object ... NotImplemented is an object. So not sure what the question is?
if the question is why is this feature important? Pandas has code that checks the type of return values and acts accordingly. Occasionally Numpy will return this type and this code breaks with an attribute error because NotImplementedType has no attribute dtype.
PS. Without this the tests below fail. This may not be clear from the small diff but the bug in this case is visible in _indexOp pandas/core/index.py:58
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, thanks for clarifying. My sense is that this method probably should raise if it gets NotImplemented
, because this is not really a valid numpy dtype.
Instead, I would make the wrapper functions in pandas.core.index
handle NotImplemented
explicitly, e.g., add something like
if result is NotImplemented:
return result
Operations that wrap arithmetic should know how to handle NotImplemented
-- that's a saner place to put it than here (where it could eat some strange errors)..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get what you are saying about numpy types, but it shouldn't eat up any errors, since this function just says that the type of the input is so and so ... right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based purely on the name, function expects a numpy/pandas array or dtype object, and NotImplemented
is certainly not one of those :). Returning arbitrary output instead of raising an error feels like an anti-pattern to me.
In contrast, for something like is_bool_dtype
it does make sense to return False
when given NotImplemented
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, I'll make this change in is_bool_dtype instead ... ccing @jreback for reference
@@ -2510,6 +2510,8 @@ def is_floating_dtype(arr_or_dtype): | |||
|
|||
|
|||
def is_bool_dtype(arr_or_dtype): | |||
if arr_or_dtype is NotImplemented: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
K, this seems more reasonable, though from my perspective I would probably still be happier explicitly handling NotImplemented
inside the wrapper functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So conceptually though, the code in the wrapper got a result, its trying to check if this result is a bool type ... the check is crashing on this NULL type of exception. So this is where its gets tricky conceptually, is NULL part of this function's domain or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed... might be better to use try/except
logic, e.g.,
try:
tipo = _get_dtype_type(arr_or_dtype)
except Exception:
return False
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good, one last thing. I tested the following and it also crapped out.
_get_dtype_type(False)
so is this also supposed to crap out? or is it supposed to return False?
is_bool_dtype(False)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, this is an internal function, so it's not extensively tested (and not worth worrying about testing too much). It expects the argument to already be a numpy dtype or numpy array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I caught ValueError and made sure that's raised ... does this look better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes :)
merged via 07257a0 thanks! |
closes #9785