-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Closed
Labels
BugDtype ConversionsUnexpected or buggy dtype conversionsUnexpected or buggy dtype conversionsMultiIndex
Milestone
Description
When (at least) one element in a MultiIndex contains a NaN, has_duplicates starts to behave strangely:
>>> idx = pd.MultiIndex.from_arrays([[101, 102], [3.5, np.nan]])
>>> idx
MultiIndex
[(101, 3.5), (102, nan)]
>>> idx.has_duplicates
True
>>> idx.get_duplicates()
[]
I would expect has_duplicates to return False here, because 102 is not the same as 101.
I would also expect it to return false for the MultiIndex
MultiIndex
[(101, 3.5), (101, nan)]
since 3.5 != NaN, but this case is more debatable.
This is important because you can't call .unstack() on a series with a MultiIndex for which has_duplicates is True, even if the MultiIndex is of high dimension and the dimensions containing the NaN(s) are not involved in the operation.
This is with pandas 0.12.0
Metadata
Metadata
Assignees
Labels
BugDtype ConversionsUnexpected or buggy dtype conversionsUnexpected or buggy dtype conversionsMultiIndex