-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
DOC: fix DataFrame.isin docstring and doctests #22767
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 3 commits
0d3ebaa
3684653
82530fa
b5af788
a781439
e59e69f
938a21f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7451,52 +7451,61 @@ def to_period(self, freq=None, axis=0, copy=True): | |
|
||
def isin(self, values): | ||
""" | ||
Return boolean DataFrame showing whether each element in the | ||
DataFrame is contained in values. | ||
Whether each element in the DataFrame is contained in values. | ||
|
||
Parameters | ||
---------- | ||
values : iterable, Series, DataFrame or dictionary | ||
values : iterable, Series, DataFrame or dict | ||
The result will only be true at a location if all the | ||
labels match. If `values` is a Series, that's the index. If | ||
`values` is a dictionary, the keys must be the column names, | ||
`values` is a dict, the keys must be the column names, | ||
which must match. If `values` is a DataFrame, | ||
then both the index and column labels must match. | ||
|
||
Returns | ||
------- | ||
DataFrame | ||
DataFrame of boolean showing whether each element in the DataFrame | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. boolean -> booleans |
||
is contained in values. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we add a |
||
DataFrame of booleans | ||
See Also | ||
-------- | ||
DataFrame.eq: Equality test for DataFrame. | ||
Series.isin: Equivalent method on Series. | ||
Moisan marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Examples | ||
-------- | ||
|
||
>>> df = pd.DataFrame({'num_legs': [2, 4], 'num_wings': [2, 0]}, | ||
... index=['falcon', 'dog']) | ||
>>> df | ||
num_legs num_wings | ||
falcon 2 2 | ||
dog 4 0 | ||
|
||
When ``values`` is a list: | ||
|
||
>>> df = pd.DataFrame({'A': [1, 2, 3], 'B': ['a', 'b', 'f']}) | ||
>>> df.isin([1, 3, 12, 'a']) | ||
A B | ||
0 True True | ||
1 False False | ||
2 True False | ||
>>> df.isin([2]) | ||
num_legs num_wings | ||
falcon True True | ||
dog False False | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What do you think about using Also, I think adding a bit more description would help (e.g. |
||
|
||
When ``values`` is a dict: | ||
When ``values`` is a dict. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would add something to this sentence to explain what is happening when values is a dict. Something like "When values is a dict, we can pass values to check for each column separately:" (similar for the example below as well) |
||
|
||
>>> df = pd.DataFrame({'A': [1, 2, 3], 'B': [1, 4, 7]}) | ||
>>> df.isin({'A': [1, 3], 'B': [4, 7, 12]}) | ||
A B | ||
0 True False # Note that B didn't match the 1 here. | ||
1 False True | ||
2 True True | ||
>>> df.isin({'num_wings': [0, 3], 'num_legs': [0]}) | ||
num_legs num_wings | ||
falcon False False | ||
dog False True | ||
|
||
When ``values`` is a Series or DataFrame: | ||
When ``values`` is a Series or DataFrame. Note that 'falcon' does not | ||
match based on the number of legs in df2. | ||
|
||
>>> df = pd.DataFrame({'A': [1, 2, 3], 'B': ['a', 'b', 'f']}) | ||
>>> df2 = pd.DataFrame({'A': [1, 3, 3, 2], 'B': ['e', 'f', 'f', 'e']}) | ||
>>> df2 = pd.DataFrame({'num_legs': [8, 0, 2], 'num_wings': [0, 2, 2]}, | ||
... index=['spider', 'falcon', 'parrot']) | ||
>>> df.isin(df2) | ||
A B | ||
0 True False | ||
1 False False # Column A in `df2` has a 3, but not at index 1. | ||
2 True True | ||
num_legs num_wings | ||
falcon False True | ||
dog False False | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I like the example, but I don't like that we have to use wrong information (a falcon without legs) ;) Not sure if better of worse, but what do you think about using a DataFrame with just the Also, small detail, but I think we used There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see :). No problem I agree with the changes! |
||
""" | ||
if isinstance(values, dict): | ||
from pandas.core.reshape.concat import concat | ||
|
Uh oh!
There was an error while loading. Please reload this page.