Skip to content

DOC: update the Series.str.join docstring #20463

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Mar 27, 2018
Merged
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 40 additions & 5 deletions pandas/core/strings.py
Original file line number Diff line number Diff line change
Expand Up @@ -941,17 +941,52 @@ def str_get_dummies(arr, sep='|'):

def str_join(arr, sep):
"""
Join lists contained as elements in the Series/Index with
passed delimiter. Equivalent to :meth:`str.join`.
Join lists contained as elements in the Series/Index with passed delimiter.

If the elements of a Series are lists themselves, join the content of these
lists using the delimiter passed to the function.
This function is an equivalent to :meth:`str.join`.

Parameters
----------
sep : string
Delimiter
sep : str
Delimiter to use between list entries.

Returns
-------
joined : Series/Index of objects
Series/Index of objects
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd use object instead of objects, as this is more a type definition than an explanations (may be one day we can use these types as annotations?)


Notes
-----
If any of the lists does not contain string objects the result of the join
will be `NaN`.

See Also
--------
str.join : Standard library version of this method.
Series.str.split : Split strings around given separator/delimiter.

Examples
--------

Example with a list that contains non-string elements.

>>> s = pd.Series({1: ['lion', 'elephant', 'zebra'],
... 2: [1.1, 2.2, 3.3],
... 3: [np.nan, np.nan, np.nan]})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment, but I think it'd be more useful for users to see that joining ['cat', np.nan, 'dog'] becomes NaN, than that joining some NaNs become NaN. That could also apply in the number example ['cat', 23, 'dog']. Feel free to disagree if you think it's more clear with unique types.

Also, it's something very subtle, but I find slightly distracting using a specific index in the example, that is not used. It surely doesn't make a big difference, but I'd construct the Series with a list of lists, and use the default index, so nobody thinks the index has an impact on joining the elements.

>>> s
1 [lion, elephant, zebra]
2 [1.1, 2.2, 3.3]
3 [nan, nan, nan]
dtype: object

Join all lists using an '-', the list of floats will become a NaN.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment needs to be updated after adding the NaN?


>>> s.str.join('-')
1 lion-elephant-zebra
2 NaN
3 NaN
dtype: object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this second example, the first one is actually a bit redundant. As this exemplifies both cases, strings and not strings.

And I think we could even show in one of the rows the floats, another the strings (both as you did), and use the third one to illustrate a list with a NaN, which I assume it returns a NaN, but it may be not obvious for all users.

"""
return _na_map(sep.join, arr)

Expand Down