-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Open
Labels
API DesignAccessorsaccessor registration mechanism (not .str, .dt, .cat)accessor registration mechanism (not .str, .dt, .cat)EnhancementNeeds DiscussionRequires discussion from core team before further actionRequires discussion from core team before further action
Description
Currently, to extend pandas Series
, DataFrame
and Index
with user-defined methods, we use accessors in the next way:
@pandas.api.extensions.register_series_accessor('emoji')
class Emoji:
def __init__(self, data):
self.data = data
def is_monkey(self):
"""
This would create `Series().emoji.is_monkey`
"""
return self.data.isin(['🙈', '🙉', '🙊'])
While this works well, I think there are two problems with this approach:
- The API looks somehow intimidating, and it's not well known. I think because
pandas.api.extensions.register_series_accessor
is too long and lives inpandas.api
, separate of functionality most users know. - It's not possible to register methods directly (
Series().is_monkey
instead ofSeries().emoji.is_monkey
)
I think all the projects extending pandas I've seen, simply "inject" the methods (except the ones implemented by pandas maintainers). For example:
- https://github.com/PatrikHlobil/Pandas-Bokeh/blob/master/pandas_bokeh/__init__.py#L20
- https://github.com/nalepae/pandarallel/blob/master/pandarallel/pandarallel.py#L52
What I propose is to have a easier/simpler API for the user. To be specific, this is the syntax I'd like when extending Series
...
import pandas
@pandas.Series.extend('emoji')
class Emoji:
def __init__(self, data):
self.data = data
def is_monkey(self):
"""
This would create `Series().emoji.is_monkey`
"""
return self.data.isin(['🙈', '🙉', '🙊'])
@pandas.Series.extend(namespace='emoji')
def is_monkey(data):
"""
This would also create `Series().emoji.is_monkey`
"""
return data.isin(['🙈', '🙉', '🙊'])
@pandas.Series.extend
class Emoji:
def __init__(self, data):
self.data = data
def is_monkey(self):
"""
This would directly create `Series().is_monkey`
"""
return self.data.isin(['🙈', '🙉', '🙊'])
@pandas.Series.extend
def is_monkey(data):
"""
This would create `Series().emoji.is_monkey`
"""
return data.isin(['🙈', '🙉', '🙊'])
This would make things much easier for the user, because:
- The name
pandas.Series.extend
is much easier to remember - A single function can be used (without creating a class)
- A direct method of
Series
... can be created
CC: @pandas-dev/pandas-core
Metadata
Metadata
Assignees
Labels
API DesignAccessorsaccessor registration mechanism (not .str, .dt, .cat)accessor registration mechanism (not .str, .dt, .cat)EnhancementNeeds DiscussionRequires discussion from core team before further actionRequires discussion from core team before further action