-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Add default repr for EAs #23601
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add default repr for EAs #23601
Changes from 36 commits
0fdbfd3
ace62aa
6e76b51
fef04e6
1885a97
ecfcd72
4e0d91f
37638cc
6e64b7b
193747e
5a2e1e4
1635b73
e2b1941
48e55cc
d8e7ba4
b312fe4
445736d
60e0d02
5b07906
ff0c998
2fd3d5d
5d8d2fc
baee6b2
4d343ea
5b291d5
1b93bf0
708dd75
0f4083e
9116930
ebadf6f
e5f6976
221cee9
439f2f8
2364546
62b1e2f
a926dca
fc4279d
27db397
5c253a4
ef390fc
2b5fe25
d84cc02
d9df6bf
a35399e
740f9e5
e7cc2ac
c79ba0b
3825aeb
2a60c15
bccf40d
a7ef104
a3b1c92
e080023
6ad113b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -44,10 +44,12 @@ class ExtensionArray(object): | |
* copy | ||
* _concat_same_type | ||
|
||
An additional method is available to satisfy pandas' internal, | ||
private block API. | ||
A default repr displaying the type, (truncated) data, length, | ||
and dtype is provided. It can be customized or replaced by | ||
by overriding: | ||
|
||
* _formatting_values | ||
* _formatter | ||
* __repr__ | ||
TomAugspurger marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Some methods require casting the ExtensionArray to an ndarray of Python | ||
objects with ``self.astype(object)``, which may be expensive. When | ||
|
@@ -653,15 +655,73 @@ def copy(self, deep=False): | |
raise AbstractMethodError(self) | ||
|
||
# ------------------------------------------------------------------------ | ||
# Block-related methods | ||
# Printing | ||
# ------------------------------------------------------------------------ | ||
def __repr__(self): | ||
TomAugspurger marked this conversation as resolved.
Show resolved
Hide resolved
|
||
from pandas.io.formats.printing import format_object_summary | ||
TomAugspurger marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
template = ( | ||
u'{class_name}' | ||
u'{data}\n' | ||
u'Length: {length}, dtype: {dtype}' | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. lowercase Length There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why lower? It's uppercase in the Series repr. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. its lowercase everywhere (else) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Everywhere else being index? Categorical uses a capital. I'd like to eventually use pieces of this for the categorical repr, and would rather not break that repr, so I think a capital makes more sense here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In index it is as part of the "keywords" in the constructor-resembling repr. There it indeed makes sense to have it lowercase. But here, I think capitalized is much more logical (it's the first item on that line), and consistent with Series and Categorical. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok, I guess if we match the EA's to Series, except for the quoting (e.g. quote in EA, but no change in Series, meaning no quoting), then Index is separate. I am still concerned with these slight differences however. E.g. even here, dtype is lowercase and Length is uppercase There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Keep in mind that any discussion of quoting whether or not to quote is up to the individual array. Are you specifically talking about quoting within PeriodArray here? Do we plan to quote within DatetimeArray and TimedeltaArray? |
||
) | ||
# the short repr has no trailing newline, while the truncated | ||
# repr does. So we include a newline in our template, and strip | ||
# any trailing newlines from format_object_summary | ||
data = format_object_summary(self, self._formatter(), name=False, | ||
trailing_comma=False).rstrip() | ||
TomAugspurger marked this conversation as resolved.
Show resolved
Hide resolved
|
||
class_name = u'<{}>\n'.format(self.__class__.__name__) | ||
return template.format(class_name=class_name, data=data, | ||
length=len(self), | ||
dtype=self.dtype) | ||
|
||
def _formatter(self, formatter=None): | ||
# type: (Optional[ExtensionArrayFormatter]) -> Callable[[Any], str] | ||
"""Formatting function for scalar values. | ||
|
||
This is used in the default '__repr__'. The formatting function | ||
TomAugspurger marked this conversation as resolved.
Show resolved
Hide resolved
|
||
receives instances of your scalar type. | ||
|
||
Parameters | ||
---------- | ||
formatter: GenericArrayFormatter, optional | ||
The formatter this array is being rendered with. When the array | ||
is being rendered inside an Index, Series, or DataFrame, a | ||
formatter will be provided. So if you want your objects to | ||
render differently inside a Series from on its own, checking | ||
with ``formatter is None`` is an option. | ||
TomAugspurger marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
The default behavior depends on whether `formatter` is passed. | ||
|
||
* When `formatter` is None, :func:`repr` is returned. | ||
* When `formatter` is passed, ``formatter.formatter`` is used, | ||
TomAugspurger marked this conversation as resolved.
Show resolved
Hide resolved
|
||
which falls back to :func:`repr` if that isn't specified. | ||
|
||
In general, just returning :func:`repr` should be fine. | ||
TomAugspurger marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Returns | ||
------- | ||
Callable[[Any], str] | ||
A callable that gets instances of the scalar type and | ||
returns a string. | ||
""" | ||
return getattr(formatter, 'formatter', None) or repr | ||
|
||
def _formatting_values(self): | ||
# type: () -> np.ndarray | ||
# At the moment, this has to be an array since we use result.dtype | ||
"""An array of values to be printed in, e.g. the Series repr""" | ||
"""An array of values to be printed in, e.g. the Series repr | ||
|
||
.. deprecated:: 0.24.0 | ||
|
||
Use :meth:`ExtensionArray._formatter` instead. | ||
""" | ||
return np.array(self) | ||
|
||
# ------------------------------------------------------------------------ | ||
# Reshaping | ||
# ------------------------------------------------------------------------ | ||
|
||
@classmethod | ||
def _concat_same_type(cls, to_concat): | ||
# type: (Sequence[ExtensionArray]) -> ExtensionArray | ||
|
Uh oh!
There was an error while loading. Please reload this page.