-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Closed
Labels
ExtensionArrayExtending pandas with custom dtypes or arrays.Extending pandas with custom dtypes or arrays.Usage Question
Description
Greetings, Pandas devs! cuDF is building out additional dtypes such as cudf.CategoricalDtype
and cudf.ListDtype
based on pd.ExtensionDtype
, and this is one question that came up.
The documentation states:
It’s expected ExtensionArray[item] returns an instance of ExtensionDtype.type for scalar item, assuming that value is valid (not NA). NA values do not need to be instances of type.
However, I note that pd.CategoricalDtype
for instance does not adhere to this:
In [47]: import pandas as pd
In [48]: a = pd.Series(['a', 'b'], dtype='category')
In [49]: type(a[0])
Out[49]: str
In [50]: type(a.array[0])
Out[50]: str
In [51]: isinstance(a.array, pd.api.extensions.ExtensionArray)
Out[51]: True
In [52]: isinstance(a.dtype, pd.api.extensions.ExtensionDtype)
Out[52]: True
On the other hand, NumPy defines dtype.type
somewhat differently:
The type object used to instantiate a scalar of this data-type.
Would love any insights as to the appropriate return value of .type
.
Metadata
Metadata
Assignees
Labels
ExtensionArrayExtending pandas with custom dtypes or arrays.Extending pandas with custom dtypes or arrays.Usage Question