-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Open
Labels
BugExtensionArrayExtending pandas with custom dtypes or arrays.Extending pandas with custom dtypes or arrays.StringsString extension data type and string dataString extension data type and string dataSubclassingSubclassing pandas objectsSubclassing pandas objects
Description
Code Sample, a copy-pastable example
import pandas as pd
from pandas import StringDtype
from pandas.core.arrays import StringArray
from pandas.core.dtypes.dtypes import register_extension_dtype
@register_extension_dtype
class MyExtensionDtype(StringDtype):
name = 'my_extension'
def __repr__(self) -> str:
return "MyExtensionDtype"
@classmethod
def construct_array_type(cls) -> "Type[MyExtensionStringArray]":
return MyExtensionStringArray
class MyExtensionStringArray(StringArray):
def __init__(self, values, copy=False):
super().__init__(values, copy)
self._dtype = MyExtensionDtype()
series = pd.Series(["test", "test2"], dtype="my_extension")
assert series.dtype == 'my_extension'
Results in
assert dtype == "string" AssertionError
Problem description
It should be possible to extend the StringDtype/StringArray for users to design efficient subtypes. I believe that the the AssertionError is a bug and not intended, as pandas wants to have extensible dtypes, because there is the ExtensionDtype.
Expected Output
The code above should pass without errors.
PR with fix on it's way.
Output of pd.show_versions()
pandas v1.0.3
Metadata
Metadata
Assignees
Labels
BugExtensionArrayExtending pandas with custom dtypes or arrays.Extending pandas with custom dtypes or arrays.StringsString extension data type and string dataString extension data type and string dataSubclassingSubclassing pandas objectsSubclassing pandas objects