-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Closed
Labels
BugCompatpandas objects compatability with Numpy or Python functionspandas objects compatability with Numpy or Python functionsDtype ConversionsUnexpected or buggy dtype conversionsUnexpected or buggy dtype conversions
Milestone
Description
xref #14937 (comment)
a number of indexing / conversion issues arise because we are treating uint
as a direct int
, rather than a sub-class. (e.g. if we make UIntBlock a sub-class of IntBlock), I think can easily handle some small overrides to, for instance check for negative values when indexing.
In [1]: df = pd.DataFrame({'A' : np.array([1,2,3],dtype='uint64'), 'B': range(3)})
In [2]: df
Out[2]:
A B
0 1 0
1 2 1
2 3 2
In [4]: df.dtypes
Out[4]:
A uint64
B int64
dtype: object
Buggy
In [5]: df.iloc[1] = -1
In [6]: df
Out[6]:
A B
0 1 0
1 18446744073709551615 -1
2 3 2
In [7]: df.iloc[1] = np.nan
In [8]: df
Out[8]:
A B
0 1.0 0.0
1 NaN NaN
2 3.0 2.0
This is correct
In [9]: df.A.astype('uint64')
---------------------------------------------------------------------------
ValueError: Cannot convert non-finite values (NA or inf) to integer
However, this is not
In [10]: df.iloc[1] = -1
In [11]: df
Out[11]:
A B
0 1.0 0.0
1 -1.0 -1.0
2 3.0 2.0
In [12]: df.dtypes
Out[12]:
A float64
B float64
dtype: object
In [13]: df.A.astype('uint64')
Out[13]:
0 1
1 18446744073709551615
2 3
Name: A, dtype: uint64
Construction with invalid values
In [1]: Series([-1], dtype='uint64')
Out[1]:
0 18446744073709551615
dtype: uint64
Metadata
Metadata
Assignees
Labels
BugCompatpandas objects compatability with Numpy or Python functionspandas objects compatability with Numpy or Python functionsDtype ConversionsUnexpected or buggy dtype conversionsUnexpected or buggy dtype conversions