You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If a DataFrame contains columns of multiple dtypes, the dtype of the column
997
+
will be chosen to accommodate all of the data types (dtype=object is the most
998
+
general).
991
999
992
-
.. _basics.cast.infer:
1000
+
The related method ``get_dtype_counts`` will return the number of columns of
1001
+
each type:
993
1002
994
-
Inferring better types for object columns
995
-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1003
+
.. ipython:: python
996
1004
997
-
The ``convert_objects`` DataFrame method will attempt to convert
998
-
``dtype=object`` columns to a better NumPy dtype. Occasionally (after
999
-
transposing multiple times, for example), a mixed-type DataFrame will end up
1000
-
with everything as ``dtype=object``. This method attempts to fix that:
1005
+
dft.get_dtype_counts()
1006
+
1007
+
Numeric dtypes will propagate and can coexist in DataFrames (starting in v0.11.0).
1008
+
If a dtype is passed (either directly via the ``dtype`` keyword, a passed ``ndarray``,
1009
+
or a passed ``Series``, then it will be preserved in DataFrame operations. Furthermore, different numeric dtypes will **NOT** be combined. The following example will give you a taste.
If a DataFrame contains columns of multiple dtypes, the dtype of the column
476
-
will be chosen to accommodate all of the data types (dtype=object is the most
477
-
general).
478
-
479
-
The related method ``get_dtype_counts`` will return the number of columns of
480
-
each type:
481
-
482
-
.. ipython:: python
483
-
484
-
df.get_dtype_counts()
485
-
486
-
Numeric dtypes will propagate and can coexist in DataFrames (starting in v0.11.0).
487
-
If a dtype is passed (either directly via the ``dtype`` keyword, a passed ``ndarray``,
488
-
or a passed ``Series``, then it will be preserved in DataFrame operations. Furthermore, different numeric dtypes will **NOT** be combined. The following example will give you a taste.
# this is lower-common-denomicator upcasting (meaning you get the dtype which can accomodate all of the types)
507
-
df3.values.dtype
508
-
509
-
Upcasting is always according to the **numpy** rules. If two different dtypes are involved in an operation, then the more *general* one will be used as the result of the operation.
510
-
511
-
DataType Conversion
512
-
~~~~~~~~~~~~~~~~~~~
513
-
514
-
You can use the ``astype`` method to convert dtypes from one to another. These *always* return a copy.
515
-
In addition, ``convert_objects`` will attempt to *soft* conversion of any *object* dtypes, meaning that if all the objects in a Series are of the same type, the Series
516
-
will have that dtype.
517
-
518
-
.. ipython:: python
519
-
520
-
df3
521
-
df3.dtypes
522
-
523
-
# conversion of dtypes
524
-
df3.astype('float32').dtypes
525
-
526
-
To force conversion of specific types of number conversion, pass ``convert_numeric = True``.
527
-
This will force strings and numbers alike to be numbers if possible, otherwise the will be set to ``np.nan``.
528
-
To force conversion to ``datetime64[ns]``, pass ``convert_dates = 'coerce'``.
529
-
This will convert any datetimelike object to dates, forcing other values to ``NaT``.
530
-
531
-
.. ipython:: python
532
-
533
-
# mixed type conversions
534
-
df3['D'] ='1.'
535
-
df3['E'] ='1'
536
-
df3.convert_objects(convert_numeric=True).dtypes
537
-
538
-
# same, but specific dtype conversion
539
-
df3['D'] = df3['D'].astype('float16')
540
-
df3['E'] = df3['E'].astype('int32')
541
-
df3.dtypes
542
-
543
-
# forcing date coercion
544
-
s = Series([datetime(2001,1,1,0,0), 'foo', 1.0, 1, Timestamp('20010104'), '20010105'],dtype='O')
0 commit comments