@@ -965,21 +965,26 @@ If you select a label *contained* within an interval, this will also select the
965
965
df.loc[2.5 ]
966
966
df.loc[[2.5 , 3.5 ]]
967
967
968
- ``Interval `` and ``IntervalIndex `` are used by ``cut `` and ``qcut ``:
968
+ :func: `cut ` and :func: `qcut ` both return a ``Categorical `` object, and the bins they
969
+ create are stored as an ``IntervalIndex `` in its ``.categories `` attribute.
969
970
970
971
.. ipython :: python
971
972
972
973
c = pd.cut(range (4 ), bins = 2 )
973
974
c
974
975
c.categories
975
976
976
- Furthermore, ``IntervalIndex `` allows one to bin *other * data with these same
977
- bins, with ``NaN `` representing a missing value similar to other dtypes.
977
+ :func: `cut ` also accepts an ``IntervalIndex `` for its ``bins `` argument, which enables
978
+ a useful pandas idiom. First, We call :func: `cut ` with some data and ``bins `` set to a
979
+ fixed number, to generate the bins. Then, we pass the values of ``.categories `` as the
980
+ ``bins `` argument in subsequent calls to :func: `cut `, supplying new data which will be
981
+ binned into the same bins.
978
982
979
983
.. ipython :: python
980
984
981
985
pd.cut([0 , 3 , 5 , 1 ], bins = c.categories)
982
986
987
+ Any value which falls outside all bins will be assigned a ``NaN `` value.
983
988
984
989
Generating ranges of intervals
985
990
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -1108,6 +1113,8 @@ the :meth:`~Index.is_unique` attribute.
1108
1113
weakly_monotonic.is_monotonic_increasing
1109
1114
weakly_monotonic.is_monotonic_increasing & weakly_monotonic.is_unique
1110
1115
1116
+ .. _advanced.endpoints_are_inclusive :
1117
+
1111
1118
Endpoints are inclusive
1112
1119
~~~~~~~~~~~~~~~~~~~~~~~
1113
1120
@@ -1137,7 +1144,7 @@ index can be somewhat complicated. For example, the following does not work:
1137
1144
s.loc['c':'e' + 1]
1138
1145
1139
1146
A very common use case is to limit a time series to start and end at two
1140
- specific dates. To enable this, we made the design to make label-based
1147
+ specific dates. To enable this, we made the design choice to make label-based
1141
1148
slicing include both endpoints:
1142
1149
1143
1150
.. ipython :: python
0 commit comments