Skip to content

Commit 32e486c

Browse files
ARFjreback
ARF
authored andcommitted
Introduction of RangeIndex
`RangeIndex(1, 10, 2)` is a memory saving alternative to `Index(np.arange(1, 10,2))`: c.f. pandas-dev#939. This re-implementation is compatible with the current `Index()` api and is a drop-in replacement for `Int64Index()`. It automatically converts to Int64Index() when required by operations. At present only for a minimum number of operations the type is conserved (e.g. slicing, inner-, left- and right-joins). Most other operations trigger creation of an equivalent Int64Index (or at least an equivalent numpy array) and fall back to its implementation. This PR also extends the functionality of the `Index()` constructor to allow creation of `RangeIndexes()` with ``` Index(20) Index(2, 20) Index(0, 20, 2) ``` in analogy to ``` range(20) range(2, 20) range(0, 20, 2) ``` restore Index() fastpath precedence Various fixes suggested by @jreback and @shoyer Cache a private Int64Index object the first time it or its values are required. Restore Index(5) as error. Restore its test. Allow Index(0, 5) and Index(0, 5, 1). Make RangeIndex immutable. See start, stop, step properties. In test_constructor(): check class, attributes (possibly including dtype). In test_copy(): check that copy is not identical (but equal) to the existing. In test_duplicates(): Assert is_unique and has_duplicates return correct values. fix slicing fix view Set RangeIndex as default index * enh: set RangeIndex as default index * fix: pandas.io.packers: encode() and decode() for RangeIndex * enh: array argument pass-through * fix: reindex * fix: use _default_index() in pandas.core.frame.extract_index() * fix: pandas.core.index.Index._is() * fix: add RangeIndex to ABCIndexClass * fix: use _default_index() in _get_names_from_index() * fix: pytables tests * fix: MultiIndex.get_level_values() * fix: RangeIndex._shallow_copy() * fix: null-size RangeIndex equals() comparison * enh: make RangeIndex.is_unique immutable enh: various performance optimizations * optimize argsort() * optimize tolist() * comment clean-up
1 parent 6b8a721 commit 32e486c

File tree

7 files changed

+1070
-26
lines changed

7 files changed

+1070
-26
lines changed

pandas/core/api.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
from pandas.core.categorical import Categorical
99
from pandas.core.groupby import Grouper
1010
from pandas.core.format import set_eng_float_format
11-
from pandas.core.index import Index, CategoricalIndex, Int64Index, Float64Index, MultiIndex
11+
from pandas.core.index import Index, CategoricalIndex, Int64Index, RangeIndex, Float64Index, MultiIndex
1212

1313
from pandas.core.series import Series, TimeSeries
1414
from pandas.core.frame import DataFrame

pandas/core/common.py

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,7 @@ def _check(cls, inst):
8585
ABCCategoricalIndex = create_pandas_abc_type("ABCCategoricalIndex", "_typ", ("categoricalindex",))
8686
ABCIndexClass = create_pandas_abc_type("ABCIndexClass", "_typ", ("index",
8787
"int64index",
88+
"rangeindex",
8889
"float64index",
8990
"multiindex",
9091
"datetimeindex",
@@ -1755,10 +1756,8 @@ def is_bool_indexer(key):
17551756

17561757

17571758
def _default_index(n):
1758-
from pandas.core.index import Int64Index
1759-
values = np.arange(n, dtype=np.int64)
1760-
result = Int64Index(values,name=None)
1761-
result.is_unique = True
1759+
from pandas.core.index import RangeIndex
1760+
result = RangeIndex(0, int(n), name=None)
17621761
return result
17631762

17641763

@@ -2156,6 +2155,11 @@ def is_int64_dtype(arr_or_dtype):
21562155
tipo = _get_dtype_type(arr_or_dtype)
21572156
return issubclass(tipo, np.int64)
21582157

2158+
def is_int64_dtype(arr_or_dtype):
2159+
tipo = _get_dtype_type(arr_or_dtype)
2160+
return issubclass(tipo, np.int64)
2161+
2162+
21592163
def is_int_or_datetime_dtype(arr_or_dtype):
21602164
tipo = _get_dtype_type(arr_or_dtype)
21612165
return (issubclass(tipo, np.integer) or

pandas/core/frame.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5250,7 +5250,7 @@ def extract_index(data):
52505250
% (lengths[0], len(index)))
52515251
raise ValueError(msg)
52525252
else:
5253-
index = Index(np.arange(lengths[0]))
5253+
index = _default_index(lengths[0])
52545254

52555255
return _ensure_index(index)
52565256

@@ -5467,11 +5467,11 @@ def convert(arr):
54675467

54685468

54695469
def _get_names_from_index(data):
5470-
index = lrange(len(data))
54715470
has_some_name = any([getattr(s, 'name', None) is not None for s in data])
54725471
if not has_some_name:
5473-
return index
5472+
return _default_index(len(data))
54745473

5474+
index = lrange(len(data))
54755475
count = 0
54765476
for i, s in enumerate(data):
54775477
n = getattr(s, 'name', None)

0 commit comments

Comments
 (0)