Skip to content

Commit cac2245

Browse files
committed
Merge branch 'shiny-new-feature' of https://github.com/EzraBrauner/pandas into shiny-new-feature
2 parents 1bc2820 + c8ddb0a commit cac2245

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

52 files changed

+324
-322
lines changed

doc/source/conf.py

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -225,11 +225,24 @@
225225
# Theme options are theme-specific and customize the look and feel of a theme
226226
# further. For a list of options available for each theme, see the
227227
# documentation.
228+
229+
switcher_version = version
230+
if ".dev" in version:
231+
switcher_version = "dev"
232+
elif "rc" in version:
233+
switcher_version = version.split("rc")[0] + " (rc)"
234+
228235
html_theme_options = {
229236
"external_links": [],
230237
"github_url": "https://github.com/pandas-dev/pandas",
231238
"twitter_url": "https://twitter.com/pandas_dev",
232239
"google_analytics_id": "UA-27880019-2",
240+
"navbar_end": ["version-switcher", "navbar-icon-links"],
241+
"switcher": {
242+
"json_url": "https://pandas.pydata.org/versions.json",
243+
"url_template": "https://pandas.pydata.org/{version}/",
244+
"version_match": switcher_version,
245+
},
233246
}
234247

235248
# Add any paths that contain custom themes here, relative to this directory.

doc/source/development/contributing_codebase.rst

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -490,8 +490,7 @@ Writing tests
490490
All tests should go into the ``tests`` subdirectory of the specific package.
491491
This folder contains many current examples of tests, and we suggest looking to these for
492492
inspiration. If your test requires working with files or
493-
network connectivity, there is more information on the `testing page
494-
<https://github.com/pandas-dev/pandas/wiki/Testing>`_ of the wiki.
493+
network connectivity, there is more information on the :wiki:`Testing` of the wiki.
495494

496495
The ``pandas._testing`` module has many special ``assert`` functions that
497496
make it easier to make statements about whether Series or DataFrame objects are

doc/source/development/roadmap.rst

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -74,8 +74,7 @@ types. This includes consistent behavior in all operations (indexing, arithmetic
7474
operations, comparisons, etc.). There has been discussion of eventually making
7575
the new semantics the default.
7676

77-
This has been discussed at
78-
`github #28095 <https://github.com/pandas-dev/pandas/issues/28095>`__ (and
77+
This has been discussed at :issue:`28095` (and
7978
linked issues), and described in more detail in this
8079
`design doc <https://hackmd.io/@jorisvandenbossche/Sk0wMeAmB>`__.
8180

@@ -129,8 +128,7 @@ We propose that it should only work with positional indexing, and the translatio
129128
to positions should be entirely done at a higher level.
130129

131130
Indexing is a complicated API with many subtleties. This refactor will require care
132-
and attention. More details are discussed at
133-
https://github.com/pandas-dev/pandas/wiki/(Tentative)-rules-for-restructuring-indexing-code
131+
and attention. More details are discussed at :wiki:`(Tentative)-rules-for-restructuring-indexing-code`
134132

135133
Numba-accelerated operations
136134
----------------------------

doc/source/ecosystem.rst

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -586,7 +586,6 @@ Development tools
586586
While pandas repository is partially typed, the package itself doesn't expose this information for external use.
587587
Install pandas-stubs to enable basic type coverage of pandas API.
588588

589-
Learn more by reading through these issues `14468 <https://github.com/pandas-dev/pandas/issues/14468>`_,
590-
`26766 <https://github.com/pandas-dev/pandas/issues/26766>`_, `28142 <https://github.com/pandas-dev/pandas/issues/28142>`_.
589+
Learn more by reading through :issue:`14468`, :issue:`26766`, :issue:`28142`.
591590

592591
See installation and usage instructions on the `github page <https://github.com/VirtusLab/pandas-stubs>`__.

doc/source/user_guide/10min.rst

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -554,10 +554,8 @@ Stack
554554
555555
tuples = list(
556556
zip(
557-
*[
558-
["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"],
559-
["one", "two", "one", "two", "one", "two", "one", "two"],
560-
]
557+
["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"],
558+
["one", "two", "one", "two", "one", "two", "one", "two"],
561559
)
562560
)
563561
index = pd.MultiIndex.from_tuples(tuples, names=["first", "second"])

doc/source/user_guide/advanced.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1246,5 +1246,5 @@ This is because the (re)indexing operations above silently inserts ``NaNs`` and
12461246
changes accordingly. This can cause some issues when using ``numpy`` ``ufuncs``
12471247
such as ``numpy.logical_and``.
12481248

1249-
See the `this old issue <https://github.com/pandas-dev/pandas/issues/2388>`__ for a more
1249+
See the :issue:`2388` for a more
12501250
detailed discussion.

doc/source/user_guide/cookbook.rst

Lines changed: 7 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -193,8 +193,7 @@ The :ref:`indexing <indexing>` docs.
193193
194194
df[(df.AAA <= 6) & (df.index.isin([0, 2, 4]))]
195195
196-
`Use loc for label-oriented slicing and iloc positional slicing
197-
<https://github.com/pandas-dev/pandas/issues/2904>`__
196+
Use loc for label-oriented slicing and iloc positional slicing :issue:`2904`
198197

199198
.. ipython:: python
200199
@@ -395,8 +394,7 @@ Sorting
395394
396395
df.sort_values(by=("Labs", "II"), ascending=False)
397396
398-
`Partial selection, the need for sortedness;
399-
<https://github.com/pandas-dev/pandas/issues/2995>`__
397+
Partial selection, the need for sortedness :issue:`2995`
400398

401399
Levels
402400
******
@@ -910,8 +908,7 @@ Valid frequency arguments to Grouper :ref:`Timeseries <timeseries.offset_aliases
910908
`Grouping using a MultiIndex
911909
<https://stackoverflow.com/questions/41483763/pandas-timegrouper-on-multiindex>`__
912910

913-
`Using TimeGrouper and another grouping to create subgroups, then apply a custom function
914-
<https://github.com/pandas-dev/pandas/issues/3791>`__
911+
Using TimeGrouper and another grouping to create subgroups, then apply a custom function :issue:`3791`
915912

916913
`Resampling with custom periods
917914
<https://stackoverflow.com/questions/15408156/resampling-with-custom-periods>`__
@@ -947,8 +944,7 @@ Depending on df construction, ``ignore_index`` may be needed
947944
df = pd.concat([df1, df2], ignore_index=True)
948945
df
949946
950-
`Self Join of a DataFrame
951-
<https://github.com/pandas-dev/pandas/issues/2996>`__
947+
Self Join of a DataFrame :issue:`2996`
952948

953949
.. ipython:: python
954950
@@ -1070,8 +1066,7 @@ using that handle to read.
10701066
`Inferring dtypes from a file
10711067
<https://stackoverflow.com/questions/15555005/get-inferred-dataframe-types-iteratively-using-chunksize>`__
10721068

1073-
`Dealing with bad lines
1074-
<https://github.com/pandas-dev/pandas/issues/2886>`__
1069+
Dealing with bad lines :issue:`2886`
10751070

10761071
`Write a multi-row index CSV without writing duplicates
10771072
<https://stackoverflow.com/questions/17349574/pandas-write-multiindex-rows-with-to-csv>`__
@@ -1205,8 +1200,7 @@ The :ref:`Excel <io.excel>` docs
12051200
`Modifying formatting in XlsxWriter output
12061201
<https://pbpython.com/improve-pandas-excel-output.html>`__
12071202

1208-
`Loading only visible sheets
1209-
<https://github.com/pandas-dev/pandas/issues/19842#issuecomment-892150745>`__
1203+
Loading only visible sheets :issue:`19842#issuecomment-892150745`
12101204

12111205
.. _cookbook.html:
12121206

@@ -1226,8 +1220,7 @@ The :ref:`HDFStores <io.hdf5>` docs
12261220
`Simple queries with a Timestamp Index
12271221
<https://stackoverflow.com/questions/13926089/selecting-columns-from-pandas-hdfstore-table>`__
12281222

1229-
`Managing heterogeneous data using a linked multiple table hierarchy
1230-
<https://github.com/pandas-dev/pandas/issues/3032>`__
1223+
Managing heterogeneous data using a linked multiple table hierarchy :issue:`3032`
12311224

12321225
`Merging on-disk tables with millions of rows
12331226
<https://stackoverflow.com/questions/14614512/merging-two-tables-with-millions-of-rows-in-python/14617925#14617925>`__

doc/source/whatsnew/v0.17.1.rst

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -37,9 +37,7 @@ Conditional HTML formatting
3737
.. warning::
3838
This is a new feature and is under active development.
3939
We'll be adding features an possibly making breaking changes in future
40-
releases. Feedback is welcome_.
41-
42-
.. _welcome: https://github.com/pandas-dev/pandas/issues/11610
40+
releases. Feedback is welcome in :issue:`11610`
4341

4442
We've added *experimental* support for conditional HTML formatting:
4543
the visual styling of a DataFrame based on the data.

doc/source/whatsnew/v1.3.0.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -811,7 +811,7 @@ Other Deprecations
811811
- Deprecated allowing scalars to be passed to the :class:`Categorical` constructor (:issue:`38433`)
812812
- Deprecated constructing :class:`CategoricalIndex` without passing list-like data (:issue:`38944`)
813813
- Deprecated allowing subclass-specific keyword arguments in the :class:`Index` constructor, use the specific subclass directly instead (:issue:`14093`, :issue:`21311`, :issue:`22315`, :issue:`26974`)
814-
- Deprecated the :meth:`astype` method of datetimelike (``timedelta64[ns]``, ``datetime64[ns]``, ``Datetime64TZDtype``, ``PeriodDtype``) to convert to integer dtypes, use ``values.view(...)`` instead (:issue:`38544`)
814+
- Deprecated the :meth:`astype` method of datetimelike (``timedelta64[ns]``, ``datetime64[ns]``, ``Datetime64TZDtype``, ``PeriodDtype``) to convert to integer dtypes, use ``values.view(...)`` instead (:issue:`38544`). This deprecation was later reverted in pandas 1.4.0.
815815
- Deprecated :meth:`MultiIndex.is_lexsorted` and :meth:`MultiIndex.lexsort_depth`, use :meth:`MultiIndex.is_monotonic_increasing` instead (:issue:`32259`)
816816
- Deprecated keyword ``try_cast`` in :meth:`Series.where`, :meth:`Series.mask`, :meth:`DataFrame.where`, :meth:`DataFrame.mask`; cast results manually if desired (:issue:`38836`)
817817
- Deprecated comparison of :class:`Timestamp` objects with ``datetime.date`` objects. Instead of e.g. ``ts <= mydate`` use ``ts <= pd.Timestamp(mydate)`` or ``ts.date() <= mydate`` (:issue:`36131`)

doc/source/whatsnew/v1.4.0.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1050,6 +1050,7 @@ Other
10501050
- Bug in :meth:`Series.replace` raising ``ValueError`` when using ``regex=True`` with a :class:`Series` containing ``np.nan`` values (:issue:`43344`)
10511051
- Bug in :meth:`DataFrame.to_records` where an incorrect ``n`` was used when missing names were replaced by ``level_n`` (:issue:`44818`)
10521052
- Bug in :meth:`DataFrame.eval` where ``resolvers`` argument was overriding the default resolvers (:issue:`34966`)
1053+
- :meth:`Series.__repr__` and :meth:`DataFrame.__repr__` no longer replace all null-values in indexes with "NaN" but use their real string-representations. "NaN" is used only for ``float("nan")`` (:issue:`45263`)
10531054

10541055
.. ---------------------------------------------------------------------------
10551056

doc/source/whatsnew/v1.5.0.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -185,7 +185,7 @@ Timezones
185185

186186
Numeric
187187
^^^^^^^
188-
-
188+
- Bug in operations with array-likes with ``dtype="boolean"`` and :attr:`NA` incorrectly altering the array in-place (:issue:`45421`)
189189
-
190190

191191
Conversion

pandas/_libs/missing.pyi

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,3 +14,4 @@ def checknull(val: object, inf_as_na: bool = ...) -> bool: ...
1414
def isnaobj(arr: np.ndarray, inf_as_na: bool = ...) -> npt.NDArray[np.bool_]: ...
1515
def isnaobj2d(arr: np.ndarray, inf_as_na: bool = ...) -> npt.NDArray[np.bool_]: ...
1616
def is_numeric_na(values: np.ndarray) -> npt.NDArray[np.bool_]: ...
17+
def is_float_nan(values: np.ndarray) -> npt.NDArray[np.bool_]: ...

pandas/_libs/missing.pyx

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -248,6 +248,31 @@ cdef bint checknull_with_nat_and_na(object obj):
248248
return checknull_with_nat(obj) or obj is C_NA
249249

250250

251+
@cython.wraparound(False)
252+
@cython.boundscheck(False)
253+
def is_float_nan(values: ndarray) -> ndarray:
254+
"""
255+
True for elements which correspond to a float nan
256+
257+
Returns
258+
-------
259+
ndarray[bool]
260+
"""
261+
cdef:
262+
ndarray[uint8_t] result
263+
Py_ssize_t i, N
264+
object val
265+
266+
N = len(values)
267+
result = np.zeros(N, dtype=np.uint8)
268+
269+
for i in range(N):
270+
val = values[i]
271+
if util.is_nan(val):
272+
result[i] = True
273+
return result.view(bool)
274+
275+
251276
@cython.wraparound(False)
252277
@cython.boundscheck(False)
253278
def is_numeric_na(values: ndarray) -> ndarray:

pandas/core/arrays/boolean.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -494,7 +494,8 @@ def _arith_method(self, other, op):
494494
if mask is None:
495495
mask = self._mask
496496
if other is libmissing.NA:
497-
mask |= True
497+
# GH#45421 don't alter inplace
498+
mask = mask | True
498499
else:
499500
mask = self._mask | mask
500501

pandas/core/arrays/datetimelike.py

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -430,14 +430,6 @@ def astype(self, dtype, copy: bool = True):
430430
elif is_integer_dtype(dtype):
431431
# we deliberately ignore int32 vs. int64 here.
432432
# See https://github.com/pandas-dev/pandas/issues/24381 for more.
433-
warnings.warn(
434-
f"casting {self.dtype} values to int64 with .astype(...) is "
435-
"deprecated and will raise in a future version. "
436-
"Use .view(...) instead.",
437-
FutureWarning,
438-
stacklevel=find_stack_level(),
439-
)
440-
441433
values = self.asi8
442434

443435
if is_unsigned_integer_dtype(dtype):

pandas/core/dtypes/astype.py

Lines changed: 0 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -112,13 +112,6 @@ def astype_nansafe(
112112

113113
elif is_datetime64_dtype(arr.dtype):
114114
if dtype == np.int64:
115-
warnings.warn(
116-
f"casting {arr.dtype} values to int64 with .astype(...) "
117-
"is deprecated and will raise in a future version. "
118-
"Use .view(...) instead.",
119-
FutureWarning,
120-
stacklevel=find_stack_level(),
121-
)
122115
if isna(arr).any():
123116
raise ValueError("Cannot convert NaT values to integer")
124117
return arr.view(dtype)
@@ -131,13 +124,6 @@ def astype_nansafe(
131124

132125
elif is_timedelta64_dtype(arr.dtype):
133126
if dtype == np.int64:
134-
warnings.warn(
135-
f"casting {arr.dtype} values to int64 with .astype(...) "
136-
"is deprecated and will raise in a future version. "
137-
"Use .view(...) instead.",
138-
FutureWarning,
139-
stacklevel=find_stack_level(),
140-
)
141127
if isna(arr).any():
142128
raise ValueError("Cannot convert NaT values to integer")
143129
return arr.view(dtype)

pandas/core/dtypes/cast.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@
2020
)
2121
import warnings
2222

23+
from dateutil.parser import ParserError
2324
import numpy as np
2425

2526
from pandas._libs import lib
@@ -1336,9 +1337,8 @@ def maybe_cast_to_datetime(
13361337
value = dta.tz_localize("UTC").tz_convert(dtype.tz)
13371338
except OutOfBoundsDatetime:
13381339
raise
1339-
except ValueError:
1340-
# TODO(GH#40048): only catch dateutil's ParserError
1341-
# once we can reliably import it in all supported versions
1340+
except ParserError:
1341+
# Note: this is dateutil's ParserError, not ours.
13421342
pass
13431343

13441344
elif getattr(vdtype, "kind", None) in ["m", "M"]:

pandas/core/frame.py

Lines changed: 5 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -93,7 +93,6 @@
9393
)
9494

9595
from pandas.core.dtypes.cast import (
96-
can_hold_element,
9796
construct_1d_arraylike_from_scalar,
9897
construct_2d_arraylike_from_scalar,
9998
find_common_type,
@@ -3882,18 +3881,11 @@ def _set_value(
38823881

38833882
series = self._get_item_cache(col)
38843883
loc = self.index.get_loc(index)
3885-
dtype = series.dtype
3886-
if isinstance(dtype, np.dtype) and dtype.kind not in ["m", "M"]:
3887-
# otherwise we have EA values, and this check will be done
3888-
# via setitem_inplace
3889-
if not can_hold_element(series._values, value):
3890-
# We'll go through loc and end up casting.
3891-
raise TypeError
3892-
3893-
series._mgr.setitem_inplace(loc, value)
3894-
# Note: trying to use series._set_value breaks tests in
3895-
# tests.frame.indexing.test_indexing and tests.indexing.test_partial
3896-
except (KeyError, TypeError):
3884+
3885+
# series._set_value will do validation that may raise TypeError
3886+
# or ValueError
3887+
series._set_value(loc, value, takeable=True)
3888+
except (KeyError, TypeError, ValueError):
38973889
# set using a non-recursive method & reset the cache
38983890
if takeable:
38993891
self.iloc[index, col] = value

pandas/core/generic.py

Lines changed: 35 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -285,22 +285,6 @@ def _init_mgr(
285285
mgr = mgr.astype(dtype=dtype)
286286
return mgr
287287

288-
@classmethod
289-
def _from_mgr(cls, mgr: Manager):
290-
"""
291-
Fastpath to create a new DataFrame/Series from just a BlockManager/ArrayManager.
292-
293-
Notes
294-
-----
295-
Skips setting `_flags` attribute; caller is responsible for doing so.
296-
"""
297-
obj = cls.__new__(cls)
298-
object.__setattr__(obj, "_is_copy", None)
299-
object.__setattr__(obj, "_mgr", mgr)
300-
object.__setattr__(obj, "_item_cache", {})
301-
object.__setattr__(obj, "_attrs", {})
302-
return obj
303-
304288
def _as_manager(self: NDFrameT, typ: str, copy: bool_t = True) -> NDFrameT:
305289
"""
306290
Private helper function to create a DataFrame with specific manager.
@@ -2068,6 +2052,41 @@ def empty(self) -> bool_t:
20682052
def __array__(self, dtype: npt.DTypeLike | None = None) -> np.ndarray:
20692053
return np.asarray(self._values, dtype=dtype)
20702054

2055+
def __array_wrap__(
2056+
self,
2057+
result: np.ndarray,
2058+
context: tuple[Callable, tuple[Any, ...], int] | None = None,
2059+
):
2060+
"""
2061+
Gets called after a ufunc and other functions.
2062+
2063+
Parameters
2064+
----------
2065+
result: np.ndarray
2066+
The result of the ufunc or other function called on the NumPy array
2067+
returned by __array__
2068+
context: tuple of (func, tuple, int)
2069+
This parameter is returned by ufuncs as a 3-element tuple: (name of the
2070+
ufunc, arguments of the ufunc, domain of the ufunc), but is not set by
2071+
other numpy functions.q
2072+
2073+
Notes
2074+
-----
2075+
Series implements __array_ufunc_ so this not called for ufunc on Series.
2076+
"""
2077+
# Note: at time of dask 2022.01.0, this is still used by dask
2078+
res = lib.item_from_zerodim(result)
2079+
if is_scalar(res):
2080+
# e.g. we get here with np.ptp(series)
2081+
# ptp also requires the item_from_zerodim
2082+
return res
2083+
d = self._construct_axes_dict(self._AXIS_ORDERS, copy=False)
2084+
# error: Argument 1 to "NDFrame" has incompatible type "ndarray";
2085+
# expected "BlockManager"
2086+
return self._constructor(res, **d).__finalize__( # type: ignore[arg-type]
2087+
self, method="__array_wrap__"
2088+
)
2089+
20712090
@final
20722091
def __array_ufunc__(
20732092
self, ufunc: np.ufunc, method: str, *inputs: Any, **kwargs: Any

0 commit comments

Comments
 (0)