Skip to content

Commit c2d61c3

Browse files
Merging in updated master in order to make CI checks pass
2 parents a36d450 + 6ca8757 commit c2d61c3

File tree

27 files changed

+491
-124
lines changed

27 files changed

+491
-124
lines changed

.github/workflows/sdist.yml

Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
name: sdist
2+
3+
on:
4+
push:
5+
branches:
6+
- master
7+
pull_request:
8+
branches:
9+
- master
10+
- 1.2.x
11+
- 1.3.x
12+
paths-ignore:
13+
- "doc/**"
14+
15+
jobs:
16+
build:
17+
runs-on: ubuntu-latest
18+
timeout-minutes: 60
19+
defaults:
20+
run:
21+
shell: bash -l {0}
22+
23+
strategy:
24+
fail-fast: false
25+
matrix:
26+
python-version: ["3.7", "3.8", "3.9"]
27+
28+
steps:
29+
- uses: actions/checkout@v2
30+
with:
31+
fetch-depth: 0
32+
33+
- name: Set up Python
34+
uses: actions/setup-python@v2
35+
with:
36+
python-version: ${{ matrix.python-version }}
37+
38+
- name: Install dependencies
39+
run: |
40+
python -m pip install --upgrade pip setuptools wheel
41+
42+
# GH 39416
43+
pip install numpy
44+
45+
- name: Build pandas sdist
46+
run: |
47+
pip list
48+
python setup.py sdist --formats=gztar
49+
50+
- uses: conda-incubator/setup-miniconda@v2
51+
with:
52+
activate-environment: pandas-sdist
53+
python-version: ${{ matrix.python-version }}
54+
55+
- name: Install pandas from sdist
56+
run: |
57+
conda list
58+
python -m pip install dist/*.gz
59+
60+
- name: Import pandas
61+
run: |
62+
cd ..
63+
conda list
64+
python -c "import pandas; pandas.show_versions();"

asv_bench/benchmarks/algos/isin.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -325,3 +325,13 @@ def setup(self, dtype, series_type):
325325

326326
def time_isin(self, dtypes, series_type):
327327
self.series.isin(self.values)
328+
329+
330+
class IsInWithLongTupples:
331+
def setup(self):
332+
t = tuple(range(1000))
333+
self.series = Series([t] * 1000)
334+
self.values = [t]
335+
336+
def time_isin(self):
337+
self.series.isin(self.values)

doc/source/user_guide/indexing.rst

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1523,18 +1523,17 @@ Looking up values by index/column labels
15231523
----------------------------------------
15241524

15251525
Sometimes you want to extract a set of values given a sequence of row labels
1526-
and column labels, this can be achieved by ``DataFrame.melt`` combined by filtering the corresponding
1527-
rows with ``DataFrame.loc``. For instance:
1526+
and column labels, this can be achieved by ``pandas.factorize`` and NumPy indexing.
1527+
For instance:
15281528

15291529
.. ipython:: python
15301530
15311531
df = pd.DataFrame({'col': ["A", "A", "B", "B"],
15321532
'A': [80, 23, np.nan, 22],
15331533
'B': [80, 55, 76, 67]})
15341534
df
1535-
melt = df.melt('col')
1536-
melt = melt.loc[melt['col'] == melt['variable'], 'value']
1537-
melt.reset_index(drop=True)
1535+
idx, cols = pd.factorize(df['col'])
1536+
df.reindex(cols, axis=1).to_numpy()[np.arange(len(df)), idx]
15381537
15391538
Formerly this could be achieved with the dedicated ``DataFrame.lookup`` method
15401539
which was deprecated in version 1.2.0.

doc/source/whatsnew/v1.2.5.rst

Lines changed: 7 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
.. _whatsnew_125:
22

3-
What's new in 1.2.5 (May ??, 2021)
4-
----------------------------------
3+
What's new in 1.2.5 (June 22, 2021)
4+
-----------------------------------
55

66
These are the changes in pandas 1.2.5. See :ref:`release` for a full changelog
77
including other versions of pandas.
@@ -14,32 +14,12 @@ including other versions of pandas.
1414

1515
Fixed regressions
1616
~~~~~~~~~~~~~~~~~
17-
- Regression in :func:`concat` between two :class:`DataFrames` where one has an :class:`Index` that is all-None and the other is :class:`DatetimeIndex` incorrectly raising (:issue:`40841`)
17+
- Fixed regression in :func:`concat` between two :class:`DataFrame` where one has an :class:`Index` that is all-None and the other is :class:`DatetimeIndex` incorrectly raising (:issue:`40841`)
1818
- Fixed regression in :meth:`DataFrame.sum` and :meth:`DataFrame.prod` when ``min_count`` and ``numeric_only`` are both given (:issue:`41074`)
19-
- Regression in :func:`read_csv` when using ``memory_map=True`` with an non-UTF8 encoding (:issue:`40986`)
20-
- Regression in :meth:`DataFrame.replace` and :meth:`Series.replace` when the values to replace is a NumPy float array (:issue:`40371`)
21-
- Regression in :func:`ExcelFile` when a corrupt file is opened but not closed (:issue:`41778`)
22-
23-
.. ---------------------------------------------------------------------------
24-
25-
26-
.. _whatsnew_125.bug_fixes:
27-
28-
Bug fixes
29-
~~~~~~~~~
30-
31-
-
32-
-
33-
34-
.. ---------------------------------------------------------------------------
35-
36-
.. _whatsnew_125.other:
37-
38-
Other
39-
~~~~~
40-
41-
-
42-
-
19+
- Fixed regression in :func:`read_csv` when using ``memory_map=True`` with an non-UTF8 encoding (:issue:`40986`)
20+
- Fixed regression in :meth:`DataFrame.replace` and :meth:`Series.replace` when the values to replace is a NumPy float array (:issue:`40371`)
21+
- Fixed regression in :func:`ExcelFile` when a corrupt file is opened but not closed (:issue:`41778`)
22+
- Fixed regression in :meth:`DataFrame.astype` with ``dtype=str`` failing to convert ``NaN`` in categorical columns (:issue:`41797`)
4323

4424
.. ---------------------------------------------------------------------------
4525

doc/source/whatsnew/v1.3.0.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -269,12 +269,14 @@ Other enhancements
269269
- :meth:`read_csv` and :meth:`read_json` expose the argument ``encoding_errors`` to control how encoding errors are handled (:issue:`39450`)
270270
- :meth:`.GroupBy.any` and :meth:`.GroupBy.all` use Kleene logic with nullable data types (:issue:`37506`)
271271
- :meth:`.GroupBy.any` and :meth:`.GroupBy.all` return a ``BooleanDtype`` for columns with nullable data types (:issue:`33449`)
272+
- :meth:`.GroupBy.any` and :meth:`.GroupBy.all` raising with ``object`` data containing ``pd.NA`` even when ``skipna=True`` (:issue:`37501`)
272273
- :meth:`.GroupBy.rank` now supports object-dtype data (:issue:`38278`)
273274
- Constructing a :class:`DataFrame` or :class:`Series` with the ``data`` argument being a Python iterable that is *not* a NumPy ``ndarray`` consisting of NumPy scalars will now result in a dtype with a precision the maximum of the NumPy scalars; this was already the case when ``data`` is a NumPy ``ndarray`` (:issue:`40908`)
274275
- Add keyword ``sort`` to :func:`pivot_table` to allow non-sorting of the result (:issue:`39143`)
275276
- Add keyword ``dropna`` to :meth:`DataFrame.value_counts` to allow counting rows that include ``NA`` values (:issue:`41325`)
276277
- :meth:`Series.replace` will now cast results to ``PeriodDtype`` where possible instead of ``object`` dtype (:issue:`41526`)
277278
- Improved error message in ``corr`` and ``cov`` methods on :class:`.Rolling`, :class:`.Expanding`, and :class:`.ExponentialMovingWindow` when ``other`` is not a :class:`DataFrame` or :class:`Series` (:issue:`41741`)
279+
- :meth:`DataFrame.explode` now supports exploding multiple columns. Its ``column`` argument now also accepts a list of str or tuples for exploding on multiple columns at the same time (:issue:`39240`)
278280

279281
.. ---------------------------------------------------------------------------
280282
@@ -914,6 +916,7 @@ Datetimelike
914916
- Bug in constructing a :class:`DataFrame` or :class:`Series` with mismatched ``datetime64`` data and ``timedelta64`` dtype, or vice-versa, failing to raise a ``TypeError`` (:issue:`38575`, :issue:`38764`, :issue:`38792`)
915917
- Bug in constructing a :class:`Series` or :class:`DataFrame` with a ``datetime`` object out of bounds for ``datetime64[ns]`` dtype or a ``timedelta`` object out of bounds for ``timedelta64[ns]`` dtype (:issue:`38792`, :issue:`38965`)
916918
- Bug in :meth:`DatetimeIndex.intersection`, :meth:`DatetimeIndex.symmetric_difference`, :meth:`PeriodIndex.intersection`, :meth:`PeriodIndex.symmetric_difference` always returning object-dtype when operating with :class:`CategoricalIndex` (:issue:`38741`)
919+
- Bug in :meth:`DatetimeIndex.intersection` giving incorrect results with non-Tick frequencies with ``n != 1`` (:issue:`42104`)
917920
- Bug in :meth:`Series.where` incorrectly casting ``datetime64`` values to ``int64`` (:issue:`37682`)
918921
- Bug in :class:`Categorical` incorrectly typecasting ``datetime`` object to ``Timestamp`` (:issue:`38878`)
919922
- Bug in comparisons between :class:`Timestamp` object and ``datetime64`` objects just outside the implementation bounds for nanosecond ``datetime64`` (:issue:`39221`)

doc/source/whatsnew/v1.4.0.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@ Other API changes
9696

9797
Deprecations
9898
~~~~~~~~~~~~
99-
-
99+
- Deprecated :meth:`Index.is_type_compatible` (:issue:`42113`)
100100
-
101101

102102
.. ---------------------------------------------------------------------------

pandas/_libs/src/klib/khash_python.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -226,6 +226,9 @@ int PANDAS_INLINE tupleobject_cmp(PyTupleObject* a, PyTupleObject* b){
226226

227227

228228
int PANDAS_INLINE pyobject_cmp(PyObject* a, PyObject* b) {
229+
if (a == b) {
230+
return 1;
231+
}
229232
if (Py_TYPE(a) == Py_TYPE(b)) {
230233
// special handling for some built-in types which could have NaNs
231234
// as we would like to have them equivalent, but the usual

pandas/_libs/tslibs/timestamps.pyx

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -129,6 +129,13 @@ cdef inline object create_timestamp_from_ts(int64_t value,
129129
return ts_base
130130

131131

132+
def _unpickle_timestamp(value, freq, tz):
133+
# GH#41949 dont warn on unpickle if we have a freq
134+
ts = Timestamp(value, tz=tz)
135+
ts._set_freq(freq)
136+
return ts
137+
138+
132139
# ----------------------------------------------------------------------
133140

134141
def integer_op_not_supported(obj):
@@ -725,7 +732,7 @@ cdef class _Timestamp(ABCTimestamp):
725732

726733
def __reduce__(self):
727734
object_state = self.value, self._freq, self.tzinfo
728-
return (Timestamp, object_state)
735+
return (_unpickle_timestamp, object_state)
729736

730737
# -----------------------------------------------------------------
731738
# Rendering Methods

pandas/core/algorithms.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -140,7 +140,11 @@ def _ensure_data(values: ArrayLike) -> tuple[np.ndarray, DtypeObj]:
140140
return np.asarray(values).view("uint8"), values.dtype
141141
else:
142142
# i.e. all-bool Categorical, BooleanArray
143-
return np.asarray(values).astype("uint8", copy=False), values.dtype
143+
try:
144+
return np.asarray(values).astype("uint8", copy=False), values.dtype
145+
except TypeError:
146+
# GH#42107 we have pd.NAs present
147+
return np.asarray(values), values.dtype
144148

145149
elif is_integer_dtype(values.dtype):
146150
return np.asarray(values), values.dtype

pandas/core/arrays/categorical.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@
2626
NaT,
2727
algos as libalgos,
2828
hashtable as htable,
29+
lib,
2930
)
3031
from pandas._libs.arrays import NDArrayBacked
3132
from pandas._libs.lib import no_default
@@ -523,14 +524,17 @@ def astype(self, dtype: Dtype, copy: bool = True) -> ArrayLike:
523524
try:
524525
new_cats = np.asarray(self.categories)
525526
new_cats = new_cats.astype(dtype=dtype, copy=copy)
527+
fill_value = lib.item_from_zerodim(np.array(np.nan).astype(dtype))
526528
except (
527529
TypeError, # downstream error msg for CategoricalIndex is misleading
528530
ValueError,
529531
):
530532
msg = f"Cannot cast {self.categories.dtype} dtype to {dtype}"
531533
raise ValueError(msg)
532534

533-
result = take_nd(new_cats, ensure_platform_int(self._codes))
535+
result = take_nd(
536+
new_cats, ensure_platform_int(self._codes), fill_value=fill_value
537+
)
534538

535539
return result
536540

0 commit comments

Comments
 (0)