Skip to content

Commit dd0bfe8

Browse files
committed
Merge branch 'master' into reduction_dtypes_II
2 parents 467073a + c96827b commit dd0bfe8

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

59 files changed

+1586
-918
lines changed

.github/actions/build_pandas/action.yml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,13 @@ runs:
1414
micromamba list
1515
shell: bash -el {0}
1616

17+
- name: Uninstall existing Pandas installation
18+
run: |
19+
if pip list | grep -q ^pandas; then
20+
pip uninstall -y pandas || true
21+
fi
22+
shell: bash -el {0}
23+
1724
- name: Build Pandas
1825
run: |
1926
if [[ ${{ inputs.editable }} == "true" ]]; then

ci/code_checks.sh

Lines changed: 1 addition & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -105,12 +105,7 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
105105
pandas.errors.UnsupportedFunctionCall \
106106
pandas.test \
107107
pandas.NaT \
108-
pandas.read_clipboard \
109-
pandas.ExcelFile \
110-
pandas.ExcelFile.parse \
111108
pandas.io.formats.style.Styler.to_html \
112-
pandas.HDFStore.groups \
113-
pandas.HDFStore.walk \
114109
pandas.read_feather \
115110
pandas.DataFrame.to_feather \
116111
pandas.read_parquet \
@@ -123,11 +118,6 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
123118
pandas.io.stata.StataReader.value_labels \
124119
pandas.io.stata.StataReader.variable_labels \
125120
pandas.io.stata.StataWriter.write_file \
126-
pandas.core.resample.Resampler.__iter__ \
127-
pandas.core.resample.Resampler.groups \
128-
pandas.core.resample.Resampler.indices \
129-
pandas.core.resample.Resampler.get_group \
130-
pandas.core.resample.Resampler.ffill \
131121
pandas.core.resample.Resampler.asfreq \
132122
pandas.core.resample.Resampler.count \
133123
pandas.core.resample.Resampler.nunique \
@@ -241,6 +231,7 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
241231
pandas.api.extensions.ExtensionArray.factorize \
242232
pandas.api.extensions.ExtensionArray.fillna \
243233
pandas.api.extensions.ExtensionArray.insert \
234+
pandas.api.extensions.ExtensionArray.interpolate \
244235
pandas.api.extensions.ExtensionArray.isin \
245236
pandas.api.extensions.ExtensionArray.isna \
246237
pandas.api.extensions.ExtensionArray.ravel \

doc/source/development/contributing_codebase.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -861,7 +861,7 @@ performance regressions. pandas is in the process of migrating to
861861
`asv benchmarks <https://github.com/airspeed-velocity/asv>`__
862862
to enable easy monitoring of the performance of critical pandas operations.
863863
These benchmarks are all found in the ``pandas/asv_bench`` directory, and the
864-
test results can be found `here <https://pandas.pydata.org/speed/pandas/>`__.
864+
test results can be found `here <https://asv-runner.github.io/asv-collection/pandas>`__.
865865

866866
To use all features of asv, you will need either ``conda`` or
867867
``virtualenv``. For more details please check the `asv installation

doc/source/reference/extensions.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,7 @@ objects.
5252
api.extensions.ExtensionArray.factorize
5353
api.extensions.ExtensionArray.fillna
5454
api.extensions.ExtensionArray.insert
55+
api.extensions.ExtensionArray.interpolate
5556
api.extensions.ExtensionArray.isin
5657
api.extensions.ExtensionArray.isna
5758
api.extensions.ExtensionArray.ravel

doc/source/user_guide/io.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2664,7 +2664,7 @@ Links can be extracted from cells along with the text using ``extract_links="all
26642664
"""
26652665
26662666
df = pd.read_html(
2667-
html_table,
2667+
StringIO(html_table),
26682668
extract_links="all"
26692669
)[0]
26702670
df

doc/source/whatsnew/v0.10.0.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -181,7 +181,7 @@ labeled the aggregated group with the end of the interval: the next day).
181181
``X0``, ``X1``, ...) can be reproduced by specifying ``prefix='X'``:
182182

183183
.. ipython:: python
184-
:okwarning:
184+
:okexcept:
185185
186186
import io
187187

doc/source/whatsnew/v0.24.0.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -286,7 +286,7 @@ value. (:issue:`17054`)
286286

287287
.. ipython:: python
288288
289-
result = pd.read_html("""
289+
result = pd.read_html(StringIO("""
290290
<table>
291291
<thead>
292292
<tr>
@@ -298,7 +298,7 @@ value. (:issue:`17054`)
298298
<td colspan="2">1</td><td>2</td>
299299
</tr>
300300
</tbody>
301-
</table>""")
301+
</table>"""))
302302
303303
*Previous behavior*:
304304

doc/source/whatsnew/v2.0.3.rst

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
.. _whatsnew_203:
22

3-
What's new in 2.0.3 (July XX, 2023)
3+
What's new in 2.0.3 (June 28, 2023)
44
-----------------------------------
55

66
These are the changes in pandas 2.0.3. See :ref:`release` for a full changelog
@@ -17,7 +17,6 @@ Fixed regressions
1717
- Fixed performance regression in merging on datetime-like columns (:issue:`53231`)
1818
- Fixed regression when :meth:`DataFrame.to_string` creates extra space for string dtypes (:issue:`52690`)
1919
- For external ExtensionArray implementations, restored the default use of ``_values_for_factorize`` for hashing arrays (:issue:`53475`)
20-
-
2120

2221
.. ---------------------------------------------------------------------------
2322
.. _whatsnew_203.bug_fixes:
@@ -38,7 +37,6 @@ Bug fixes
3837

3938
Other
4039
~~~~~
41-
-
4240

4341
.. ---------------------------------------------------------------------------
4442
.. _whatsnew_203.contributors:

doc/source/whatsnew/v2.1.0.rst

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -142,20 +142,21 @@ Other enhancements
142142
- Let :meth:`DataFrame.to_feather` accept a non-default :class:`Index` and non-string column names (:issue:`51787`)
143143
- Performance improvement in :func:`read_csv` (:issue:`52632`) with ``engine="c"``
144144
- :meth:`Categorical.from_codes` has gotten a ``validate`` parameter (:issue:`50975`)
145-
- :meth:`DataFrame.stack` gained the ``sort`` keyword to dictate whether the resulting :class:`MultiIndex` levels are sorted (:issue:`15105`)
146145
- :meth:`DataFrame.unstack` gained the ``sort`` keyword to dictate whether the resulting :class:`MultiIndex` levels are sorted (:issue:`15105`)
147146
- :meth:`DataFrameGroupby.agg` and :meth:`DataFrameGroupby.transform` now support grouping by multiple keys when the index is not a :class:`MultiIndex` for ``engine="numba"`` (:issue:`53486`)
148147
- :meth:`Series.explode` now supports pyarrow-backed list types (:issue:`53602`)
149148
- :meth:`Series.str.join` now supports ``ArrowDtype(pa.string())`` (:issue:`53646`)
150149
- :meth:`SeriesGroupby.agg` and :meth:`DataFrameGroupby.agg` now support passing in multiple functions for ``engine="numba"`` (:issue:`53486`)
151150
- :meth:`SeriesGroupby.transform` and :meth:`DataFrameGroupby.transform` now support passing in a string as the function for ``engine="numba"`` (:issue:`53579`)
151+
- Added :meth:`ExtensionArray.interpolate` used by :meth:`Series.interpolate` and :meth:`DataFrame.interpolate` (:issue:`53659`)
152152
- Added ``engine_kwargs`` parameter to :meth:`DataFrame.to_excel` (:issue:`53220`)
153153
- Added a new parameter ``by_row`` to :meth:`Series.apply`. When set to ``False`` the supplied callables will always operate on the whole Series (:issue:`53400`).
154154
- Groupby aggregations (such as :meth:`DataFrameGroupby.sum`) now can preserve the dtype of the input instead of casting to ``float64`` (:issue:`44952`)
155155
- Improved error message when :meth:`DataFrameGroupBy.agg` failed (:issue:`52930`)
156156
- Many read/to_* functions, such as :meth:`DataFrame.to_pickle` and :func:`read_csv`, support forwarding compression arguments to lzma.LZMAFile (:issue:`52979`)
157157
- Performance improvement in :func:`concat` with homogeneous ``np.float64`` or ``np.float32`` dtypes (:issue:`52685`)
158158
- Performance improvement in :meth:`DataFrame.filter` when ``items`` is given (:issue:`52941`)
159+
-
159160

160161
.. ---------------------------------------------------------------------------
161162
.. _whatsnew_210.notable_bug_fixes:
@@ -293,6 +294,7 @@ Deprecations
293294
- Deprecated making the functions in a list of functions given to :meth:`DataFrame.agg` attempt to operate on each element in the :class:`DataFrame` and only operate on the columns of the :class:`DataFrame` if the elementwise operations failed. To keep the current behavior, use :meth:`DataFrame.transform` instead. (:issue:`53325`)
294295
- Deprecated passing a :class:`DataFrame` to :meth:`DataFrame.from_records`, use :meth:`DataFrame.set_index` or :meth:`DataFrame.drop` instead (:issue:`51353`)
295296
- Deprecated silently dropping unrecognized timezones when parsing strings to datetimes (:issue:`18702`)
297+
- Deprecated the "downcast" keyword in :meth:`Series.interpolate`, :meth:`DataFrame.interpolate`, :meth:`Series.fillna`, :meth:`DataFrame.fillna`, :meth:`Series.ffill`, :meth:`DataFrame.ffill`, :meth:`Series.bfill`, :meth:`DataFrame.bfill` (:issue:`40988`)
296298
- Deprecated the ``axis`` keyword in :meth:`DataFrame.ewm`, :meth:`Series.ewm`, :meth:`DataFrame.rolling`, :meth:`Series.rolling`, :meth:`DataFrame.expanding`, :meth:`Series.expanding` (:issue:`51778`)
297299
- Deprecated the ``axis`` keyword in :meth:`DataFrame.resample`, :meth:`Series.resample` (:issue:`51778`)
298300
- Deprecated the behavior of :func:`concat` with both ``len(keys) != len(objs)``, in a future version this will raise instead of truncating to the shorter of the two sequences (:issue:`43485`)
@@ -330,13 +332,13 @@ Deprecations
330332
- Deprecated :meth:`Series.first` and :meth:`DataFrame.first` (please create a mask and filter using ``.loc`` instead) (:issue:`45908`)
331333
- Deprecated :meth:`Series.interpolate` and :meth:`DataFrame.interpolate` for object-dtype (:issue:`53631`)
332334
- Deprecated :meth:`Series.last` and :meth:`DataFrame.last` (please create a mask and filter using ``.loc`` instead) (:issue:`53692`)
333-
- Deprecated allowing ``downcast`` keyword other than ``None``, ``False``, "infer", or a dict with these as values in :meth:`Series.fillna`, :meth:`DataFrame.fillna` (:issue:`40988`)
334335
- Deprecated allowing arbitrary ``fill_value`` in :class:`SparseDtype`, in a future version the ``fill_value`` will need to be compatible with the ``dtype.subtype``, either a scalar that can be held by that subtype or ``NaN`` for integer or bool subtypes (:issue:`23124`)
335336
- Deprecated behavior of :func:`assert_series_equal` and :func:`assert_frame_equal` considering NA-like values (e.g. ``NaN`` vs ``None`` as equivalent) (:issue:`52081`)
336337
- Deprecated bytes input to :func:`read_excel`. To read a file path, use a string or path-like object. (:issue:`53767`)
337338
- Deprecated constructing :class:`SparseArray` from scalar data, pass a sequence instead (:issue:`53039`)
338339
- Deprecated falling back to filling when ``value`` is not specified in :meth:`DataFrame.replace` and :meth:`Series.replace` with non-dict-like ``to_replace`` (:issue:`33302`)
339340
- Deprecated literal json input to :func:`read_json`. Wrap literal json string input in ``io.StringIO`` instead. (:issue:`53409`)
341+
- Deprecated literal string/bytes input to :func:`read_html`. Wrap literal string/bytes input in ``io.StringIO`` / ``io.BytesIO`` instead. (:issue:`53767`)
340342
- Deprecated option "mode.use_inf_as_na", convert inf entries to ``NaN`` before instead (:issue:`51684`)
341343
- Deprecated parameter ``obj`` in :meth:`GroupBy.get_group` (:issue:`53545`)
342344
- Deprecated positional indexing on :class:`Series` with :meth:`Series.__getitem__` and :meth:`Series.__setitem__`, in a future version ``ser[item]`` will *always* interpret ``item`` as a label, not a position (:issue:`50617`)
@@ -541,7 +543,8 @@ Reshaping
541543
- Bug in :meth:`DataFrame.idxmin` and :meth:`DataFrame.idxmax`, where the axis dtype would be lost for empty frames (:issue:`53265`)
542544
- Bug in :meth:`DataFrame.merge` not merging correctly when having ``MultiIndex`` with single level (:issue:`52331`)
543545
- Bug in :meth:`DataFrame.stack` losing extension dtypes when columns is a :class:`MultiIndex` and frame contains mixed dtypes (:issue:`45740`)
544-
- Bug in :meth:`DataFrame.stack` sorting columns lexicographically (:issue:`53786`)
546+
- Bug in :meth:`DataFrame.stack` sorting columns lexicographically in rare cases (:issue:`53786`)
547+
- Bug in :meth:`DataFrame.stack` sorting index lexicographically in rare cases (:issue:`53824`)
545548
- Bug in :meth:`DataFrame.transpose` inferring dtype for object column (:issue:`51546`)
546549
- Bug in :meth:`Series.combine_first` converting ``int64`` dtype to ``float64`` and losing precision on very large integers (:issue:`51764`)
547550
-
@@ -554,6 +557,7 @@ Sparse
554557

555558
ExtensionArray
556559
^^^^^^^^^^^^^^
560+
- Bug in :class:`DataFrame` constructor not copying :class:`Series` with extension dtype when given in dict (:issue:`53744`)
557561
- Bug in :class:`~arrays.ArrowExtensionArray` converting pandas non-nanosecond temporal objects from non-zero values to zero values (:issue:`53171`)
558562
- Bug in :meth:`Series.quantile` for pyarrow temporal types raising ArrowInvalid (:issue:`52678`)
559563
- Bug in :meth:`Series.rank` returning wrong order for small values with ``Float64`` dtype (:issue:`52471`)

pandas/_libs/parsers.pyi

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ from pandas._typing import (
1212
)
1313

1414
STR_NA_VALUES: set[str]
15+
DEFAULT_BUFFER_HEURISTIC: int
1516

1617
def sanitize_objects(
1718
values: npt.NDArray[np.object_],

0 commit comments

Comments
 (0)