Skip to content

Commit 4a3e82b

Browse files
authored
Merge branch 'master' into mask_pos_args_deprecation
2 parents 4840f66 + 427a493 commit 4a3e82b

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

74 files changed

+1337
-662
lines changed

.pre-commit-config.yaml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -19,14 +19,14 @@ repos:
1919
types_or: [python, rst, markdown]
2020
files: ^(pandas|doc)/
2121
- repo: https://github.com/pre-commit/pre-commit-hooks
22-
rev: v3.4.0
22+
rev: v4.0.1
2323
hooks:
2424
- id: debug-statements
2525
- id: end-of-file-fixer
2626
exclude: \.txt$
2727
- id: trailing-whitespace
2828
- repo: https://github.com/cpplint/cpplint
29-
rev: f7061b1 # the latest tag does not have the hook
29+
rev: 1.5.5
3030
hooks:
3131
- id: cpplint
3232
# We don't lint all C files because we don't want to lint any that are built
@@ -57,7 +57,7 @@ repos:
5757
hooks:
5858
- id: isort
5959
- repo: https://github.com/asottile/pyupgrade
60-
rev: v2.12.0
60+
rev: v2.18.3
6161
hooks:
6262
- id: pyupgrade
6363
args: [--py37-plus]
@@ -72,7 +72,7 @@ repos:
7272
types: [text] # overwrite types: [rst]
7373
types_or: [python, rst]
7474
- repo: https://github.com/asottile/yesqa
75-
rev: v1.2.2
75+
rev: v1.2.3
7676
hooks:
7777
- id: yesqa
7878
additional_dependencies:

doc/source/user_guide/groupby.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1000,6 +1000,7 @@ instance method on each data group. This is pretty easy to do by passing lambda
10001000
functions:
10011001

10021002
.. ipython:: python
1003+
:okwarning:
10031004
10041005
grouped = df.groupby("A")
10051006
grouped.agg(lambda x: x.std())
@@ -1009,6 +1010,7 @@ arguments. Using a bit of metaprogramming cleverness, GroupBy now has the
10091010
ability to "dispatch" method calls to the groups:
10101011

10111012
.. ipython:: python
1013+
:okwarning:
10121014
10131015
grouped.std()
10141016

doc/source/user_guide/io.rst

Lines changed: 37 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ The pandas I/O API is a set of top level ``reader`` functions accessed like
2222
text;Fixed-Width Text File;:ref:`read_fwf<io.fwf_reader>`
2323
text;`JSON <https://www.json.org/>`__;:ref:`read_json<io.json_reader>`;:ref:`to_json<io.json_writer>`
2424
text;`HTML <https://en.wikipedia.org/wiki/HTML>`__;:ref:`read_html<io.read_html>`;:ref:`to_html<io.html>`
25+
text;`LaTeX <https://en.wikipedia.org/wiki/LaTeX>`__;;:ref:`Styler.to_latex<io.latex>`
2526
text;`XML <https://www.w3.org/standards/xml/core>`__;:ref:`read_xml<io.read_xml>`;:ref:`to_xml<io.xml>`
2627
text; Local clipboard;:ref:`read_clipboard<io.clipboard>`;:ref:`to_clipboard<io.clipboard>`
2728
binary;`MS Excel <https://en.wikipedia.org/wiki/Microsoft_Excel>`__;:ref:`read_excel<io.excel_reader>`;:ref:`to_excel<io.excel_writer>`
@@ -1896,7 +1897,7 @@ Writing in ISO date format:
18961897
18971898
dfd = pd.DataFrame(np.random.randn(5, 2), columns=list("AB"))
18981899
dfd["date"] = pd.Timestamp("20130101")
1899-
dfd = dfd.sort_index(1, ascending=False)
1900+
dfd = dfd.sort_index(axis=1, ascending=False)
19001901
json = dfd.to_json(date_format="iso")
19011902
json
19021903
@@ -2830,7 +2831,42 @@ parse HTML tables in the top-level pandas io function ``read_html``.
28302831
.. |lxml| replace:: **lxml**
28312832
.. _lxml: https://lxml.de
28322833

2834+
.. _io.latex:
28332835

2836+
LaTeX
2837+
-----
2838+
2839+
.. versionadded:: 1.3.0
2840+
2841+
Currently there are no methods to read from LaTeX, only output methods.
2842+
2843+
Writing to LaTeX files
2844+
''''''''''''''''''''''
2845+
2846+
.. note::
2847+
2848+
DataFrame *and* Styler objects currently have a ``to_latex`` method. We recommend
2849+
using the `Styler.to_latex() <../reference/api/pandas.io.formats.style.Styler.to_latex.rst>`__ method
2850+
over `DataFrame.to_latex() <../reference/api/pandas.DataFrame.to_latex.rst>`__ due to the former's greater flexibility with
2851+
conditional styling, and the latter's possible future deprecation.
2852+
2853+
Review the documentation for `Styler.to_latex <../reference/api/pandas.io.formats.style.Styler.to_latex.rst>`__,
2854+
which gives examples of conditional styling and explains the operation of its keyword
2855+
arguments.
2856+
2857+
For simple application the following pattern is sufficient.
2858+
2859+
.. ipython:: python
2860+
2861+
df = pd.DataFrame([[1, 2], [3, 4]], index=["a", "b"], columns=["c", "d"])
2862+
print(df.style.to_latex())
2863+
2864+
To format values before output, chain the `Styler.format <../reference/api/pandas.io.formats.style.Styler.format.rst>`__
2865+
method.
2866+
2867+
.. ipython:: python
2868+
2869+
print(df.style.format("{}").to_latex())
28342870
28352871
XML
28362872
---

doc/source/whatsnew/v1.3.0.rst

Lines changed: 49 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -142,7 +142,7 @@ properly format HTML and eliminate some inconsistencies (:issue:`39942` :issue:`
142142
One also has greater control of the display through separate sparsification of the index or columns, using the new 'styler' options context (:issue:`41142`).
143143

144144
We have added an extension to allow LaTeX styling as an alternative to CSS styling and a method :meth:`.Styler.to_latex`
145-
which renders the necessary LaTeX format including built-up styles.
145+
which renders the necessary LaTeX format including built-up styles. An additional file io function :meth:`Styler.to_html` has been added for convenience (:issue:`40312`).
146146

147147
Documentation has also seen major revisions in light of new features (:issue:`39720` :issue:`39317` :issue:`40493`)
148148

@@ -679,10 +679,15 @@ Deprecations
679679
- Deprecated behavior of :meth:`DatetimeIndex.union` with mixed timezones; in a future version both will be cast to UTC instead of object dtype (:issue:`39328`)
680680
- Deprecated using ``usecols`` with out of bounds indices for ``read_csv`` with ``engine="c"`` (:issue:`25623`)
681681
- Deprecated passing arguments (apart from ``cond`` and ``other``) as positional in :meth:`DataFrame.mask` (:issue:`41485`)
682+
- Deprecated passing arguments as positional in :meth:`DataFrame.clip` and :meth:`Series.clip` (other than ``"upper"`` and ``"lower"``) (:issue:`41485`)
682683
- Deprecated special treatment of lists with first element a Categorical in the :class:`DataFrame` constructor; pass as ``pd.DataFrame({col: categorical, ...})`` instead (:issue:`38845`)
683684
- Deprecated passing arguments as positional (except for ``"method"``) in :meth:`DataFrame.interpolate` and :meth:`Series.interpolate` (:issue:`41485`)
685+
- Deprecated passing arguments as positional in :meth:`DataFrame.sort_index` and :meth:`Series.sort_index` (:issue:`41485`)
686+
- Deprecated passing arguments as positional in :meth:`DataFrame.drop_duplicates` (except for ``subset``), :meth:`Series.drop_duplicates`, :meth:`Index.drop_duplicates` and :meth:`MultiIndex.drop_duplicates`(:issue:`41485`)
684687
- Deprecated passing arguments (apart from ``value``) as positional in :meth:`DataFrame.fillna` and :meth:`Series.fillna` (:issue:`41485`)
688+
- Deprecated passing arguments as positional in :meth:`DataFrame.reset_index` (other than ``"level"``) and :meth:`Series.reset_index` (:issue:`41485`)
685689
- Deprecated construction of :class:`Series` or :class:`DataFrame` with ``DatetimeTZDtype`` data and ``datetime64[ns]`` dtype. Use ``Series(data).dt.tz_localize(None)`` instead (:issue:`41555`,:issue:`33401`)
690+
- Deprecated passing arguments as positional in :meth:`DataFrame.where` and :meth:`Series.where` (other than ``"cond"`` and ``"other"``) (:issue:`41485`)
686691

687692
.. _whatsnew_130.deprecations.nuisance_columns:
688693

@@ -725,6 +730,44 @@ For example:
725730
A 24
726731
dtype: int64
727732
733+
734+
Similarly, when applying a function to :class:`DataFrameGroupBy`, columns on which
735+
the function raises ``TypeError`` are currently silently ignored and dropped
736+
from the result.
737+
738+
This behavior is deprecated. In a future version, the ``TypeError``
739+
will be raised, and users will need to select only valid columns before calling
740+
the function.
741+
742+
For example:
743+
744+
.. ipython:: python
745+
746+
df = pd.DataFrame({"A": [1, 2, 3, 4], "B": pd.date_range("2016-01-01", periods=4)})
747+
gb = df.groupby([1, 1, 2, 2])
748+
749+
*Old behavior*:
750+
751+
.. code-block:: ipython
752+
753+
In [4]: gb.prod(numeric_only=False)
754+
Out[4]:
755+
A
756+
1 2
757+
2 12
758+
759+
.. code-block:: ipython
760+
761+
In [5]: gb.prod(numeric_only=False)
762+
...
763+
TypeError: datetime64 type does not support prod operations
764+
765+
In [6]: gb[["A"]].prod(numeric_only=False)
766+
Out[6]:
767+
A
768+
1 2
769+
2 12
770+
728771
.. ---------------------------------------------------------------------------
729772
730773
@@ -833,6 +876,7 @@ Strings
833876
- Bug in the conversion from ``pyarrow.ChunkedArray`` to :class:`~arrays.StringArray` when the original had zero chunks (:issue:`41040`)
834877
- Bug in :meth:`Series.replace` and :meth:`DataFrame.replace` ignoring replacements with ``regex=True`` for ``StringDType`` data (:issue:`41333`, :issue:`35977`)
835878
- Bug in :meth:`Series.str.extract` with :class:`~arrays.StringArray` returning object dtype for empty :class:`DataFrame` (:issue:`41441`)
879+
- Bug in :meth:`Series.str.replace` where the ``case`` argument was ignored when ``regex=False`` (:issue:`41602`)
836880

837881
Interval
838882
^^^^^^^^
@@ -844,7 +888,7 @@ Interval
844888
Indexing
845889
^^^^^^^^
846890

847-
- Bug in :meth:`Index.union` dropping duplicate ``Index`` values when ``Index`` was not monotonic or ``sort`` was set to ``False`` (:issue:`36289`, :issue:`31326`, :issue:`40862`)
891+
- Bug in :meth:`Index.union` and :meth:`MultiIndex.union` dropping duplicate ``Index`` values when ``Index`` was not monotonic or ``sort`` was set to ``False`` (:issue:`36289`, :issue:`31326`, :issue:`40862`)
848892
- Bug in :meth:`CategoricalIndex.get_indexer` failing to raise ``InvalidIndexError`` when non-unique (:issue:`38372`)
849893
- Bug in :meth:`Series.loc` raising ``ValueError`` when input was filtered with a boolean list and values to set were a list with lower dimension (:issue:`20438`)
850894
- Bug in inserting many new columns into a :class:`DataFrame` causing incorrect subsequent indexing behavior (:issue:`38380`)
@@ -878,6 +922,7 @@ Indexing
878922
- Bug in :meth:`DataFrame.loc.__getitem__` with :class:`MultiIndex` casting to float when at least one column is from has float dtype and we retrieve a scalar (:issue:`41369`)
879923
- Bug in :meth:`DataFrame.loc` incorrectly matching non-boolean index elements (:issue:`20432`)
880924
- Bug in :meth:`Series.__delitem__` with ``ExtensionDtype`` incorrectly casting to ``ndarray`` (:issue:`40386`)
925+
- Bug in :meth:`DataFrame.loc` returning :class:`MultiIndex` in wrong order if indexer has duplicates (:issue:`40978`)
881926
- Bug in :meth:`DataFrame.__setitem__` raising ``TypeError`` when using a str subclass as the column name with a :class:`DatetimeIndex` (:issue:`37366`)
882927

883928
Missing
@@ -932,7 +977,7 @@ I/O
932977
- Bug in :func:`read_csv` and :func:`read_table` misinterpreting arguments when ``sys.setprofile`` had been previously called (:issue:`41069`)
933978
- Bug in the conversion from pyarrow to pandas (e.g. for reading Parquet) with nullable dtypes and a pyarrow array whose data buffer size is not a multiple of dtype size (:issue:`40896`)
934979
- Bug in :func:`read_excel` would raise an error when pandas could not determine the file type, even when user specified the ``engine`` argument (:issue:`41225`)
935-
-
980+
- Bug in :func:`read_clipboard` copying from an excel file shifts values into the wrong column if there are null values in first column (:issue:`41108`)
936981

937982
Period
938983
^^^^^^
@@ -992,6 +1037,7 @@ Groupby/resample/rolling
9921037
- Bug in :meth:`DataFrameGroupBy.__getitem__` with non-unique columns incorrectly returning a malformed :class:`SeriesGroupBy` instead of :class:`DataFrameGroupBy` (:issue:`41427`)
9931038
- Bug in :meth:`DataFrameGroupBy.transform` with non-unique columns incorrectly raising ``AttributeError`` (:issue:`41427`)
9941039
- Bug in :meth:`Resampler.apply` with non-unique columns incorrectly dropping duplicated columns (:issue:`41445`)
1040+
- Bug in :meth:`DataFrameGroupBy.transform` and :meth:`DataFrameGroupBy.agg` with ``engine="numba"`` where ``*args`` were being cached with the user passed function (:issue:`41647`)
9951041

9961042
Reshaping
9971043
^^^^^^^^^

doc/sphinxext/announce.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@
5454

5555
def get_authors(revision_range):
5656
pat = "^.*\\t(.*)$"
57-
lst_release, cur_release = [r.strip() for r in revision_range.split("..")]
57+
lst_release, cur_release = (r.strip() for r in revision_range.split(".."))
5858

5959
if "|" in cur_release:
6060
# e.g. v1.0.1|HEAD
@@ -119,7 +119,7 @@ def get_pull_requests(repo, revision_range):
119119

120120

121121
def build_components(revision_range, heading="Contributors"):
122-
lst_release, cur_release = [r.strip() for r in revision_range.split("..")]
122+
lst_release, cur_release = (r.strip() for r in revision_range.split(".."))
123123
authors = get_authors(revision_range)
124124

125125
return {

pandas/_config/config.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -157,7 +157,7 @@ def _describe_option(pat: str = "", _print_desc: bool = True):
157157
if len(keys) == 0:
158158
raise OptionError("No such keys(s)")
159159

160-
s = "\n".join([_build_option_description(k) for k in keys])
160+
s = "\n".join(_build_option_description(k) for k in keys)
161161

162162
if _print_desc:
163163
print(s)

pandas/_libs/lib.pyx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -291,7 +291,7 @@ def item_from_zerodim(val: object) -> object:
291291

292292
@cython.wraparound(False)
293293
@cython.boundscheck(False)
294-
def fast_unique_multiple(list arrays, sort: bool = True) -> list:
294+
def fast_unique_multiple(list arrays, sort: bool = True):
295295
"""
296296
Generate a list of unique values from a list of arrays.
297297

pandas/core/arrays/string_arrow.py

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
from __future__ import annotations
22

3+
from collections.abc import Callable # noqa: PDF001
34
import re
45
from typing import (
56
TYPE_CHECKING,
@@ -834,6 +835,28 @@ def _str_endswith(self, pat: str, na=None):
834835
pat = re.escape(pat) + "$"
835836
return self._str_contains(pat, na=na, regex=True)
836837

838+
def _str_replace(
839+
self,
840+
pat: str | re.Pattern,
841+
repl: str | Callable,
842+
n: int = -1,
843+
case: bool = True,
844+
flags: int = 0,
845+
regex: bool = True,
846+
):
847+
if (
848+
pa_version_under4p0
849+
or isinstance(pat, re.Pattern)
850+
or callable(repl)
851+
or not case
852+
or flags
853+
):
854+
return super()._str_replace(pat, repl, n, case, flags, regex)
855+
856+
func = pc.replace_substring_regex if regex else pc.replace_substring
857+
result = func(self._data, pattern=pat, replacement=repl, max_replacements=n)
858+
return type(self)(result)
859+
837860
def _str_match(
838861
self, pat: str, case: bool = True, flags: int = 0, na: Scalar = None
839862
):

0 commit comments

Comments
 (0)