Skip to content

Conversation

TomAugspurger
Copy link
Contributor

Closes #34741, while
retaining the spirit of the spirit of
#34208.

@@ -746,6 +745,7 @@ Deprecations
- :meth:`DataFrame.to_dict` has deprecated accepting short names for ``orient`` in future versions (:issue:`32515`)
- :meth:`Categorical.to_dense` is deprecated and will be removed in a future version, use ``np.asarray(cat)`` instead (:issue:`32639`)
- The ``fastpath`` keyword in the ``SingleBlockManager`` constructor is deprecated and will be removed in a future version (:issue:`33092`)
- Providing ``suffixes`` as a ``set`` in :func:`pandas.merge` is deprecated. Provide a tuple instead (:issue:`33740`, :issue:`34741`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we are going to do this (I guess @jorisvandenbossche really wants this), then let's update the doc-string of pd.merge to be correct (needs list-like) and the docs in reshape.rst as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pd.merge docs already state that we want a sequence.

    suffixes : Sequence, default is ("_x", "_y")
        A length-2 sequence where

Updated the DataFrame.merge docs to be consistent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm we don't use sequence elsewhere do we, don't we use list-like everywhere?

raise TypeError(
f"suffixes should be tuple of (str, str). But got {type(suffixes).__name__}"
if isinstance(suffixes, set):
warnings.warn(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just raise? also what about a dict or non-list-like for example?

do we check is_list_like anywhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'll need a warning according to our backwards compatibility poilcy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's easy to do it with a warning, so I would also just do that.

@TomAugspurger if we don't want to have it set-specific, could also do if not is_list_like(suffixes, allow_sets=False), although in practice that might give the same.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that should use what @jorisvandenbossche suggests here , e.g. dict would pass this test but is not acceptable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dicts are ordered though, so they shouldn't be a problem. I thought the original issue was non-deterministic outputs because sets aren't ordered?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually sets are not ordered, but I think we need an actual sequence here (sure .keys() satisfies), but i'd rather be explicit. we do not treat dicts as list-like anywhere else.

@jreback jreback added the Reshaping Concat, Merge/Join, Stack/Unstack, Explode label Jun 15, 2020
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test that passes the suffixes as a list? (to ensure we don't accidentally remove this again)

@@ -746,6 +745,7 @@ Deprecations
- :meth:`DataFrame.to_dict` has deprecated accepting short names for ``orient`` in future versions (:issue:`32515`)
- :meth:`Categorical.to_dense` is deprecated and will be removed in a future version, use ``np.asarray(cat)`` instead (:issue:`32639`)
- The ``fastpath`` keyword in the ``SingleBlockManager`` constructor is deprecated and will be removed in a future version (:issue:`33092`)
- Providing ``suffixes`` as a ``set`` in :func:`pandas.merge` is deprecated. Provide a tuple instead (:issue:`33740`, :issue:`34741`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm we don't use sequence elsewhere do we, don't we use list-like everywhere?

raise TypeError(
f"suffixes should be tuple of (str, str). But got {type(suffixes).__name__}"
if isinstance(suffixes, set):
warnings.warn(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that should use what @jorisvandenbossche suggests here , e.g. dict would pass this test but is not acceptable

def test_merge_suffix_type_error(col1, col2, suffixes):
a = pd.DataFrame({col1: [1, 2, 3]})
b = pd.DataFrame({col2: [3, 4, 5]})
def test_merge_suffix_set():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

call this non_list_like parmaterize on dict as well

@jreback jreback added this to the 1.1 milestone Jun 30, 2020
@jreback jreback merged commit eb48e10 into pandas-dev:master Jun 30, 2020
@jreback
Copy link
Contributor

jreback commented Jun 30, 2020

thanks @TomAugspurger

with pytest.raises(TypeError, match=msg):
pd.merge(a, b, left_index=True, right_index=True, suffixes=suffixes)
with tm.assert_produces_warning(FutureWarning):
pd.merge(a, b, left_index=True, right_index=True, suffixes={"left", "right"})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are not actually passing the parameterized suffixes here, so I think that the dict case is not tested, and will also not raise a warning (as dicts are considered list-like)

@TomAugspurger TomAugspurger deleted the 34741-suffixes branch July 6, 2020 13:14
@simonjayhawkins simonjayhawkins mentioned this pull request May 4, 2022
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

REGR: merge no longer accepts a list for suffixes keyword
3 participants