Skip to content

BUG: Raise TypeError when joining with non-DataFrame using 'on=' (GH#61434) #61454

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

iabhi4
Copy link
Contributor

@iabhi4 iabhi4 commented May 18, 2025

Closes GH#61434

What does this PR change?

When using DataFrame.join() with the on parameter, passing an invalid object like a dict, int, or third-party DataFrame previously resulted in unclear internal errors.

This PR adds a minimal type check that raises a clear TypeError when other is not a DataFrame, Series, or a list of such objects. Valid list-based joins without on remain unaffected.

Checklist

@iabhi4 iabhi4 force-pushed the fix-61434-nonpandas-join-typeerror branch from 78fbc2c to b604714 Compare May 19, 2025 00:26
@iabhi4
Copy link
Contributor Author

iabhi4 commented May 19, 2025

All checks passed except the Pyodide build, which failed due to a rate-limit (HTTP 429). The failure seems unrelated to this PR. A rerun should resolve it

@rhshadrach rhshadrach added Reshaping Concat, Merge/Join, Stack/Unstack, Explode Error Reporting Incorrect or improved errors from pandas labels May 19, 2025
@iabhi4 iabhi4 force-pushed the fix-61434-nonpandas-join-typeerror branch from b604714 to 71eb2e7 Compare May 19, 2025 05:19
@iabhi4 iabhi4 requested a review from rhshadrach May 19, 2025 08:20
Comment on lines +10889 to +10891
if isinstance(other, Iterable) and not isinstance(
other, (DataFrame, Series, str, bytes, bytearray)
):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if isinstance(other, Iterable) and not isinstance(
other, (DataFrame, Series, str, bytes, bytearray)
):
if is_list_like(other):

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this probably should be done in merge

Copy link
Member

@rhshadrach rhshadrach May 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mroeschke - is_list_like will return True on a DataFrame. We only want to enter this block on a potential sequence of DataFrame/Series.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah OK ignore my suggestion then. But I believe this check should still be done in merge probably

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah OK ignore my suggestion then. But I believe this check should still be done in merge probably

ValueError is raised in join() before the call reaches merge when on is specified. Would you prefer that I let these inputs flow into merge and move the check there for consistency?

@iabhi4 iabhi4 force-pushed the fix-61434-nonpandas-join-typeerror branch from 71eb2e7 to 6789fb6 Compare May 22, 2025 21:18
@iabhi4 iabhi4 force-pushed the fix-61434-nonpandas-join-typeerror branch from 6789fb6 to c9ac591 Compare May 22, 2025 21:23
@iabhi4 iabhi4 requested a review from rhshadrach May 22, 2025 21:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Error Reporting Incorrect or improved errors from pandas Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Joining Pandas with Polars dataframe produces fuzzy errormessage
3 participants