Skip to content

ENH: Add coalesce_keys option to join #61033

Open
@tylerriccio33

Description

@tylerriccio33

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

It would be useful to retain keys used in a join instead of automatically coalescing them. This is most useful in full outer joins. I am happy to implement myself :)

Feature Description

A test for this would pass w/the below data.

df1 = {"id": [1, 2, 3], "value1": ["A", "B", "C"]}
df2 = {"id": [2, 3, 4], "value2": ["X", "Y", "Z"]}

res = df1.join(df2, on = 'id', coalesce_keys = False)

Note the preservation of the id columns:
expected_no_coalesce = {
"id": [None, 1, 2, 3],
"value1": [None, "A", "B", "C"],
"id_right": [4, None, 2, 3],
"value2": ["Z", None, "X", "Y"],
}

Alternative Solutions

Arrow and polars have this option. I bring this up because I'm implementing a common full join where keys are preserved in the Narwhals package and noticed Pandas does not allow this out of the box. https://github.com/narwhals-dev/narwhals/pull/2126/files#diff-ff8314856956318d0da461d7cc2710a6b18d3c052581be7990ae0023a9e689ee

Additional Context

No response

Metadata

Metadata

Assignees

Labels

EnhancementNeeds TriageIssue that has not been reviewed by a pandas team member

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions