Skip to content

BUG: Allow merging on Index vectors #19073

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 6, 2018

Conversation

gfyoung
Copy link
Member

@gfyoung gfyoung commented Jan 4, 2018

This behavior used to work in v0.19.0 and
is consistent with the documentation.

Closes #19038

@gfyoung gfyoung added Regression Functionality that used to work in a prior pandas version Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Jan 4, 2018
@gfyoung gfyoung added this to the 0.23.0 milestone Jan 4, 2018
@codecov
Copy link

codecov bot commented Jan 4, 2018

Codecov Report

Merging #19073 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #19073      +/-   ##
==========================================
+ Coverage   91.53%   91.53%   +<.01%     
==========================================
  Files         148      148              
  Lines       48688    48690       +2     
==========================================
+ Hits        44566    44568       +2     
  Misses       4122     4122
Flag Coverage Δ
#multiple 89.9% <100%> (ø) ⬆️
#single 41.63% <40%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/dtypes/api.py 100% <ø> (ø) ⬆️
pandas/core/reshape/merge.py 94.21% <100%> (ø) ⬆️
pandas/core/dtypes/inference.py 98.41% <100%> (+0.05%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c0e3767...1a553bc. Read the comment docs.

x, (np.ndarray, Series)) and len(x) == len(left)
is_rkey = lambda x: isinstance(
x, (np.ndarray, Series)) and len(x) == len(right)
is_lkey = lambda x: isinstance(x, list_types) and len(x) == len(left)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you use is_list_like here?

Copy link
Member Author

@gfyoung gfyoung Jan 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that, but passing in list or tuple is not accepted (later on, we check for the dtype attribute by comparing dtypes). Those data types will pass the is_list_like check, however.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe lambda x: is_list_like(x) and hasattr(x, 'dtype')?

just trying to avoid hardcoding (actually you could add that as is_array_like in dtypes/common, seems useful

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, is_array_like makes sense as an abstraction. I'll add that.

@@ -1370,6 +1370,22 @@ def f():
household.join(log_return, how='outer')
pytest.raises(NotImplementedError, f)

def test_merge_datetime_index(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you parameterize on ndarray, Series, Index (for the df.index.year arg)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@gfyoung gfyoung force-pushed the merge-index-regression branch 2 times, most recently from 224ee15 to 6ce573a Compare January 5, 2018 10:32
-------
is_array_like : bool
Whether `obj` has array-like properties.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you show Index, Series, ndarray (and a scalar) as examples

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as a followup we could replace code in various places to use this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

This behavior used to work in v0.19.0 and
is consistent with the documentation.

Closes pandas-devgh-19038
Used for abstracting checks in DataFrame.merge,
but the function itself can be quite useful.
@gfyoung gfyoung force-pushed the merge-index-regression branch from 6ce573a to 1a553bc Compare January 6, 2018 05:53
@gfyoung
Copy link
Member Author

gfyoung commented Jan 6, 2018

@jreback : Comments have been addressed, and all is green. PTAL.

@jreback jreback merged commit 07dc117 into pandas-dev:master Jan 6, 2018
@jreback
Copy link
Contributor

jreback commented Jan 6, 2018

very nice @gfyoung thanks!

@gfyoung gfyoung deleted the merge-index-regression branch January 6, 2018 19:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Regression Functionality that used to work in a prior pandas version Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants