-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
BUG: duplicate indexing with embedded non-orderables (#17610) #18609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #18609 +/- ##
=========================================
Coverage ? 91.56%
=========================================
Files ? 153
Lines ? 51276
Branches ? 0
=========================================
Hits ? 46950
Misses ? 4326
Partials ? 0
Continue to review full report at Codecov.
|
@gloryfromca no need to create a NEW PR every time this needs updating. simply merge master and push to THIS one. |
doc/source/whatsnew/v0.22.0.txt
Outdated
@@ -183,6 +183,7 @@ Indexing | |||
- Bug in :class:`IntervalIndex` where empty and purely NA data was constructed inconsistently depending on the construction method (:issue:`18421`) | |||
- Bug in ``IntervalIndex.symmetric_difference()`` where the symmetric difference with a non-``IntervalIndex`` did not raise (:issue:`18475`) | |||
- Bug in indexing a datetimelike ``Index`` that raised ``ValueError`` instead of ``IndexError`` (:issue:`18386`). | |||
- Bug in ``Series`` containing duplicate indexing when gets embedded non-orderables or orderables, raises error or returns unexpected result. (:issue:`17610`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not very clear what you are fixing here, be a little more consise.
pandas/core/series.py
Outdated
@@ -657,12 +657,12 @@ def __getitem__(self, key): | |||
try: | |||
result = self.index.get_value(self, key) | |||
|
|||
if not is_scalar(result): | |||
if not is_scalar(result) and key in self.index: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this check necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will cause that some tests fail, if not is_scalar(self.index.get_loc(key))
will raise ValueError when type of key is Int not String. So by using [0] is another recommended way to get the first item with a String Index in Series? I will try to use try...except...
to resolve it , Is this way good or a recommended way?
Or I should adjust three failed tests?
test_applymap
inpandas.tests.frame.test_apply.TestDataFrameApply.py
,test_match_findall_flags
inpandas.tests.test_strings.TestStringMethods.py
,test_constructor_from_items
inpandas.tests.frame.test_constructors.TestDataFrameConstructors.py
I'm poor in programing skills and time, sorry about that {:-()}.
pandas/tests/series/test_indexing.py
Outdated
@@ -546,6 +546,22 @@ def test_getitem_setitem_periodindex(self): | |||
result[4:8] = ts[4:8] | |||
assert_series_equal(result, ts) | |||
|
|||
def test_getitem_with_duplicates_indices(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parameterize this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, parameterize this , what should I do ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pandas/tests/series/test_indexing.py
Outdated
s = s.append(pd.Series({1: 313})) | ||
s_1 = pd.Series({1: 12, },) | ||
s_1 = s_1.append(pd.Series({1: 313})) | ||
assert_series_equal(s[1], s_1, check_dtype=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't use check_dtype=False
, your expected should account for this
pandas/tests/series/test_indexing.py
Outdated
# GH 17610 | ||
s = pd.Series({1: 12, 2: [1, 2, 2, 3]}) | ||
s = s.append(pd.Series({1: 313})) | ||
s_1 = pd.Series({1: 12, },) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use result and expected to indicate what you are comparing
pandas/core/series.py
Outdated
@@ -666,11 +666,13 @@ def __getitem__(self, key): | |||
|
|||
# we need to box if we have a non-unique index here | |||
# otherwise have inline ndarray/lists |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you update this comment a bit, its not longer non-unique, but getting back a scalar loc for the key, IOW its a unique key
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it unique key?
pandas/core/series.py
Outdated
result, index=[key] * len(result), | ||
dtype=self.dtype).__finalize__(self) | ||
except KeyError: | ||
return result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can pass on the KeyError (it will fall thru and return result anyhow)
@jreback what else can I do for this? |
doc/source/whatsnew/v0.22.0.txt
Outdated
@@ -269,6 +269,8 @@ Indexing | |||
- Bug in :func:`IntervalIndex.symmetric_difference` where the symmetric difference with a non-``IntervalIndex`` did not raise (:issue:`18475`) | |||
- Bug in indexing a datetimelike ``Index`` that raised ``ValueError`` instead of ``IndexError`` (:issue:`18386`). | |||
- Bug in tz-aware :class:`DatetimeIndex` where addition/subtraction with a :class:`TimedeltaIndex` or array with ``dtype='timedelta64[ns]'`` was incorrect (:issue:`17558`) | |||
- Bug in indexing non_scalar item with unique index in ``Series`` containing duplicate index, returns ``Series`` wrapping value flatted. (:issue:`17610`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is so confusing. pls make it simpler.
pandas/tests/series/test_indexing.py
Outdated
], | ||
]) | ||
def test_getitem_with_duplicates_indices( | ||
self, result_1, duplicate_item, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the duplicate_key and unique_key are the same in each parametrization yes? so don't include them, just directly index in the test body
thanks @gloryfromca |
:)
2017-12-21 23:29 GMT+08:00 Jeff Reback <[email protected]>:
… thanks @gloryfromca <https://github.com/gloryfromca>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#18609 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AWW15woErYcpptjyjyEG4sLLNPDQFUUbks5tCnlRgaJpZM4QztgW>
.
|
git diff upstream/master -u -- "*.py" | flake8 --diff
closes BUG: duplicate indexing with embedded non-orderables #17610