Skip to content

BUG: duplicate indexing with embedded non-orderables #17610

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gloryfromca opened this issue Sep 21, 2017 · 8 comments
Closed

BUG: duplicate indexing with embedded non-orderables #17610

gloryfromca opened this issue Sep 21, 2017 · 8 comments
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves
Milestone

Comments

@gloryfromca
Copy link
Contributor

gloryfromca commented Sep 21, 2017

code below is scene that issue happened.

 for user_id, row in pending_data_df.iterrows():

        df = DataFrame(
            columns=[
                Names.call.same_call_counts,
                Names.call.phonebook_detail
            ],
            index=[user_id]
        )

        phoneBookPath = row[Names.app.phonebookpath]
        call_number_list = row.get(Names.call.call_number_list)

Problem description

when I wanted to get a set from a Series , it happended.

it has worked well for a long time,but suddenly it broke out. raise Traceback like this:

Traceback (most recent call last):
File "/usr/local/lib/python3.5/threading.py", line 914, in _bootstrap_inner
self.run()
File "/home/datascience/zh/suijiesuihuan/datascience/my_threads.py", line 87, in run
ret_data, result = make_decision(int(self.layerId), df_input)
File "/home/datascience/zh/suijiesuihuan/datascience/layered_decision.py", line 52, in make_decision
data_df, result_df = load_features_group_loader(features_group_loader_id, mocking)(data_df)
File "/home/datascience/zh/suijiesuihuan/datascience/dynamic_content_loader.py", line 166, in inner
return load_and_make_decision(data_df, config_new)
File "/home/datascience/zh/suijiesuihuan/datascience/dynamic_content_loader.py", line 111, in load_and_make_decision
new_feature_group_df, error_df = _load_data(pending_df, features_loader_id, features_loader, mocking)
File "/home/datascience/zh/suijiesuihuan/datascience/dynamic_content_loader.py", line 76, in _load_data
data_df, result_df = features_loader(data_df)
File "/home/datascience/zh/suijiesuihuan/dynamic_contents/feature_group_loader/feature_group_loader_0003/feature_group_loader.py", line 55, in load
call_number_list = row.get(Names.call.call_number_list)
File "/home/datascience/zh/venv/lib/python3.5/site-packages/pandas/core/generic.py", line 1633, in get
return self[key]
File "/home/datascience/zh/venv/lib/python3.5/site-packages/pandas/core/series.py", line 611, in getitem
dtype=self.dtype).finalize(self)
File "/home/datascience/zh/venv/lib/python3.5/site-packages/pandas/core/series.py", line 227, in init
"".format(data.class.name))
TypeError: 'set' type is unordered

someone knows what happened?

@nmusolino
Copy link
Contributor

First, it is not clear at all that this is a bug. A bug report typically contains a short, runnable example of the problem, and a description of why the observed behavior is wrong.

I tried to reproduce this with pandas 0.19 but could not.

In [1]: import pandas

In [2]: s = pandas.Series({'a': 0, 'b': 1})

In [3]: s.get(['b', 'a'])
Out[3]:
b    1
a    0
dtype: int64

In [4]: s.get({'b', 'a'})  # note set
Out[4]:
a    0
b    1
dtype: int64

Could you try calling get with a list argument, as in this?

        call_number_list = row.get(list(Names.call.call_number_list))

@gfyoung
Copy link
Member

gfyoung commented Sep 21, 2017

@gloryfromca : Also, could you provide version information (pandas.show_versions)?

@gloryfromca
Copy link
Contributor Author

gloryfromca commented Oct 9, 2017 via email

@gloryfromca
Copy link
Contributor Author

gloryfromca commented Oct 9, 2017 via email

@jreback
Copy link
Contributor

jreback commented Oct 9, 2017

I guess this is a bug, though I am not sure why you would ever do this. embedding non-scalars (e.g. a set) is non-idiomatic. Using duplicate indices requires care as well.

I'll mark it, but it would require a community pull request to fix.

In [2]: s = Series({'1':333,'s':set([1,2,3])})

In [3]: s
Out[3]: 
1          333
s    {1, 2, 3}
dtype: object

In [13]: s2 = s.append(Series({'1':2}))

In [14]: s2
Out[14]: 
1          333
s    {1, 2, 3}
1            2
dtype: object

In [15]: s2[1]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-15-c2e7db717c6c> in <module>()
----> 1 s2[1]

~/pandas/pandas/core/series.py in __getitem__(self, key)
    629                         result = self._constructor(
    630                             result, index=[key] * len(result),
--> 631                             dtype=self.dtype).__finalize__(self)
    632 
    633             return result

~/pandas/pandas/core/series.py in __init__(self, data, index, dtype, name, copy, fastpath)
    239             elif isinstance(data, (set, frozenset)):
    240                 raise TypeError("{0!r} type is unordered"
--> 241                                 "".format(data.__class__.__name__))
    242             else:
    243 

TypeError: 'set' type is unordered

@jreback jreback added Bug Difficulty Intermediate Indexing Related to indexing on series/frames, not to indexes themselves and removed Can't Repro labels Oct 9, 2017
@jreback jreback added this to the Next Major Release milestone Oct 9, 2017
@jreback jreback changed the title raise TypeError: 'set' type is unordered when I try to get set from a series BUG: duplicate indexing with embedded non-orderables Oct 9, 2017
@gloryfromca
Copy link
Contributor Author

gloryfromca commented Oct 10, 2017 via email

@jreback
Copy link
Contributor

jreback commented Oct 10, 2017

Should I create a pull request for it ?

sure, docs are: http://pandas.pydata.org/pandas-docs/stable/contributing.html

@gloryfromca
Copy link
Contributor Author

gloryfromca commented Oct 11, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants