Skip to content

json_normalize throws TypeError with list record_path #21605

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
vuminhle opened this issue Jun 23, 2018 · 6 comments
Closed

json_normalize throws TypeError with list record_path #21605

vuminhle opened this issue Jun 23, 2018 · 6 comments
Labels
Enhancement IO JSON read_json, to_json, json_normalize Needs Discussion Requires discussion from core team before further action

Comments

@vuminhle
Copy link
Contributor

vuminhle commented Jun 23, 2018

Code Sample, a copy-pastable example if possible

from pandas.io.json import json_normalize

json_normalize({'A': {'B': [{'X': 1, 'Y': 2}, {'X': 3, 'Y': 4}]}}, ['A', 'B'])

Note: change from json_normalize({'A': {'B': [1, 2]}}, ['A', 'B']) to distinguish from #21608

Problem description

The above code throws a TypeError:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python36\lib\site-packages\pandas\io\json\normalize.py", line 257, in json_normalize
    _recursive_extract(data, record_path, {}, level=0)
  File "C:\Python36\lib\site-packages\pandas\io\json\normalize.py", line 230, in _recursive_extract
    seen_meta, level=level + 1)
  File "C:\Python36\lib\site-packages\pandas\io\json\normalize.py", line 233, in _recursive_extract
    recs = _pull_field(obj, path[0])
  File "C:\Python36\lib\site-packages\pandas\io\json\normalize.py", line 180, in _pull_field
    result = result[spec]
TypeError: string indices must be integers

Expected Output

X Y
0 1 2
1 3 4

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.4.final.0 python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 62 Stepping 4, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: None.None

pandas: 0.23.1
pytest: 3.6.1
pip: 10.0.1
setuptools: 28.8.0
Cython: None
numpy: 1.14.2
scipy: None
pyarrow: None
xarray: None
IPython: 6.3.1
sphinx: None
patsy: None
dateutil: 2.7.2
pytz: 2018.4
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

vuminhle added a commit to vuminhle/pandas that referenced this issue Jun 23, 2018
@vuminhle
Copy link
Contributor Author

vuminhle commented Jun 23, 2018

Pull request here #21607

@gfyoung gfyoung added Enhancement IO JSON read_json, to_json, json_normalize Needs Discussion Requires discussion from core team before further action labels Jun 25, 2018
@gfyoung
Copy link
Member

gfyoung commented Jun 25, 2018

cc @jreback

Uncertain that we want to support this. Thoughts?

@vuminhle
Copy link
Contributor Author

vuminhle commented Jun 25, 2018

@gfyoung: Note that I created a minimal test case to trigger the bug.
The same bug triggers when the array contains objects.

>>> json_normalize({'A': {'B': [{'X': 1, 'Y': 2}, {'X': 3, 'Y': 4}]}}, ['A', 'B'])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python36\lib\site-packages\pandas\io\json\normalize.py", line 257, in json_normalize
    _recursive_extract(data, record_path, {}, level=0)
  File "C:\Python36\lib\site-packages\pandas\io\json\normalize.py", line 230, in _recursive_extract
    seen_meta, level=level + 1)
  File "C:\Python36\lib\site-packages\pandas\io\json\normalize.py", line 233, in _recursive_extract
    recs = _pull_field(obj, path[0])
  File "C:\Python36\lib\site-packages\pandas\io\json\normalize.py", line 180, in _pull_field
    result = result[spec]
TypeError: string indices must be integers

@jreback
Copy link
Contributor

jreback commented Jun 26, 2018

this is not legitimate input to json_normalize. simiar to #21608 I would raise an intelligent message here.

@vuminhle
Copy link
Contributor Author

vuminhle commented Jun 26, 2018

Why isn't it legitimate?

@jreback
Copy link
Contributor

jreback commented Dec 13, 2018

duplicate of #22804

@jreback jreback closed this as completed Dec 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement IO JSON read_json, to_json, json_normalize Needs Discussion Requires discussion from core team before further action
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants