Skip to content

Commit 44fc0fd

Browse files
authored
BUG: read_json raising with table orient and NA values (#50332)
* BUG: read_json raising with table orient and NA values * Add test
1 parent 0e0b987 commit 44fc0fd

File tree

3 files changed

+48
-3
lines changed

3 files changed

+48
-3
lines changed

doc/source/whatsnew/v2.0.0.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -879,6 +879,7 @@ I/O
879879
- Bug when a pickling a subset PyArrow-backed data that would serialize the entire data instead of the subset (:issue:`42600`)
880880
- Bug in :func:`read_sql_query` ignoring ``dtype`` argument when ``chunksize`` is specified and result is empty (:issue:`50245`)
881881
- Bug in :func:`read_csv` for a single-line csv with fewer columns than ``names`` raised :class:`.errors.ParserError` with ``engine="c"`` (:issue:`47566`)
882+
- Bug in :func:`read_json` raising with ``orient="table"`` and ``NA`` value (:issue:`40255`)
882883
- Bug in displaying ``string`` dtypes not showing storage option (:issue:`50099`)
883884
- Bug in :meth:`DataFrame.to_string` with ``header=False`` that printed the index name on the same line as the first row of the data (:issue:`49230`)
884885
- Bug in :meth:`DataFrame.to_string` ignoring float formatter for extension arrays (:issue:`39336`)

pandas/io/json/_table_schema.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -196,11 +196,11 @@ def convert_json_field_to_pandas_type(field) -> str | CategoricalDtype:
196196
if typ == "string":
197197
return "object"
198198
elif typ == "integer":
199-
return "int64"
199+
return field.get("extDtype", "int64")
200200
elif typ == "number":
201-
return "float64"
201+
return field.get("extDtype", "float64")
202202
elif typ == "boolean":
203-
return "bool"
203+
return field.get("extDtype", "bool")
204204
elif typ == "duration":
205205
return "timedelta64"
206206
elif typ == "datetime":

pandas/tests/io/json/test_json_table_schema_ext_dtype.py

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,9 +8,13 @@
88
import pytest
99

1010
from pandas import (
11+
NA,
1112
DataFrame,
13+
Index,
1214
array,
15+
read_json,
1316
)
17+
import pandas._testing as tm
1418
from pandas.core.arrays.integer import Int64Dtype
1519
from pandas.core.arrays.string_ import StringDtype
1620
from pandas.core.series import Series
@@ -273,3 +277,43 @@ def test_to_json(self, df):
273277
expected = OrderedDict([("schema", schema), ("data", data)])
274278

275279
assert result == expected
280+
281+
def test_json_ext_dtype_reading_roundtrip(self):
282+
# GH#40255
283+
df = DataFrame(
284+
{
285+
"a": Series([2, NA], dtype="Int64"),
286+
"b": Series([1.5, NA], dtype="Float64"),
287+
"c": Series([True, NA], dtype="boolean"),
288+
},
289+
index=Index([1, NA], dtype="Int64"),
290+
)
291+
expected = df.copy()
292+
data_json = df.to_json(orient="table", indent=4)
293+
result = read_json(data_json, orient="table")
294+
tm.assert_frame_equal(result, expected)
295+
296+
def test_json_ext_dtype_reading(self):
297+
# GH#40255
298+
data_json = """{
299+
"schema":{
300+
"fields":[
301+
{
302+
"name":"a",
303+
"type":"integer",
304+
"extDtype":"Int64"
305+
}
306+
],
307+
},
308+
"data":[
309+
{
310+
"a":2
311+
},
312+
{
313+
"a":null
314+
}
315+
]
316+
}"""
317+
result = read_json(data_json, orient="table")
318+
expected = DataFrame({"a": Series([2, NA], dtype="Int64")})
319+
tm.assert_frame_equal(result, expected)

0 commit comments

Comments
 (0)