-
Notifications
You must be signed in to change notification settings - Fork 11
Closed
Description
I think it would be good to have a better story for heterogeneous data. Both of the following results (which are generated from the same data but where entries are ordered differently) are surprising and can cause problems.
julia> using JSONTables, DataFrames
julia> json_a = """[
{"timea": 1585154193000, "troublemaker": 97},
{"timea": 1310044361000}
]""";
julia> json_b = """[
{"timea": 1310044361000},
{"timea": 1585154193000,"troublemaker": 97}
]""";
julia> DataFrame(jsontable(json_a)) # throws error
ERROR: KeyError: key :troublemaker not found
Stacktrace:
[1] get(::JSON3.Object{Base.CodeUnits{UInt8,String},SubArray{UInt64,1,Array{UInt64,1},Tuple{UnitRange{Int64}},true}}, ::Symbol) at /home/gerhard/.julia/packages/JSON3/YGLA7/src/JSON3.jl:53
...
julia> DataFrame(jsontable(json_b)) # looses troublemaker silently
2×1 DataFrame
│ Row │ timea │
│ │ Int64 │
├─────┼───────────────┤
│ 1 │ 1310044361000 │
│ 2 │ 1585154193000 │What I would have expected jsontable to produce:
julia> using JSON3
julia> reduce((x, y) -> append!(x, y;cols=:union), JSON3.read(json_a);init=DataFrame())
2×2 DataFrame
│ Row │ timea │ troublemaker │
│ │ Int64 │ Int64? │
├─────┼───────────────┼──────────────┤
│ 1 │ 1585154193000 │ 97 │
│ 2 │ 1310044361000 │ missing │
julia> reduce((x, y) -> append!(x, y;cols=:union), JSON3.read(json_b);init=DataFrame())
2×2 DataFrame
│ Row │ timea │ troublemaker │
│ │ Int64 │ Int64? │
├─────┼───────────────┼──────────────┤
│ 1 │ 1310044361000 │ missing │
│ 2 │ 1585154193000 │ 97 │
If this is not possible or desired at least the documentation should include a clear warning about what to expect.
Thx!
Metadata
Metadata
Assignees
Labels
No labels