Skip to content

Error with converting from DataFrames.DataFrame --> Pandas.DataFrame if a column has all Missing values. #71

@00krishna

Description

@00krishna

Hello. So I ran into an issue with converting from a DataFrames.DataFrame --> Pandas.DataFrame when my original DataFrame had a column of missing values. You can see the details of the conversation in the Discourse link here.

Using Missings
Using DataFrames
Import Pandas

t = DataFrame(rand(4,5))
t[:, :x6] .= Missing
Pandas.DataFrame(t)

The error message I was getting was as follows:

Error showing value of type Pandas.DataFrame:
ERROR: PyError ($(Expr(:escape, :(ccall(#= /home/krishnab/.julia/packages/PyCall/zqDXB/src/pyfncall.jl:43 =# @pysym(:PyObject_Call), PyPtr, (PyPtr, PyPtr, PyPtr), o, pyargsptr, kw))))) <class 'RuntimeError'>
RuntimeError('Julia exception: MethodError: no method matching iterate(::Type{Missing})\nClosest candidates are:\n  iterate(!Matched::Core.SimpleVector) at essentials.jl:603\n  iterate(!Matched::Core.SimpleVector, !Matched::Any) at essentials.jl:603\n  iterate(!Matched::ExponentialBackOff) at error.jl:253\n  ...\nStacktrace:\n [1] jlwrap_iterator(::Type{T} where T) at /home/krishnab/.julia/packages/PyCall/zqDXB/src/pyiterator.jl:144\n [2] pyjlwrap_getiter(::Ptr{PyCall.PyObject_struct}) at /home/krishnab/.julia/packages/PyCall/zqDXB/src/pyiterator.jl:125\n [3] macro expansion at /home/krishnab/.julia/packages/PyCall/zqDXB/src/exception.jl:93 [inlined]\n [4] #110 at /home/krishnab/.julia/packages/PyCall/zqDXB/src/pyfncall.jl:43 [inlined]\n [5] disable_sigint at ./c.jl:446 [inlined]\n [6] __pycall! at /home/krishnab/.julia/packages/PyCall/zqDXB/src/pyfncall.jl:42 [inlined]\n [7] _pycall!(::PyCall.PyObject, ::PyCall.PyObject, ::Tuple{}, ::Int64, ::Ptr{Nothing}) at /home/krishnab/.julia/packages/PyCall/zqDXB/src/pyfncall.jl:29\n [8] _pycall! at /home/krishnab/.julia/packages/PyCall/zqDXB/src/pyfncall.jl:11 [inlined]\n [9] #_#117 at /home/krishnab/.julia/packages/PyCall/zqDXB/src/pyfncall.jl:86 [inlined]\n [10] (::PyCall.PyObject)() at /home/krishnab/.julia/packages/PyCall/zqDXB/src/pyfncall.jl:86\n [11] show(::IOContext{REPL.Terminals.TTYTerminal}, ::Pandas.DataFrame) at /home/krishnab/.julia/packages/Pandas/rAPmB/src/Pandas.jl:319\n [12] show(::IOContext{REPL.Terminals.TTYTerminal}, ::MIME{Symbol("text/plain")}, ::Pandas.DataFrame) at ./multimedia.jl:47\n [13] display(::REPL.REPLDisplay, ::MIME{Symbol("text/plain")}, ::Any) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.4/REPL/src/REPL.jl:137\n [14] display(::REPL.REPLDisplay, ::Any) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.4/REPL/src/REPL.jl:141\n [15] display(::Any) at ./multimedia.jl:323\n [16] #invokelatest#1 at ./essentials.jl:712 [inlined]\n [17] invokelatest at ./essentials.jl:711 [inlined]\n [18] print_response(::IO, ::Any, ::Bool, ::Bool, ::Any) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.4/REPL/src/REPL.jl:161\n [19] print_response(::REPL.AbstractREPL, ::Any, ::Bool, ::Bool) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.4/REPL/src/REPL.jl:146\n [20] (::REPL.var"#do_respond#38"{Bool,REPL.var"#48#57"{REPL.LineEditREPL,REPL.REPLHistoryProvider},REPL.LineEditREPL,REPL.LineEdit.Prompt})(::Any, ::Any, ::Any) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.4/REPL/src/REPL.jl:729\n [21] #invokelatest#1 at ./essentials.jl:712 [inlined]\n [22] invokelatest at ./essentials.jl:711 [inlined]\n [23] run_interface(::REPL.Terminals.TextTerminal, ::REPL.LineEdit.ModalInterface, ::REPL.LineEdit.MIState) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.4/REPL/src/LineEdit.jl:2354\n [24] run_frontend(::REPL.LineEditREPL, ::REPL.REPLBackendRef) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.4/REPL/src/REPL.jl:1055\n [25] run_repl(::REPL.AbstractREPL, ::Any) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.4/REPL/src/REPL.jl:206\n [26] (::Base.var"#764#766"{Bool,Bool,Bool,Bool})(::Module) at ./client.jl:383\n [27] #invokelatest#1 at ./essentials.jl:712 [inlined]\n [28] invokelatest at ./essentials.jl:711 [inlined]\n [29] run_main_repl(::Bool, ::Bool, ::Bool, ::Bool, ::Bool) at ./client.jl:367\n [30] exec_options(::Base.JLOptions) at ./client.jl:305\n [31] _start() at ./client.jl:484')
  File "/media/krishnab/lakshmi/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py", line 687, in __repr__
    show_dimensions=show_dimensions,
  File "/media/krishnab/lakshmi/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py", line 820, in to_string
    return formatter.to_string(buf=buf, encoding=encoding)
  File "/media/krishnab/lakshmi/anaconda3/lib/python3.7/site-packages/pandas/io/formats/format.py", line 914, in to_string
    return self.get_result(buf=buf, encoding=encoding)
  File "/media/krishnab/lakshmi/anaconda3/lib/python3.7/site-packages/pandas/io/formats/format.py", line 521, in get_result
    self.write_result(buf=f)
  File "/media/krishnab/lakshmi/anaconda3/lib/python3.7/site-packages/pandas/io/formats/format.py", line 823, in write_result
    strcols = self._to_str_columns()
  File "/media/krishnab/lakshmi/anaconda3/lib/python3.7/site-packages/pandas/io/formats/format.py", line 759, in _to_str_columns
    fmt_values = self._format_col(i)
  File "/media/krishnab/lakshmi/anaconda3/lib/python3.7/site-packages/pandas/io/formats/format.py", line 954, in _format_col
    decimal=self.decimal,
  File "/media/krishnab/lakshmi/anaconda3/lib/python3.7/site-packages/pandas/io/formats/format.py", line 1172, in format_array
    return fmt_obj.get_result()
  File "/media/krishnab/lakshmi/anaconda3/lib/python3.7/site-packages/pandas/io/formats/format.py", line 1203, in get_result
    fmt_values = self._format_strings()
  File "/media/krishnab/lakshmi/anaconda3/lib/python3.7/site-packages/pandas/io/formats/format.py", line 1268, in _format_strings
    fmt_values.append(tpl.format(v=_format(v)))
  File "/media/krishnab/lakshmi/anaconda3/lib/python3.7/site-packages/pandas/io/formats/format.py", line 1242, in _format
    return "{x}".format(x=formatter(x))
  File "/media/krishnab/lakshmi/anaconda3/lib/python3.7/site-packages/pandas/io/formats/format.py", line 1220, in <lambda>
    else (lambda x: pprint_thing(x, escape_chars=("\t", "\r", "\n")))
  File "/media/krishnab/lakshmi/anaconda3/lib/python3.7/site-packages/pandas/io/formats/printing.py", line 219, in pprint_thing
    elif is_sequence(thing) and _nest_lvl < get_option("display.pprint_nest_depth"):
  File "/media/krishnab/lakshmi/anaconda3/lib/python3.7/site-packages/pandas/core/dtypes/inference.py", line 420, in is_sequence
    iter(obj)  # Can iterate over it.

Stacktrace:
 [1] pyerr_check at /home/krishnab/.julia/packages/PyCall/zqDXB/src/exception.jl:60 [inlined]
 [2] pyerr_check at /home/krishnab/.julia/packages/PyCall/zqDXB/src/exception.jl:64 [inlined]
 [3] _handle_error(::String) at /home/krishnab/.julia/packages/PyCall/zqDXB/src/exception.jl:81
 [4] macro expansion at /home/krishnab/.julia/packages/PyCall/zqDXB/src/exception.jl:95 [inlined]
 [5] #110 at /home/krishnab/.julia/packages/PyCall/zqDXB/src/pyfncall.jl:43 [inlined]
 [6] disable_sigint at ./c.jl:446 [inlined]
 [7] __pycall! at /home/krishnab/.julia/packages/PyCall/zqDXB/src/pyfncall.jl:42 [inlined]
 [8] _pycall!(::PyCall.PyObject, ::PyCall.PyObject, ::Tuple{}, ::Int64, ::Ptr{Nothing}) at /home/krishnab/.julia/packages/PyCall/zqDXB/src/pyfncall.jl:29
 [9] _pycall! at /home/krishnab/.julia/packages/PyCall/zqDXB/src/pyfncall.jl:11 [inlined]
 [10] #_#117 at /home/krishnab/.julia/packages/PyCall/zqDXB/src/pyfncall.jl:86 [inlined]
 [11] (::PyCall.PyObject)() at /home/krishnab/.julia/packages/PyCall/zqDXB/src/pyfncall.jl:86
 [12] show(::IOContext{REPL.Terminals.TTYTerminal}, ::Pandas.DataFrame) at /home/krishnab/.julia/packages/Pandas/rAPmB/src/Pandas.jl:319
 [13] show(::IOContext{REPL.Terminals.TTYTerminal}, ::MIME{Symbol("text/plain")}, ::Pandas.DataFrame) at ./multimedia.jl:47
 [14] display(::REPL.REPLDisplay, ::MIME{Symbol("text/plain")}, ::Any) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.4/REPL/src/REPL.jl:137
 [15] display(::REPL.REPLDisplay, ::Any) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.4/REPL/src/REPL.jl:141
 [16] display(::Any) at ./multimedia.jl:323
 [17] #invokelatest#1 at ./essentials.jl:712 [inlined]
 [18] invokelatest at ./essentials.jl:711 [inlined]
 [19] print_response(::IO, ::Any, ::Bool, ::Bool, ::Any) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.4/REPL/src/REPL.jl:161
 [20] print_response(::REPL.AbstractREPL, ::Any, ::Bool, ::Bool) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.4/REPL/src/REPL.jl:146
 [21] (::REPL.var"#do_respond#38"{Bool,REPL.var"#48#57"{REPL.LineEditREPL,REPL.REPLHistoryProvider},REPL.LineEditREPL,REPL.LineEdit.Prompt})(::Any, ::Any, ::Any) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.4/REPL/src/REPL.jl:729
 [22] #invokelatest#1 at ./essentials.jl:712 [inlined]
 [23] invokelatest at ./essentials.jl:711 [inlined]
 [24] run_interface(::REPL.Terminals.TextTerminal, ::REPL.LineEdit.ModalInterface, ::REPL.LineEdit.MIState) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.4/REPL/src/LineEdit.jl:2354
 [25] run_frontend(::REPL.LineEditREPL, ::REPL.REPLBackendRef) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.4/REPL/src/REPL.jl:1055
 [26] run_repl(::REPL.AbstractREPL, ::Any) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.4/REPL/src/REPL.jl:206
 [27] (::Base.var"#764#766"{Bool,Bool,Bool,Bool})(::Module) at ./client.jl:383
 [28] #invokelatest#1 at ./essentials.jl:712 [inlined]
 [29] invokelatest at ./essentials.jl:711 [inlined]
 [30] run_main_repl(::Bool, ::Bool, ::Bool, ::Bool, ::Bool) at ./client.jl:367
 [31] exec_options(::Base.JLOptions) at ./client.jl:305
 [32] _start() at ./client.jl:484

In my specific case I was trying to convert to a Pandas DataFrame in some julia code, and I received the error message below, a LoadError: MethodError message about an ambiguous conversion type.

ERROR: LoadError: MethodError: convert(::Type{Union{}}, ::Float64) is ambiguous. Candidates:
  convert(::Type{Union{}}, x) in Base at essentials.jl:169
  convert(::Type{T}, x::Number) where T<:Number in Base at number.jl:7
  convert(::Type{T}, arg) where T<:VecElement in Base at baseext.jl:8
  convert(::Type{T}, x::Number) where T<:AbstractChar in Base at char.jl:179
Possible fix, define
  convert(::Type{Union{}}, ::Number)
Stacktrace:
 [1] setindex!(::Array{Union{},1}, ::Float64, ::Int64) at ./array.jl:825
 [2] _construct_pandas_from_iterabletable(::DataFrame) at /home/krishnab/.julia/packages/Pandas/rAPmB/src/tabletraits.jl:37
 [3] DataFrame at /home/krishnab/.julia/packages/Pandas/rAPmB/src/Pandas.jl:457 [inlined]
 [4] run_julia_model(::Dict{String,Any}, ::Int64, ::Int64) at /media/krishnab/lakshmi/sandbox/julia/pyjulia/test_julia.jl:6
 [5] top-level scope at /media/krishnab/lakshmi/sandbox/julia/pyjulia/test_julia.jl:24
 [6] include(::Module, ::String) at ./Base.jl:377
 [7] exec_options(::Base.JLOptions) at ./client.jl:288
 [8] _start() at ./client.jl:484

We traced the error to some code in Pandas.jl. Seems like a column of missing values, when passed through DataValues gets treated as a Union{} type. And there are no methods on that type, so the nearest dispatch implementation was converting to Float64, and that did not work.

Anyhow, let me know if you have any questions -- or please take a look at the Discourse conversation. Thanks.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions