Skip to content

Jupyter: Data schema generation bug with mixed nullability of similarly named column #1222

@Jolanrensen

Description

@Jolanrensen

Can be reproduced on 1.0.0-Beta2:

val df1 = dataFrameOf(
    "group" to columnOf(
        "a" to columnOf(1, null, 3),
    )
)
val df2 = dataFrameOf(
    "group" to columnOf(
        "a" to columnOf(1, 2, 3),
    )
)

produces:

// for df1

@DataSchema(isOpen = false)
interface _DataFrameType1 {
    val a: Int?
}

@DataSchema
interface _DataFrameType {
    val group: _DataFrameType1
}

// for df2

@DataSchema(isOpen = false)
interface _DataFrameType3 {
    val a: Int
}

@DataSchema
interface _DataFrameType2 : _DataFrameType {
    override val group: _DataFrameType3 // Type of 'group' is not a subtype of overridden property 'val group: _DataFrameType1' defined in '_DataFrameType'
}

What I suspect was meant to be generated is something like this:

// for df1

@DataSchema(isOpen = true)
interface _DataFrameType1 {
    val a: Int?
}

@DataSchema
interface _DataFrameType {
    val group: _DataFrameType1
}

// for df2

@DataSchema(isOpen = true)
interface _DataFrameType3 : _DataFrameType1 { // now the non-nullable variant extends the nullable variant
    override val a: Int // requires override
}

@DataSchema
interface _DataFrameType2 : _DataFrameType {
    override val group: _DataFrameType3
}

or when disconnected:

// for df1

@DataSchema(isOpen = false)
interface _DataFrameType1 {
    val a: Int?
}

@DataSchema
interface _DataFrameType {
    val group: _DataFrameType1
}

// for df2

@DataSchema(isOpen = false)
interface _DataFrameType3 {
    val a: Int
}

@DataSchema
interface _DataFrameType2 {
    val group: _DataFrameType3
}

Metadata

Metadata

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions