DataColumn nullability in JDBC

I'd argue that KType nullability should always check actual column values. 
https://github.com/Kotlin/dataframe/blob/master/dataframe-jdbc/src/main/kotlin/org/jetbrains/kotlinx/dataframe/io/readJdbc.kt#L597
Which is done by `infer = Infer.Nulls`
My reasoning is mostly related to notebooks
Pros: you won't have to handle nullable values if given snapshot doesn't have any! Very convenient if you just want to work with specific fragment of data
Cons: Imagine you want to rerun the same notebook, but this time data has nulls. Now, you'll have to modify your code to handle it, or it will be compilation error 
So, depending on your use case: explore data once vs reuse notebook, desirable behavior can vary. 
My suggestion here: to support re-usability of notebooks, JDBC integration should have method to import data schema from DB schema the same way as `open api` support does. 

Things to consider here: it's already possible to write (or generate and edit) a data schema to rerun notebooks without problems. There're other operation that work like this: `add`, `convert` and other functions will create nullable KType only if there are nulls, as well as other data sources (discussion about this in context of Arrow: https://github.com/Kotlin/dataframe/issues/428 with additional argument about KType nullability)  
```
public inline fun <reified R, T> DataFrame<T>.add(
    name: String,
    noinline expression: AddExpression<T, R>
): DataFrame<T> = add(name, Infer.Nulls,  expression)
```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DataColumn nullability in JDBC #541

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

DataColumn nullability in JDBC #541

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions