Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
180 changes: 58 additions & 122 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,143 +37,65 @@ You could find the following articles there:

## Setup

### Gradle for JVM
```groovy
// build.gradle

plugins {
// Optional Gradle plugin for enhanced type safety and schema generation
// https://kotlin.github.io/dataframe/gradle.html
id 'org.jetbrains.kotlinx.dataframe' version '0.12.0'
}

repositories {
mavenCentral()
}

dependencies {
implementation 'org.jetbrains.kotlinx:dataframe:0.12.0'
}
```kotlin
implementation("org.jetbrains.kotlinx:dataframe:0.12.1")
```

Optional Gradle plugin for enhanced type safety and schema generation
https://kotlin.github.io/dataframe/schemasgradle.html
```kotlin
// build.gradle.kts

plugins {
// Optional Gradle plugin for enhanced type safety and schema generation
// https://kotlin.github.io/dataframe/gradle.html
id("org.jetbrains.kotlinx.dataframe") version "0.12.0"
}

repositories {
mavenCentral()
}

dependencies {
implementation("org.jetbrains.kotlinx:dataframe:0.12.0")
}
id("org.jetbrains.kotlinx.dataframe") version "0.12.1"
```

### Gradle for Android
```groovy
// build.gradle

plugins {
// Optional Gradle plugin for enhanced type safety and schema generation
// https://kotlin.github.io/dataframe/gradle.html
id 'org.jetbrains.kotlinx.dataframe' version '0.12.0'
}
Check out the [custom setup page](https://kotlin.github.io/dataframe/gettingstartedgradleadvanced.html) if you don't need some of the formats as dependencies,
for Groovy, and for configurations specific to Android projects.

dependencies {
implementation 'org.jetbrains.kotlinx:dataframe:0.12.0'
}
## Getting started

android {
defaultConfig {
minSdk 26 // Android O+
}
compileOptions {
sourceCompatibility JavaVersion.VERSION_1_8
targetCompatibility JavaVersion.VERSION_1_8
}
kotlinOptions {
jvmTarget = '1.8'
}
packagingOptions {
resources {
pickFirsts = ["META-INF/AL2.0",
"META-INF/LGPL2.1",
"META-INF/ASL-2.0.txt",
"META-INF/LICENSE.md",
"META-INF/NOTICE.md",
"META-INF/LGPL-3.0.txt"]
excludes = ["META-INF/kotlin-jupyter-libraries/libraries.json",
"META-INF/{INDEX.LIST,DEPENDENCIES}",
"{draftv3,draftv4}/schema",
"arrow-git.properties"]
}
}
}

// optional, could be required for KSP
tasks.withType(KotlinCompile).configureEach {
kotlinOptions {
jvmTarget = '1.8'
}
}
```kotlin
import org.jetbrains.kotlinx.dataframe.*
import org.jetbrains.kotlinx.dataframe.api.*
import org.jetbrains.kotlinx.dataframe.io.*
```

```kotlin
// build.gradle.kts
val df = DataFrame.read("https://raw.githubusercontent.com/Kotlin/dataframe/master/data/jetbrains_repositories.csv")
df["full_name"][0] // Indexing https://kotlin.github.io/dataframe/access.html
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Theodor ;)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does look a bit odd to just have a link for access and not for reading, filtering, column accessor creation etc.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but i found it odd to have link for all operations as well :D Soo.. In this case the message it should convey is "hey, we have an API reference, check it out"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable :)


plugins {
// Optional Gradle plugin for enhanced type safety and schema generation
// https://kotlin.github.io/dataframe/gradle.html
id("org.jetbrains.kotlinx.dataframe") version "0.12.0"
}
df.filter { "stargazers_count"<Int>() > 50 }.print()
```

dependencies {
implementation("org.jetbrains.kotlinx:dataframe:0.12.0")
}
## Getting started with data schema
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe *generated data schemas? To emphasise that you can write them yourself or have them auto-generated


android {
defaultConfig {
minSdk = 26 // Android O+
}
compileOptions {
sourceCompatibility = JavaVersion.VERSION_1_8
targetCompatibility = JavaVersion.VERSION_1_8
}
kotlinOptions {
jvmTarget = "1.8"
}
packaging {
resources {
pickFirsts += listOf(
"META-INF/AL2.0",
"META-INF/LGPL2.1",
"META-INF/ASL-2.0.txt",
"META-INF/LICENSE.md",
"META-INF/NOTICE.md",
"META-INF/LGPL-3.0.txt",
)
excludes += listOf(
"META-INF/kotlin-jupyter-libraries/libraries.json",
"META-INF/{INDEX.LIST,DEPENDENCIES}",
"{draftv3,draftv4}/schema",
"arrow-git.properties",
)
}
}
}
Requires Gradle plugin to work
```kotlin
id("org.jetbrains.kotlinx.dataframe") version "0.12.1"
```

Plugin generates extension properties API for provided sample of data. Column names and their types become discoverable in completion.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first sentence is a bit odd (and lacks articles): the "Extension properties API" is an abstract concept and an instance of one of the "Access APIs" we offer. So I'd just say that the plugin generates extension properties using the provided sample of data.


// required for KSP
tasks.withType<org.jetbrains.kotlin.gradle.tasks.KotlinCompile> {
kotlinOptions.jvmTarget = "1.8"
```kotlin
// Make sure to place the file annotation above the package directive
@file:ImportDataSchema(
"Repository",
"https://raw.githubusercontent.com/Kotlin/dataframe/master/data/jetbrains_repositories.csv",
)

package example

import org.jetbrains.kotlinx.dataframe.annotations.ImportDataSchema
import org.jetbrains.kotlinx.dataframe.api.*

fun main() {
// execute `assemble` to generate extension properties API
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"API" again is a bit odd

val df = Repository.readCSV()
df.fullName[0]

df.filter { stargazersCount > 50 }
}
```

### Jupyter Notebook
## Getting started in Jupyter Notebook / Kotlin Notebook

Install [Kotlin kernel](https://github.com/Kotlin/kotlin-jupyter) for [Jupyter](https://jupyter.org/)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, maybe I'll check the docs and readme in the future. We lack articles all over the place. It looks a bit unprofessional to me :/

Expand All @@ -186,14 +108,26 @@ or specific version:
%use dataframe(<version>)
```

```kotlin
val df = DataFrame.read("https://raw.githubusercontent.com/Kotlin/dataframe/master/data/jetbrains_repositories.csv")
df // the last expression in the cell is displayed
```

When a cell with a variable declaration is executed, in the next cell `DataFrame` provides extension properties based on its data
```kotlin
df.filter { stargazers_count > 50 }
```

## Data model
* `DataFrame` is a list of columns with equal sizes and distinct names.
* `DataColumn` is a named list of values. Can be one of three kinds:
* `ValueColumn` — contains data
* `ColumnGroup` — contains columns
* `FrameColumn` — contains dataframes

## Usage example
## Syntax example

Let us show you how data cleaning and aggregation pipelines could look like with DataFrame.

**Create:**
```kotlin
Expand Down Expand Up @@ -269,7 +203,9 @@ clean
}
```

[Try it in **Datalore**](https://datalore.jetbrains.com/view/notebook/vq5j45KWkYiSQnACA2Ymij) and explore [**more examples here**](examples).
Check it out on [**Datalore**](https://datalore.jetbrains.com/view/notebook/vq5j45KWkYiSQnACA2Ymij) to get a better visual impression of what happens and what the hierarchical DataFrame structure looks like.

Explore [**more examples here**](examples).

## Kotlin, Kotlin Jupyter, OpenAPI, Arrow and JDK versions

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
<html>
<head>
<style type="text/css">
:root {
<html>
<head>
<style type="text/css">
:root {
--background: #fff;
--background-odd: #f5f5f5;
--background-hover: #d9edfd;
Expand Down Expand Up @@ -173,34 +173,12 @@
summary {
padding: 6px;
}



















</style>
</head>
<body>


</style>
</head>
<body>
<details>
<summary>df.df[0].name</summary>

<details>
<details>
<summary>Input DataFrame: rowsCount = 7, columnsCount = 5</summary>
<table class="dataframe" id="df_0"></table>

Expand All @@ -222,8 +200,7 @@
<br>
<details>
<summary>df.df[3, 5, 6].select { name and age }</summary>

<details>
<details>
<summary>Input DataFrame: rowsCount = 7, columnsCount = 5</summary>
<table class="dataframe" id="df_3"></table>

Expand All @@ -245,8 +222,7 @@
<br>
<details>
<summary>df.df[3..5]</summary>

<details>
<details>
<summary>Input DataFrame: rowsCount = 7, columnsCount = 5</summary>
<table class="dataframe" id="df_6"></table>

Expand All @@ -260,9 +236,9 @@
</details>
</details>
<br>
</body>
<script>
(function () {
</body>
<script>
(function () {
window.DataFrame = window.DataFrame || new (function () {
this.addTable = function (df) {
let cols = df.cols;
Expand Down Expand Up @@ -536,8 +512,6 @@
}
})()



/*<!--*/
call_DataFrame(function() { DataFrame.addTable({ cols: [{ name: "<span title=\"firstName: String\">firstName</span>", children: [], rightAlign: false, values: ["Alice","Bob","Charlie","Charlie","Bob","Alice","Charlie"] },
{ name: "<span title=\"lastName: String\">lastName</span>", children: [], rightAlign: false, values: ["Cooper","Dylan","Daniels","Chaplin","Marley","Wolf","Byrd"] },
Expand All @@ -551,7 +525,6 @@

call_DataFrame(function() { DataFrame.renderTable(0) });


/*<!--*/
call_DataFrame(function() { DataFrame.addTable({ cols: [{ name: "<span title=\"firstName: String\">firstName</span>", children: [], rightAlign: false, values: ["Alice"] },
{ name: "<span title=\"lastName: String\">lastName</span>", children: [], rightAlign: false, values: ["Cooper"] },
Expand All @@ -565,7 +538,6 @@

call_DataFrame(function() { DataFrame.renderTable(1) });


/*<!--*/
call_DataFrame(function() { DataFrame.addTable({ cols: [{ name: "<span title=\"firstName: String\">firstName</span>", children: [], rightAlign: false, values: ["Alice"] },
{ name: "<span title=\"lastName: String\">lastName</span>", children: [], rightAlign: false, values: ["Cooper"] },
Expand All @@ -574,8 +546,6 @@

call_DataFrame(function() { DataFrame.renderTable(2) });



/*<!--*/
call_DataFrame(function() { DataFrame.addTable({ cols: [{ name: "<span title=\"firstName: String\">firstName</span>", children: [], rightAlign: false, values: ["Alice","Bob","Charlie","Charlie","Bob","Alice","Charlie"] },
{ name: "<span title=\"lastName: String\">lastName</span>", children: [], rightAlign: false, values: ["Cooper","Dylan","Daniels","Chaplin","Marley","Wolf","Byrd"] },
Expand All @@ -589,7 +559,6 @@

call_DataFrame(function() { DataFrame.renderTable(3) });


/*<!--*/
call_DataFrame(function() { DataFrame.addTable({ cols: [{ name: "<span title=\"firstName: String\">firstName</span>", children: [], rightAlign: false, values: ["Charlie","Alice","Charlie"] },
{ name: "<span title=\"lastName: String\">lastName</span>", children: [], rightAlign: false, values: ["Chaplin","Wolf","Byrd"] },
Expand All @@ -603,7 +572,6 @@

call_DataFrame(function() { DataFrame.renderTable(4) });


/*<!--*/
call_DataFrame(function() { DataFrame.addTable({ cols: [{ name: "<span title=\"firstName: String\">firstName</span>", children: [], rightAlign: false, values: ["Charlie","Alice","Charlie"] },
{ name: "<span title=\"lastName: String\">lastName</span>", children: [], rightAlign: false, values: ["Chaplin","Wolf","Byrd"] },
Expand All @@ -614,8 +582,6 @@

call_DataFrame(function() { DataFrame.renderTable(5) });



/*<!--*/
call_DataFrame(function() { DataFrame.addTable({ cols: [{ name: "<span title=\"firstName: String\">firstName</span>", children: [], rightAlign: false, values: ["Alice","Bob","Charlie","Charlie","Bob","Alice","Charlie"] },
{ name: "<span title=\"lastName: String\">lastName</span>", children: [], rightAlign: false, values: ["Cooper","Dylan","Daniels","Chaplin","Marley","Wolf","Byrd"] },
Expand All @@ -629,7 +595,6 @@

call_DataFrame(function() { DataFrame.renderTable(6) });


/*<!--*/
call_DataFrame(function() { DataFrame.addTable({ cols: [{ name: "<span title=\"firstName: String\">firstName</span>", children: [], rightAlign: false, values: ["Charlie","Bob","Alice"] },
{ name: "<span title=\"lastName: String\">lastName</span>", children: [], rightAlign: false, values: ["Chaplin","Marley","Wolf"] },
Expand All @@ -643,6 +608,5 @@

call_DataFrame(function() { DataFrame.renderTable(7) });


</script>
</html>
</script>
</html>
Loading