Skip to content

Commit fb0853b

Browse files
authored
Add guide for custom SQL database support with HSQLDB (#986)
* Add guide for custom SQL database support with HSQLDB This commit introduces documentation detailing the process of extending the Kotlin DataFrame library to support custom SQL databases, using HSQLDB as an example. The guide includes prerequisites, implementation of a custom database type, and example code for managing database tables and schemas. Additionally, updates have been made to reflect the possibility of registering custom SQL databases in existing files. * Add Gradle instructions to custom SQL database guide
1 parent 0478671 commit fb0853b

File tree

3 files changed

+200
-54
lines changed

3 files changed

+200
-54
lines changed

docs/StardustDocs/d.tree

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@
4646
<toc-element topic="io.md">
4747
<toc-element topic="read.md"/>
4848
<toc-element topic="readSqlDatabases.md"/>
49+
<toc-element topic="readSqlFromCustomDatabase.md"/>
4950
<toc-element topic="write.md"/>
5051
</toc-element>
5152
<toc-element topic="info.md">

docs/StardustDocs/topics/readSqlDatabases.md

Lines changed: 30 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -32,10 +32,11 @@ Also, there are a few **extension functions** available on `Connection`,
3232
**NOTE:** This is an experimental module, and for now,
3333
we only support four databases: MS SQL, MariaDB, MySQL, PostgreSQL, and SQLite.
3434

35+
Moreover, since release 0.15 we support the possibility to register custom SQL database, read more in our [guide](readSqlFromCustomDatabase.md).
36+
3537
Additionally, support for JSON and date-time types is limited.
3638
Please take this into consideration when using these functions.
3739

38-
3940
## Getting started with reading from SQL database in Gradle Project
4041

4142
In the first, you need to add a dependency
@@ -70,15 +71,15 @@ implementation("com.mysql:mysql-connector-j:$version")
7071

7172
Maven Central version could be found [here](https://mvnrepository.com/artifact/com.mysql/mysql-connector-j).
7273

73-
For SQLite:
74+
For **SQLite**:
7475

7576
```kotlin
7677
implementation("org.xerial:sqlite-jdbc:$version")
7778
```
7879

7980
Maven Central version could be found [here](https://mvnrepository.com/artifact/org.xerial/sqlite-jdbc).
8081

81-
For MS SQL:
82+
For **MS SQL**:
8283

8384
```kotlin
8485
implementation("com.microsoft.sqlserver:mssql-jdbc:$version")
@@ -158,14 +159,17 @@ otherwise, it will be considered non-nullable for the newly created `DataFrame`
158159
These functions read all data from a specific table in the database.
159160
Variants with a limit parameter restrict how many rows will be read from the table.
160161

161-
**readSqlTable(dbConfig: DbConnectionConfig, tableName: String, limit: Int, inferNullability: Boolean): AnyFrame**
162+
**readSqlTable(dbConfig: DbConnectionConfig, tableName: String, limit: Int, inferNullability: Boolean, dbType: DbType?): AnyFrame**
162163

163164
Read all data from a specific table in the SQL database and transform it into an `AnyFrame` object.
164165

165166
The `dbConfig: DbConnectionConfig` parameter represents the configuration for a database connection,
166167
created under the hood and managed by the library.
167168
Typically, it requires a URL, username, and password.
168169

170+
The `dbType` parameter is the type of database, could be a custom object, provided by user, optional, default is `null`,
171+
to know more, read the [guide](readSqlFromCustomDatabase.md).
172+
169173
```kotlin
170174
import org.jetbrains.kotlinx.dataframe.io.DbConnectionConfig
171175

@@ -180,7 +184,7 @@ The `limit: Int` parameter allows setting the maximum number of records to be re
180184
val users = DataFrame.readSqlTable(dbConfig, "Users", limit = 100)
181185
```
182186

183-
**readSqlTable(connection: Connection, tableName: String, limit: Int, inferNullability: Boolean): AnyFrame**
187+
**readSqlTable(connection: Connection, tableName: String, limit: Int, inferNullability: Boolean, dbType: DbType?): AnyFrame**
184188

185189
Another variant, where instead of `dbConfig: DbConnectionConfig` we use a JDBC connection: `Connection` object.
186190

@@ -210,7 +214,7 @@ val users = connection.readDataFrame("Users", 100)
210214
connection.close()
211215
```
212216

213-
**Connection.readDataFrame(sqlQueryOrTableName: String, limit: Int, inferNullability: Boolean): AnyFrame**
217+
**Connection.readDataFrame(sqlQueryOrTableName: String, limit: Int, inferNullability: Boolean, dbType: DbType?): AnyFrame**
214218

215219
Read all data from a specific table in the SQL database and transform it into an `AnyFrame` object.
216220

@@ -222,7 +226,7 @@ It should not contain `;` symbol.
222226

223227
All other parameters are described above.
224228

225-
**DbConnectionConfig.readDataFrame(sqlQueryOrTableName: String, limit: Int, inferNullability: Boolean): AnyFrame**
229+
**DbConnectionConfig.readDataFrame(sqlQueryOrTableName: String, limit: Int, inferNullability: Boolean, dbType: DbType?): AnyFrame**
226230

227231
If you do not have a connection object or need to run a quick,
228232
isolated experiment reading data from an SQL database,
@@ -233,7 +237,7 @@ you can delegate the creation of the connection to `DbConnectionConfig`.
233237
These functions execute an SQL query on the database and convert the result into a `DataFrame` object.
234238
If a limit is provided, only that many rows will be returned from the result.
235239

236-
**readSqlQuery(dbConfig: DbConnectionConfig, sqlQuery: String, limit: Int, inferNullability: Boolean): AnyFrame**
240+
**readSqlQuery(dbConfig: DbConnectionConfig, sqlQuery: String, limit: Int, inferNullability: Boolean, dbType: DbType?): AnyFrame**
237241

238242
Execute a specific SQL query on the SQL database and retrieve the resulting data as an AnyFrame.
239243

@@ -249,7 +253,7 @@ val dbConfig = DbConnectionConfig("URL_TO_CONNECT_DATABASE", "USERNAME", "PASSWO
249253
val df = DataFrame.readSqlQuery(dbConfig, "SELECT * FROM Users WHERE age > 35")
250254
```
251255

252-
**readSqlQuery(connection: Connection, sqlQuery: String, limit: Int, inferNullability: Boolean): AnyFrame**
256+
**readSqlQuery(connection: Connection, sqlQuery: String, limit: Int, inferNullability: Boolean, dbType: DbType?): AnyFrame**
253257

254258
Another variant, where instead of `dbConfig: DbConnectionConfig` we use a JDBC connection: `Connection` object.
255259

@@ -301,16 +305,18 @@ The `dbType: DbType` parameter specifies the type of our database (e.g., Postgre
301305
supported by a library.
302306
Currently, the following classes are available: `H2, MsSql, MariaDb, MySql, PostgreSql, Sqlite`.
303307

308+
Also, users have an ability to pass objects, describing their custom databases, more information in [guide](readSqlFromCustomDatabase.md).
309+
304310
```kotlin
305311
import org.jetbrains.kotlinx.dataframe.io.db.PostgreSql
306312
import java.sql.ResultSet
307313

308314
val df = DataFrame.readResultSet(resultSet, PostgreSql)
309315
```
310316

311-
**readResultSet(resultSet: ResultSet, connection: Connection, limit: Int, inferNullability: Boolean): AnyFrame**
317+
**readResultSet(resultSet: ResultSet, connection: Connection, limit: Int, inferNullability: Boolean, dbType: DbType?): AnyFrame**
312318

313-
Another variant, where instead of `dbType: DbType` we use a JDBC connection: `Connection` object.
319+
Another variant, we use a JDBC connection: `Connection` object.
314320

315321
```kotlin
316322
import java.sql.Connection
@@ -340,7 +346,7 @@ val df = rs.readDataFrame(connection, 10)
340346
connection.close()
341347
```
342348

343-
**ResultSet.readDataFrame(connection: Connection, limit: Int, inferNullability: Boolean): AnyFrame**
349+
**ResultSet.readDataFrame(connection: Connection, limit: Int, inferNullability: Boolean, dbType: DbType?): AnyFrame**
344350

345351
Reads the data from a `ResultSet` and converts it into a `DataFrame`.
346352

@@ -352,7 +358,7 @@ that the `ResultSet` belongs to.
352358
These functions read all data from all tables in the connected database.
353359
Variants with a limit parameter restrict how many rows will be read from each table.
354360

355-
**readAllSqlTables(dbConfig: DbConnectionConfig, limit: Int, inferNullability: Boolean): Map\<String, AnyFrame>**
361+
**readAllSqlTables(dbConfig: DbConnectionConfig, limit: Int, inferNullability: Boolean, dbType: DbType?): Map\<String, AnyFrame>**
356362

357363
Retrieves data from all the non-system tables in the SQL database and returns them as a map of table names to `AnyFrame` objects.
358364

@@ -368,7 +374,7 @@ val dbConfig = DbConnectionConfig("URL_TO_CONNECT_DATABASE", "USERNAME", "PASSWO
368374
val dataframes = DataFrame.readAllSqlTables(dbConfig)
369375
```
370376

371-
**readAllSqlTables(connection: Connection, limit: Int, inferNullability: Boolean): Map\<String, AnyFrame>**
377+
**readAllSqlTables(connection: Connection, limit: Int, inferNullability: Boolean, dbType: DbType?): Map\<String, AnyFrame>**
372378

373379
Another variant, where instead of `dbConfig: DbConnectionConfig` we use a JDBC connection: `Connection` object.
374380

@@ -389,7 +395,7 @@ The purpose of these functions is to facilitate the retrieval of table schema.
389395
By providing a table name and either a database configuration or connection,
390396
these functions return the [DataFrameSchema](schema.md) of the specified table.
391397

392-
**getSchemaForSqlTable(dbConfig: DbConnectionConfig, tableName: String): DataFrameSchema**
398+
**getSchemaForSqlTable(dbConfig: DbConnectionConfig, tableName: String, dbType: DbType?): DataFrameSchema**
393399

394400
This function captures the schema of a specific table from an SQL database.
395401

@@ -405,7 +411,7 @@ val dbConfig = DbConnectionConfig("URL_TO_CONNECT_DATABASE", "USERNAME", "PASSWO
405411
val schema = DataFrame.getSchemaForSqlTable(dbConfig, "Users")
406412
```
407413

408-
**getSchemaForSqlTable(connection: Connection, tableName: String): DataFrameSchema**
414+
**getSchemaForSqlTable(connection: Connection, tableName: String, dbType: DbType?): DataFrameSchema**
409415

410416
Another variant, where instead of `dbConfig: DbConnectionConfig` we use a JDBC connection: `Connection` object.
411417

@@ -427,7 +433,7 @@ These functions return the schema of an SQL query result.
427433
Once you provide a database configuration or connection and an SQL query,
428434
they return the [DataFrameSchema](schema.md) of the query result.
429435

430-
**getSchemaForSqlQuery(dbConfig: DbConnectionConfig, sqlQuery: String): DataFrameSchema**
436+
**getSchemaForSqlQuery(dbConfig: DbConnectionConfig, sqlQuery: String, dbType: DbType?): DataFrameSchema**
431437

432438
This function executes an SQL query on the database and then retrieves the resulting schema.
433439

@@ -443,7 +449,7 @@ val dbConfig = DbConnectionConfig("URL_TO_CONNECT_DATABASE", "USERNAME", "PASSWO
443449
val schema = DataFrame.getSchemaForSqlQuery(dbConfig, "SELECT * FROM Users WHERE age > 35")
444450
```
445451

446-
**getSchemaForSqlQuery(connection: Connection, sqlQuery: String): DataFrameSchema**
452+
**getSchemaForSqlQuery(connection: Connection, sqlQuery: String, dbType: DbType?): DataFrameSchema**
447453

448454
Another variant, where instead of `dbConfig: DbConnectionConfig` we use a JDBC connection: `Connection` object.
449455

@@ -472,11 +478,11 @@ val schema = connection.getDataFrameSchema("SELECT * FROM Users WHERE age > 35")
472478

473479
connection.close()
474480
```
475-
**Connection.getDataFrameSchema(sqlQueryOrTableName: String): DataFrameSchema**
481+
**Connection.getDataFrameSchema(sqlQueryOrTableName: String, dbType: DbType?): DataFrameSchema**
476482

477483
Retrieves the schema of an SQL query result or an SQL table using the provided database configuration.
478484

479-
**DbConnectionConfig.getDataFrameSchema(sqlQueryOrTableName: String): DataFrameSchema**
485+
**DbConnectionConfig.getDataFrameSchema(sqlQueryOrTableName: String, dbType: DbType?): DataFrameSchema**
480486

481487
Retrieves the schema of an SQL query result or an SQL table using the provided database configuration.
482488

@@ -507,49 +513,19 @@ The `dbType: DbType` parameter specifies the type of our database (e.g., Postgre
507513
supported by a library.
508514
Currently, the following classes are available: `H2, MariaDb, MySql, PostgreSql, Sqlite`.
509515

516+
Also, users have an ability to pass objects, describing their custom databases, more information in [guide](readSqlFromCustomDatabase.md).
517+
510518
```kotlin
511519
import org.jetbrains.kotlinx.dataframe.io.db.PostgreSql
512520
import java.sql.ResultSet
513521

514522
val schema = DataFrame.getSchemaForResultSet(resultSet, PostgreSql)
515523
```
516524

517-
**getSchemaForResultSet(connection: Connection, sqlQuery: String): DataFrameSchema**
518-
519-
Another variant, where instead of `dbType: DbType` we use a JDBC connection: `Connection` object.
520-
521-
```kotlin
522-
import java.sql.Connection
523-
import java.sql.DriverManager
524-
525-
val connection = DriverManager.getConnection("URL_TO_CONNECT_DATABASE")
526-
527-
val schema = DataFrame.getSchemaForResultSet(resultSet, connection)
528-
529-
connection.close()
530-
```
531-
532525
### Extension functions for schema reading from the ResultSet
533526

534527
The same example, rewritten with the extension function:
535528

536-
```kotlin
537-
import java.sql.Connection
538-
import java.sql.DriverManager
539-
540-
val connection = DriverManager.getConnection("URL_TO_CONNECT_DATABASE")
541-
542-
val schema = resultSet.getDataFrameSchema(connection)
543-
544-
connection.close()
545-
```
546-
547-
if you are using this extension function
548-
549-
**ResultSet.getDataFrameSchema(connection: Connection): DataFrameSchema**
550-
551-
or
552-
553529
```kotlin
554530
import org.jetbrains.kotlinx.dataframe.io.db.PostgreSql
555531
import java.sql.ResultSet
@@ -566,7 +542,7 @@ based on
566542
These functions return a list of all [`DataFrameSchema`](schema.md) from all the non-system tables in the SQL database.
567543
They can be called with either a database configuration or a connection.
568544

569-
**getSchemaForAllSqlTables(dbConfig: DbConnectionConfig): Map\<String, DataFrameSchema>**
545+
**getSchemaForAllSqlTables(dbConfig: DbConnectionConfig, dbType: DbType?): Map\<String, DataFrameSchema>**
570546

571547
This function retrieves the schema of all tables from an SQL database
572548
and returns them as a map of table names to [`DataFrameSchema`](schema.md) objects.
@@ -583,7 +559,7 @@ val dbConfig = DbConnectionConfig("URL_TO_CONNECT_DATABASE", "USERNAME", "PASSWO
583559
val schemas = DataFrame.getSchemaForAllSqlTables(dbConfig)
584560
```
585561

586-
**getSchemaForAllSqlTables(connection: Connection): Map\<String, DataFrameSchema>**
562+
**getSchemaForAllSqlTables(connection: Connection, dbType: DbType?): Map\<String, DataFrameSchema>**
587563

588564
This function retrieves the schema of all tables using a JDBC connection: `Connection` object
589565
and returns them as a list of [`DataFrameSchema`](schema.md).

0 commit comments

Comments
 (0)