Skip to content

Commit adb6e05

Browse files
Merge pull request #1352 from Kotlin/data_sources_docs
Data sources docs
2 parents eebf250 + 8527515 commit adb6e05

18 files changed

+802
-8
lines changed

docs/StardustDocs/d.tree

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -187,8 +187,23 @@
187187
<toc-element topic="jupyterRendering.md"/>
188188
</toc-element>
189189
</toc-element>
190-
<toc-element topic="Data-Sources.md" hidden="true">
191-
<toc-element topic="Integrations.md" hidden="true"/>
190+
<toc-element topic="Data-Sources.md">
191+
<toc-element topic="JSON.md">
192+
<toc-element topic="OpenAPI.md"/>
193+
</toc-element>
194+
<toc-element topic="CSV-TSV.md"/>
195+
<toc-element topic="Excel.md"/>
196+
<toc-element topic="ApacheArrow.md"/>
197+
<toc-element topic="SQL.md">
198+
<toc-element topic="PostgreSQL.md"/>
199+
<toc-element topic="MySQL.md"/>
200+
<toc-element topic="Microsoft-SQL-Server.md"/>
201+
<toc-element topic="SQLite.md"/>
202+
<toc-element topic="H2.md"/>
203+
<toc-element topic="MariaDB.md"/>
204+
<toc-element topic="Custom-SQL-Source.md"/>
205+
</toc-element>
206+
<toc-element topic="Integrations.md"/>
192207
</toc-element>
193208
<toc-element topic="_shadow_resources.md" hidden="true"/>
194209
<toc-element topic="Support.md"/>

docs/StardustDocs/topics/Home.topic

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@
2929
<title>Featured topics</title>
3030
<a href="Kotlin-DataFrame-Features-in-Kotlin-Notebook.md"/>
3131
<a href="Compiler-Plugin.md"/>
32+
<a href="Data-Sources.md"/>
3233
<a href="readSqlDatabases.md"/>
3334
</secondary>
3435

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
# Apache Arrow
2+
3+
<web-summary>
4+
Read and write Apache Arrow files in Kotlin — efficient binary format support with Kotlin DataFrame.
5+
</web-summary>
6+
7+
<card-summary>
8+
Work with Arrow files in Kotlin for fast I/O — supports both streaming and random access formats.
9+
</card-summary>
10+
11+
<link-summary>
12+
Kotlin DataFrame provides full support for reading and writing Apache Arrow files in high-performance workflows.
13+
</link-summary>
14+
15+
16+
Kotlin DataFrame supports reading from and writing to Apache Arrow files.
17+
18+
Requires the [`dataframe-arrow` module](Modules.md#dataframe-arrow), which is included by
19+
default in the general [`dataframe`](Modules.md#dataframe-general) artifact
20+
and in [`%use dataframe`](gettingStartedKotlinNotebook.md#integrate-kotlin-dataframe) for Kotlin Notebook.
21+
22+
> Make sure to follow the
23+
> [Apache Arrow Java compatibility guide](https://arrow.apache.org/docs/java/install.html#java-compatibility)
24+
> when using Java 9+.
25+
> {style="warning"}
26+
27+
## Read
28+
29+
[`DataFrame`](DataFrame.md) supports both the
30+
[Arrow interprocess streaming format](https://arrow.apache.org/docs/java/ipc.html#writing-and-reading-streaming-format)
31+
and the [Arrow random access format](https://arrow.apache.org/docs/java/ipc.html#writing-and-reading-random-access-files).
32+
33+
You can read a `DataFrame` from Apache Arrow data sources
34+
(via a file path, URL, or stream) using the [`readArrowFeather()`](read.md#read-apache-arrow-formats) method:
35+
36+
```kotlin
37+
val df = DataFrame.readArrowFeather("example.feather")
38+
```
39+
40+
```kotlin
41+
val df = DataFrame.readArrowFeather("https://kotlin.github.io/dataframe/resources/example.feather")
42+
```
43+
44+
## Write
45+
46+
A [`DataFrame`](DataFrame.md) can be written to Arrow format using the interprocess streaming or random access format.
47+
Output targets include `WritableByteChannel`, `OutputStream`, `File`, or `ByteArray`.
48+
49+
See [](write.md#writing-to-apache-arrow-formats) for more details.
Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
# CSV / TSV
2+
3+
<web-summary>
4+
Work with CSV and TSV files — read, analyze, and export tabular data using Kotlin DataFrame.
5+
</web-summary>
6+
7+
<card-summary>
8+
Seamlessly load and write CSV or TSV files in Kotlin — perfect for common tabular data workflows.
9+
</card-summary>
10+
11+
<link-summary>
12+
Kotlin DataFrame support for reading and writing CSV and TSV files with simple, type-safe APIs.
13+
</link-summary>
14+
15+
16+
Kotlin DataFrame supports reading from and writing to CSV and TSV files.
17+
18+
Requires the [`dataframe-csv` module](Modules.md#dataframe-csv),
19+
which is included by default in the general [`dataframe`](Modules.md#dataframe-general)
20+
artifact and in [`%use dataframe`](gettingStartedKotlinNotebook.md#integrate-kotlin-dataframe) for Kotlin Notebook.
21+
22+
## Read
23+
24+
You can read a [`DataFrame`](DataFrame.md) from a CSV or TSV file (via a file path or URL)
25+
using the [`readCsv()`](read.md#read-from-csv) or `readTsv()` methods:
26+
27+
```kotlin
28+
val df = DataFrame.readCsv("example.csv")
29+
```
30+
31+
```kotlin
32+
val df = DataFrame.readCsv("https://kotlin.github.io/dataframe/resources/example.csv")
33+
```
34+
35+
## Write
36+
37+
You can write a [`DataFrame`](DataFrame.md) to a CSV file using the [`writeCsv()`](write.md#writing-to-csv) method:
38+
39+
```kotlin
40+
df.writeCsv("example.csv")
41+
```
42+
43+
## Deephaven CSV
44+
45+
The [`dataframe-csv`](Modules.md#dataframe-csv) module uses the high-performance
46+
[Deephaven CSV library](https://github.com/deephaven/deephaven-csv) under the hood
47+
for fast and efficient CSV reading and writing.
48+
49+
If you're working with large CSV files, you can adjust the parser manually
50+
by [configuring Deephaven-specific parameters](https://kotlin.github.io/dataframe/read.html#unlocking-deephaven-csv-features)
51+
to get the best performance for your use case.
52+
Lines changed: 31 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,33 @@
11
# Data Sources
22

3-
> This topic is not ready yet.
3+
<web-summary>
4+
Discover all the data formats Kotlin DataFrame can work with — including JSON, CSV, Excel, SQL databases, and more.
5+
</web-summary>
6+
7+
<card-summary>
8+
Explore supported data sources in Kotlin DataFrame and how to integrate them into your data processing workflow.
9+
</card-summary>
10+
11+
<link-summary>
12+
Explore supported data sources in Kotlin DataFrame and how to integrate them into your data processing workflow.
13+
</link-summary>
14+
15+
One of the key aspects of working with data is being able to read from and write to various data sources.
16+
Kotlin DataFrame provides seamless support for a wide range of formats to integrate into your data workflows.
17+
Below you'll find a list of supported sources along with instructions on how to read and write data using them.
18+
19+
- [JSON](JSON.md)
20+
- [OpenAPI](OpenAPI.md)
21+
- [CSV / TSV](CSV-TSV.md)
22+
- [Excel](Excel.md)
23+
- [Apache Arrow](ApacheArrow.md)
24+
- [SQL](SQL.md):
25+
- [PostgreSQL](PostgreSQL.md)
26+
- [MySQL](MySQL.md)
27+
- [Microsoft SQL Server](Microsoft-SQL-Server.md)
28+
- [SQLite](SQLite.md)
29+
- [H2](H2.md)
30+
- [MariaDB](MariaDB.md)
31+
- [Custom SQL Source](Custom-SQL-Source.md)
32+
- [Custom integrations with unsupported data sources](Integrations.md)
33+
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
# Excel
2+
3+
<web-summary>
4+
Read from and write to Excel files in `.xls` or `.xlsx` formats with Kotlin DataFrame for seamless spreadsheet integration.
5+
</web-summary>
6+
7+
<card-summary>
8+
Kotlin DataFrame makes it easy to load and save data from Excel files — perfect for working with spreadsheet-based workflows.
9+
</card-summary>
10+
11+
<link-summary>
12+
Learn how to read and write Excel files using Kotlin DataFrame with just a single line of code.
13+
</link-summary>
14+
15+
16+
Kotlin DataFrame supports reading from and writing to Excel files in both `.xls` and `.xlsx` formats.
17+
18+
Requires the [`dataframe-excel` module](Modules.md#dataframe-excel),
19+
which is included by default in the general [`dataframe`](Modules.md#dataframe-general)
20+
artifact and in [`%use dataframe`](gettingStartedKotlinNotebook.md#integrate-kotlin-dataframe) for Kotlin Notebook.
21+
22+
## Read
23+
24+
You can read a [`DataFrame`](DataFrame.md) from an Excel file (via a file path or URL)
25+
using the [`readExcel()`](read.md#read-from-excel) method:
26+
27+
```kotlin
28+
val df = DataFrame.readExcel("example.xlsx")
29+
```
30+
31+
```kotlin
32+
val df = DataFrame.readExcel("https://kotlin.github.io/dataframe/resources/example.xlsx")
33+
```
34+
35+
## Write
36+
37+
You can write a [`DataFrame`](DataFrame.md) to an Excel file using the
38+
[`writeExcel()`](write.md#writing-to-csv) method:
39+
40+
```kotlin
41+
df.writeExcel("example.xlsx")
42+
```
Lines changed: 26 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,27 @@
1-
# Integrations
1+
# Custom integrations with unsupported data sources
22

3-
> This topic is not ready yet.
3+
<web-summary>
4+
Examples of how to integrate Kotlin DataFrame with other data frameworks like Exposed, Spark, or Multik.
5+
</web-summary>
6+
7+
<card-summary>
8+
Integrate Kotlin DataFrame with unsupported sources — see practical examples with Exposed, Spark, and more.
9+
</card-summary>
10+
11+
<link-summary>
12+
How to connect Kotlin DataFrame with data sources like Exposed, Apache Spark, or Multik.
13+
</link-summary>
14+
15+
Some data sources are not officially supported in the Kotlin DataFrame API yet —
16+
but you can still integrate them easily using custom code.
17+
18+
Below is a list of example integrations with other data frameworks.
19+
These examples demonstrate how to bridge Kotlin DataFrame with external libraries or APIs.
20+
21+
- [Kotlin Exposed](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/src/main/kotlin/org/jetbrains/kotlinx/dataframe/examples/exposed)
22+
- [Apache Spark](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/src/main/kotlin/org/jetbrains/kotlinx/dataframe/examples/spark)
23+
- [Apache Spark (with Kotlin Spark API)](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/src/main/kotlin/org/jetbrains/kotlinx/dataframe/examples/kotlinSpark)
24+
- [Multik](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/src/main/kotlin/org/jetbrains/kotlinx/dataframe/examples/multik)
25+
26+
You can use these examples as templates to create your own integrations
27+
with any data processing library that produces structured tabular data.
Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
# JSON
2+
3+
<web-summary>
4+
Support for working with JSON data — load, explore, and save structured JSON using Kotlin DataFrame.
5+
</web-summary>
6+
7+
<card-summary>
8+
Easily handle JSON data in Kotlin — read from files or URLs, and export your data back to JSON format.
9+
</card-summary>
10+
11+
<link-summary>
12+
Kotlin DataFrame support for reading and writing JSON files in a structured and type-safe way.
13+
</link-summary>
14+
15+
Kotlin DataFrame supports reading from and writing to JSON files.
16+
17+
Requires the [`dataframe-json` module](Modules.md#dataframe-json),
18+
which is included by default in the general [`dataframe`](Modules.md#dataframe-general)
19+
artifact and in [`%use dataframe`](gettingStartedKotlinNotebook.md#integrate-kotlin-dataframe)
20+
for Kotlin Notebook.
21+
22+
> Kotlin DataFrame is suitable only for working with table-like structured JSON —
23+
> a list of objects where each object represents a row and all objects share the same structure.
24+
>
25+
> Experimental support for [OpenAPI JSON schemas](OpenAPI.md) is also available.
26+
> {style="note"}
27+
28+
## Read
29+
30+
You can read a [`DataFrame`](DataFrame.md) or [`DataRow`](DataRow.md)
31+
from a JSON file (via a file path or URL) using the [`readJson()`](read.md#read-from-json) method:
32+
33+
```kotlin
34+
val df = DataFrame.readJson("example.json")
35+
```
36+
37+
```kotlin
38+
val df = DataFrame.readJson("https://kotlin.github.io/dataframe/resources/example.json")
39+
```
40+
41+
## Write
42+
43+
You can write a [`DataFrame`](DataFrame.md) to a JSON file using the [`writeJson()`](write.md#writing-to-json) method:
44+
45+
```kotlin
46+
df.writeJson("example.json")
47+
```
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
# OpenAPI
2+
3+
<web-summary>
4+
Work with JSON data based on OpenAPI 3.0 schemas using Kotlin DataFrame — helpful for consuming structured API responses.
5+
</web-summary>
6+
7+
<card-summary>
8+
Use Kotlin DataFrame to read and write data that conforms to OpenAPI specifications. Great for API-driven data workflows.
9+
</card-summary>
10+
11+
<link-summary>
12+
Learn how to use OpenAPI 3.0 JSON schemas with Kotlin DataFrame to load and manipulate API-defined data.
13+
</link-summary>
14+
15+
16+
> **Experimental**: Support for OpenAPI 3.0.0 schemas is currently experimental
17+
> and may change or be removed in future releases.
18+
> {style="warning"}
19+
20+
Kotlin DataFrame provides support for reading and writing JSON data
21+
that conforms to [OpenAPI 3.0 specifications](https://www.openapis.org).
22+
This feature is useful when working with APIs that expose structured data defined via OpenAPI schemas.
23+
24+
Requires the [`dataframe-openapi` module](Modules.md#dataframe-openapi),
25+
which **is not included** in the general [`dataframe`](Modules.md#dataframe-general) artifact.
26+
27+
To enable it in Kotlin Notebook, use:
28+
29+
```kotlin
30+
%use dataframe(enableExperimentalOpenApi=true)
31+
```
32+
33+
See [the OpenAPI guide notebook](https://github.com/Kotlin/dataframe/blob/master/examples/notebooks/json/KeyValueAndOpenApi.ipynb)
34+
for details on how to work with OpenAPI-based data.
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
# Custom SQL Source
2+
3+
<web-summary>
4+
Connect Kotlin DataFrame to any JDBC-compatible database using a custom SQL source configuration.
5+
</web-summary>
6+
7+
<card-summary>
8+
Easily integrate unsupported SQL databases in Kotlin DataFrame using a flexible custom source setup.
9+
</card-summary>
10+
11+
<link-summary>
12+
Define a custom SQL source in Kotlin DataFrame to work with any JDBC-based database.
13+
</link-summary>
14+
15+
16+
If your SQL database is not officially supported, you can either
17+
[create an issue](https://github.com/Kotlin/dataframe/issues)
18+
or define a simple, configurable custom SQL source.
19+
20+
See the [How to Extend DataFrame Library for Custom SQL Database Support guide](readSqlFromCustomDatabase.md)
21+
for detailed instructions and an example with HSQLDB.
22+

0 commit comments

Comments
 (0)