Skip to content

Commit 2284185

Browse files
data schemas topics fix
1 parent 15ea2b4 commit 2284185

13 files changed

+121
-57
lines changed

docs/StardustDocs/d.tree

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -32,19 +32,19 @@
3232
<toc-element topic="hierarchical.md"/>
3333
<toc-element topic="nanAndNa.md"/>
3434
<toc-element topic="numberUnification.md"/>
35-
<toc-element topic="schemas.md">
36-
<toc-element topic="schemasGradle.md"/>
37-
<toc-element topic="schemasJupyter.md"/>
38-
<toc-element topic="schemasInheritance.md"/>
39-
<toc-element topic="schemasCustom.md"/>
40-
<toc-element topic="schemasExternalJupyter.md"/>
41-
<toc-element topic="schemasImportSqlGradle.md"/>
42-
<toc-element topic="schemasImportOpenApiGradle.md"/>
43-
<toc-element topic="schemasImportOpenApiJupyter.md"/>
44-
<toc-element topic="DataSchemaGenerationGradle.md"/>
45-
</toc-element>
4635
</toc-element>
4736
<toc-element topic="extensionPropertiesApi.md"/>
37+
<toc-element topic="schemas.md">
38+
<toc-element topic="schemasGradle.md"/>
39+
<toc-element topic="schemasJupyter.md"/>
40+
<toc-element topic="schemasInheritance.md"/>
41+
<toc-element topic="schemasCustom.md"/>
42+
<toc-element topic="schemasExternalJupyter.md"/>
43+
<toc-element topic="schemasImportSqlGradle.md"/>
44+
<toc-element topic="schemasImportOpenApiGradle.md"/>
45+
<toc-element topic="schemasImportOpenApiJupyter.md"/>
46+
<toc-element topic="DataSchemaGenerationGradle.md"/>
47+
</toc-element>
4848
<toc-element topic="DataSchemaGenerationMethods.md"/>
4949
<toc-element topic="Compiler-Plugin.md">
5050
<toc-element topic="staticInterpretation.md"/>

docs/StardustDocs/topics/concepts/schemas.md

Lines changed: 0 additions & 46 deletions
This file was deleted.
File renamed without changes.
File renamed without changes.
Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
[//]: # (title: Data Schemas)
2+
3+
The Kotlin DataFrame library provides typed data access via
4+
[generation of extension properties](extensionPropertiesApi.md) for type
5+
[`DataFrame<T>`](DataFrame.md) (as well as [`DataRow<T>`](DataRow.md)), where
6+
`T` is a marker class that represents `DataSchema` of [`DataFrame`](DataFrame.md).
7+
8+
Schema of [`DataFrame`](DataFrame.md) is a mapping from column names to column types of [`DataFrame`](DataFrame.md).
9+
Data schema can be interpreted as a Kotlin interface or class. If the dataframe is hierarchical - contains
10+
[column group](DataColumn.md#columngroup) or [column of dataframes](DataColumn.md#framecolumn), data schema
11+
takes it into account and there is a separate class for each column group or inner `DataFrame`.
12+
13+
For example, consider a simple hierarchical dataframe from
14+
<resource src="example.csv"></resource>.
15+
16+
This dataframe consists of two columns: `name`, which is a `String` column, and `info`,
17+
which is a [**column group**](DataColumn.md#columngroup) containing two nested
18+
[value columns](DataColumn.md#valuecolumn)
19+
`age` of type `Int`, and `height` of type `Double`.
20+
21+
<table>
22+
<thead>
23+
<tr>
24+
<th>name</th>
25+
<th colspan="2">info</th>
26+
</tr>
27+
<tr>
28+
<th></th>
29+
<th>age</th>
30+
<th>height</th>
31+
</tr>
32+
</thead>
33+
<tbody>
34+
<tr>
35+
<td>Alice</td>
36+
<td>23</td>
37+
<td>175.5</td>
38+
</tr>
39+
<tr>
40+
<td>Bob</td>
41+
<td>27</td>
42+
<td>160.2</td>
43+
</tr>
44+
</tbody>
45+
</table>
46+
47+
Data schema corresponding to this dataframe can be represented like this :
48+
49+
```kotlin
50+
// Data schema of the "info" column group
51+
@DataSchema
52+
data class Info(
53+
val age: Int,
54+
val height: Float
55+
)
56+
57+
// Data schema of the entire dataframe
58+
@DataSchema
59+
data class Person(
60+
val info: Info,
61+
val name: String
62+
)
63+
```
64+
65+
[Extension properties](extensionPropertiesApi.md) for the `DataFrame<Person>`
66+
are generated according to this schema and can be used for accessing columns and usage in operations:
67+
68+
```kotlin
69+
// Assuming `df` has type DataFrame<Person>
70+
71+
// Get "age" column from "info" group
72+
df.info.age
73+
74+
// Select "name" and "height" columns
75+
df.select { name and info.height }
76+
77+
// Filter rows by age value
78+
df.filter { age >= 18}
79+
```
80+
81+
82+
## Popular use cases with Data Schemas
83+
84+
Here's a list of the most popular use cases with Data Schemas.
85+
86+
* [**Data Schemas in Gradle projects**](schemasGradle.md) <br/>
87+
If you are developing a server application and building it with Gradle.
88+
89+
* [**DataSchema workflow in Jupyter**](schemasJupyter.md) <br/>
90+
If you prefer Notebooks.
91+
92+
* [**Schema inheritance**](schemasInheritance.md) <br/>
93+
It's worth knowing how to reuse Data Schemas generated earlier.
94+
95+
* [**Custom Data Schemas**](schemasCustom.md) <br/>
96+
Sometimes it is necessary to create your own scheme.
97+
98+
* [**Use external Data Schemas in Jupyter**](schemasExternalJupyter.md) <br/>
99+
Sometimes it is convenient to extract reusable code from Jupyter Notebook into the Kotlin JVM library.
100+
Schema interfaces should also be extracted if this code uses Custom Data Schemas.
101+
102+
* [**Schema Definitions from SQL Databases in Gradle Project**](schemasImportSqlGradle.md) <br/>
103+
When you need to take data from the SQL database.
104+
105+
* [**Import OpenAPI 3.0.0 Schemas (Experimental) in Gradle Project**](schemasImportOpenApiGradle.md) <br/>
106+
When you need to take data from the endpoint with OpenAPI Schema.
107+
108+
* [**Import Data Schemas, e.g. from OpenAPI 3.0.0 (Experimental), in Jupyter**](schemasImportOpenApiJupyter.md) <br/>
109+
Similar to [importing OpenAPI Data Schemas in Gradle projects](schemasImportOpenApiGradle.md),
110+
you can also do this in Jupyter Notebooks.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.

0 commit comments

Comments
 (0)