Skip to content

Commit 7c3f204

Browse files
committed
Merge branch 'master' into jupyter-any-detection
2 parents 2941e38 + f1b245e commit 7c3f204

File tree

57 files changed

+779
-297
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

57 files changed

+779
-297
lines changed

README.md

Lines changed: 97 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -5,12 +5,12 @@
55
[![Maven Central](https://img.shields.io/maven-central/v/org.jetbrains.kotlinx/dataframe?color=blue&label=Maven%20Central)](https://search.maven.org/artifact/org.jetbrains.kotlinx/dataframe)
66
[![GitHub License](https://img.shields.io/badge/license-Apache%20License%202.0-blue.svg?style=flat)](http://www.apache.org/licenses/LICENSE-2.0)
77

8-
Kotlin Dataframe aims to reconcile Kotlin static typing with dynamic nature of data by utilizing both the full power of Kotlin language and opportunities provided by intermittent code execution in Jupyter notebooks and REPL.
8+
Kotlin Dataframe aims to reconcile Kotlin's static typing with the dynamic nature of data by utilizing both the full power of the Kotlin language and the opportunities provided by intermittent code execution in Jupyter notebooks and REPL.
99

1010
* **Hierarchical** — represents hierarchical data structures, such as JSON or a tree of JVM objects.
1111
* **Functional** — data processing pipeline is organized in a chain of `DataFrame` transformation operations. Every operation returns a new instance of `DataFrame` reusing underlying storage wherever it's possible.
1212
* **Readable** — data transformation operations are defined in DSL close to natural language.
13-
* **Practical** — provides simple solutions for common problems and ability to perform complex tasks.
13+
* **Practical** — provides simple solutions for common problems and the ability to perform complex tasks.
1414
* **Minimalistic** — simple, yet powerful data model of three column kinds.
1515
* **Interoperable** — convertable with Kotlin data classes and collections.
1616
* **Generic** — can store objects of any type, not only numbers or strings.
@@ -23,23 +23,105 @@ Explore [**documentation**](https://kotlin.github.io/dataframe/overview.html) fo
2323

2424
## Setup
2525

26-
### Gradle
26+
### Gradle for JVM
27+
```groovy
28+
// build.gradle
29+
30+
plugins {
31+
// Optional Gradle plugin for enhanced type safety and schema generation
32+
// https://kotlin.github.io/dataframe/gradle.html
33+
id 'org.jetbrains.kotlinx.dataframe' version '0.10.1'
34+
}
35+
36+
repositories {
37+
mavenCentral()
38+
}
39+
40+
dependencies {
41+
implementation 'org.jetbrains.kotlinx:dataframe:0.10.1'
42+
}
43+
```
44+
2745
```kotlin
46+
// build.gradle.kts
47+
2848
plugins {
2949
// Optional Gradle plugin for enhanced type safety and schema generation
3050
// https://kotlin.github.io/dataframe/gradle.html
31-
id("org.jetbrains.kotlinx.dataframe") version "0.10.0"
51+
id("org.jetbrains.kotlinx.dataframe") version "0.10.1"
3252
}
3353

3454
repositories {
3555
mavenCentral()
3656
}
3757

3858
dependencies {
39-
implementation("org.jetbrains.kotlinx:dataframe:0.10.0")
59+
implementation("org.jetbrains.kotlinx:dataframe:0.10.1")
60+
}
61+
```
62+
63+
### Gradle for Android
64+
```groovy
65+
// build.gradle
66+
67+
plugins {
68+
// Optional Gradle plugin for enhanced type safety and schema generation
69+
// https://kotlin.github.io/dataframe/gradle.html
70+
id 'org.jetbrains.kotlinx.dataframe' version '0.10.1'
71+
}
72+
73+
dependencies {
74+
implementation 'org.jetbrains.kotlinx:dataframe:0.10.1'
75+
}
76+
77+
android {
78+
defaultConfig {
79+
minSdk 26 // Android O+
80+
}
81+
compileOptions {
82+
sourceCompatibility JavaVersion.VERSION_1_8
83+
targetCompatibility JavaVersion.VERSION_1_8
84+
}
85+
kotlinOptions {
86+
jvmTarget = '1.8'
87+
}
88+
packagingOptions {
89+
resources {
90+
pickFirsts = ["META-INF/AL2.0",
91+
"META-INF/LGPL2.1",
92+
"META-INF/ASL-2.0.txt",
93+
"META-INF/LICENSE.md",
94+
"META-INF/NOTICE.md",
95+
"META-INF/LGPL-3.0.txt"]
96+
excludes = ["META-INF/kotlin-jupyter-libraries/libraries.json",
97+
"META-INF/{INDEX.LIST,DEPENDENCIES}",
98+
"{draftv3,draftv4}/schema",
99+
"arrow-git.properties"]
100+
}
101+
}
102+
}
103+
104+
// optional, could be required for KSP
105+
tasks.withType(KotlinCompile).configureEach {
106+
kotlinOptions {
107+
jvmTarget = '1.8'
108+
}
109+
}
110+
```
111+
112+
```kotlin
113+
// build.gradle.kts
114+
115+
plugins {
116+
// Optional Gradle plugin for enhanced type safety and schema generation
117+
// https://kotlin.github.io/dataframe/gradle.html
118+
id("org.jetbrains.kotlinx.dataframe") version "0.10.1"
119+
}
120+
121+
dependencies {
122+
implementation("org.jetbrains.kotlinx:dataframe:0.10.1")
40123
}
41124

42-
// Below only applies to Android projects
43125
android {
44126
defaultConfig {
45127
minSdk = 26 // Android O+
@@ -70,10 +152,13 @@ android {
70152
}
71153
}
72154
}
73-
tasks.withType<org.jetbrains.kotlin.gradle.tasks.KotlinCompile> {
155+
156+
// required for KSP
157+
tasks.withType<org.jetbrains.kotlin.gradle.tasks.KotlinCompile> {
74158
kotlinOptions.jvmTarget = "1.8"
75159
}
76160
```
161+
77162
### Jupyter Notebook
78163

79164
Install [Kotlin kernel](https://github.com/Kotlin/kotlin-jupyter) for [Jupyter](https://jupyter.org/)
@@ -97,7 +182,7 @@ or specific version:
97182

98183
## Kotlin, Kotlin Jupyter, OpenAPI, Arrow and JDK versions
99184

100-
This table shows the mapping between main library components versions and minimum supported Java versions.
185+
This table shows the mapping between main library component versions and minimum supported Java versions.
101186

102187
| Kotlin DataFrame Version | Minimum Java Version | Kotlin Version | Kotlin Jupyter Version | OpenAPI version | Apache Arrow version |
103188
|--------------------------|----------------------|----------------|------------------------|-----------------|----------------------|
@@ -116,6 +201,9 @@ val airline by columnOf("KLM(!)", "{Air France} (12)", "(British Airways. )", "1
116201

117202
// create dataframe
118203
val df = dataFrameOf(fromTo, flightNumber, recentDelays, airline)
204+
205+
// print dataframe
206+
df.print()
119207
```
120208

121209
**Clean:**
@@ -155,7 +243,7 @@ val clean = df
155243
clean
156244
// group by the flight origin renamed into "from"
157245
.groupBy { origin named "from" }.aggregate {
158-
// we are in the context of single data group
246+
// we are in the context of a single data group
159247

160248
// total number of flights from origin
161249
count() into "count"

core/generated-sources/src/main/kotlin/org/jetbrains/kotlinx/dataframe/api/DataRowApi.kt

Lines changed: 108 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,9 @@ import org.jetbrains.kotlinx.dataframe.index
1515
import org.jetbrains.kotlinx.dataframe.indices
1616
import org.jetbrains.kotlinx.dataframe.ncol
1717
import org.jetbrains.kotlinx.dataframe.nrow
18+
import org.jetbrains.kotlinx.dataframe.util.DIFF_DEPRECATION_MESSAGE
19+
import org.jetbrains.kotlinx.dataframe.util.DIFF_OR_NULL_IMPORT
20+
import org.jetbrains.kotlinx.dataframe.util.DIFF_REPLACE_MESSAGE
1821
import kotlin.experimental.ExperimentalTypeInference
1922
import kotlin.reflect.KProperty
2023
import kotlin.reflect.KType
@@ -74,17 +77,122 @@ public operator fun AnyRow.contains(column: KProperty<*>): Boolean = containsKey
7477

7578
// endregion
7679

80+
/**
81+
* Calculates the difference between the results of a row expression computed on the current and previous DataRow.
82+
*
83+
* @return [firstRowValue] for the first row; difference between expression computed for current and previous row for the following rows
84+
*/
85+
internal interface DiffDocs
86+
87+
/**
88+
* Calculates the difference between the results of a row expression computed on the current and previous DataRow.
89+
*
90+
* @return null for the first row; difference between expression computed for current and previous row for the following rows
91+
*/
92+
internal interface DiffOrNullDocs
93+
94+
@OptIn(ExperimentalTypeInference::class)
95+
@OverloadResolutionByLambdaReturnType
96+
/**
97+
* Calculates the difference between the results of a row expression computed on the current and previous DataRow.
98+
*
99+
* @return [firstRowValue] for the first row; difference between expression computed for current and previous row for the following rows
100+
*/
101+
public fun <T> DataRow<T>.diff(firstRowResult: Double, expression: RowExpression<T, Double>): Double =
102+
prev()?.let { p -> expression(this, this) - expression(p, p) } ?: firstRowResult
103+
104+
// required to resolve `diff(0) { intValue }`
105+
@OptIn(ExperimentalTypeInference::class)
106+
@OverloadResolutionByLambdaReturnType
107+
/**
108+
* Calculates the difference between the results of a row expression computed on the current and previous DataRow.
109+
*
110+
* @return [firstRowValue] for the first row; difference between expression computed for current and previous row for the following rows
111+
*/
112+
public fun <T> DataRow<T>.diff(firstRowResult: Int, expression: RowExpression<T, Int>): Int =
113+
prev()?.let { p -> expression(this, this) - expression(p, p) } ?: firstRowResult
114+
115+
/**
116+
* Calculates the difference between the results of a row expression computed on the current and previous DataRow.
117+
*
118+
* @return [firstRowValue] for the first row; difference between expression computed for current and previous row for the following rows
119+
*/
120+
public fun <T> DataRow<T>.diff(firstRowResult: Long, expression: RowExpression<T, Long>): Long =
121+
prev()?.let { p -> expression(this, this) - expression(p, p) } ?: firstRowResult
122+
123+
/**
124+
* Calculates the difference between the results of a row expression computed on the current and previous DataRow.
125+
*
126+
* @return [firstRowValue] for the first row; difference between expression computed for current and previous row for the following rows
127+
*/
128+
public fun <T> DataRow<T>.diff(firstRowResult: Float, expression: RowExpression<T, Float>): Float =
129+
prev()?.let { p -> expression(this, this) - expression(p, p) } ?: firstRowResult
130+
131+
@OptIn(ExperimentalTypeInference::class)
132+
@OverloadResolutionByLambdaReturnType
133+
/**
134+
* Calculates the difference between the results of a row expression computed on the current and previous DataRow.
135+
*
136+
* @return null for the first row; difference between expression computed for current and previous row for the following rows
137+
*/
138+
public fun <T> DataRow<T>.diffOrNull(expression: RowExpression<T, Double>): Double? =
139+
prev()?.let { p -> expression(this, this) - expression(p, p) }
140+
141+
/**
142+
* Calculates the difference between the results of a row expression computed on the current and previous DataRow.
143+
*
144+
* @return null for the first row; difference between expression computed for current and previous row for the following rows
145+
*/
146+
public fun <T> DataRow<T>.diffOrNull(expression: RowExpression<T, Int>): Int? =
147+
prev()?.let { p -> expression(this, this) - expression(p, p) }
148+
149+
/**
150+
* Calculates the difference between the results of a row expression computed on the current and previous DataRow.
151+
*
152+
* @return null for the first row; difference between expression computed for current and previous row for the following rows
153+
*/
154+
public fun <T> DataRow<T>.diffOrNull(expression: RowExpression<T, Long>): Long? =
155+
prev()?.let { p -> expression(this, this) - expression(p, p) }
156+
157+
/**
158+
* Calculates the difference between the results of a row expression computed on the current and previous DataRow.
159+
*
160+
* @return null for the first row; difference between expression computed for current and previous row for the following rows
161+
*/
162+
public fun <T> DataRow<T>.diffOrNull(expression: RowExpression<T, Float>): Float? =
163+
prev()?.let { p -> expression(this, this) - expression(p, p) }
164+
77165
@OptIn(ExperimentalTypeInference::class)
78166
@OverloadResolutionByLambdaReturnType
167+
@Deprecated(
168+
DIFF_DEPRECATION_MESSAGE,
169+
ReplaceWith(DIFF_REPLACE_MESSAGE, DIFF_OR_NULL_IMPORT),
170+
DeprecationLevel.WARNING
171+
)
79172
public fun <T> DataRow<T>.diff(expression: RowExpression<T, Double>): Double? =
80173
prev()?.let { p -> expression(this, this) - expression(p, p) }
81174

175+
@Deprecated(
176+
DIFF_DEPRECATION_MESSAGE,
177+
ReplaceWith(DIFF_REPLACE_MESSAGE, DIFF_OR_NULL_IMPORT),
178+
DeprecationLevel.WARNING
179+
)
82180
public fun <T> DataRow<T>.diff(expression: RowExpression<T, Int>): Int? =
83181
prev()?.let { p -> expression(this, this) - expression(p, p) }
84182

183+
@Deprecated(
184+
DIFF_DEPRECATION_MESSAGE,
185+
ReplaceWith(DIFF_REPLACE_MESSAGE, DIFF_OR_NULL_IMPORT),
186+
DeprecationLevel.WARNING
187+
)
85188
public fun <T> DataRow<T>.diff(expression: RowExpression<T, Long>): Long? =
86189
prev()?.let { p -> expression(this, this) - expression(p, p) }
87190

191+
@Deprecated(
192+
DIFF_DEPRECATION_MESSAGE,
193+
ReplaceWith(DIFF_REPLACE_MESSAGE, DIFF_OR_NULL_IMPORT),
194+
DeprecationLevel.WARNING
195+
)
88196
public fun <T> DataRow<T>.diff(expression: RowExpression<T, Float>): Float? =
89197
prev()?.let { p -> expression(this, this) - expression(p, p) }
90198

core/generated-sources/src/main/kotlin/org/jetbrains/kotlinx/dataframe/api/Nulls.kt

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,6 @@ internal interface FillNulls {
3535
* | .`[notNull][org.jetbrains.kotlinx.dataframe.api.Update.notNull]` { `[rowExpression][org.jetbrains.kotlinx.dataframe.documentation.ExpressionsGivenRow.RowValueExpression.WithExample]` }
3636
* | .`[perCol][org.jetbrains.kotlinx.dataframe.api.Update.perCol]` { `[colExpression][org.jetbrains.kotlinx.dataframe.documentation.ExpressionsGivenColumn.ColumnExpression.WithExample]` }
3737
* | .`[perRowCol][org.jetbrains.kotlinx.dataframe.api.Update.perRowCol]` { `[rowColExpression][org.jetbrains.kotlinx.dataframe.documentation.ExpressionsGivenRowAndColumn.RowColumnExpression.WithExample]` }
38-
* | .`[withValue][org.jetbrains.kotlinx.dataframe.api.Update.withValue]`(value)
3938
* | .`[withNull][org.jetbrains.kotlinx.dataframe.api.Update.withNull]`()
4039
* | .`[withZero][org.jetbrains.kotlinx.dataframe.api.Update.withZero]`()
4140
* | .`[asFrame][org.jetbrains.kotlinx.dataframe.api.Update.asFrame]` { `[dataFrameExpression][org.jetbrains.kotlinx.dataframe.documentation.ExpressionsGivenDataFrame.DataFrameExpression.WithExample]` }`
@@ -239,7 +238,6 @@ internal interface FillNaNs {
239238
* | .`[notNull][org.jetbrains.kotlinx.dataframe.api.Update.notNull]` { `[rowExpression][org.jetbrains.kotlinx.dataframe.documentation.ExpressionsGivenRow.RowValueExpression.WithExample]` }
240239
* | .`[perCol][org.jetbrains.kotlinx.dataframe.api.Update.perCol]` { `[colExpression][org.jetbrains.kotlinx.dataframe.documentation.ExpressionsGivenColumn.ColumnExpression.WithExample]` }
241240
* | .`[perRowCol][org.jetbrains.kotlinx.dataframe.api.Update.perRowCol]` { `[rowColExpression][org.jetbrains.kotlinx.dataframe.documentation.ExpressionsGivenRowAndColumn.RowColumnExpression.WithExample]` }
242-
* | .`[withValue][org.jetbrains.kotlinx.dataframe.api.Update.withValue]`(value)
243241
* | .`[withNull][org.jetbrains.kotlinx.dataframe.api.Update.withNull]`()
244242
* | .`[withZero][org.jetbrains.kotlinx.dataframe.api.Update.withZero]`()
245243
* | .`[asFrame][org.jetbrains.kotlinx.dataframe.api.Update.asFrame]` { `[dataFrameExpression][org.jetbrains.kotlinx.dataframe.documentation.ExpressionsGivenDataFrame.DataFrameExpression.WithExample]` }`
@@ -413,7 +411,6 @@ internal interface FillNA {
413411
* | .`[notNull][org.jetbrains.kotlinx.dataframe.api.Update.notNull]` { `[rowExpression][org.jetbrains.kotlinx.dataframe.documentation.ExpressionsGivenRow.RowValueExpression.WithExample]` }
414412
* | .`[perCol][org.jetbrains.kotlinx.dataframe.api.Update.perCol]` { `[colExpression][org.jetbrains.kotlinx.dataframe.documentation.ExpressionsGivenColumn.ColumnExpression.WithExample]` }
415413
* | .`[perRowCol][org.jetbrains.kotlinx.dataframe.api.Update.perRowCol]` { `[rowColExpression][org.jetbrains.kotlinx.dataframe.documentation.ExpressionsGivenRowAndColumn.RowColumnExpression.WithExample]` }
416-
* | .`[withValue][org.jetbrains.kotlinx.dataframe.api.Update.withValue]`(value)
417414
* | .`[withNull][org.jetbrains.kotlinx.dataframe.api.Update.withNull]`()
418415
* | .`[withZero][org.jetbrains.kotlinx.dataframe.api.Update.withZero]`()
419416
* | .`[asFrame][org.jetbrains.kotlinx.dataframe.api.Update.asFrame]` { `[dataFrameExpression][org.jetbrains.kotlinx.dataframe.documentation.ExpressionsGivenDataFrame.DataFrameExpression.WithExample]` }`

core/generated-sources/src/main/kotlin/org/jetbrains/kotlinx/dataframe/api/toDataFrame.kt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ public inline fun <reified T> Iterable<T>.toDataFrame(vararg props: KProperty<*>
3434
properties(roots = props, maxDepth = maxDepth)
3535
}
3636

37-
@Deprecated(DF_READ_DEPRECATION_MESSAGE, ReplaceWith(DF_READ_REPLACE_MESSAGE), DeprecationLevel.ERROR)
37+
@Deprecated(DF_READ_DEPRECATION_MESSAGE, ReplaceWith("this.unfold(columns)"), DeprecationLevel.ERROR)
3838
public fun <T> DataFrame<T>.read(columns: ColumnsSelector<T, *>): DataFrame<T> = unfold(columns)
3939

4040
@Deprecated(DF_READ_DEPRECATION_MESSAGE, ReplaceWith(DF_READ_REPLACE_MESSAGE), DeprecationLevel.ERROR)

core/generated-sources/src/main/kotlin/org/jetbrains/kotlinx/dataframe/api/update.kt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,6 @@ public data class Update<T, C>(
5050
* | .`[notNull][Update.notNull]` { `[rowExpression][ExpressionsGivenRow.RowValueExpression.WithExample]` }
5151
* | .`[perCol][Update.perCol]` { `[colExpression][ExpressionsGivenColumn.ColumnExpression.WithExample]` }
5252
* | .`[perRowCol][Update.perRowCol]` { `[rowColExpression][ExpressionsGivenRowAndColumn.RowColumnExpression.WithExample]` }
53-
* | .`[withValue][Update.withValue]`(value)
5453
* | .`[withNull][Update.withNull]`()
5554
* | .`[withZero][Update.withZero]`()
5655
* | .`[asFrame][Update.asFrame]` { `[dataFrameExpression][ExpressionsGivenDataFrame.DataFrameExpression.WithExample]` }`
@@ -764,4 +763,5 @@ public fun <T, C> Update<T, C>.withZero(): DataFrame<T> = updateWithValuePerColu
764763
*
765764
* @param [value] The value to set the selected rows to. In contrast to [with][Update.with], this must be the same exact type.
766765
*/
766+
@Deprecated("Use with { value } instead", ReplaceWith("this.with { value }"))
767767
public fun <T, C> Update<T, C>.withValue(value: C): DataFrame<T> = with { value }

core/generated-sources/src/main/kotlin/org/jetbrains/kotlinx/dataframe/documentation/DocumentationUrls.kt

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,16 @@ internal interface DocumentationUrls {
1919
interface RowConditions
2020
}
2121

22+
/** [See `NaN` and `NA` on the documentation website.](https://kotlin.github.io/dataframe/nanAndNa.html) */
23+
interface NanAndNa {
24+
25+
/** [See `NaN` on the documentation website.](https://kotlin.github.io/dataframe/nanAndNa.html#nan) */
26+
interface NaN
27+
28+
/** [See `NA` on the documentation website.](https://kotlin.github.io/dataframe/nanAndNa.html#na) */
29+
interface NA
30+
}
31+
2232
/** [See `update` on the documentation website.](https://kotlin.github.io/dataframe/update.html) */
2333
interface Update
2434

core/generated-sources/src/main/kotlin/org/jetbrains/kotlinx/dataframe/documentation/NA.kt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,8 @@ import org.jetbrains.kotlinx.dataframe.api.fillNA
1313
* You can also use [fillNA][fillNA] to replace `NAs` in certain columns with a given value or expression
1414
* or [dropNA][dropNA] to drop rows with `NAs` in them.
1515
*
16+
* For more information: [See `NA` on the documentation website.](https://kotlin.github.io/dataframe/nanAndNa.html#na)
17+
*
1618
* @see NaN
1719
*/
1820
internal interface NA

core/generated-sources/src/main/kotlin/org/jetbrains/kotlinx/dataframe/documentation/NaN.kt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,8 @@ import org.jetbrains.kotlinx.dataframe.api.fillNaNs
1111
* You can also use [fillNaNs][fillNaNs] to replace `NaNs` in certain columns with a given value or expression
1212
* or [dropNaNs][dropNaNs] to drop rows with `NaNs` in them.
1313
*
14+
* For more information: [See `NaN` on the documentation website.](https://kotlin.github.io/dataframe/nanAndNa.html#nan)
15+
*
1416
* @see NA
1517
*/
1618
internal interface NaN

core/generated-sources/src/main/kotlin/org/jetbrains/kotlinx/dataframe/util/deprecationMessages.kt

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,3 +5,9 @@ internal const val DF_READ_DEPRECATION_MESSAGE = "Replaced with `unfold` operati
55
internal const val DF_READ_REPLACE_MESSAGE = "this.unfold(*columns)"
66

77
internal const val ITERABLE_COLUMNS_DEPRECATION_MESSAGE = "Replaced with `toColumnSet()` operation."
8+
9+
internal const val DIFF_DEPRECATION_MESSAGE = "Replaced to explicitly indicate nullable return value; added a new non-null overload."
10+
11+
internal const val DIFF_REPLACE_MESSAGE = "this.diffOrNull(expression)"
12+
13+
internal const val DIFF_OR_NULL_IMPORT = "org.jetbrains.kotlinx.dataframe.api.diffOrNull"

0 commit comments

Comments
 (0)