Skip to content
Open
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
1ce08a4
Merge branch 'util_functions_docs' into join_docs
AndreiKingsley Sep 8, 2025
4eba809
Merge branch 'master' into join_docs
AndreiKingsley Sep 8, 2025
0fe349a
join docs examples
AndreiKingsley Sep 8, 2025
4e5e60b
korro update
AndreiKingsley Sep 8, 2025
b64b96c
join samples
AndreiKingsley Sep 9, 2025
6a6fcff
join samples comments
AndreiKingsley Sep 9, 2025
a205d04
join samples comments in docs
AndreiKingsley Sep 9, 2025
63b420d
Merge branch 'master' into join_docs
AndreiKingsley Sep 9, 2025
0642463
ktlint format
AndreiKingsley Sep 9, 2025
278b781
Merge branch 'master' into join_docs
AndreiKingsley Sep 10, 2025
ca9e963
improve join description
AndreiKingsley Sep 10, 2025
cf5313c
Merge branch 'master' into join_docs
AndreiKingsley Sep 10, 2025
623238e
formatted ifreames for simple join
AndreiKingsley Sep 10, 2025
3ded9e1
update join examples datasets
AndreiKingsley Sep 10, 2025
1ffbb39
update join examples with colored tables
AndreiKingsley Sep 10, 2025
c206279
update all join examples with colored tables
AndreiKingsley Sep 10, 2025
7a72ccd
ktlint format
AndreiKingsley Sep 11, 2025
3881f60
fix korro
AndreiKingsley Sep 17, 2025
26efbdb
fix korro paths
AndreiKingsley Sep 17, 2025
ebd7b3b
Merge branch 'master' into join_docs
AndreiKingsley Sep 17, 2025
746baa9
Revert "fix korro"
AndreiKingsley Sep 17, 2025
6c7a875
fix korro build error
AndreiKingsley Sep 17, 2025
4b2aefb
Merge branch 'master' into join_docs
AndreiKingsley Oct 3, 2025
ac873c2
Merge branch 'master' into join_docs
AndreiKingsley Oct 3, 2025
34512f4
update Kandy
AndreiKingsley Oct 3, 2025
9cce2a4
join docs colorize headers
AndreiKingsley Oct 3, 2025
15306b4
join docs korro update
AndreiKingsley Oct 3, 2025
75b9a95
increase gradle heap
AndreiKingsley Oct 3, 2025
77fc4a9
update _shadow_resources.md
AndreiKingsley Oct 3, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
512 changes: 512 additions & 0 deletions docs/StardustDocs/resources/api/join/notebook_test_join_10.html

Large diffs are not rendered by default.

514 changes: 514 additions & 0 deletions docs/StardustDocs/resources/api/join/notebook_test_join_11.html

Large diffs are not rendered by default.

513 changes: 513 additions & 0 deletions docs/StardustDocs/resources/api/join/notebook_test_join_12.html

Large diffs are not rendered by default.

512 changes: 512 additions & 0 deletions docs/StardustDocs/resources/api/join/notebook_test_join_13.html

Large diffs are not rendered by default.

512 changes: 512 additions & 0 deletions docs/StardustDocs/resources/api/join/notebook_test_join_14.html

Large diffs are not rendered by default.

513 changes: 513 additions & 0 deletions docs/StardustDocs/resources/api/join/notebook_test_join_15.html

Large diffs are not rendered by default.

512 changes: 512 additions & 0 deletions docs/StardustDocs/resources/api/join/notebook_test_join_16.html

Large diffs are not rendered by default.

513 changes: 513 additions & 0 deletions docs/StardustDocs/resources/api/join/notebook_test_join_17.html

Large diffs are not rendered by default.

513 changes: 513 additions & 0 deletions docs/StardustDocs/resources/api/join/notebook_test_join_18.html

Large diffs are not rendered by default.

513 changes: 513 additions & 0 deletions docs/StardustDocs/resources/api/join/notebook_test_join_19.html

Large diffs are not rendered by default.

512 changes: 512 additions & 0 deletions docs/StardustDocs/resources/api/join/notebook_test_join_20.html

Large diffs are not rendered by default.

511 changes: 511 additions & 0 deletions docs/StardustDocs/resources/api/join/notebook_test_join_3.html

Large diffs are not rendered by default.

511 changes: 511 additions & 0 deletions docs/StardustDocs/resources/api/join/notebook_test_join_5.html

Large diffs are not rendered by default.

512 changes: 512 additions & 0 deletions docs/StardustDocs/resources/api/join/notebook_test_join_6.html

Large diffs are not rendered by default.

512 changes: 512 additions & 0 deletions docs/StardustDocs/resources/api/join/notebook_test_join_8.html

Large diffs are not rendered by default.

15 changes: 15 additions & 0 deletions docs/StardustDocs/topics/_shadow_resources.md
Original file line number Diff line number Diff line change
Expand Up @@ -166,6 +166,21 @@
<resource src="notebook_test_generate_docs_1.html"></resource>
<resource src="notebook_test_shuffle_2.html"></resource>
<resource src="notebook_test_shuffle_1.html"></resource>
<resource src="notebook_test_join_8.html"></resource>
<resource src="notebook_test_join_10.html"></resource>
<resource src="notebook_test_join_5.html"></resource>
<resource src="notebook_test_join_11.html"></resource>
<resource src="notebook_test_join_20.html"></resource>
<resource src="notebook_test_join_16.html"></resource>
<resource src="notebook_test_join_17.html"></resource>
<resource src="notebook_test_join_3.html"></resource>
<resource src="notebook_test_join_18.html"></resource>
<resource src="notebook_test_join_14.html"></resource>
<resource src="notebook_test_join_15.html"></resource>
<resource src="notebook_test_join_19.html"></resource>
<resource src="notebook_test_join_12.html"></resource>
<resource src="notebook_test_join_6.html"></resource>
<resource src="notebook_test_join_13.html"></resource>
<resource src="notebook_test_chunked_1.html"></resource>
<resource src="notebook_test_chunked_3.html"></resource>
<resource src="notebook_test_chunked_2.html"></resource>
Expand Down
112 changes: 0 additions & 112 deletions docs/StardustDocs/topics/join.md

This file was deleted.

233 changes: 233 additions & 0 deletions docs/StardustDocs/topics/operations/multiple/join.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,233 @@
[//]: # (title: join)

<!---IMPORT org.jetbrains.kotlinx.dataframe.samples.api.multiple.JoinSamples-->

Joins two [`DataFrame`](DataFrame.md) object by join columns.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

objects

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah you just copied this part from the original file


```kotlin
join(otherDf, type = JoinType.Inner) [ { joinColumns } ]

joinColumns: JoinDsl.(LeftDataFrame) -> Columns

interface JoinDsl: LeftDataFrame {

val right: RightDataFrame

fun DataColumn.match(rightColumn: DataColumn)
}
```

`joinColumns` is a [column selector](ColumnSelectors.md) that defines column mapping for join:

Related operations: [](multipleDataFrames.md)

## Examples

<!---FUN notebook_test_join_3-->

```kotlin
dfAges
```

<!---END-->

<inline-frame src="./resources/notebook_test_join_3.html" width="100%" height="500px"></inline-frame>

<!---FUN notebook_test_join_5-->

```kotlin
dfCities
```

<!---END-->

<inline-frame src="./resources/notebook_test_join_5.html" width="100%" height="500px"></inline-frame>

<!---FUN notebook_test_join_6-->

```kotlin
// INNER JOIN on differently named keys:
// Merge a row when dfAges.firstName == dfCities.name.
// With the given data all 3 names match → all rows merge.
dfAges.join(dfCities) { firstName match right.name }
```

<!---END-->

<inline-frame src="./resources/notebook_test_join_6.html" width="100%" height="500px"></inline-frame>

If mapped columns have the same name, just select join columns from the left [`DataFrame`](DataFrame.md):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm it's hard to see where the previous example ends and the new one begins. Maybe you could give them a small title?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you do exactly that below :) nice


<!---FUN notebook_test_join_8-->

```kotlin
dfLeft
```

<!---END-->

<inline-frame src="./resources/notebook_test_join_8.html" width="100%" height="500px"></inline-frame>


<!---FUN notebook_test_join_10-->

```kotlin
dfRight
```

<!---END-->

<inline-frame src="./resources/notebook_test_join_10.html" width="100%" height="500px"></inline-frame>

<!---FUN notebook_test_join_11-->

```kotlin
// INNER JOIN on "name" only:
// Merge when left.name == right.name.
// Duplicate keys produce multiple merged rows (one per pairing).
dfLeft.join(dfRight) { name }
```

<!---END-->

<inline-frame src="./resources/notebook_test_join_11.html" width="100%" height="500px"></inline-frame>

If `joinColumns` is not specified, columns with the same name from both [`DataFrame`](DataFrame.md)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You say joinColumns is not specified, yet in the example, you show { name and city }

objects will be used as join columns:


<!---FUN notebook_test_join_12-->

```kotlin
// INNER JOIN on all same-named columns ("name" and "city"):
// Merge when BOTH name AND city are equal; otherwise the row is dropped.
dfLeft.join(dfRight)
```

<!---END-->

<inline-frame src="./resources/notebook_test_join_12.html" width="100%" height="500px"></inline-frame>


## Join types

Supported join types:
* `Inner` (default) — only matched rows from left and right [`DataFrame`](DataFrame.md) objects
* `Filter` — only matched rows from left [`DataFrame`](DataFrame.md)
* `Left` — all rows from left [`DataFrame`](DataFrame.md), mismatches from right [`DataFrame`](DataFrame.md) filled with `null`
* `Right` — all rows from right [`DataFrame`](DataFrame.md), mismatches from left [`DataFrame`](DataFrame.md) filled with `null`
* `Full` — all rows from left and right [`DataFrame`](DataFrame.md) objects, any mismatches filled with `null`
* `Exclude` — only mismatched rows from left [`DataFrame`](DataFrame.md)

For every join type there is a shortcut operation:

```kotlin
df.innerJoin(otherDf) [ { joinColumns } ]
df.filterJoin(otherDf) [ { joinColumns } ]
df.leftJoin(otherDf) [ { joinColumns } ]
df.rightJoin(otherDf) [ { joinColumns } ]
df.fullJoin(otherDf) [ { joinColumns } ]
df.excludeJoin(otherDf) [ { joinColumns } ]
```


### Examples {id="examples_1"}

<!---FUN notebook_test_join_13-->

```kotlin
dfLeft
```

<!---END-->

<inline-frame src="./resources/notebook_test_join_13.html" width="100%" height="500px"></inline-frame>

<!---FUN notebook_test_join_14-->

```kotlin
dfRight
```

<!---END-->

<inline-frame src="./resources/notebook_test_join_14.html" width="100%" height="500px"></inline-frame>

<!---FUN notebook_test_join_15-->

```kotlin
// INNER JOIN:
// Keep only rows where (name, city) match on both sides.
// In this dataset both Charlies match twice (Moscow, Milan) → 2 merged rows.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean "both charlies"?, the result shows Alice and Charlie

dfLeft.innerJoin(dfRight) { name and city }
```

<!---END-->

<inline-frame src="./resources/notebook_test_join_15.html" width="100%" height="500px"></inline-frame>

<!---FUN notebook_test_join_16-->

```kotlin
// FILTER JOIN:
// Keep ONLY left rows that have ANY match on (name, city).
// No right-side columns are added.
dfLeft.filterJoin(dfRight) { name and city }
```

<!---END-->

<inline-frame src="./resources/notebook_test_join_16.html" width="100%" height="500px"></inline-frame>

<!---FUN notebook_test_join_17-->

```kotlin
// LEFT JOIN:
// Keep ALL left rows. If (name, city) matches, attach right columns;
// if not, right columns are null (e.g., Alice–London has no right match).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*columns from the right dataframe

Also, Alice-London does have a match, Bob-Dubai does not, so isBusy == null

dfLeft.leftJoin(dfRight) { name and city }
```

<!---END-->

<inline-frame src="./resources/notebook_test_join_17.html" width="100%" height="500px"></inline-frame>

<!---FUN notebook_test_join_18-->

```kotlin
// RIGHT JOIN:
// Keep ALL right rows. If no left match, left columns become null
// (e.g., Alice with city=null exists only on the right).
dfLeft.rightJoin(dfRight) { name and city }
```

<!---END-->

<inline-frame src="./resources/notebook_test_join_18.html" width="100%" height="500px"></inline-frame>

<!---FUN notebook_test_join_19-->

```kotlin
// FULL JOIN:
// Keep ALL rows from both sides. Where there's no match on (name, city),
// the other side is filled with nulls.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you make all nulls bold? so it's clearer these are new, added to fill in the gaps

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.format().with { if (it == null) bold else null } will do

I still find the examples hard to follow, I'm wondering how we can make it as clear as possible :)

dfLeft.fullJoin(dfRight) { name and city }
```

<!---END-->

<inline-frame src="./resources/notebook_test_join_19.html" width="100%" height="500px"></inline-frame>

<!---FUN notebook_test_join_20-->

```kotlin
// EXCLUDE JOIN:
// Keep ONLY left rows that have NO match on (name, city).
// Useful to find "unpaired" left rows.
dfLeft.excludeJoin(dfRight) { name and city }
```

<!---END-->

<inline-frame src="./resources/notebook_test_join_20.html" width="100%" height="500px"></inline-frame>

4 changes: 3 additions & 1 deletion samples/build.gradle.kts
Original file line number Diff line number Diff line change
Expand Up @@ -63,13 +63,14 @@ kotlin.sourceSets {

korro {
docs = fileTree(rootProject.rootDir) {
include("docs/StardustDocs/topics/DataSchema-Data-Classes-Generation.md")
include("docs/StardustDocs/topics/schemas/*.md")
include("docs/StardustDocs/topics/read.md")
include("docs/StardustDocs/topics/write.md")
include("docs/StardustDocs/topics/rename.md")
include("docs/StardustDocs/topics/format.md")
include("docs/StardustDocs/topics/guides/*.md")
include("docs/StardustDocs/topics/operations/utils/*.md")
include("docs/StardustDocs/topics/operations/multiple/*.md")
include("docs/StardustDocs/topics/operations/column/*.md")
include("docs/StardustDocs/topics/collectionsInterop/*.md")
include("docs/StardustDocs/topics/dataSources/sql/*.md")
Expand All @@ -80,6 +81,7 @@ korro {
include("src/test/kotlin/org/jetbrains/kotlinx/dataframe/samples/*.kt")
include("src/test/kotlin/org/jetbrains/kotlinx/dataframe/samples/api/*.kt")
include("src/test/kotlin/org/jetbrains/kotlinx/dataframe/samples/api/utils/*.kt")
include("src/test/kotlin/org/jetbrains/kotlinx/dataframe/samples/api/multiple/*.kt")
include("src/test/kotlin/org/jetbrains/kotlinx/dataframe/samples/api/collectionsInterop/*.kt")
include("src/test/kotlin/org/jetbrains/kotlinx/dataframe/samples/api/column/*.kt")
include("src/test/kotlin/org/jetbrains/kotlinx/dataframe/samples/api/info/*.kt")
Expand Down
Loading
Loading