Skip to content

Commit 93ce411

Browse files
committed
adding number unification docs
1 parent f469c84 commit 93ce411

File tree

3 files changed

+47
-3
lines changed

3 files changed

+47
-3
lines changed

core/generated-sources/src/main/kotlin/org/jetbrains/kotlinx/dataframe/documentation/UnifyingNumbers.kt

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ import org.jetbrains.kotlinx.dataframe.impl.UnifiedNumberTypeOptions
4040
*
4141
* See [UnifiedNumberTypeOptions] for these settings.
4242
*
43-
* At the bottom of the graph is [Nothing]. This can be interpreted as `null`.
43+
* At the bottom of the graph is [Nothing?][Nothing].
44+
* This can be interpreted as `null`.
4445
*/
4546
public interface UnifyingNumbers

core/src/main/kotlin/org/jetbrains/kotlinx/dataframe/documentation/UnifyingNumbers.kt

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,8 @@ import org.jetbrains.kotlinx.dataframe.impl.UnifiedNumberTypeOptions
2222
*
2323
* See [UnifiedNumberTypeOptions] for these settings.
2424
*
25-
* At the bottom of the graph is [Nothing]. This can be interpreted as `null`.
25+
* At the bottom of the graph is [Nothing?][Nothing].
26+
* This can be interpreted as `null`.
2627
*/
2728
public interface UnifyingNumbers {
2829

@@ -48,5 +49,6 @@ public interface UnifyingNumbers {
4849
* ```
4950
*/
5051
@ExcludeFromSources
52+
@ExportAsHtml
5153
private interface Graph
5254
}
Lines changed: 42 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,44 @@
11
[//]: # (title: Number Unification)
22

3-
// TODO
3+
The concept of unifying numbers is converting them to a common number type without losing information.
4+
5+
This is an internal part of the library for now, but its logic can be encountered in multiple places, such as
6+
[statistics](summaryStatistics.md), and [reading JSON](read.md#read-from-json).
7+
8+
The following graph shows the hierarchy of number types in Kotlin DataFrame.
9+
10+
<inline-frame src="kdocs/org.jetbrains.kotlinx.dataframe.documentation.UnifyingNumbers.Graph.html" />
11+
12+
The order is top-down from the most complex type to the simplest one.
13+
14+
For each number type in the graph, it holds that a number of that type can be expressed lossless by
15+
a number of a more complex type (any of its parents).
16+
This is either because the more complex type has a larger range or higher precision (in terms of bits).
17+
18+
Nullability, while not displayed everywhere in the graph, is also taken into account.
19+
This means that `Int?` and `Float` will be unified to `Double?`.
20+
21+
At the bottom of the graph is `Nothing?`. This can be interpreted as `null`.
22+
23+
> There may be parts of the library that "unify" numbers, such as [`readCsv`](read.md#column-type-inference-from-csv),
24+
> or [`readExcel`](read.md#read-from-excel).
25+
> However, because they rely on another library (like [Deephaven CSV](https://github.com/deephaven/deephaven-csv))
26+
> this may behave slightly differently.
27+
28+
### Unified Number Type Options
29+
30+
There are variants of this graph that exclude some types, such as `BigDecimal` and `BigInteger`, or
31+
allow some slightly lossy conversions, like from `Long` to `Double`.
32+
33+
This follows either `UnifiedNumberTypeOptions.PRIMITIVES_ONLY` or
34+
`UnifiedNumberTypeOptions.DEFAULT`.
35+
36+
For `PRIMITIVES_ONLY`, used by [statistics](summaryStatistics.md), big numbers are excluded from the graph.
37+
Additionally, `Double` is considered the most complex type,
38+
meaning `Long`/`ULong` and `Double` can be joined to `Double`,
39+
potentially losing a little precision(!).
40+
41+
For `DEFAULT`, used by [`readJson`](read.md#read-from-json), big numbers can appear.
42+
`BigDecimal` is considered the most complex type, meaning that `Long`/`ULong` and `Double` will be joined
43+
to `BigDecimal` instead.
44+

0 commit comments

Comments
 (0)