|
| 1 | +[//]: # (title: Data visualization with Lets-Plot for Kotlin) |
| 2 | + |
| 3 | +[Lets-Plot for Kotlin (LPK)](https://lets-plot.org/kotlin/get-started.html) is a multiplatform plotting library that ports the [R's ggplot2 library](https://ggplot2.tidyverse.org/) to |
| 4 | +Kotlin. LPK brings the feature-rich ggplot2 API to the Kotlin ecosystem, |
| 5 | +making it suitable for scientists and statisticians who require sophisticated data visualization capabilities. |
| 6 | + |
| 7 | +LPK targets various platforms, including [Kotlin notebooks](data-analysis-overview.md#notebooks), [Kotlin/JS](js-overview.md), [JVM's Swing](https://docs.oracle.com/javase/8/docs/technotes/guides/swing/), [JavaFX](https://openjfx.io/), and [Compose Multiplatform](https://www.jetbrains.com/lp/compose-multiplatform/). |
| 8 | +Additionally, LPK has seamless integration with [IntelliJ](https://www.jetbrains.com/idea/), [DataGrip](https://www.jetbrains.com/datagrip/), [DataSpell](https://www.jetbrains.com/dataspell/), and [PyCharm](https://www.jetbrains.com/pycharm/). |
| 9 | + |
| 10 | +{width=700} |
| 11 | + |
| 12 | +This tutorial demonstrates how to create different plot types with |
| 13 | +the LPK and [Kotlin DataFrame](https://kotlin.github.io/dataframe/gettingstarted.html) libraries using Kotlin Notebook in IntelliJ IDEA. |
| 14 | + |
| 15 | +## Before you start |
| 16 | + |
| 17 | +1. Download and install the latest version of [IntelliJ IDEA Ultimate](https://www.jetbrains.com/idea/download/?section=mac). |
| 18 | +2. Install the [Kotlin Notebook plugin](https://plugins.jetbrains.com/plugin/16340-kotlin-notebook) in IntelliJ IDEA. |
| 19 | + |
| 20 | + > Alternatively, access the Kotlin Notebook plugin from **Settings** | **Plugins** | **Marketplace** within IntelliJ IDEA. |
| 21 | + > |
| 22 | + {type="tip"} |
| 23 | + |
| 24 | +3. Create a new notebook by selecting **File** | **New** | **Kotlin Notebook**. |
| 25 | +4. In your notebook, import the LPK and Kotlin DataFrame libraries by running the following command: |
| 26 | + |
| 27 | + ```kotlin |
| 28 | + %use lets-plot |
| 29 | + %use dataframe |
| 30 | + ``` |
| 31 | + |
| 32 | +## Prepare the data |
| 33 | + |
| 34 | +Let's create a DataFrame that stores simulated numbers of the monthly average temperature in three cities: Berlin, Madrid, and Caracas. |
| 35 | +
|
| 36 | +Use the [`dataFrameOf()`](https://kotlin.github.io/dataframe/createdataframe.html#dataframeof) function from the Kotlin DataFrame library to generate the DataFrame. Paste and run the following code snippet in your Kotlin Notebook: |
| 37 | +
|
| 38 | +```kotlin |
| 39 | +// The months variable stores a list with 12 months of the year |
| 40 | +val months = listOf( |
| 41 | + "January", "February", |
| 42 | + "March", "April", "May", |
| 43 | + "June", "July", "August", |
| 44 | + "September", "October", "November", |
| 45 | + "December" |
| 46 | +) |
| 47 | +// The tempBerlin, tempMadrid, and tempCaracas variables store a list with temperature values for each month |
| 48 | +val tempBerlin = |
| 49 | + listOf(-0.5, 0.0, 4.8, 9.0, 14.3, 17.5, 19.2, 18.9, 14.5, 9.7, 4.7, 1.0) |
| 50 | +val tempMadrid = |
| 51 | + listOf(6.3, 7.9, 11.2, 12.9, 16.7, 21.1, 24.7, 24.2, 20.3, 15.4, 9.9, 6.6) |
| 52 | +val tempCaracas = |
| 53 | + listOf(27.5, 28.9, 29.6, 30.9, 31.7, 35.1, 33.8, 32.2, 31.3, 29.4, 28.9, 27.6) |
| 54 | +
|
| 55 | +// The df variable stores a DataFrame of three columns, including monthly records, temperature, and cities |
| 56 | +val df = dataFrameOf( |
| 57 | + "Month" to months + months + months, |
| 58 | + "Temperature" to tempBerlin + tempMadrid + tempCaracas, |
| 59 | + "City" to List(12) { "Berlin" } + List(12) { "Madrid" } + List(12) { "Caracas" } |
| 60 | +) |
| 61 | +df.head(4) |
| 62 | +``` |
| 63 | +
|
| 64 | +You can see that the DataFrame has three columns: Month, Temperature, and City. The first four rows of the DataFrame |
| 65 | +contain records of the temperature in Berlin from January to April: |
| 66 | +
|
| 67 | +{width=600} |
| 68 | +
|
| 69 | +To create a plot using the LPK library, you need to convert your data (`df`) into a `Map` type that stores the |
| 70 | +data in key-value pairs. You can easily convert a DataFrame into a `Map` using the [`.toMap()`](https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.collections/to-map.html) function: |
| 71 | +
|
| 72 | +```kotlin |
| 73 | +val data = df.toMap() |
| 74 | +``` |
| 75 | +
|
| 76 | +## Create a scatter plot |
| 77 | +
|
| 78 | +Let's create a scatter plot in Kotlin Notebook with the LPK library. |
| 79 | + |
| 80 | +Once you have your data in the `Map` format, use the [`geomPoint()`](https://lets-plot.org/kotlin/api-reference/-lets--plot--kotlin/org.jetbrains.letsPlot.geom/geom-point/index.html) function from the LPK library to generate the scatter plot. |
| 81 | +You can specify the values for the X and Y axes, as well as define categories and their color. Additionally, |
| 82 | +you can [customize](https://lets-plot.org/kotlin/aesthetics.html#point-shapes) the plot's size and point shapes to suit your needs: |
| 83 | + |
| 84 | +```kotlin |
| 85 | +// Specifies X and Y axes, categories and their color, plot size, and plot type |
| 86 | +val scatterPlot = |
| 87 | + letsPlot(data) { x = "Month"; y = "Temperature"; color = "City" } + ggsize(600, 500) + geomPoint(shape = 15) |
| 88 | +scatterPlot |
| 89 | +``` |
| 90 | + |
| 91 | +Here's the result: |
| 92 | +
|
| 93 | +{width=600} |
| 94 | +
|
| 95 | +## Create a box plot |
| 96 | +
|
| 97 | +Let's visualize the [data](#prepare-the-data) in a box plot. Use the [`geomBoxplot()`](https://lets-plot.org/kotlin/api-reference/-lets--plot--kotlin/org.jetbrains.letsPlot.geom/geom-boxplot.html) |
| 98 | +function from the LPK library to generate the plot and [customize](https://lets-plot.org/kotlin/aesthetics.html#point-shapes) colors with the [`scaleFillManual()`](https://lets-plot.org/kotlin/api-reference/-lets--plot--kotlin/org.jetbrains.letsPlot.scale/scale-fill-manual.html) |
| 99 | +function: |
| 100 | + |
| 101 | +```kotlin |
| 102 | +// Specifies X and Y axes, categories, plot size, and plot type |
| 103 | +val boxPlot = ggplot(data) { x = "City"; y = "Temperature" } + ggsize(700, 500) + geomBoxplot { fill = "City" } + |
| 104 | + // Customizes colors |
| 105 | + scaleFillManual(values = listOf("light_yellow", "light_magenta", "light_green")) |
| 106 | +boxPlot |
| 107 | +``` |
| 108 | + |
| 109 | +Here's the result: |
| 110 | +
|
| 111 | +{width=600} |
| 112 | +
|
| 113 | +## Create a 2D density plot |
| 114 | +
|
| 115 | +Now, let's create a 2D density plot to visualize the distribution and concentration of some random data. |
| 116 | + |
| 117 | +### Prepare the data for the 2D density plot |
| 118 | + |
| 119 | +1. Import the dependencies to process the data and generate the plot: |
| 120 | + |
| 121 | + ```kotlin |
| 122 | + %use lets-plot |
| 123 | + |
| 124 | + @file:DependsOn("org.apache.commons:commons-math3:3.6.1") |
| 125 | + import org.apache.commons.math3.distribution.MultivariateNormalDistribution |
| 126 | + ``` |
| 127 | + |
| 128 | + > For more information about importing dependencies to Kotlin Notebook, see the [Kotlin Notebook documentation](https://www.jetbrains.com/help/idea/kotlin-notebook.html#add-dependencies). |
| 129 | + > {type="tip"} |
| 130 | + |
| 131 | +2. Paste and run the following code snippet in your Kotlin Notebook to create sets of 2D data points: |
| 132 | + |
| 133 | + ```kotlin |
| 134 | + // Defines covariance matrices for three distributions |
| 135 | + val cov0: Array<DoubleArray> = arrayOf( |
| 136 | + doubleArrayOf(1.0, -.8), |
| 137 | + doubleArrayOf(-.8, 1.0) |
| 138 | + ) |
| 139 | + |
| 140 | + val cov1: Array<DoubleArray> = arrayOf( |
| 141 | + doubleArrayOf(1.0, .8), |
| 142 | + doubleArrayOf(.8, 1.0) |
| 143 | + ) |
| 144 | + |
| 145 | + val cov2: Array<DoubleArray> = arrayOf( |
| 146 | + doubleArrayOf(10.0, .1), |
| 147 | + doubleArrayOf(.1, .1) |
| 148 | + ) |
| 149 | + |
| 150 | + // Defines the number of samples |
| 151 | + val n = 400 |
| 152 | + |
| 153 | + // Defines means for three distributions |
| 154 | + val means0: DoubleArray = doubleArrayOf(-2.0, 0.0) |
| 155 | + val means1: DoubleArray = doubleArrayOf(2.0, 0.0) |
| 156 | + val means2: DoubleArray = doubleArrayOf(0.0, 1.0) |
| 157 | + |
| 158 | + // Generates random samples from three multivariate normal distributions |
| 159 | + val xy0 = MultivariateNormalDistribution(means0, cov0).sample(n) |
| 160 | + val xy1 = MultivariateNormalDistribution(means1, cov1).sample(n) |
| 161 | + val xy2 = MultivariateNormalDistribution(means2, cov2).sample(n) |
| 162 | + ``` |
| 163 | + |
| 164 | + From the code above, the `xy0`, `xy1`, and `xy2` variables store arrays with 2D (`x, y`) data points. |
| 165 | + |
| 166 | +3. Convert your data into a `Map` type: |
| 167 | + |
| 168 | + ```kotlin |
| 169 | + val data = mapOf( |
| 170 | + "x" to (xy0.map { it[0] } + xy1.map { it[0] } + xy2.map { it[0] }).toList(), |
| 171 | + "y" to (xy0.map { it[1] } + xy1.map { it[1] } + xy2.map { it[1] }).toList() |
| 172 | + ) |
| 173 | + ``` |
| 174 | + |
| 175 | +### Generate the 2D density plot |
| 176 | + |
| 177 | +Using the `Map` from the previous step, create a 2D density plot (`geomDensity2D`) with a scatter plot (`geomPoint`) in the background to better visualize the |
| 178 | +data points and outliers. You can use the [`scaleColorGradient()`](https://lets-plot.org/kotlin/api-reference/-lets--plot--kotlin/org.jetbrains.letsPlot.scale/scale-color-gradient.html) function to customize the scale of colors: |
| 179 | + |
| 180 | +```kotlin |
| 181 | +val densityPlot = letsPlot(data) { x = "x"; y = "y" } + ggsize(600, 300) + geomPoint( |
| 182 | + color = "black", |
| 183 | + alpha = .1 |
| 184 | +) + geomDensity2D { color = "..level.." } + |
| 185 | + scaleColorGradient(low = "dark_green", high = "yellow", guide = guideColorbar(barHeight = 10, barWidth = 300)) + |
| 186 | + theme().legendPositionBottom() |
| 187 | +densityPlot |
| 188 | +``` |
| 189 | + |
| 190 | +Here's the result: |
| 191 | +
|
| 192 | +{width=600} |
| 193 | +
|
| 194 | +## What's next |
| 195 | + |
| 196 | +* Explore more plot examples in the [Lets-Plot for Kotlin's documentation](https://lets-plot.org/kotlin/charts.html). |
| 197 | +* Check the Lets-Plot for Kotlin's [API reference](https://lets-plot.org/kotlin/api-reference/). |
| 198 | +* Learn about transforming and visualizing data with Kotlin in the [Kotlin DataFrame](https://kotlin.github.io/dataframe/info.html) and [Kandy](https://kotlin.github.io/kandy/welcome.html) library documentation. |
| 199 | +* Find additional information about the [Kotlin Notebook's usage and key features](https://www.jetbrains.com/help/idea/kotlin-notebook.html). |
0 commit comments