|
1 | | -# TypedFrames |
| 1 | +# Iskra |
2 | 2 |
|
3 | | -TypedFrames is a Scala 3 wrapper around Apache Spark API which allows writing typesafe and boilerplate-free but still efficient Spark code. |
| 3 | +Iskra is a Scala 3 wrapper around Apache Spark API which allows writing typesafe and boilerplate-free but still efficient Spark code. |
4 | 4 |
|
5 | 5 | ## How is it possible to write Spark applications in Scala 3? |
6 | 6 |
|
7 | 7 | Starting from the release of 3.2.0, Spark is cross-compiled also for Scala 2.13, which opens a way to using Spark from Scala 3 code, as Scala 3 projects can depend on Scala 2.13 artifacts. |
8 | 8 |
|
9 | 9 | However, one might run into problems when trying to call a method requiring an implicit instance of Spark's `Encoder` type. Derivation of instances of `Encoder` relies on presence of a `TypeTag` for a given type. However `TypeTag`s are not generated by Scala 3 compiler anymore (and there are no plans to support this) so instances of `Encoder` cannot be automatically synthesized in most cases. |
10 | 10 |
|
11 | | -TypedFrames tries to work around this problem by using its own encoders (unrelated to Spark's `Encoder` type) generated using Scala 3's new metaprogramming API. |
| 11 | +Iskra tries to work around this problem by using its own encoders (unrelated to Spark's `Encoder` type) generated using Scala 3's new metaprogramming API. |
12 | 12 |
|
13 | | -## How does TypedFrames make things typesafe and efficient at the same time? |
| 13 | +## How does Iskra make things typesafe and efficient at the same time? |
14 | 14 |
|
15 | 15 | TypedFrames provides thin (but strongly typed) wrappers around `DataFrame`s, which track types and names of columns at compile time but let Catalyst perform all of its optimizations at runtime. |
16 | 16 |
|
17 | | -TypedFrames uses structural types rather than case classes as data models, which gives us a lot of flexibility (no need to explicitly define a new case class when a column is added/removed/renamed!) but we still get compilation errors when we try to refer to a column which doesn't exist or can't be used in a given context. |
| 17 | +Iskra uses structural types rather than case classes as data models, which gives us a lot of flexibility (no need to explicitly define a new case class when a column is added/removed/renamed!) but we still get compilation errors when we try to refer to a column which doesn't exist or can't be used in a given context. |
18 | 18 |
|
19 | 19 | ## Usage |
20 | 20 |
|
21 | 21 | :warning: This library is in its early stage of development - the syntax and type hierarchy might still change, |
22 | 22 | the coverage of Spark's API is far from being complete and more tests are needed. |
23 | 23 |
|
24 | | -1) Add TypedFrames as a dependency to your project, e.g. |
| 24 | +1) Add Iskra as a dependency to your project, e.g. |
25 | 25 |
|
26 | 26 | * in a file compiled with Scala CLI: |
27 | 27 | ```scala |
28 | | -//> using lib "org.virtuslab::typed-frames:0.0.1" |
| 28 | +//> using lib "org.virtuslab::iskra:0.0.1" |
29 | 29 | ``` |
30 | 30 |
|
31 | 31 | * when starting Scala CLI REPL: |
32 | 32 | ```shell |
33 | | -scala-cli repl --dep org.virtuslab::typed-frames:0.0.1 |
| 33 | +scala-cli repl --dep org.virtuslab::iskra:0.0.1 |
34 | 34 | ``` |
35 | 35 |
|
36 | 36 | * in `build.sbt` in an sbt project: |
37 | 37 | ```scala |
38 | | -libraryDependencies += "org.virtuslab" %% "typed-frames" % "0.0.1" |
| 38 | +libraryDependencies += "org.virtuslab" %% "iskra" % "0.0.1" |
39 | 39 | ``` |
40 | 40 |
|
41 | | -TypedFrames is built with Scala 3.1.3 so it's compatible with Scala 3.1.x and newer minor releases (starting from 3.2.0-RC1 you'll get code completions for names of columns in REPL and Metals!). |
42 | | -TypedFrames transitively depends on Spark 3.2.0. |
| 41 | +Iskra is built with Scala 3.1.3 so it's compatible with Scala 3.1.x and newer minor releases (starting from 3.2.0-RC1 you'll get code completions for names of columns in REPL and Metals!). |
| 42 | +Iskra transitively depends on Spark 3.2.0. |
43 | 43 |
|
44 | 44 | 2) Import the basic definitions from the API |
45 | 45 | ```scala |
46 | | -import org.virtuslab.typedframes.api.* |
| 46 | +import org.virtuslab.iskra.api.* |
47 | 47 | ``` |
48 | 48 |
|
49 | 49 | 3) Get a Spark session, e.g. |
@@ -87,9 +87,9 @@ foos.innerJoin(bars).on($.foos.barId === $.bars.id).select(...) |
87 | 87 | ``` |
88 | 88 | * As you might have noticed above, the aliases for `foos` and `bars` were automatically inferred |
89 | 89 |
|
90 | | -6) For reference look at the [examples](src/test/example/) and the [API docs](https://virtuslab.github.io/typed-frames/) |
| 90 | +6) For reference look at the [examples](src/test/example/) and the [API docs](https://virtuslab.github.io/iskra/) |
91 | 91 |
|
92 | 92 | ## Local development |
93 | 93 |
|
94 | 94 |
|
95 | | -This project is built using [scala-cli](https://scala-cli.virtuslab.org/) so just use the traditional commands with `.` as root like `scala-cli compile .` or `scala-cli test .`. |
| 95 | +This project is built using [scala-cli](https://scala-cli.virtuslab.org/) so just use the traditional commands with `.` as root like `scala-cli compile .` or `scala-cli test .`. |
0 commit comments