Minimal Scala sandbox for local development on Ubuntu.
Purpose: quick refresh of Scala syntax and functional patterns, with optional Apache Spark.
- run Scala locally
- demonstrate functional style
- show case classes and Options
- basic collections processing
- optional local Spark job
- no external services required
- Java 17+
- coursier installed
- scala-cli available
Check:
java -version
scala-cli versionscala-sandbox/
├── README.md
├── build.sbt
├── hello.sc
├── wordcount.sc
├── case_class.sc
├── spark_hello.sc
├── dataset_transformation.sc
├── etl_pipeline.sc
├── src/
│ ├── main/scala/
│ │ ├── WordCounter.scala
│ │ └── PersonUtils.scala
│ └── test/scala/
│ ├── WordCounterTest.scala
│ └── PersonUtilsTest.scala
└── project/
Run hello world:
scala-cli hello.scRun word count example:
scala-cli wordcount.scRun case class demo:
scala-cli case_class.scRun Spark example (optional, heavier):
scala-cli spark_hello.scRun dataset transformation:
scala-cli dataset_transformation.scRun ETL pipeline:
scala-cli etl_pipeline.scsbt compile
sbt test
sbt runDemonstrates:
- main entry point
- simple function
- string interpolation
@main def hello(): Unit =
val name = "MJ"
println(s"Hello, $name")Demonstrates:
- collections
- map and filter
- groupBy
- Option handling
def wordCount(text: String): Map[String, Int] =
text
.toLowerCase
.split("\\W+")
.filter(_.nonEmpty)
.groupBy(identity)
.view
.mapValues(_.length)
.toMap
@main def runWordCount(): Unit =
val input = "Scala is great and Scala is fast"
val counts = wordCount(input)
counts.toSeq.sortBy(_._1).foreach(println)Create file case_class.sc if you want to extend the sandbox.
Demonstrates:
- case class
- immutability
- pattern matching
case class Person(name: String, age: Int)
def describe(p: Person): String =
p match
case Person(name, age) if age >= 18 => s"$name is adult"
case Person(name, _) => s"$name is minor"
@main def runCaseClass(): Unit =
val people = List(
Person("Alice", 30),
Person("Bob", 15)
)
people.map(describe).foreach(println)Only needed if practicing Spark.
Create spark_hello.sc:
//> using dep "org.apache.spark::spark-sql:3.5.1"
import org.apache.spark.sql.SparkSession
@main def sparkHello(): Unit =
val spark = SparkSession.builder()
.appName("LocalHello")
.master("local[*]")
.getOrCreate()
import spark.implicits._
val ds = Seq(1, 2, 3, 4).toDS()
ds.show()
spark.stop()Notes:
- runs fully local
- suitable for laptop testing
- increase RAM if needed
What is sbt?
sbt (Scala Build Tool) is a build and task runner for Scala projects. It manages:
- Compilation: Compile Scala code to bytecode
- Dependencies: Download and manage libraries (from Maven Central, etc.)
- Testing: Run unit tests with frameworks like ScalaTest
- Packaging: Create JAR files for distribution
Key concepts:
build.sbt: Configuration file defining project name, version, dependencies, Scala versionsrc/main/scala/: Production codesrc/test/scala/: Test codetarget/: Compiled output (auto-generated, don't edit)
Common commands:
sbt compile # Compile code
sbt test # Run all tests
sbt test-only # Run specific test
sbt run # Run main class
sbt clean # Remove compiled artifacts
sbt package # Create JAR fileExample build.sbt:
ThisBuild / scalaVersion := "3.3.1"
ThisBuild / version := "0.1.0-SNAPSHOT"
lazy val root = (project in file("."))
.settings(
name := "scala-sandbox",
libraryDependencies ++= Seq(
"org.scalatest" %% "scalatest" % "3.2.18" % Test,
"org.apache.spark" %% "spark-sql" % "3.5.1" % Provided
)
)Run tests:
sbt test # All tests
sbt testOnly *WordCounterTest # Specific testTests in src/test/scala/ use ScalaTest framework:
class WordCounterTest extends AnyFlatSpec with Matchers:
"wordCount" should "count words correctly" in {
val result = WordCounter.wordCount("Scala is great and Scala is fast")
result("scala") should be(2)
}Demonstrates Spark DataFrame/Dataset operations:
scala-cli dataset_transformation.scFeatures:
- Create datasets from case classes
- Filter and map operations
- GroupBy and aggregations
- Complex transformations with mapGroups
val employees = Seq(
Employee(1, "Alice", "Engineering", 120000),
Employee(2, "Bob", "Sales", 90000)
).toDS()
// Filter
employees.filter(_.department == "Engineering").show()
// GroupBy: Average salary by department
employees
.groupBy(col("department"))
.agg(avg(col("salary")))
.show()Extract-Transform-Load demo with Spark:
scala-cli etl_pipeline.scThree steps:
- Extract - Load raw log data
- Transform - Clean, validate, aggregate
- Load - Generate summary reports
Format code:
scalafmt .Start REPL:
scala-cli replCompile only:
scala-cli compile hello.sc- modern Scala 3 syntax
- functional collection processing
- case classes and pattern matching
- Option-safe style
- sbt build system and dependency management
- unit testing with ScalaTest
- local Apache Spark execution
- Dataset transformations
- ETL pipeline patterns
- reproducible local setup
✅ unit tests (ScalaTest framework) ✅ sbt version with full project structure ✅ Dataset transformations (Spark DataFrames) ✅ Small ETL pipeline example