|
| 1 | += Zero-Setup All-in-One Java Tooling via Mill Bootstrap Scripts |
| 2 | + |
| 3 | +// tag::header[] |
| 4 | +:author: Li Haoyi |
| 5 | +:revdate: 24 September 2025 |
| 6 | + |
| 7 | +_{author}, {revdate}_ |
| 8 | + |
| 9 | +Getting the software you need installed onto your machine is a common point of |
| 10 | +friction, whether you're on OS-X finding |
| 11 | +https://github.com/orgs/Homebrew/discussions/1177[Homebrew being terribly slow] or on Ubuntu finding |
| 12 | +https://www.reddit.com/r/Ubuntu/comments/1j3ldpm/why_are_all_my_apt_programs_so_outdated/[the versions available are all outdated]. |
| 13 | +Setting up Java projects in particular often involves a multi-step process to install `mvn`, |
| 14 | +`sdkman`, `jenv`, and the `java` version you need. |
| 15 | + |
| 16 | +The Mill build tool does something interesting here: it requires no system-wide installation |
| 17 | +at all to build your Java projects! You can checkout any codebase built with Mill on a bare |
| 18 | +Linux/Mac/Windows machine, build it without any prior setup using it's `./mill` bootstrap |
| 19 | +script, and Mill will automatically download and cache itself, any JVMs, and any third-party |
| 20 | +libraries and tools necessary. For example, the `./mill __.compile` below is all that is needed |
| 21 | +to compile all modules in a newly-checked-out Mill project on a clean machine, greatly |
| 22 | +simplifying building your project on diverse dev and CI environments: |
| 23 | + |
| 24 | +```console |
| 25 | +> curl -L https://github.com/com-lihaoyi/cask/archive/refs/heads/master.zip -o cask.zip |
| 26 | + |
| 27 | +> unzip cask.zip && cd cask-master |
| 28 | + |
| 29 | +> ./mill __.compile |
| 30 | +``` |
| 31 | + |
| 32 | +This blog post explores how Mill's zero-install workflow works: the status quo, |
| 33 | +the interesting innovations that Mill builds upon, and Mill's unique ideas |
| 34 | +that let it achieve this zero-setup usage to greatly simplify getting started |
| 35 | +working with Java, Scala, or Kotlin projects. |
| 36 | + |
| 37 | +// end::header[] |
| 38 | + |
| 39 | +== 1-Step and Multi-Step Installation |
| 40 | + |
| 41 | +Perhaps the most common way software is installed is via package managers like `apt`, `yum`, or |
| 42 | +`brew`. For example, the incantation to install `git` in an Amazon-Linux machine is: |
| 43 | + |
| 44 | +```console |
| 45 | +> sudo yum install git |
| 46 | +``` |
| 47 | + |
| 48 | +Depending on how nicely the software you are installing is packaged, this may or may not require |
| 49 | +additional commands to install transitive dependencies. For example, when setting up a codebase |
| 50 | +for development, you may need to: |
| 51 | + |
| 52 | +- `apt install` the Python version you want to use |
| 53 | +- `pip install` the libraries you want to use |
| 54 | +- Also `apt install` any native dependencies your python code needs to run. |
| 55 | + |
| 56 | +In the JVM ecosystem, it is common to need to: |
| 57 | + |
| 58 | +* `apt install openjdk-17-jdk` and then |
| 59 | +* `apt install mvn` |
| 60 | +* It's also common to install https://sdkman.io/[SdkMan] or https://github.com/jenv/jenv[JEnv] |
| 61 | + to help manage your JVM, e.g. |
| 62 | +** `curl -s "https://get.sdkman.io" | bash`, `sdk install java 17-tem` |
| 63 | +** `sudo apt install jenv`, `jenv local 17` |
| 64 | + |
| 65 | +Such multi-step workflows are common when building a software project, as the codebase and |
| 66 | +its dependencies are never as nicely packaged as distributed binaries like `git`. Using a |
| 67 | +language-specific package manager or build tool can help, but since the build tool itself |
| 68 | +needs to be installed it remains a multi-step workflow getting everything set up and ready to use. |
| 69 | + |
| 70 | +There are other ways to install things apart from package managers: `curl <url> | bash` is common, |
| 71 | +as is manually downloading binaries to put on your `PATH`. But all of these have a similar problem: |
| 72 | +the installation must be done _manually_ and happen _before_ you can begin working on your project. |
| 73 | +This gives a lot of room for things to go wrong, for example: |
| 74 | + |
| 75 | +1. **Things falling out of sync**: the installation commands on MacOS using `brew` will be different |
| 76 | + from Amazon-Linux using `yum` or Ubuntu using `apt`, so it's terribly easy to end up with |
| 77 | + subtly different sets of packages on each. This results in tedious busy-work trying to keep the |
| 78 | + various environments in sync. |
| 79 | + |
| 80 | +2. **Steps done manually and not reproduced**: it is always tempting to `apt install` or |
| 81 | + `pip install` something locally to get things working, but that leaves you open |
| 82 | + for your code failing on CI workers due to missing installs, or failing on your co-workers' |
| 83 | + laptops which are missing some manual steps. |
| 84 | + |
| 85 | +3. **The number of steps growing**: while a 1-step install may seem fine, a large codebase |
| 86 | + may have many packages and tools requiring 1-step installs, resulting in an installation |
| 87 | + process with dozens of steps that can be tedious if run manually and fragile if scripted |
| 88 | + |
| 89 | +Multi-step setup workflows are the norm, an 1-step setup workflows are something people often |
| 90 | +strive towards. But it's worth asking: could we do better? |
| 91 | + |
| 92 | +== Maven & Gradle Bootstrap Scripts |
| 93 | + |
| 94 | +One interesting innovation on the installation process is the use of a _bootstrap script_. These |
| 95 | +were popularized by the https://gradle.org/[Gradle build tool] as a |
| 96 | +https://docs.gradle.org/current/userguide/gradle_wrapper.html[./gradlew] bootstrap script you |
| 97 | +commit to the repository root. The bootstrap script embeds the version of Gradle you |
| 98 | +want to use, and ensures to download and cache that specific version when it is invoked. That means |
| 99 | +you can checkout a project's code and run: |
| 100 | + |
| 101 | +```console |
| 102 | +> ./gradlew build |
| 103 | +``` |
| 104 | + |
| 105 | +And be sure you are using the same version of Gradle that everyone else is also using |
| 106 | +to build that project. This can be very handy: you now no longer need to worry about installing |
| 107 | +the "right version" of Gradle on your colleagues' laptops, on CI, etc. The bootstrap |
| 108 | +script ensures that anyone working on the project - human or otherwise - will be using the |
| 109 | +same version. |
| 110 | + |
| 111 | +Furthermore, as tools like Gradle automatically resolve the application-level |
| 112 | +dependencies required by the project they are building, the user does not need to install |
| 113 | +those manually. Any `build` or `install` or `test` command results in all necessary |
| 114 | +dependencies being automatically downloaded and cached as necessary. More recently, the |
| 115 | +https://maven.apache.org/[Maven build tool] has adopted a similar convention with |
| 116 | +https://maven.apache.org/tools/wrapper/[./mvnw] scripts serving the same purpose. |
| 117 | + |
| 118 | +However, one limitation of the Maven and Gradle approach to bootstrap scripts is that they rely |
| 119 | +on `java` being pre-installed to begin the bootstrapping process. Without `java`, they cannot |
| 120 | +run at all, as shown below: |
| 121 | + |
| 122 | +```console |
| 123 | +> curl -L https://github.com/netty/netty/archive/refs/heads/4.2.zip -o netty.zip |
| 124 | +> unzip netty.zip |
| 125 | +> ./mvnw clean install |
| 126 | +/usr/bin/which: no javac in (/home/ec2-user/.local/bin:/home/ec2-user/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin) |
| 127 | +Error: JAVA_HOME is not defined correctly. |
| 128 | +``` |
| 129 | + |
| 130 | +So even with the `./gradlew` or `./mvnw` bootstrap scripts, working with Gradle or Maven still |
| 131 | +ends up being a 1-step installation process: you need to install `java` (and the right version |
| 132 | +of Java!) before you begin, possibly using SdkMan or Jenv, each of which themselves need to |
| 133 | +be installed first. Thus although these bootstrap scripts mitigate |
| 134 | +the problem - differing Java versions tend to be more forgiving than differing Maven/Gradle |
| 135 | +versions - they haven't completely solved it. |
| 136 | + |
| 137 | +Why do these bootstrap scripts have require `java` to be installed? It's |
| 138 | +because they don't want to put non-trivial bootstrapping logic into `.sh` or `.bat` scripts, |
| 139 | +and as JVM build tools writing their bootstrapping logic in Java running on the JVM makes sense. |
| 140 | +But that doesn't seem like a hard requirement, and it should be possible to make a bootstrapping |
| 141 | +binary that can run without `java` or any other runtime pre-installed. That is the approach |
| 142 | +that Mill takes. |
| 143 | + |
| 144 | +== Mill's Zero-Setup Bootstrap Scripts |
| 145 | + |
| 146 | +Mill's xref:mill::cli/installation-ide.adoc#_bootstrap_scripts[./mill bootstrap scripts] are |
| 147 | +similar to `./mvnw` or `./gradle`, but differ in that |
| 148 | +by default they do not require `java` pre-installed in order to run. Instead, `./mill` downloads |
| 149 | +a native platform-specific binary that then performs the bootstrapping process: |
| 150 | + |
| 151 | +``` |
| 152 | +mill-dist-native-linux-aarch64-1.0.5.exe |
| 153 | +mill-dist-native-linux-amd64-1.0.5.exe |
| 154 | +mill-dist-native-mac-aarch64-1.0.5.exe |
| 155 | +mill-dist-native-mac-amd64-1.0.5.exe |
| 156 | +``` |
| 157 | + |
| 158 | +These `.exe` files are JVM executables, but compiled to native platform-specific binaries using |
| 159 | +the xref:7-graal-native-executables.adoc[Graal Native Image compiler]. Apart from the benefits |
| 160 | +of reduced startup time and memory usage, the key property we care about is that native image |
| 161 | +binaries also can run on bare environments without a `java` runtime pre-installed. This lets |
| 162 | +us write our non-trivial bootstrapping logic in Java and run it without needing a |
| 163 | +system-wide `java` distribution pre-installed on the machine. |
| 164 | + |
| 165 | +As native image binaries are OS/CPU-specific, we need some logic to pick the right binary for the |
| 166 | +machine the bootstrap script it running on, and that logic needs to run in the `.sh` or `.bat` |
| 167 | +bootstrap script because we need it to run _before_ the native image binary has been downloaded. |
| 168 | +The `.sh` version of this implemented using `uname` is as follows: |
| 169 | + |
| 170 | +```bash |
| 171 | +ARTIFACT_SUFFIX="" |
| 172 | +set_artifact_suffix(){ |
| 173 | + if [ "$(expr substr $(uname -s) 1 5 2>/dev/null)" = "Linux" ]; then |
| 174 | + if [ "$(uname -m)" = "aarch64" ]; then |
| 175 | + ARTIFACT_SUFFIX="-native-linux-aarch64" |
| 176 | + else |
| 177 | + ARTIFACT_SUFFIX="-native-linux-amd64" |
| 178 | + fi |
| 179 | + elif [ "$(uname)" = "Darwin" ]; then |
| 180 | + if [ "$(uname -m)" = "arm64" ]; then |
| 181 | + ARTIFACT_SUFFIX="-native-mac-aarch64" |
| 182 | + else |
| 183 | + ARTIFACT_SUFFIX="-native-mac-amd64" |
| 184 | + fi |
| 185 | + else |
| 186 | + echo "This native mill launcher supports only Linux and macOS." 1>&2 |
| 187 | + exit 1 |
| 188 | + fi |
| 189 | +} |
| 190 | +``` |
| 191 | + |
| 192 | +The bootstrap script can then assemble this into a download URL to `curl` down the relevant file |
| 193 | +from the Maven Central package repository: |
| 194 | + |
| 195 | +```bash |
| 196 | +DOWNLOAD_URL="https://repo1.maven.org/maven2/com/lihaoyi/mill-dist${ARTIFACT_SUFFIX}/${MILL_VERSION}/mill-dist${ARTIFACT_SUFFIX}-${MILL_VERSION}.${DOWNLOAD_EXT}" |
| 197 | +curl -f -L -o "${DOWNLOAD_FILE}" "${DOWNLOAD_URL}" |
| 198 | +``` |
| 199 | + |
| 200 | +We can then execute the downloaded file, taking any command line arguments given to the bootstrap |
| 201 | +script and forwarding them to the native binary: |
| 202 | + |
| 203 | +```bash |
| 204 | +exec "${DOWNLOAD_FILE}" "$@" |
| 205 | +``` |
| 206 | + |
| 207 | +The snippets above are somewhat simplified - the |
| 208 | +https://github.com/com-lihaoyi/mill/blob/1.0.5/dist/scripts/src/mill.sh[actual bootstrap script] |
| 209 | +contains a lot more logic to handle backwards compatibility, version configuration, Windows |
| 210 | +support, and other necessary details. But at a high level, they illustrate what Mill's |
| 211 | +bootstrap script does: it picks the downloads the native binary of the configured version, |
| 212 | +operating system, and CPU architecture, and executes it to begin the Mill bootstrapping process. |
| 213 | +This lets it bootstrap from _shell/bat script_ to _native image binary_ without any prior |
| 214 | +installation of `java` or other system-wide dependencies, and from there we can bootstrap the |
| 215 | +rest of the way. |
| 216 | + |
| 217 | +== Bootstrapping a Full JVM Environment |
| 218 | + |
| 219 | +Once we execute our native image binary, we then have an opportunity to run real JVM code (as |
| 220 | +opposed to sketchy shell scripts) to proceed with bootstrapping. When someone runs |
| 221 | +`./mill __.compile` to compile all modules in a repository, and the native image bootstrap |
| 222 | +launcher has been downloaded as described above, we can then use it to: |
| 223 | + |
| 224 | +1. **Download the JVM that Mill needs to run**, as Graal Native Images have limitations around |
| 225 | + classloading that make it unsuitable for the Mill daemon process |
| 226 | + |
| 227 | +2. **Download the `.jar` files that make up the Mill daemon process**, since Mill is implemented |
| 228 | + as a mixed Java/Scala codebase which compiles to `.class` files and is distributed as ``.jar``s |
| 229 | + |
| 230 | +3. **Start the Mill daemon process, which runs those `.jar` files on the downloaded JVM** |
| 231 | + |
| 232 | +Once we have the Mill daemon process running, further steps are necessary to bootstrap the Mill |
| 233 | +build dependencies and user code dependencies |
| 234 | + |
| 235 | +1. **Resolve any `.jar` files necessary for Mill's build logic, and any user-configured plugins**, |
| 236 | + and load them into a classloader to invoke the build |
| 237 | + |
| 238 | +2. **Resolve any `.jar` files or JVM necessary for user modules to compile and run** |
| 239 | + |
| 240 | +3. Finally, **compiling the user code using any `.jar` files and any custom JVM that they require**. |
| 241 | + |
| 242 | +The various `.jar` files are typically downloaded from |
| 243 | +https://central.sonatype.com/[Maven Central], which is the standard package repository for JVM libraries. |
| 244 | +The JVMs themselves come from the various provider download URLs that we reference via |
| 245 | +the https://github.com/coursier/jvm-index[Coursier JVM Index]. Apart from libraries and JVMs, |
| 246 | +all tools necessary for your Java/Scala/Kotlin development are also bootstrapped the |
| 247 | +same way - xref:mill::javalib/linting.adoc#_linting_with_checkstyle[Checkstyle], |
| 248 | +xref:mill::javalib/linting.adoc#_linting_with_errorprone[ErrorProne], |
| 249 | +xref:mill::scalalib/linting.adoc#_autoformatting_with_scalafmt[ScalaFmt], |
| 250 | +xref:mill::kotlinlib/linting.adoc#_linting_with_ktlint[KtLint], etc. - so you can use them |
| 251 | +without needing prior system-wide setup or installation. |
| 252 | + |
| 253 | +Note that we only do these steps once the native image bootstrap launcher has been downloaded |
| 254 | +as they require non-trivial logic: resolving JVM versions to download URLs, resolving `.jar` |
| 255 | +files from https://maven.apache.org/repositories/artifacts.html[group-artifact-version coordinates], |
| 256 | +adjudicating version conflicts, etc. This is too complicated to implement in `.sh` and `.bat` |
| 257 | +scripts, so Mill handles that using https://github.com/coursier/coursier[Coursier] which is |
| 258 | +a common JVM dependency resolution library also used by https://bazel.build/[Bazel] and |
| 259 | +https://www.scala-sbt.org/[SBT]. |
| 260 | + |
| 261 | +The final bootstrapping process of `./mill __.compile` looks something like this, with the |
| 262 | +solid lines indicating local steps in the bootstrapping process, and the dashed lines |
| 263 | +indicating downloads from package repositories: |
| 264 | + |
| 265 | +```graphviz |
| 266 | +digraph G { |
| 267 | + node [shape=box width=0 height=0 style=filled fillcolor=white] |
| 268 | + subgraph cluster0{ |
| 269 | + color=white |
| 270 | + |
| 271 | + "./mill" -> "native image launcher binary" -> "daemon jars" -> "daemon process" -> "build jars" -> "build classloader" -> "user code dependency jars" |
| 272 | + "native image launcher binary" -> "daemon JVM" -> "daemon process" |
| 273 | + "build classloader" -> "user code JVM" |
| 274 | + |
| 275 | + "user code JVM" -> "__.compile" |
| 276 | + "user code dependency jars" -> "__.compile" |
| 277 | + "user code sources" -> "__.compile" |
| 278 | + } |
| 279 | + "JVM Vendor" [style=dashed] |
| 280 | + |
| 281 | + "Maven Central" [style=dashed] |
| 282 | + "Maven Central" -> "native image launcher binary" [style=dashed arrowhead=empty weight=0] |
| 283 | + "Maven Central" -> "daemon jars" [style=dashed arrowhead=empty weight=0] |
| 284 | + "JVM Vendor" -> "daemon JVM" [style=dashed arrowhead=empty weight=0] |
| 285 | + "Maven Central" -> "build jars" [style=dashed arrowhead=empty weight=0] |
| 286 | + "Maven Central" -> "user code dependency jars" [style=dashed arrowhead=empty weight=0] |
| 287 | + "JVM Vendor" -> "user code JVM" [style=dashed arrowhead=empty weight=0] |
| 288 | + {"Maven Central"; "JVM Vendor"; "./mill"; rank=same} |
| 289 | +} |
| 290 | +``` |
| 291 | + |
| 292 | +Although this may seem like a lot of steps, all of them are completely automatic, and generally |
| 293 | +invisible to the user: |
| 294 | + |
| 295 | +* Jars and JVMs are downloaded when needed, in parallel where possible, and cached for future use. |
| 296 | + |
| 297 | +* Different versions of libraries and packages are assigned different caches on disk and can |
| 298 | + co-exist on the same machine. |
| 299 | + |
| 300 | +* Even different versions of the JVM can be downloaded and used |
| 301 | + at the same time without issue, e.g. if different user modules need to compile and run with |
| 302 | + different library or JVM versions. |
| 303 | + |
| 304 | +This is unlike packages installed via `brew` or `apt` or `yum`, where installation often |
| 305 | +has to be done manually, and typically only a single version of a package can be "installed" |
| 306 | +or "active" globally on a system at any one point in time. While traditional package management |
| 307 | +and program installation often involves manual work to set up and maintain, Mill's handling |
| 308 | +of dependencies in this bootstrap process is largely hands-off and automated. |
| 309 | + |
| 310 | +Despite the complexity described above, Mill's zero-install bootstrap process means that the user |
| 311 | +never needs to deal with any of it. They can immediately start using `./mill __.compile` or |
| 312 | +or any other command on a clean system, and the only indication noticeable |
| 313 | +difference would be the first command taking longer than normal and logging indicating that |
| 314 | +these downloads are happening. And once caches are warm, running `./mill` feels just as fast |
| 315 | +as running any pre-installed binary or executable. |
| 316 | + |
| 317 | + |
| 318 | +== Conclusion |
| 319 | + |
| 320 | +In this article, we discussed how the Mill build tool implements its zero-step setup |
| 321 | +process. This removes the zoo of manual installs that a Java developer would traditionally |
| 322 | +need to setup and maintain (`mvn`, `jenv`, `sdkman`, `java`, etc.), and replaces it with a single |
| 323 | +`./mill` script that automatically bootstraps all necessary tools and runtimes for the user, |
| 324 | +letting them begin their work on a codebase without any prior setup. |
| 325 | + |
| 326 | +This is done by carefully arranging the bootstrapping |
| 327 | +process for the Mill project: starting from a `.sh` script (or `.bat` on windows), using it |
| 328 | +to bootstrap a native binary, using the native binary to bootstrap a JVM, and using the JVM |
| 329 | +to bootstrap the user-defined dependencies they need to build their project. Although both |
| 330 | +the Mill build tool itself and user projects built with Mill both may have large transitive |
| 331 | +dependency trees, the bootstrapping process is arranged in a way that it can all be handled |
| 332 | +entirely automatically. |
| 333 | + |
| 334 | +For the purposes of this article, we simplified and skimmed over a lot of things: |
| 335 | + |
| 336 | +- The intricacies of writing equivalent `.sh` and `.bat` scripts to start bootstrapping |
| 337 | + |
| 338 | +- https://github.com/oracle/graal/issues/9215[Graal native image not working on windows-aarch64], |
| 339 | + meaning such systems still need `java` pre-installed |
| 340 | + |
| 341 | +- xref:mill::javalib/dependencies.adoc#_repository_config[Using a different package repository] |
| 342 | + instead of the default Maven Central |
| 343 | + |
| 344 | +- xref:mill::fundamentals/bundled-libraries.adoc#_requests_scala[Downloading and |
| 345 | + caching external non-Maven-Central resources] as part of your build |
| 346 | + |
| 347 | +- xref:mill::cli/build-header.adoc#_mill_jvm_version[Explicitly pinning the JVM version] |
| 348 | + to ensure consistency regardless of what may be installed locally |
| 349 | + |
| 350 | +- Use of `./mill __.prepareOffline`, to force Mill to download dependencies up-front so they |
| 351 | + can be used later without further downloads (e.g. in an internet-restricted environment) |
| 352 | + |
| 353 | +Although this article covers bootstrapping Java and JVM |
| 354 | +applications, the same principles could apply to bootstrap any non-trivial project and its |
| 355 | +dependencies: starting from a shell script, bootstrapping a native binary, which then |
| 356 | +bootstraps the messy dependencies that are required for any real-world project. |
| 357 | +With Mill, we take advantage of this to try and simplify Java development: codebases built |
| 358 | +using Mill can be built via `./mill` out of the box, providing everything you need for |
| 359 | +development without any prior setup. We hope that this will make it easier to people to |
| 360 | +contribute to such projects, whether in a proprietary setting or open-source. |
| 361 | + |
| 362 | +Zero-step installation workflows are really the only thing that scales as a project grows. |
| 363 | +While multiple 1-step installs can add up and become a long N-step installation process, |
| 364 | +multiple zero-step installs will always remain zero-step even if added together, regardless |
| 365 | +of how large and messy the project gets. Hopefully you've come away from this article |
| 366 | +with an appreciation for how Mill builds upon prior art to come up with its zero-step setup |
| 367 | +process, so next time the opportunity arises you can implement something similar in your |
| 368 | +own projects. |
0 commit comments