Skip to content

Commit 8c738df

Browse files
authored
Zero-Setup All-in-One Java Tooling via Mill Bootstrap Scripts (#5903)
1 parent 3f29205 commit 8c738df

File tree

3 files changed

+372
-0
lines changed

3 files changed

+372
-0
lines changed

website/blog/modules/ROOT/nav.adoc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
* xref:blog::index.adoc[_The Mill Build Engineering Blog_]
2+
* xref:16-zero-setup.adoc[]
23
* xref:15-android-build-flow.adoc[]
34
* xref:14-bash-zsh-completion.adoc[]
45
* xref:13-mill-build-tool-v1-0-0.adoc[]
Lines changed: 368 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,368 @@
1+
= Zero-Setup All-in-One Java Tooling via Mill Bootstrap Scripts
2+
3+
// tag::header[]
4+
:author: Li Haoyi
5+
:revdate: 24 September 2025
6+
7+
_{author}, {revdate}_
8+
9+
Getting the software you need installed onto your machine is a common point of
10+
friction, whether you're on OS-X finding
11+
https://github.com/orgs/Homebrew/discussions/1177[Homebrew being terribly slow] or on Ubuntu finding
12+
https://www.reddit.com/r/Ubuntu/comments/1j3ldpm/why_are_all_my_apt_programs_so_outdated/[the versions available are all outdated].
13+
Setting up Java projects in particular often involves a multi-step process to install `mvn`,
14+
`sdkman`, `jenv`, and the `java` version you need.
15+
16+
The Mill build tool does something interesting here: it requires no system-wide installation
17+
at all to build your Java projects! You can checkout any codebase built with Mill on a bare
18+
Linux/Mac/Windows machine, build it without any prior setup using it's `./mill` bootstrap
19+
script, and Mill will automatically download and cache itself, any JVMs, and any third-party
20+
libraries and tools necessary. For example, the `./mill __.compile` below is all that is needed
21+
to compile all modules in a newly-checked-out Mill project on a clean machine, greatly
22+
simplifying building your project on diverse dev and CI environments:
23+
24+
```console
25+
> curl -L https://github.com/com-lihaoyi/cask/archive/refs/heads/master.zip -o cask.zip
26+
27+
> unzip cask.zip && cd cask-master
28+
29+
> ./mill __.compile
30+
```
31+
32+
This blog post explores how Mill's zero-install workflow works: the status quo,
33+
the interesting innovations that Mill builds upon, and Mill's unique ideas
34+
that let it achieve this zero-setup usage to greatly simplify getting started
35+
working with Java, Scala, or Kotlin projects.
36+
37+
// end::header[]
38+
39+
== 1-Step and Multi-Step Installation
40+
41+
Perhaps the most common way software is installed is via package managers like `apt`, `yum`, or
42+
`brew`. For example, the incantation to install `git` in an Amazon-Linux machine is:
43+
44+
```console
45+
> sudo yum install git
46+
```
47+
48+
Depending on how nicely the software you are installing is packaged, this may or may not require
49+
additional commands to install transitive dependencies. For example, when setting up a codebase
50+
for development, you may need to:
51+
52+
- `apt install` the Python version you want to use
53+
- `pip install` the libraries you want to use
54+
- Also `apt install` any native dependencies your python code needs to run.
55+
56+
In the JVM ecosystem, it is common to need to:
57+
58+
* `apt install openjdk-17-jdk` and then
59+
* `apt install mvn`
60+
* It's also common to install https://sdkman.io/[SdkMan] or https://github.com/jenv/jenv[JEnv]
61+
to help manage your JVM, e.g.
62+
** `curl -s "https://get.sdkman.io" | bash`, `sdk install java 17-tem`
63+
** `sudo apt install jenv`, `jenv local 17`
64+
65+
Such multi-step workflows are common when building a software project, as the codebase and
66+
its dependencies are never as nicely packaged as distributed binaries like `git`. Using a
67+
language-specific package manager or build tool can help, but since the build tool itself
68+
needs to be installed it remains a multi-step workflow getting everything set up and ready to use.
69+
70+
There are other ways to install things apart from package managers: `curl <url> | bash` is common,
71+
as is manually downloading binaries to put on your `PATH`. But all of these have a similar problem:
72+
the installation must be done _manually_ and happen _before_ you can begin working on your project.
73+
This gives a lot of room for things to go wrong, for example:
74+
75+
1. **Things falling out of sync**: the installation commands on MacOS using `brew` will be different
76+
from Amazon-Linux using `yum` or Ubuntu using `apt`, so it's terribly easy to end up with
77+
subtly different sets of packages on each. This results in tedious busy-work trying to keep the
78+
various environments in sync.
79+
80+
2. **Steps done manually and not reproduced**: it is always tempting to `apt install` or
81+
`pip install` something locally to get things working, but that leaves you open
82+
for your code failing on CI workers due to missing installs, or failing on your co-workers'
83+
laptops which are missing some manual steps.
84+
85+
3. **The number of steps growing**: while a 1-step install may seem fine, a large codebase
86+
may have many packages and tools requiring 1-step installs, resulting in an installation
87+
process with dozens of steps that can be tedious if run manually and fragile if scripted
88+
89+
Multi-step setup workflows are the norm, an 1-step setup workflows are something people often
90+
strive towards. But it's worth asking: could we do better?
91+
92+
== Maven & Gradle Bootstrap Scripts
93+
94+
One interesting innovation on the installation process is the use of a _bootstrap script_. These
95+
were popularized by the https://gradle.org/[Gradle build tool] as a
96+
https://docs.gradle.org/current/userguide/gradle_wrapper.html[./gradlew] bootstrap script you
97+
commit to the repository root. The bootstrap script embeds the version of Gradle you
98+
want to use, and ensures to download and cache that specific version when it is invoked. That means
99+
you can checkout a project's code and run:
100+
101+
```console
102+
> ./gradlew build
103+
```
104+
105+
And be sure you are using the same version of Gradle that everyone else is also using
106+
to build that project. This can be very handy: you now no longer need to worry about installing
107+
the "right version" of Gradle on your colleagues' laptops, on CI, etc. The bootstrap
108+
script ensures that anyone working on the project - human or otherwise - will be using the
109+
same version.
110+
111+
Furthermore, as tools like Gradle automatically resolve the application-level
112+
dependencies required by the project they are building, the user does not need to install
113+
those manually. Any `build` or `install` or `test` command results in all necessary
114+
dependencies being automatically downloaded and cached as necessary. More recently, the
115+
https://maven.apache.org/[Maven build tool] has adopted a similar convention with
116+
https://maven.apache.org/tools/wrapper/[./mvnw] scripts serving the same purpose.
117+
118+
However, one limitation of the Maven and Gradle approach to bootstrap scripts is that they rely
119+
on `java` being pre-installed to begin the bootstrapping process. Without `java`, they cannot
120+
run at all, as shown below:
121+
122+
```console
123+
> curl -L https://github.com/netty/netty/archive/refs/heads/4.2.zip -o netty.zip
124+
> unzip netty.zip
125+
> ./mvnw clean install
126+
/usr/bin/which: no javac in (/home/ec2-user/.local/bin:/home/ec2-user/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin)
127+
Error: JAVA_HOME is not defined correctly.
128+
```
129+
130+
So even with the `./gradlew` or `./mvnw` bootstrap scripts, working with Gradle or Maven still
131+
ends up being a 1-step installation process: you need to install `java` (and the right version
132+
of Java!) before you begin, possibly using SdkMan or Jenv, each of which themselves need to
133+
be installed first. Thus although these bootstrap scripts mitigate
134+
the problem - differing Java versions tend to be more forgiving than differing Maven/Gradle
135+
versions - they haven't completely solved it.
136+
137+
Why do these bootstrap scripts have require `java` to be installed? It's
138+
because they don't want to put non-trivial bootstrapping logic into `.sh` or `.bat` scripts,
139+
and as JVM build tools writing their bootstrapping logic in Java running on the JVM makes sense.
140+
But that doesn't seem like a hard requirement, and it should be possible to make a bootstrapping
141+
binary that can run without `java` or any other runtime pre-installed. That is the approach
142+
that Mill takes.
143+
144+
== Mill's Zero-Setup Bootstrap Scripts
145+
146+
Mill's xref:mill::cli/installation-ide.adoc#_bootstrap_scripts[./mill bootstrap scripts] are
147+
similar to `./mvnw` or `./gradle`, but differ in that
148+
by default they do not require `java` pre-installed in order to run. Instead, `./mill` downloads
149+
a native platform-specific binary that then performs the bootstrapping process:
150+
151+
```
152+
mill-dist-native-linux-aarch64-1.0.5.exe
153+
mill-dist-native-linux-amd64-1.0.5.exe
154+
mill-dist-native-mac-aarch64-1.0.5.exe
155+
mill-dist-native-mac-amd64-1.0.5.exe
156+
```
157+
158+
These `.exe` files are JVM executables, but compiled to native platform-specific binaries using
159+
the xref:7-graal-native-executables.adoc[Graal Native Image compiler]. Apart from the benefits
160+
of reduced startup time and memory usage, the key property we care about is that native image
161+
binaries also can run on bare environments without a `java` runtime pre-installed. This lets
162+
us write our non-trivial bootstrapping logic in Java and run it without needing a
163+
system-wide `java` distribution pre-installed on the machine.
164+
165+
As native image binaries are OS/CPU-specific, we need some logic to pick the right binary for the
166+
machine the bootstrap script it running on, and that logic needs to run in the `.sh` or `.bat`
167+
bootstrap script because we need it to run _before_ the native image binary has been downloaded.
168+
The `.sh` version of this implemented using `uname` is as follows:
169+
170+
```bash
171+
ARTIFACT_SUFFIX=""
172+
set_artifact_suffix(){
173+
if [ "$(expr substr $(uname -s) 1 5 2>/dev/null)" = "Linux" ]; then
174+
if [ "$(uname -m)" = "aarch64" ]; then
175+
ARTIFACT_SUFFIX="-native-linux-aarch64"
176+
else
177+
ARTIFACT_SUFFIX="-native-linux-amd64"
178+
fi
179+
elif [ "$(uname)" = "Darwin" ]; then
180+
if [ "$(uname -m)" = "arm64" ]; then
181+
ARTIFACT_SUFFIX="-native-mac-aarch64"
182+
else
183+
ARTIFACT_SUFFIX="-native-mac-amd64"
184+
fi
185+
else
186+
echo "This native mill launcher supports only Linux and macOS." 1>&2
187+
exit 1
188+
fi
189+
}
190+
```
191+
192+
The bootstrap script can then assemble this into a download URL to `curl` down the relevant file
193+
from the Maven Central package repository:
194+
195+
```bash
196+
DOWNLOAD_URL="https://repo1.maven.org/maven2/com/lihaoyi/mill-dist${ARTIFACT_SUFFIX}/${MILL_VERSION}/mill-dist${ARTIFACT_SUFFIX}-${MILL_VERSION}.${DOWNLOAD_EXT}"
197+
curl -f -L -o "${DOWNLOAD_FILE}" "${DOWNLOAD_URL}"
198+
```
199+
200+
We can then execute the downloaded file, taking any command line arguments given to the bootstrap
201+
script and forwarding them to the native binary:
202+
203+
```bash
204+
exec "${DOWNLOAD_FILE}" "$@"
205+
```
206+
207+
The snippets above are somewhat simplified - the
208+
https://github.com/com-lihaoyi/mill/blob/1.0.5/dist/scripts/src/mill.sh[actual bootstrap script]
209+
contains a lot more logic to handle backwards compatibility, version configuration, Windows
210+
support, and other necessary details. But at a high level, they illustrate what Mill's
211+
bootstrap script does: it picks the downloads the native binary of the configured version,
212+
operating system, and CPU architecture, and executes it to begin the Mill bootstrapping process.
213+
This lets it bootstrap from _shell/bat script_ to _native image binary_ without any prior
214+
installation of `java` or other system-wide dependencies, and from there we can bootstrap the
215+
rest of the way.
216+
217+
== Bootstrapping a Full JVM Environment
218+
219+
Once we execute our native image binary, we then have an opportunity to run real JVM code (as
220+
opposed to sketchy shell scripts) to proceed with bootstrapping. When someone runs
221+
`./mill __.compile` to compile all modules in a repository, and the native image bootstrap
222+
launcher has been downloaded as described above, we can then use it to:
223+
224+
1. **Download the JVM that Mill needs to run**, as Graal Native Images have limitations around
225+
classloading that make it unsuitable for the Mill daemon process
226+
227+
2. **Download the `.jar` files that make up the Mill daemon process**, since Mill is implemented
228+
as a mixed Java/Scala codebase which compiles to `.class` files and is distributed as ``.jar``s
229+
230+
3. **Start the Mill daemon process, which runs those `.jar` files on the downloaded JVM**
231+
232+
Once we have the Mill daemon process running, further steps are necessary to bootstrap the Mill
233+
build dependencies and user code dependencies
234+
235+
1. **Resolve any `.jar` files necessary for Mill's build logic, and any user-configured plugins**,
236+
and load them into a classloader to invoke the build
237+
238+
2. **Resolve any `.jar` files or JVM necessary for user modules to compile and run**
239+
240+
3. Finally, **compiling the user code using any `.jar` files and any custom JVM that they require**.
241+
242+
The various `.jar` files are typically downloaded from
243+
https://central.sonatype.com/[Maven Central], which is the standard package repository for JVM libraries.
244+
The JVMs themselves come from the various provider download URLs that we reference via
245+
the https://github.com/coursier/jvm-index[Coursier JVM Index]. Apart from libraries and JVMs,
246+
all tools necessary for your Java/Scala/Kotlin development are also bootstrapped the
247+
same way - xref:mill::javalib/linting.adoc#_linting_with_checkstyle[Checkstyle],
248+
xref:mill::javalib/linting.adoc#_linting_with_errorprone[ErrorProne],
249+
xref:mill::scalalib/linting.adoc#_autoformatting_with_scalafmt[ScalaFmt],
250+
xref:mill::kotlinlib/linting.adoc#_linting_with_ktlint[KtLint], etc. - so you can use them
251+
without needing prior system-wide setup or installation.
252+
253+
Note that we only do these steps once the native image bootstrap launcher has been downloaded
254+
as they require non-trivial logic: resolving JVM versions to download URLs, resolving `.jar`
255+
files from https://maven.apache.org/repositories/artifacts.html[group-artifact-version coordinates],
256+
adjudicating version conflicts, etc. This is too complicated to implement in `.sh` and `.bat`
257+
scripts, so Mill handles that using https://github.com/coursier/coursier[Coursier] which is
258+
a common JVM dependency resolution library also used by https://bazel.build/[Bazel] and
259+
https://www.scala-sbt.org/[SBT].
260+
261+
The final bootstrapping process of `./mill __.compile` looks something like this, with the
262+
solid lines indicating local steps in the bootstrapping process, and the dashed lines
263+
indicating downloads from package repositories:
264+
265+
```graphviz
266+
digraph G {
267+
node [shape=box width=0 height=0 style=filled fillcolor=white]
268+
subgraph cluster0{
269+
color=white
270+
271+
"./mill" -> "native image launcher binary" -> "daemon jars" -> "daemon process" -> "build jars" -> "build classloader" -> "user code dependency jars"
272+
"native image launcher binary" -> "daemon JVM" -> "daemon process"
273+
"build classloader" -> "user code JVM"
274+
275+
"user code JVM" -> "__.compile"
276+
"user code dependency jars" -> "__.compile"
277+
"user code sources" -> "__.compile"
278+
}
279+
"JVM Vendor" [style=dashed]
280+
281+
"Maven Central" [style=dashed]
282+
"Maven Central" -> "native image launcher binary" [style=dashed arrowhead=empty weight=0]
283+
"Maven Central" -> "daemon jars" [style=dashed arrowhead=empty weight=0]
284+
"JVM Vendor" -> "daemon JVM" [style=dashed arrowhead=empty weight=0]
285+
"Maven Central" -> "build jars" [style=dashed arrowhead=empty weight=0]
286+
"Maven Central" -> "user code dependency jars" [style=dashed arrowhead=empty weight=0]
287+
"JVM Vendor" -> "user code JVM" [style=dashed arrowhead=empty weight=0]
288+
{"Maven Central"; "JVM Vendor"; "./mill"; rank=same}
289+
}
290+
```
291+
292+
Although this may seem like a lot of steps, all of them are completely automatic, and generally
293+
invisible to the user:
294+
295+
* Jars and JVMs are downloaded when needed, in parallel where possible, and cached for future use.
296+
297+
* Different versions of libraries and packages are assigned different caches on disk and can
298+
co-exist on the same machine.
299+
300+
* Even different versions of the JVM can be downloaded and used
301+
at the same time without issue, e.g. if different user modules need to compile and run with
302+
different library or JVM versions.
303+
304+
This is unlike packages installed via `brew` or `apt` or `yum`, where installation often
305+
has to be done manually, and typically only a single version of a package can be "installed"
306+
or "active" globally on a system at any one point in time. While traditional package management
307+
and program installation often involves manual work to set up and maintain, Mill's handling
308+
of dependencies in this bootstrap process is largely hands-off and automated.
309+
310+
Despite the complexity described above, Mill's zero-install bootstrap process means that the user
311+
never needs to deal with any of it. They can immediately start using `./mill __.compile` or
312+
or any other command on a clean system, and the only indication noticeable
313+
difference would be the first command taking longer than normal and logging indicating that
314+
these downloads are happening. And once caches are warm, running `./mill` feels just as fast
315+
as running any pre-installed binary or executable.
316+
317+
318+
== Conclusion
319+
320+
In this article, we discussed how the Mill build tool implements its zero-step setup
321+
process. This removes the zoo of manual installs that a Java developer would traditionally
322+
need to setup and maintain (`mvn`, `jenv`, `sdkman`, `java`, etc.), and replaces it with a single
323+
`./mill` script that automatically bootstraps all necessary tools and runtimes for the user,
324+
letting them begin their work on a codebase without any prior setup.
325+
326+
This is done by carefully arranging the bootstrapping
327+
process for the Mill project: starting from a `.sh` script (or `.bat` on windows), using it
328+
to bootstrap a native binary, using the native binary to bootstrap a JVM, and using the JVM
329+
to bootstrap the user-defined dependencies they need to build their project. Although both
330+
the Mill build tool itself and user projects built with Mill both may have large transitive
331+
dependency trees, the bootstrapping process is arranged in a way that it can all be handled
332+
entirely automatically.
333+
334+
For the purposes of this article, we simplified and skimmed over a lot of things:
335+
336+
- The intricacies of writing equivalent `.sh` and `.bat` scripts to start bootstrapping
337+
338+
- https://github.com/oracle/graal/issues/9215[Graal native image not working on windows-aarch64],
339+
meaning such systems still need `java` pre-installed
340+
341+
- xref:mill::javalib/dependencies.adoc#_repository_config[Using a different package repository]
342+
instead of the default Maven Central
343+
344+
- xref:mill::fundamentals/bundled-libraries.adoc#_requests_scala[Downloading and
345+
caching external non-Maven-Central resources] as part of your build
346+
347+
- xref:mill::cli/build-header.adoc#_mill_jvm_version[Explicitly pinning the JVM version]
348+
to ensure consistency regardless of what may be installed locally
349+
350+
- Use of `./mill __.prepareOffline`, to force Mill to download dependencies up-front so they
351+
can be used later without further downloads (e.g. in an internet-restricted environment)
352+
353+
Although this article covers bootstrapping Java and JVM
354+
applications, the same principles could apply to bootstrap any non-trivial project and its
355+
dependencies: starting from a shell script, bootstrapping a native binary, which then
356+
bootstraps the messy dependencies that are required for any real-world project.
357+
With Mill, we take advantage of this to try and simplify Java development: codebases built
358+
using Mill can be built via `./mill` out of the box, providing everything you need for
359+
development without any prior setup. We hope that this will make it easier to people to
360+
contribute to such projects, whether in a proprietary setting or open-source.
361+
362+
Zero-step installation workflows are really the only thing that scales as a project grows.
363+
While multiple 1-step installs can add up and become a long N-step installation process,
364+
multiple zero-step installs will always remain zero-step even if added together, regardless
365+
of how large and messy the project gets. Hopefully you've come away from this article
366+
with an appreciation for how Mill builds upon prior art to come up with its zero-step setup
367+
process, so next time the opportunity arises you can implement something similar in your
368+
own projects.

0 commit comments

Comments
 (0)