-
-
Notifications
You must be signed in to change notification settings - Fork 421
Description
The goal of this ticket is to make the out/ folder contents more reproducible, such that it contains the same bytes and hashes regardless of the user's filesystem layout outside of that folder. This is would allow re-using the out/ folder as a build cache between different machines that may have the checkout in different place (e.g. /Users/alice/my-repository vs /Users/charlie/my-repository), both coarse grained (e.g. by sending over a zip file) and fine grained (via the bazel remote cache protocol)
The main thing that needs to happen is that every os.Path and mill.api.PathRef that is serialized within a "known" directory needs to be normalized to a path relative to an abstract reference to that known directory. e.g.
/Users/alice/my-repository/out/foo/bar.dest/quxshould be serialized as$WORKSPACE/out/foo/bar.dest/qux/Users/lihaoyi/Library/Caches/Coursier/v1/https/repo1.maven.org/maven2/org/scala-lang/scala-library/2.13.14/scala-library-2.13.14.jarshould be serialized as$COURSIER_CACHE/v1/https/repo1.maven.org/maven2/org/scala-lang/scala-library/2.13.14/scala-library-2.13.14.jar/Users/alice/thing-outside-repositoryshould be serialized as$HOME/thing-outside-repository
AFAIK the necessary known roots should all be available globally (e.g. mill.api.workspace.WorkspaceRoot.workspaceRoot, os.home, sys.env("COURSIER_CACHE")). It should be easy enough to add to the serialization logic:
mill.api.PathRefserializationmill/main/api/src/mill/api/PathRef.scala
Lines 175 to 197 in e0a2c93
implicit def jsonFormatter: RW[PathRef] = upickle.default.readwriter[String].bimap[PathRef]( p => p.toString(), s => { val Array(prefix, valid0, hex, pathString) = s.split(":", 4) val path = os.Path(pathString) val quick = prefix match { case "qref" => true case "ref" => false } val validOrig = valid0 match { case "v0" => Revalidate.Never case "v1" => Revalidate.Once case "vn" => Revalidate.Always } // Parsing to a long and casting to an int is the only way to make // round-trip handling of negative numbers work =( val sig = java.lang.Long.parseLong(hex, 16).toInt val pr = PathRef(path, quick, sig, revalidate = validOrig) validatedPaths.value.revalidateIfNeededOrThrow(pr) pr } ) os.Pathserializationmill/main/api/src/mill/api/JsonFormatters.scala
Lines 27 to 31 in e0a2c93
implicit val pathReadWrite: RW[os.Path] = upickle.default.readwriter[String] .bimap[os.Path]( _.toString, os.Path(_) )
Apart from PathRef and Path, we will also need to deal with:
-
Files in
out/which are naturally non-deterministic:mill-profile.json,mill-chrome-profile.json,mill-server/*andmill-no-server/*, etc. -
Modified times are also expected to vary. These may need to be zeroed out in the process of making
zipandjarfiles such that they do not affect the byte contents, and ignored as part of any equivalence comparison -
Any
foo.jsonfiles belonging to workers can also be expected to differ since they contain thetoStringof the worker, and may need to be renamed tofoo.worker.jsonor similar to make them identifiable. -
There will also be inherent differences between files generated on different platforms (e.g. native binaries). This is fine for now, and likely unavoidable.
-
There may be other files that need to be made reproducible that are not listed here
The success criteria would be a test in integration/feature/ that:
- Copies the code in
example/scalalib/web/5-webapp-scalajs-sharedinto two separate subfolders.- The choice of
example/scalalib/web/5-webapp-scalajs-sharedis somewhat arbitrary, but should give us good coverage of a variety of Mill module and task types, exercising a wide range of code paths
- The choice of
- Runs
./mill runBackground && ./mill clean runBackground && ./mill jar && ./mill assemblyin each folder- (one with a custom
COURSIER_CACHEand-Duser.homepassed in),
- (one with a custom
- Does a file-by-file and byte-for-byte comparison against the two outfolders with some normalization criteria (ignoring the expected-to-differ files and ignoring mtimes), to assert that the
out/folder is byte-for-byte identical
Related issues with prior discussion: