Skip to content

Commit 46ff7e2

Browse files
DavidSpickettasl
authored andcommitted
Use <mark> for highlights.
1 parent ef3cbf2 commit 46ff7e2

File tree

1 file changed

+13
-13
lines changed

1 file changed

+13
-13
lines changed

content/posts/2019-11-07-deterministic-builds-with-clang-and-lld.md

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -45,19 +45,19 @@ Basic determinism needs tools (compiler, linker, etc) that are deterministic. To
4545

4646
The C standard defines the predefined macros `__TIME__` and `__DATE__` that expand to the time a source file is compiled. Several compilers, including clang, also define the non-standard `__TIMESTAMP__`. This is inherently nondeterministic. You should not use these macros, and you can use `-Wdate-time` to make the compiler emit a warning when they are used.
4747

48-
If they are used in third-party code you don’t control, you can use `-Wno-builtin-macro-redefined -D__DATE__= -D__TIME__= -D__TIMESTAMP__=` to make them expand to nothing.
48+
If they are used in third-party code you don’t control, you can use <mark>-Wno-builtin-macro-redefined -D__DATE__= -D__TIME__= -D__TIMESTAMP__=</mark> to make them expand to nothing.
4949

50-
When targeting Windows, clang and clang-cl by default also embed the current time in a timestamp field in the output .obj file, because Microsoft’s link.exe in `/incremental` mode silently mislinks files if that field isn’t set correctly. If you don’t use link.exe’s `/incremental` flag, or if you link with lld-link, you should pass `/Brepro` to clang-cl to make it not write the current timestamp into its output.
50+
When targeting Windows, clang and clang-cl by default also embed the current time in a timestamp field in the output .obj file, because Microsoft’s link.exe in `/incremental` mode silently mislinks files if that field isn’t set correctly. If you don’t use link.exe’s `/incremental` flag, or if you link with lld-link, you should pass <mark>/Brepro</mark> to clang-cl to make it not write the current timestamp into its output.
5151

52-
Both link.exe and lld-link also write the current timestamp into output .dll or .exe files. To make them instead write a hash of the binary into this field, you can pass `/Brepro` to the linker as well. However, some tools, such as Windows 7’s app compatibility database, try to interpret that field as an actual timestamp and can get confused if it’s set to a hash of the binary. For this case, lld-link also offers a `/timestamp:` flag that you can give an explicit timestamp that’s written into the output. You could use this to for example write the time of the commit the code is built at instead of the current time to make it deterministic. (But see the footnote on embedding commit hashes below.)
52+
Both link.exe and lld-link also write the current timestamp into output .dll or .exe files. To make them instead write a hash of the binary into this field, you can pass `/Brepro` to the linker as well. However, some tools, such as Windows 7’s app compatibility database, try to interpret that field as an actual timestamp and can get confused if it’s set to a hash of the binary. For this case, lld-link also offers a <mark>/timestamp:</mark> flag that you can give an explicit timestamp that’s written into the output. You could use this to for example write the time of the commit the code is built at instead of the current time to make it deterministic. (But see the footnote on embedding commit hashes below.)
5353

54-
Visual Studio’s assemblers ml.exe and ml64.exe also insist on writing the current time into their output. In situations like this, where you can’t easily fix the tool to write the right output in the first place, you need to write wrappers that fix up the file after the fact. As an example, [ml.py](https://cs.chromium.org/chromium/src/build/toolchain/win/ml.py) is the wrapper the Chromium project uses to make ml’s output deterministic.
54+
Visual Studio’s assemblers ml.exe and ml64.exe also insist on writing the current time into their output. In situations like this, where you can’t easily fix the tool to write the right output in the first place, you need to write wrappers that fix up the file after the fact. As an example, <mark>[ml.py](https://cs.chromium.org/chromium/src/build/toolchain/win/ml.py)</mark> is the wrapper the Chromium project uses to make ml’s output deterministic.
5555

56-
macOS’s libtool and ld64 also insist on writing timestamps into their outputs. You can set the environment variable `ZERO_AR_DATE` to 1 in a wrapper to make their output deterministic, but that confuses lldb of older Xcode versions.
56+
macOS’s libtool and ld64 also insist on writing timestamps into their outputs. You can set the environment variable <mark>ZERO_AR_DATE</mark> to 1 in a wrapper to make their output deterministic, but that confuses lldb of older Xcode versions.
5757

5858
Gcc sometimes uses random numbers in certain symbol mangling situations. Clang does not do this, so there’s no need to pass `-frandom-seed` to clang.
5959

60-
It’s a good idea to make your build independent of environment variables as much as possible, so that accidental local changes in the environment don’t affect the build output. You should pass `/X` to clang-cl to make it ignore `%INCLUDE%` and explicitly pass system include directories via the `-imsvc` switch instead. Likewise, very new lld-link versions (LLVM 10 and newer, at the time of this writing still unreleased) understand the flag `/lldignoreenv` flag, which makes lld-link ignore the `%LIB%` environment variable; explicitly pass system library directories via `/libpath:`.
60+
It’s a good idea to make your build independent of environment variables as much as possible, so that accidental local changes in the environment don’t affect the build output. You should pass <mark>/X</mark> to clang-cl to make it ignore `%INCLUDE%` and explicitly pass system include directories via the <mark>-imsvc</mark> switch instead. Likewise, very new lld-link versions (LLVM 10 and newer, at the time of this writing still unreleased) understand the flag <mark>/lldignoreenv</mark> flag, which makes lld-link ignore the `%LIB%` environment variable; explicitly pass system library directories via <mark>/libpath:</mark>.
6161

6262
## Footnote on embedding git hashes into the binary
6363

@@ -89,22 +89,22 @@ Making build outputs independent of the names of the checkout or build directory
8989

9090
A possible way to arrange for that is to put all build directories into the checkout directory. For example, if your code is at `path/to/src`, then you could have “out” in your `.gitignore` and build directories at `path/to/src/out/debug`, `path/to/src/out/release`, and so on. The relative path from each build artifact to the source is with `../../` followed by the path of the source file in the source directory, which is identical for each build directory.
9191

92-
The C standard defines the predefined macro `__FILE__` that expands to the name of the current source file. Clang expands this to an absolute path if it is invoked with an absolute path (`clang -c /absolute/path/to/my/file.cc`), and to a relative path if it is invoked with a relative path (`clang ../../path/to/my/file.cc`). To make your build locally deterministic, pass relative paths to your .cc files to clang.
92+
The C standard defines the predefined macro `__FILE__` that expands to the name of the current source file. Clang expands this to an absolute path if it is invoked with an absolute path (`clang -c /absolute/path/to/my/file.cc`), and to a relative path if it is invoked with a relative path (`clang ../../path/to/my/file.cc`). To make your build locally deterministic, <mark>pass relative paths to your .cc files to clang</mark>.
9393

94-
By default, clang will internally use absolute paths to refer to compiler-internal headers. Pass `-no-canonical-prefixes` to make clang use relative paths for these internal files.
94+
By default, clang will internally use absolute paths to refer to compiler-internal headers. Pass <mark>-no-canonical-prefixes</mark> to make clang use relative paths for these internal files.
9595

96-
Passing relative paths to clang makes clang expand `__FILE__` to a relative path, but paths in debug information are still absolute by default. Pass `-fdebug-compilation-dir .` to make paths in debug information relative to the build directory. (Before LLVM 9, this is an internal clang flag that must be used as `-Xclang -fdebug-compilation-dir -Xclang .`) When using clang’s integrated assembler (the default), `-Wa,-fdebug-compilation-dir,.` will do the same for object files created from assembly input. (For ml.exe / ml64.exe, see the script linked to from the “Basic determinism” section above.)
96+
Passing relative paths to clang makes clang expand `__FILE__` to a relative path, but paths in debug information are still absolute by default. Pass <mark>-fdebug-compilation-dir .</mark> to make paths in debug information relative to the build directory. (Before LLVM 9, this is an internal clang flag that must be used as `-Xclang -fdebug-compilation-dir -Xclang .`) When using clang’s integrated assembler (the default), <mark>-Wa,-fdebug-compilation-dir,.</mark> will do the same for object files created from assembly input. (For ml.exe / ml64.exe, see the script linked to from the “Basic determinism” section above.)
9797

9898
Using this means that debuggers won’t automatically find the source code belonging to your binary. At the moment, there’s no way to tell debuggers to resolve relative paths relative to the location of the binary ([DWARF proposal](http://dwarfstd.org/ShowIssue.php?issue=171130.2), [gdb patch](https://gnutoolchain-gerrit.osci.io/r/c/binutils-gdb/+/402)). See the end of this section for how to configure common debuggers to work correctly.
9999

100-
There are a few flags that try to make compilers produce relative paths in outputs even if the filename passed to the compiler is absolute (`-fdebug-prefix-map`, `-ffile-prefix-map`, `-fmacro-prefix-map`). Do not use these flags.
100+
There are a few flags that try to make compilers produce relative paths in outputs even if the filename passed to the compiler is absolute (`-fdebug-prefix-map`, `-ffile-prefix-map`, `-fmacro-prefix-map`). <mark>Do not use these flags</mark>.
101101

102102
- They work by adding lhs=rhs replacement patterns, and the lhs must be an absolute path to remove the absolute path from the output. That means that while they make the compile output path-independent, they make the compile command itself path-dependent, which hinders distributed compile caching. With `-grecord-gcc-switches` or `-frecord-gcc-switches` the compile command is embedded in debug info or even the object file itself, so in that case the flags even break local determinism. (Both `-grecord-gcc-switches` and `-frecord-gcc-switches` default to false in clang.)
103103
- They don’t affect the paths in dwo files when using fission; passing relative paths to the compiler is the only way to make these paths relative.
104104

105-
On Windows, it’s very unusual to have PDBs with relative paths. You can pass `/pdbsourcepath:X:\fake\prefix` to lld-link to make it resolve all relative paths in object files against a fixed absolute path to make sure your final PDBs contain absolute paths. Since the absolute path is against a fixed prefix, this doesn’t impair determinism. With this, both binaries and PDBs created by clang-cl and lld-link will be fully deterministic and build path independent.
105+
On Windows, it’s very unusual to have PDBs with relative paths. You can pass <mark>/pdbsourcepath:X:\fake\prefix</mark> to lld-link to make it resolve all relative paths in object files against a fixed absolute path to make sure your final PDBs contain absolute paths. Since the absolute path is against a fixed prefix, this doesn’t impair determinism. With this, both binaries and PDBs created by clang-cl and lld-link will be fully deterministic and build path independent.
106106

107-
Also on Windows, the linker by default puts the absolute path the to the generated PDB file in the output binary. Pass `/pdbaltpath:%_PDB%` when you pass `/debug` to make the linker write a relative path to the generated PDB file instead. If you have custom build steps that extract PDB names from binaries, you have to make sure these scripts work with relative paths. Microsoft’s tools (debuggers, ETW) work fine with this set in most situations, and you can add a symbol search path in the cases where they don’t (when the binaries are copied before being run).
107+
Also on Windows, the linker by default puts the absolute path the to the generated PDB file in the output binary. Pass <mark>/pdbaltpath:%_PDB%</mark> when you pass `/debug` to make the linker write a relative path to the generated PDB file instead. If you have custom build steps that extract PDB names from binaries, you have to make sure these scripts work with relative paths. Microsoft’s tools (debuggers, ETW) work fine with this set in most situations, and you can add a symbol search path in the cases where they don’t (when the binaries are copied before being run).
108108

109109
## Getting debuggers to work well with locally deterministic builds
110110

@@ -153,7 +153,7 @@ It also versions the compiler, linker, and SDK used within your code, which mean
153153

154154
You need to store the currently-used compiler, linker, and SDK versions in a file in your source control repository, and from some kind of hook that runs after pulling the newest version of the source, download compiler, linker, and SDK of the right version from some kind of cloud storage service.
155155

156-
You then need to modify your build files to use `--sysroot` (Linux), `-isysroot` (macOS), `-imsvc` (Windows) to use these hermetic SDKs for builds. They need to be somewhere below your source root to not regress build directory name invariance.
156+
You then need to modify your build files to use <mark>--sysroot</mark> (Linux), <mark>-isysroot</mark> (macOS), <mark>-imsvc</mark> (Windows) to use these hermetic SDKs for builds. They need to be somewhere below your source root to not regress build directory name invariance.
157157

158158
You also want to make sure your build doesn’t depend on environment variables, as already mentioned in the “Getting to incremental determinism”, since environments between different machines can be very different and difficult to control.
159159

0 commit comments

Comments
 (0)