-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New rustc and Cargo options to allow path sanitisation by default #3127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 5 commits
139d7f0
bcbd131
97e4104
d8344ef
408dc50
f92a321
2bd2792
998ecf4
ba6b2d8
d790948
d33e029
0688580
aee42a6
8e33a46
0b59e5c
604dcb0
2d49c09
23394fc
e15308c
dec1901
bb079f4
7c533d3
6e45014
34d4386
785c229
a357827
5286008
8ba3510
3f59b7b
16edfb2
cbcb1df
451f163
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,234 @@ | ||
- Feature Name: trim-paths | ||
- Start Date: 2021-05-24 | ||
- RFC PR: [rust-lang/rfcs#3127](https://github.com/rust-lang/rfcs/pull/3127) | ||
- Rust Issue: N/A | ||
|
||
# Summary | ||
[summary]: #summary | ||
|
||
Cargo should have a [profile setting](https://doc.rust-lang.org/cargo/reference/profiles.html#profile-settings) named `trim-paths` | ||
to sanitise absolute paths introduced during compilation that may be embedded in the compiled binary executable or library, and optionally in | ||
the separate debug symbols file (depending on `split-debuginfo` settings). | ||
|
||
`cargo build` with the default `release` profile should not produce any host filesystem dependent paths into binary executable or library. But | ||
it will retain the paths in separate debug symbols file, if one exists, to help debuggers and profilers locate the source files. | ||
|
||
To facilitate this, a new flag named `--remap-scope` should be added to `rustc` controlling the behaviour of `--remap-path-prefix`, allowing us to fine | ||
tune the scope of remapping, speicifying paths under which context (in marco expansion, in debuginfo or in diagnostics) | ||
should or shouldn't be remapped. | ||
|
||
# Motivation | ||
[motivation]: #motivation | ||
|
||
## Sanitising local paths that are currently embedded | ||
Currently, executables and libraies built by Cargo have a lot of embedded absolute paths. They most frequently appear in debug information and | ||
cbeuw marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
panic messages (pointing to the panic location source file). As an example, consider the following package: | ||
|
||
`Cargo.toml`: | ||
|
||
```toml | ||
[package] | ||
name = "rfc" | ||
version = "0.1.0" | ||
edition = "2018" | ||
|
||
[dependencies] | ||
rand = "0.8.0" | ||
``` | ||
`src/main.rs` | ||
|
||
```rust | ||
use rand::prelude::*; | ||
|
||
fn main() { | ||
let r: f64 = rand::thread_rng().gen(); | ||
println!("{}", r); | ||
} | ||
``` | ||
|
||
Then run | ||
|
||
```bash | ||
$ cargo build --release | ||
$ strings target/release/rfc | grep $HOME | ||
``` | ||
|
||
We will see some absolute paths pointing to dependency crates downloaded by Cargo, containing our username: | ||
|
||
``` | ||
could not initialize thread_rng: /home/username/.cargo/registry/src/github.com-1ecc6299db9ec823/rand-0.8.3/src/rngs/thread.rs | ||
/home/username/.cargo/registry/src/github.com-1ecc6299db9ec823/rand_chacha-0.3.0/src/guts.rsdescription() is deprecated; use Display | ||
/home/username/.cargo/registry/src/github.com-1ecc6299db9ec823/getrandom-0.2.2/src/util_libc.rs | ||
``` | ||
|
||
This is undesirable for the following reasons: | ||
|
||
1. **Privacy**. `release` binaries may be distributed, and anyone could then see the builder's local OS account username. | ||
Additionally, some CI (such as [GitLab CI](https://docs.gitlab.com/runner/best_practice/#build-directory)) checks out the repo under a path where | ||
it may include things that really aren't meant to be public. Without sanitising the path by default, this may be inadvertently leaked. | ||
2. **Build reproducibility**. We would like to make it easier to reproduce binary equivalent builds. While it is not required to maintain | ||
reproducibility across different environments, removing environment-sensitive information from the build will increase tolerance on the inevitable | ||
environment differences when trying to verify builds. | ||
|
||
cbeuw marked this conversation as resolved.
Show resolved
Hide resolved
|
||
## Handling sysroot paths | ||
At the moment, paths to the source files of standard and core libraries, even when they are present, always begin with a virtual prefix in the form | ||
of `/rustc/[SHA1 hash]/library`. This is not an issue when the source files are not present (i.e. when `rust-src` component is not installed), but | ||
when a user installs `rust-src` they may want the path to their local copy of source files to be visible. Hence the default behaviour when `rust-src` | ||
is installed should be to embed the local path. These local paths should be then affected by path remappings in the usual way. | ||
|
||
## Preserving debuginfo to help debuggers | ||
At the moment, `--remap-path-prefix` will cause paths to source files in debuginfo to be remapped. On platforms where the debuginfo resides in a | ||
separate file from the distributable binary, this may be unnecessary and it prevents debuggers from being able to find the source. Hence `rustc` | ||
should support finer grained control over paths in which contexts should be remapped. | ||
|
||
# Guide-level explanation | ||
[guide-level-explanation]: #guide-level-explanation | ||
|
||
## The rustc book: Command-line arguments | ||
|
||
### `--remap-scope`: configure the scope of path remapping | ||
cbeuw marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
When the `--remap-path-prefix` option is passed to rustc then source path prefixes in all output will be affected. | ||
The `--remap-scope` argument can be used in conjunction with `--remap-path-prefix` to determine paths in which output context should be affected. | ||
This flag accepts a comma-separated list of values and may be specified multiple times. The valid scopes are: | ||
cbeuw marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
- `macro` - apply remappings to the expansion of `std::file!()` macro. This is where paths in embedded panic messages come from | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is not really a descriptive name. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I don't have any good naming suggestions (maybe There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
- `debuginfo` - apply remappings to debug information | ||
- `diagnostics` - apply remappings to printed compiler diagnostics | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this useful on it's own? It can't prevent the original path from ending up in the crate metadata without also requiring all other remappings. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Maybe people won't use it on its own, but it can't be merged with other options either so it has to be there There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. since analyzer is popular that means nothing is stripped by default then? will it be a single option and not a mess of remap-prefix options then to strip the paths? otherwise it will be a choice of privacy/reproducibility or the usefulness of analyzer There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
|
||
## Cargo | ||
|
||
`trim-paths` is a profile setting which controls the sanitisation of file paths in compilation outputs. It has three valid options: | ||
|
||
- `0` or `false`: no sanitisation at all | ||
- `1`: sanitise only the paths in emitted executable or library binaries. It always affects paths from macros such as panic messages, and in debug information | ||
only if they will be embedded together with the binary (the default on platforms with ELF binaries, such as Linux and windows-gnu), | ||
but will not touch them if they are in a separate symbols file (the default on Windows MSVC and macOS) | ||
- `2` or `ture`: sanitise paths in all compilation outputs, including compiled executable/library, separate symbols file (if one exists), and compiler diagnostics. | ||
|
||
|
||
The default release profile uses option `1`. You can also manually override it by specifying this option in `Cargo.toml`: | ||
```toml | ||
[profile.dev] | ||
trim-paths = 2 | ||
|
||
[profile.release] | ||
trim-paths = 0 | ||
``` | ||
|
||
When a path is in scope for sanitisation, it is replaced with the following rules: | ||
|
||
1. Path to the source files of the standard and core library (sysroot) will begin with `/rustc/[rustc commit hash]`. | ||
E.g. `/home/username/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/result.rs` -> | ||
`/rustc/fe72845f7bb6a77b9e671e6a4f32fe714962cec4/library/core/src/result.rs` | ||
2. Path to the working directory will be replaced with `.`. E.g. `/home/username/crate/src/lib.rs` -> `./src/lib.rs`. | ||
cbeuw marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
3. Path to packages outside of the working directory will be replaced with `[package name]-[version]`. E.g. `/home/username/deps/foo/src/lib.rs` -> `foo-0.1.0/src/lib.rs` | ||
|
||
|
||
When a path to the source files of the standard and core library is *not* in scope for sanitisation, the emitted path will depend on if `rust-src` component | ||
is present. If it is, then the real path pointing to a copy of the source files on your file system will be emitted; if it isn't, then they will | ||
show up as `/rustc/[rustc commit hash]/library/...` (just like when it is selected for sanitisation). Paths to all other source files will not be affected. | ||
|
||
This will not affect any hard-coded paths in the source code. | ||
|
||
# Reference-level explanation | ||
[reference-level-explanation]: #reference-level-explanation | ||
|
||
## `trim-paths` implementation in Cargo | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Reading this entire section, it doesn't seem to me that this functionality needs to live in Cargo. rustc could directly provide a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is rustc aware of the path to the current package? This is relevant to dependencies living under There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmmm, fair point. I thought that rustc had the necessary information, but perhaps not. It might make sense to pass the requisite information into rustc, but then let rustc make the decision for how to use that information. I'd rather not have cargo making that decision. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. rustc always knows the absolute path to files, but only the build tool can tell which part of it is environment-sensitive. E.g. rustc knows it's using This "requisite information" is already included in provided mappings in |
||
|
||
We only need to change the behaviour for `Test` and `Build` compile modes. | ||
|
||
If `trim-paths` is `0` (`false`), no extra flag is supplied to `rustc`. | ||
|
||
|
||
If `trip-paths` is `1` or `2` (`true`), then two `--remap-path-prefix` arguments are supplied to `rustc`: | ||
- From the path of the local sysroot to `/rustc/[commit hash]`. | ||
- If the compilation unit is under the working directory, from the absolute path to the working directory to `.`. | ||
If it's outside the working directory, from the absolute path of the package root to `[package name]-[package version]`. | ||
|
||
A further `--remap-scope` is also supplied for options `1` and `2`: | ||
|
||
If `trim-path` is `1`, then it depends on the setting of `split-debuginfo` (whether the setting is explicitly supplied or from the default) | ||
cbeuw marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
- If `split-debuginfo` is `off`, then `--remap-scope=macro,debuginfo`. | ||
- If `split-debuginfo` is `packed` or `unpacked`, then `--remap-scope=macro` | ||
This is because we always want to remap panic messages as they will always be embedded in executable/library, but we don't need to touch the separate | ||
symbols file | ||
|
||
If `trim-path` is `2` (`true`), all paths will be affected, equivalent to `--remap-scope=macro,debuginfo,diagnostics` | ||
|
||
|
||
Some interactions with compiler-intrinstic macros need to be considered: | ||
1. Path (of the current file) introduced by [`file!()`](https://doc.rust-lang.org/std/macro.file.html) *will* be remapped. **Things may break** if | ||
the code interacts with its own source file at runtime by using this macro. | ||
2. Path introduced by [`include!()`](https://doc.rust-lang.org/std/macro.include.html) *will* be remapped, given that the included file is under | ||
the current working directory or a dependency package. | ||
|
||
|
||
If the user further supplies custom `--remap-path-prefix` arguments via `RUSTFLAGS` | ||
or similar mechanisms, they will take precedence over the one supplied by `trim-paths`. This means that the user-defined remapping arguments must be | ||
supplied *after* Cargo's own remapping. | ||
|
||
|
||
Additionally, when using MSVC linker, Cargo should emit `/PDBALTPATH:%_PDB%` to the linker via `-C link-arg`. This makes the linker embed | ||
only the file name of the .pdb file without the path to it. | ||
|
||
|
||
## Changing handling of sysroot path in `rustc` | ||
|
||
The virtualisation of sysroot files to `/rustc/[commit hash]/library/...` was done at compiler bootstraping, specifically when | ||
`remap-debuginfo = true` in `config.toml`. This is done for Rust distribution on all channels. | ||
|
||
At `rustc` runtime (i.e. compiling some code), we try to correlate this virtual path to a real path pointing to the file on the local file system | ||
Currently the result is represented internally as if the path was remapped by a `--remap-path-prefix`, from local `rust-src` path to the virtual path. | ||
Only the virtual name is ever emitted for metadata or codegen. We want to change this behaviour such that, when `rust-src` source files can be | ||
discovered, the virtual path is discarded and therefore the local path will be embedded, unless there is a `--remap-path-prefix` that causes this | ||
local path to be remapped in the usual way. | ||
michaelwoerister marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
|
||
# Drawbacks | ||
[drawbacks]: #drawbacks | ||
|
||
The user will not be able to `Ctrl+click` on any paths provided in panic messages or backtraces outside of the working directory. But | ||
there shouldn't be any confusion as the combination of pacakge name and version can be used to pinpoint the file. | ||
cbeuw marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
As mentioned above, `trim-paths` may break code that relies on `std::file!()` to evaluate to an accessible path to the file. Hence enabling | ||
it by default for release builds may be a technically breaking change. Occurances of such use should be extremely rare but should be investigated | ||
cbeuw marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
via a Crater run. In case this breakage is unacceptable, `trim-paths` can be made an opt-in option rather than default in any build profile. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Rust-analyzer uses expect-test which does exactly this to update snapshot tests. Most of the time tests would run in debug mode I think though. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
|
||
# Rationale and alternatives | ||
[rationale-and-alternatives]: #rationale-and-alternatives | ||
|
||
There has been an issue (https://github.com/rust-lang/rust/issues/40552) asking for path sanitisation to be implemented and enabled by default for | ||
release builds. It has, over the past 4 years, gained a decent amount of popular support. The remapping rule proposed here is very simple to | ||
implement. | ||
|
||
Path to sysroot crates are specially handled by `rustc`. Due to this, the behaviour we currently have is that all such paths are virtualised. | ||
Although good for privacy and reproducibility, some people find it a hinderance for debugging: https://github.com/rust-lang/rust/issues/85463. | ||
Hence the user should be given control on if they want the virtual or local path. | ||
|
||
An alternative to `--remap-scope` is to have individual `--remap-path-prefxi`-like flags, one each for macro, debuginfo and diagnostics, requiring | ||
cbeuw marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
the full mapping to be given for each context. This is similar to what GCC and Clang does as described below, but we have added a third context | ||
for diagnostics. This technically enables for even finer grained control, allowing different paths in different | ||
contexts to be remapped differently. However it will cause the command line to be very verbose under most normal use cases. | ||
cbeuw marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
# Prior art | ||
[prior-art]: #prior-art | ||
|
||
The name `trim-paths` came from the [similar feature](https://golang.org/cmd/go/#hdr-Compile_packages_and_dependencies) in Go. An alternative name | ||
`sanitize-paths` was first considered but the spelling of "sanitise" differs across the pond and down under. It is also not as short and concise. | ||
|
||
Go does not enable this by default. Since Go does not differ between debug and release builds, removing absolute paths for all build would be | ||
a hassle for debugging. However this is not an issue for Rust as we have separate debug build profile. | ||
|
||
GCC and Clang both have a flag equivalent to `--remap-path-prefix`, but they also both have two separate flags one for only macro expansion and | ||
the other for only debuginfo: https://reproducible-builds.org/docs/build-path/. This is the origin of the `--remap-scope` idea. | ||
|
||
# Unresolved questions | ||
[unresolved-questions]: #unresolved-questions | ||
|
||
- Should we treat the current working directory the same as other packages? We could have one fewer remapping rule by remapping all | ||
package roots to `[package name]-[version]`. A minor downside to this is not being able to `Ctrl+click` on paths to files the user is working | ||
on from panic messages. | ||
- Should we use a slightly more complex remapping rule, like distinguishing packages from registry, git and path, as mentioned in | ||
https://github.com/rust-lang/rust/issues/40552? | ||
- Will these cover all potentially embedded paths? Have we missed anything? | ||
- Should we make this affect more `CompileMode`s, such as `Check`, where the emitted `rmeta` file will also contain absolute paths? | ||
|
||
# Future possibilities | ||
[future-possibilities]: #future-possibilities | ||
|
||
N/A | ||
cbeuw marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
Uh oh!
There was an error while loading. Please reload this page.