Skip to content

Conversation

@davidtwco
Copy link
Owner

@davidtwco davidtwco commented Nov 17, 2025

Early draft of an alternative to the target-spec-json format for discussion/feedback from the compiler team.

Rendered

- These features overwrite `-Ctarget-cpu` but can be overwritten with
`-Ctarget-features`
- Corresponds to `llc -mattr=$features`
- Accepts LLVM feature names, not Rust feature names
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLVM feature names do sometimes change, so I think we did have to do the same translation as we do for -Ctarget-feature and #[target_feature(enable = "...")].

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed this line - it was just copied from the comment on the target spec field. Per #5 (comment), we could choose to have the values of this field be unstable, but we could also use our own feature names. I don't feel too strongly about it and it's something I'd like to see worked out prior to stabilisation of the format and when final documentation is being written, the key thing at the moment is just that the key makes sense to have, in this table, and included in the initial version. Changed this in 9868a6a.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the fact that our target specs currently use LLVM feature names is a holdover from when those were the only feature names that exist. There was never a reason to change this. But anything we intend to stabilize should definitely use Rust feature names and go through the same handling as -Ctarget-feature.

The target feature code is a mess so it might need a bit of a rewrite for this. ;)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened rust-lang/rust#149173 for this.

- **`data-layout` -> `backends.llvm.data_layout`**
- [Data layout][llvm-datalayout] to pass to LLVM
- Mandatory (if LLVM backend used), overrides
- Used in 306/306 built-in targets
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does LLVM provide any stability guarantees around the data-layout format?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The data-layout has definitely changed in backwards-incompatible ways multiple times in the past.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I intend for the values of these keys to have their own stability policy, like we informally have with compiler flags, where for some flags like -Cllvm-args we have no guarantees for the values, but others like -Csplit-debuginfo we do. For backends.llvm.data_layout, we would document that the format can change depending on the LLVM version and so we only guarantee that the key will exist in the target specification format. Mentioned this in e6d0ae7.

- **`archive-format` -> `unstable.target.archive_format`**
- Format that archives should be emitted in
- This affects whether we use LLVM to assemble an archive or fall back to
the system linker, and currently only "gnu" is used to fall into LLVM
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We always use the ar_archive_writer crate for assembling archives. Defaulting to GNU without making this option stable will make it impossible to create a custom target for BSDs. Also be aware that setting the wrong archive format is likely to silently corrupt archives if they are passed through a tool like ranlib that uses a different archive format. They are all incompatible extensions to the base ar archive format.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like in the other comment (#5 (comment)), I just copied this description from the doc comment in the target specification, so I've just removed this line for now. I don't have a preference as to what the default is here, just intending to indicate where I think this key should go, and that we probably shouldn't include it in an initial stable format until we've thought about it more. Changed this in 19c7f19.

format.

# Motivation
[motivation]: #motivation

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a missing paragraph in this section on what motivates us to have a stable strategy for extending rustc with new targets. The current language describes largely the history of why JSON isn't great and what we've done so far, but it's not a motivation for why we'd want this capability.

I think questions I'd like to see answered:

  • To what extent are they deviating from an existing target? E.g., should the things they want to tweak be target modifiers rather than new target names?
  • What level of stability are these users looking for? Obviously we can't guarantee that much if we're less than a tier 3 target... even building core via stable -Zbuild-std can't going to be guaranteed to work I imagine for a user-defined target that's sufficiently different from our in-tree targets.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point - I shouldn't assume we want custom targets. I've reached out to some people to learn how they're using custom targets today, and until I hear back, added some generic motivations that I expect are close to correct in 34c28b3.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one question that I'd like to see us answer is why the mechanism here isn't (roughly) a file that contains command line arguments to rustc (like target modifiers) - what makes the name actually useful?

For the case of entirely new target that can't be upstreamed I think it's hard for me to believe the compiler is usable for it with no code changes if it doesn't fit into the previous paragraph.

But looking forward to having more concrete examples.

New keys can be added to the target specification format by the compiler team in
the `unstable` table and this is not a breaking change. Unstable keys can be
stabilised and moved outside of the `unstable` table. The `unstable` table
serves the same purpose as the `-Z` rustc flag group.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the problem here is that for those who care most about something like this (I am guessing RfL people would be amongst them?) they have expressed that the -Z flag group is actually kind of a pain, because things "jump" from -Z to -C. IMO, the -Z flag group mostly serves the needs of compiler devs: it's a great place to put flags we kinda never intend to stabilize.

Do we need a structural marker in the toml format like a new table?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible, of course, that the unstable table might happen to be nested in the right way to just make this easy, I guess? Unsure.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to doing something else, especially having previously argued that we should do somethnig different for flags (rust-lang/compiler-team#787). Maybe a unstable-options = ["backends.llvm.frame_pointer"] key where you list the unstable keys you want to use, or just unstable-options = true, or we allow stabilised keys to be used under the unstable table too? I think unstable-options = true might be cleanest?

Comment on lines 51 to 52
Built-in targets are not subject to the same consistency checks as custom
targets ([rust#133459]).
Copy link

@RalfJung RalfJung Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct in the sense that builtin targets are subject to more consistency checks -- but your wording makes it sound like the opposite. Originally, custom targets did not have any consistency checks. rust-lang/rust#133409 fixed that. Now consistency checks generally apply to both, with the exception of some dynamic linker checks which seem like they are merely "odd" but not "wrong" (but I am by far not a dynamic linker expert so before we expose this on stable, someone should carefully look over this).

Also that issue is about the default values being inconsistent, it's not about JSON vs builtin. So I don't understand it as a reference for this sentence.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I misunderstood the issue, I interpreted it as us having consistency checks for the custom targets as that was an external input we were reading but because our normal targets don't go through that codepath, we just define the types directly, they weren't getting the consistency check. Changed in bb2e174.

- For example, `backends.llvm`
- `cfgs`
- Keys that configure cfgs that are expected for targets
- For example, `cfgs.endian`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Endianess is not just a cfg, it's a fundamental property of the target that affects the AM, just like pointer width. It would be extremely strange if endianess and pointer_width do not end up next to each other.

cgfs makes sense for things that are only cfgs but have no other language-level consequences, like target_abi.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, I've moved the endian key to target and also renamed it endianness in 51345e7.

- For example, `linker.supports_dynamic`
- `target`
- Keys describing properties inherent to the target
- For example, `target.pointer_width`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some targets have multiple ABI variants, and this is a global choice as it affects all extern "C" functions. Would that also go here? It's different from pointer_width in that pointer_width exists at the level of the Rust language / Abstract Machine, whereas ABI variants are not visible inside Rust and only exist on the assembly level.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so, I intend target to be anything that isn't backend-specific, basically. It's tricky though, because I don't know what we'd do if there were multiple ABI variants but LLVM supported both and GCC supported one, for example. It's a similar trade-off as to something like the split-debuginfo key, is that something the target supports or is that something the backend supports? It could be either? It's just a judgement call, I think.

Copy link

@RalfJung RalfJung Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW this is rustc_abi which you currently have put under llvm. I'm not sure that's the right place. Maybe we have to separate controlling the extern "C" ABI from controlling the "Rust" ABI.

This is anyway largely a hack to work around LLVM's ABI handling being kind of a mess for some targets. Ideally we'd just have one "ABI name" field that subsumes all of llvm_abiname, llvm_floatabi, and rustc_abi, and then the backends know how to ensure they are generating code with the right ABI. But LLVM has three different ways for how the ABI is controlled and then we have three fields for dealing with that -- and one of them interacts with target features so a bunch of that logic isn't even in the codegen backend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants