Add function signature based labeling scheme for landing pad#434
Add function signature based labeling scheme for landing pad#434kito-cheng wants to merge 22 commits intocfi-propfrom
Conversation
|
Regarding the mechanism to pass the calculated lpad labels to static linker for PLT generation, maybe we can just add a symbol for each called function? Currently, the LLVM KCFI mechanism adds similar symbols to allow assembly codes to reference labels computed from C function declarations. This approach has the benefit of being easily human readable when examining the assembly text and object dumps (through symbol table), and it does not require us to invent a new format for this purpose, which means it can already be accepted by existing assemblers and compiling pipelines that utilizes independent assemblers. The downside of this may be that it adds quite a lot of additional entries to the symbol table and the data structure of symbol table entries are a bit too bloated for this purpose, but if we decide to pass the symbols along in the relocatables instead of fetching from shared objects at static link time, maybe we can just advise programmers to strip away the symbol tables after linking so the program size can be reduced. |
|
@mylai-mtk thanks for the inputs, and share few options in my mind:
typedef struct {
ElfNN_Half ss_boundto; /* Direct bindings, symbol bound to */
ElfNN_Word ss_sig; /* Signature string, string index in .riscv.ssstr section . */
} ElfNN_SymSig;However option 1 and option 2 will has some problem when dealing with undefined weak symbol For options 3: build attribute for symbol is kinda good fit for current usage, but...all linker seems NOT implement that at all, so I'm a little hesitant to choose this option, also it's bind to symbol, so it could handle undefined weak symbol well in theory. For option 4: similar to option 3 for many aspect, but use a customized section.
|
|
Hi @kito-cheng, thanks for the reply.
Regarding this option, I don't see any structure that looks like the "build attribute" you mentioned that works with symbols in the "Attributes" section of the linked document. Can you be more specific on what existing format we have that you're referring to?
I don't think we need LTO with KCFI. KCFI works much like Zicfilp, except that it's software emulated. In KCFI, the label used at the caller site comes from the called target's apparent signature as seen from caller, which should be available in C programs for most of the time. The only exception I know is K&R-style functions, which technically do not have signatures and accept a plethora of argument type combinations. However, intended K&R-style functions are rare IMHO, and probably should use a special label rule if we are to handle them.
I wasn't talking about generating symbol at the same address as the function symbol. In KCFI, the label symbols are associated to the called functions by name, e.g. for a What I proposed was to add something like |
|
We need to have a way (define macro?) to know which labeling scheme (simple/complex) is in use for the current compilation, so assembly files can know which label to use. This is needed for libc implementation. |
|
(Nitpicking...) I propose that we move away from the name of "complex" when using function signatures as the label content. Though we have a "simple" labeling scheme and it's natural to use its opposite term "complex" to name our new scheme, which is not simple, I think the term "complex" covers too many possibilities and thus reveals too little about what it actually is. Also, avoiding the "complex" term allows us to define more label schemes in the future without the embarrassment of feeling like defining another new "complex" scheme. Based on these points, I propose to name this current new label scheme using the name "func_sig" or "mangled_sig" (the underscore may be removed), which is precise and tells what it really does. |
Good suggestion, let me rename it to function signature / func_sig :) |
|
Create a PR for adding macro riscv-non-isa/riscv-c-api-doc#76 |
e4ae5de to
c3c4adc
Compare
|
Changes:
|
| 1: lpad <hash-value-for-function> | ||
| auipc t3, %pcrel_hi(function@.got.plt) | ||
| l[w|d] t3, %pcrel_lo(1b)(t3) | ||
| lui t2, <hash-value-for-function> |
There was a problem hiding this comment.
Its needed for direct call to PLT followed by indirect tail call from PLT to target...
|
Regarding functions having no prototype, which are still widely but probably unintendedly used in new C codes, should we impose some restrictions on the usage of them so that we can properly label them? To be specific, I propose that we treat types like |
|
Changes:
|
|
Changes:
|
Changes: - Rename `GNU_PROPERTY_RISCV_FEATURE_1_CFI_LP_SIMPLE` to `GNU_PROPERTY_RISCV_FEATURE_1_CFI_LP_UNLABELED` - Fix wrong offset in the first PLT stubs for the simple landing pad PLT.
Function signature based labeling scheme, follow the "Function types" mangling
rule defeind in Itanium C++ ABI.
With few specific rules:
- `main` funciton is using signature of
`(int, pointer to pointer to char) returning int` (`FiiPPcE`).
- `_dl_runtime_resolve` use zero for the landing pad.
- {Cpp} member functions should use the "Pointer-to-member types" mangling rule
defined in the _Itanium {Cpp} ABI_ <<itanium-cxx-abi>>.
- Virtual functions in {Cpp} should use the member function type of the base
class that first defined the virtual function.
- If a virtual function is inherited from more than one base class, it should
use the type of the first base class. Thunk functions will use the type of
the corresponding base class.
Co-authored-by: Ming-Yi Lai <Ming-yi.Lai@mediatek.com>
Changes: - Rename complex labeling scheme to function signature based labeling scheme - Fix the PLT stubs - Add labeling rule for `main` and `_dl_runtime_resolve`. - Clarify the rule for those virtual function from more than one base class.
- Speical rule for return type of member function. - Speical rule for class destructors - <exception-spec> should be ignored. - Static functions should follow the same rules as normal functions. - wchar_t is platform dependent. - Functions with an empty parameter list are treated as `void` (`v`).
- Add note to mention covariant return types
Use zero-filled value if remain bits is less than 20 bits
b523e16 to
1f96d3e
Compare
|
llvm/llvm-project#111661 (comment) void f(int(*x)[]) {}; // Zicfilp mangles as `FvPA_iE`
typedef void PtrType(int(*)[12]);
void g(PtrType p) { p(0); } // Zicfilp mangles `PtrType` as `FvPA12_iE`
void h() { g(f); } // Compatibly passing `f` as `PtrType`I think a case, which is indeed not properly handled by the current draft mangling scheme, is raised. Perhaps we should strip away the element numbers of arrays, just like what AArch64 PtrAuth did (ref. https://github.com/llvm/llvm-project/blob/63de20c0de05ce7b8b3968a9d210aa0f3d01acd4/clang/lib/AST/ASTContext.cpp#L3333)? |
…filp/Zicfiss features (#127193) + When all relocatable files contain a `.note.gnu.property` section (with `NT_GNU_PROPERTY_TYPE_0` notes) which contains a `GNU_PROPERTY_RISCV_FEATURE_1_AND` property in which the `GNU_PROPERTY_RISCV_FEATURE_1_CFI_LP_UNLABELED`/`GNU_PROPERTY_RISCV_FEATURE_1_CFI_LP_FUNC_SIG`/`GNU_PROPERTY_RISCV_FEATURE_1_CFI_LP_SS` bit is set: + The output file will contain a `.note.gnu.property` section with the bit set + A `PT_GNU_PROPERTY` program header is created to encompass the `.note.gnu.property` section + If `-z zicfilp-unlabeled-report=[warning|error]`/`-z zicfilp-func-sig-report=[warning|error]`/`-z zicfiss-report=[warning|error]` is specified, the linker will report a warning or error for any relocatable file lacking the feature bit RISC-V Zicfilp/Zicfiss features indicate their adoptions as bits in the `.note.gnu.property` section of ELF files. This patch enables LLD to process the information correctly by parsing, checking and merging the bits from all input ELF files and writing the merged result to the output ELF file. These feature bits are encoded as a mask in each input ELF files and intended to be "and"-ed together to check that all input files support a particular feature. For RISC-V Zicfilp features, there are 2 conflicting bits allocated: `GNU_PROPERTY_RISCV_FEATURE_1_CFI_LP_UNLABELED` and `GNU_PROPERTY_RISCV_FEATURE_1_CFI_LP_FUNC_SIG`. They represent the adoption of the forward edge protection of control-flow integrity with the "unlabeled" or "func-sig" policy. Since these 2 policies conflicts with each other, these 2 bits also conflict with each other. This patch adds the `-z zicfilp-unlabeled-report=[none|warning|error]` and `-z zicfilp-func-sig-report=[none|warning|error]` commandline options to make LLD report files that do not have the expected bits toggled on. For RISC-V Zicfiss feature, there's only one bit allocated: `GNU_PROPERTY_RISCV_FEATURE_1_CFI_SS`. This bit indicates that the ELF file supports Zicfiss-based shadow stack. This patch adds the `-z zicfiss-report=[none|warning|error]` commandline option to make LLD report files that do not have the expected bit toggled on. The adoption of the `.note.gnu.property` section for RISC-V targets can be found in the psABI PR <riscv-non-isa/riscv-elf-psabi-doc#417> (`CFI_LP_UNLABELED` and `CFI_SS`) and PR <riscv-non-isa/riscv-elf-psabi-doc#434> (`CFI_LP_FUNC_SIG`).
…ons for Zicfilp/Zicfiss features (#127193) + When all relocatable files contain a `.note.gnu.property` section (with `NT_GNU_PROPERTY_TYPE_0` notes) which contains a `GNU_PROPERTY_RISCV_FEATURE_1_AND` property in which the `GNU_PROPERTY_RISCV_FEATURE_1_CFI_LP_UNLABELED`/`GNU_PROPERTY_RISCV_FEATURE_1_CFI_LP_FUNC_SIG`/`GNU_PROPERTY_RISCV_FEATURE_1_CFI_LP_SS` bit is set: + The output file will contain a `.note.gnu.property` section with the bit set + A `PT_GNU_PROPERTY` program header is created to encompass the `.note.gnu.property` section + If `-z zicfilp-unlabeled-report=[warning|error]`/`-z zicfilp-func-sig-report=[warning|error]`/`-z zicfiss-report=[warning|error]` is specified, the linker will report a warning or error for any relocatable file lacking the feature bit RISC-V Zicfilp/Zicfiss features indicate their adoptions as bits in the `.note.gnu.property` section of ELF files. This patch enables LLD to process the information correctly by parsing, checking and merging the bits from all input ELF files and writing the merged result to the output ELF file. These feature bits are encoded as a mask in each input ELF files and intended to be "and"-ed together to check that all input files support a particular feature. For RISC-V Zicfilp features, there are 2 conflicting bits allocated: `GNU_PROPERTY_RISCV_FEATURE_1_CFI_LP_UNLABELED` and `GNU_PROPERTY_RISCV_FEATURE_1_CFI_LP_FUNC_SIG`. They represent the adoption of the forward edge protection of control-flow integrity with the "unlabeled" or "func-sig" policy. Since these 2 policies conflicts with each other, these 2 bits also conflict with each other. This patch adds the `-z zicfilp-unlabeled-report=[none|warning|error]` and `-z zicfilp-func-sig-report=[none|warning|error]` commandline options to make LLD report files that do not have the expected bits toggled on. For RISC-V Zicfiss feature, there's only one bit allocated: `GNU_PROPERTY_RISCV_FEATURE_1_CFI_SS`. This bit indicates that the ELF file supports Zicfiss-based shadow stack. This patch adds the `-z zicfiss-report=[none|warning|error]` commandline option to make LLD report files that do not have the expected bit toggled on. The adoption of the `.note.gnu.property` section for RISC-V targets can be found in the psABI PR <riscv-non-isa/riscv-elf-psabi-doc#417> (`CFI_LP_UNLABELED` and `CFI_SS`) and PR <riscv-non-isa/riscv-elf-psabi-doc#434> (`CFI_LP_FUNC_SIG`).
| sub t1, t1, t3 # shifted .got.plt offset + hdr size + 20 | ||
| auipc t3, %pcrel_hi(.got.plt) |
There was a problem hiding this comment.
I found that the order of these 2 instructions is different in the current PLT scheme. Is this reordering intentional?
|
Kito, I and some other folks have discussed another approach for finer grained forward cfi which would impact PLT generation specifics here. Want to ensure that folks here are also aware. One major point
More details here Please take a look and poke holes |
NOTE: it's chained PR, which based on #417
TODO: We don't add mechanism for generating right PLT yet, we may need a specialized section to record the function signature or hash, so that static linker can generate right label at PLT entries.
Function signature based labeling scheme, follow the "Function types" mangling rule defined in Itanium C++ ABI.
With few specific rules:
mainfunction is using signature of(int, pointer to pointer to char) returning int(FiiPPcE)._dl_runtime_resolveuse zero for the landing pad.defined in the Itanium C++ ABI.
class that first defined the virtual function.
use the type of the first base class. Thunk functions will use the type of
the corresponding base class.