-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[LLVM][MC][DecoderEmitter] Add support to specialize decoder per bitwidth #154865
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
46bbe78
to
3297671
Compare
Hi @s-barannikov and @topperc, this is the new simplified version of the earlier change to compile out decoders that are unused for a particular bitwidth. This approach of using a type-trait seems simpler and is much less code changes than before. Note that it includes both the per-bitwidth decoder specialization as well as numbering decoders for each bitwidth, yielding some savings on the size of the decoder tables as well. I'll fill in the description soon to explain the motivation and changes. |
For AMDGPU, for example, the generated code looks like:
|
69c8e70
to
0930ffe
Compare
Just fixing some comments. |
Looks good to me overall. One thing I'm not sure of is using std::bitset where a larger integer type would do (e.g., using std::bitset<48> instead of uint64_t). Probably not a big deal, but I thought it is worth pointing out. Secondly, what will happen if I use a "wrong" type of instruction when calling decodeInstruction? Will it silently use the wrong decodeToMCInst? If so, this is not great, and it should be possible to add a runtime check that calling decodeInstruction with a particular table resolves to the correct decodeToMCInst. |
Right, for the bitwidth specialization to work though, we need different types, so using the same type for different bitwidth (like uint64_t for both 48 and 64 bit instructions) is not supported. I guess for RISCV, if 64-bit instructions are not supported, there is a choice is just using 64-bits. I'd hope though that std::bitset<48> would internally just use a uint64_t. One thing I thought of is generalizing this to allow > 1 bitwidth for the same type. The trait can be a per-type function like:
This will allow using the same C++ type for > 1 bitwidths. Then we would need to remove the current specializations in MCDecoder.h as they assume a specific usage and then backends can choose. Not sure if that generalization is needed.
Yeah, @topperc had suggested that in the earlier version. I plan to do that in a follow-on MR. The first byte of the generated decoder table will be the supported bitwidth and we can do an initial runtime check. |
LMK if this is something to pursue. I suspect it may not be too much effort to do this and offers additional flexibility for backends. |
I did and now RISCV can continue using uint64_t for 48-bit insts. Will update the PR soon |
The windows build failure seems to be this : https://stackoverflow.com/questions/29502052/template-specialization-and-enable-if-problems |
The specific form that seems to work with MSVC (and clang and gcc) was derived here: https://godbolt.org/z/Eqdf4Tehz And this was the solution: https://stackoverflow.com/questions/75686697/how-to-work-around-function-template-has-already-been-defined-when-using-std |
Looks like MSVC builds with compiler version 14.x do not support either form, so maybe not possible to pursue this at the moment. |
Oh, the bot does incremental builds. It got rebuilt in the first build with this PR: http://45.33.8.238/macm1/113315/step_3.txt |
Thanks. I'm not sure what's happening. Basically the output or running llvm-tblgen for AMDGPU with the specialize flags seems to be just the record dumps. Can you show the tablegen command the produced AMDGPUGenDisassemblerTables.inc? |
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/168/builds/15779 Here is the relevant piece of the build log for the reference
|
Thanks, that was enough to figure it out (9f42ba3). Hopefully better now :) |
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/138/builds/18408 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/174/builds/23753 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/95/builds/17763 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/186/builds/12066 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/45/builds/15988 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/145/builds/9505 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/76/builds/12580 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/66/builds/18677 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/88/builds/15612 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/64/builds/5648 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/33/builds/22152 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/127/builds/4604 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/8/builds/20363 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/141/builds/11240 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/204/builds/20593 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/42/builds/5972 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/203/builds/21781 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/205/builds/20570 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/152/builds/4830 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/202/builds/3140 Here is the relevant piece of the build log for the reference
|
One the topic of using the same underlying type for > 1 bitwidths: I had earlier prototyped using something like:
This ran into MSVC issues and I wasn't able to find a code pattern that worked. However, it seems we can achieve the same by having a wrapper type to do this:
We can do this for RISCV if we do need 48 vs 64-bit distinction and do not want to use |
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/49/builds/2141 Here is the relevant piece of the build log for the reference
|
This change adds an option to specialize decoders per bitwidth, which can help reduce the (compiled) code size of the decoder code.
Current state:
Currently, the code generated by the decoder emitter consists of two key functions:
decodeInstruction
which is the entry point into the generated code anddecodeToMCInst
which is invoked when a decode op is reached while traversing through the decoder table. Both functions are templated onInsnType
which is the raw instruction bits that are supplied todecodeInstruction
.Several backends call
decodeInstruction
with differentInsnType
types, leading to several template instantiations of these functions in the final code. As an example, AMDGPU instantiates this function with typeDecoderUInt128
type for decoding 96/128-bit instructions,uint64_t
for decoding 64-bit instructions, anduint32_t
for decoding 32-bit instructions. Since there is just onedecodeToMCInst
in the generated code, it has code that handles decoding for all instruction sizes. However, the decoders emitted for different instructions sizes rarely have any intersection with each other. That means, in the AMDGPU case, the instantiation with InsnType == DecoderUInt128 has decoder code for 32/64-bit instructions that is never exercised. Conversely, the instantiation with InsnType == uint64_t has decoder code for 128/96/32-bit instructions that is never exercised. This leads to unnecessary dead code in the generated disassembler binary (that the compiler cannot eliminate by itself).New state:
With this change, we introduce an option
specialize-decoders-per-bitwidth
. Under this mode, the DecoderEmitter will generate several versions ofdecodeToMCInst
function, one for each bitwidth. The code is still templated, but will require backends to specify, for eachInsnType
used, the bitwidth of the instruction that the type is used to represent using a type-traitInsnBitWidth
. This will enable the templated code to choose the right variant ofdecodeToMCInst
. Under this mode, a particular instantiation will only end up instantiating a single variant ofdecodeToMCInst
generated and that will include only those decoders that are applicable to a single bitwidth, resulting in elimination of the code duplication through instantiation and a reduction in code size.Additionally, under this mode, decoders are uniqued only within a given bitwidth (as opposed to across all bitwidths without this option), so the decoder index values assigned are smaller, and consume less bytes in their ULEB128 encoding. As a result, the generated decoder tables can also reduce in size.
Adopt this feature for the AMDGPU and RISCV backend. In a release build, this results in a net 55% reduction in the .text size of libLLVMAMDGPUDisassembler.so and a 5% reduction in the .rodata size. For RISCV, which today uses a single
uint64_t
type, this results in a 3.7% increase in code size (expected as we instantiate the code 3 times now).Actual measured sizes are as follows: