Skip to content

Conversation

@OCHyams
Copy link
Contributor

@OCHyams OCHyams commented Jul 1, 2025

DILocations that are not attached to instructions are encoded using METADATA_LOCATION records which have an abbrev. DILocations attached to instructions are interleaved with instruction records as FUNC_CODE_DEBUG_LOC records, which do not have an abbrev (and FUNC_CODE_DEBUG_LOC_AGAIN which have no operands).

Add a new FUNCTION_BLOCK abbrev FUNCTION_DEBUG_LOC_ABBREV for FUNC_CODE_DEBUG_LOC records.

This reduces the bc file size by up to 7% in CTMark, with many between 2-4% smaller.

per-file size comparison in compile-time-tracker (go to stage1-ReleaseLTO-g).

This optimisation is motivated by #144102, which adds the new Key Instructions fields to bitcode records. The combined patches still overall look to be a slight improvement over the base.

OCHyams added 2 commits July 1, 2025 10:29
DILocations that are not attached to instructions are encoded using
METADATA_LOCATION records which have an abbrev. DILocations attached to
instructions are interleaved with instruction records as FUNC_CODE_DEBUG_LOC
records, which do not have an abbrev (and FUNC_CODE_DEBUG_LOC_AGAIN which have
no operands).

Add a new FUNCTION_BLOCK abbrev FUNCTION_DEBUG_LOC_ABBREV for
FUNC_CODE_DEBUG_LOC records.

This reduces the bc file size by up to 7% in CTMark, with many between 2-4%
smaller.

[per-file file size
compile-time-tracker](https://llvm-compile-time-tracker.com/compare.php?from=75cf826849713c00829cdf657e330e24c1a2fd03&to=1e268ebd0a581016660d9d7e942495c1be041f7d&stat=size-file&details=on)
(go to stage1-ReleaseLTO-g).

This optimisation is motivated by llvm#144102, which adds the new Key Instructions
fields to bitcode records. The combined patches still overall look to be a
slight improvement over the base.
@jmorse
Copy link
Member

jmorse commented Jul 1, 2025

IMO this is good -- although it directly trades-off debug-info builds against normal builds because it's expanding the size of abbrev-records to five bits instead of four. By eye, it look like between a 0.5% and 1% cost per object file [edit: in a normal build].

I've previously discarded some removing-debug-intrinsics optimisations because of that limitation. I'm fine with saying that we should push through it to enable benefits in non-normal builds. With more effort (and the extra abbrev numberspace) we might recover the cost being added now. Further opinions would be most welcome!

@dwblaikie
Copy link
Collaborator

By eye, it look like between a 0.5% and 1% cost per object file [edit: in a normal build].

Huh, that seems like a lot for this one little record...

I wonder if there's any way to reduce that impact - can't make the abbrev condition on debug info, I guess - too early to know that. The bitcode format does dynamically create abbrevs - but I guess if you don't create explicit ones like this, they aren't shared? Each record has its own abbrev? What if we had a way to add shared abbrevs dynamically?

@jmorse
Copy link
Member

jmorse commented Jul 1, 2025

Huh, that seems like a lot for this one little record...

Alas I believe (96% confidence) it's the width of every abbreviated-code in a function-block, thus a very large number of records in the program grow by one bit, in both debug and non-debug modes.

I wonder if there's any way to reduce that impact - can't make the abbrev condition on debug info, I guess - too early to know that.

Hhmmm -- the ModuleBitcodeWriterBase class does get constructed with a Module handy, and we could examine whether that Module had any compile-units? There would be a new failure mode, when a program contains a DILocation that gets emitted as a large abbrev but we didn't correctly determine that in advance and configured abbrevs to be small. I suppose we can scatter assertions to stop that.

Customising the abbrev width based on whether the module contains debug-info feels awkward; but on the other hand, this is the benefit of bitcode abbrevs being emitted inline, we get to customise them without any kind of compatibility issues. Plus with that precedent, we can add some more abbrevs for RemoveDIs records to reduce filesize further, without non-debug builds paying a price.

@nikic
Copy link
Contributor

nikic commented Jul 1, 2025

I feel like we'd probably gain more long term by switching to 5 bits now and thus enabling use of more abbreviations.

But if you want to preserve the size, I strongly suspect that you can kick out FUNCTION_INST_UNOP_ABBREV and FUNCTION_INST_UNOP_FLAGS_ABBREV without much impact. These are used exclusively for encoding the fneg instruction, which shouldn't be particularly common...

@OCHyams
Copy link
Contributor Author

OCHyams commented Jul 2, 2025

@dblaikie:

I wonder if there's any way to reduce that impact - can't make the abbrev condition on debug info, I guess - too early to know that. The bitcode format does dynamically create abbrevs - but I guess if you don't create explicit ones like this, they aren't shared? Each record has its own abbrev? What if we had a way to add shared abbrevs dynamically?

I just tried dynamic abbrevs (to be clear, creating them in function blocks on the fly rather than declaring them up front in function block-info) but it seems you still need to bump the CodeLen from 4 to 5 bits. And looking at the file sizes it seems Jeremy is right that it's the CodeLen increase that's causing the size increase for non-debug builds. Using dynamic abbrevs reduces the savings in debug info builds slightly compared to the initial patch, but it's still pretty good wins overall:

@jmorse:

Customising the abbrev width based on whether the module contains debug-info feels awkward

I'll happily give this a go, but this may not be necessary depending on the results below.

@nikita:

But if you want to preserve the size, I strongly suspect that you can kick out FUNCTION_INST_UNOP_ABBREV and FUNCTION_INST_UNOP_FLAGS_ABBREV without much impact.

base vs remove-unop-abbrevs

There's a slight saving in most non-debug files, but a reasonable regression in a few, e.g. stage1-ReleaseThinLTO CMakeFiles/lencod.dir/context_ini.c.o 83KiB -> 83KiB (+0.48%) (up to 0.9% for gim_tri_collision.cpp which is a small - 500loc - file with a bunch of fp stuff). By my count the average across the .bc files is -0.047% for the non-debug thinLTO build and -0.021% for the debug LTO build. So overall it's a slight win for CTMark looking at percentages (I didn't get numbers for sum size difference, but can if needed).

If everyone's happy with us deferring smart conditional-on-debug-info stuff to a later date, in favour of Nikita's suggestion, then this seems like a good option to me. (I'll put up a PR shortly)

OCHyams added a commit to OCHyams/llvm-project that referenced this pull request Jul 2, 2025
Discussed in llvm#146497.

Seems to have a positive impact on bitcode file size in CTMark LTO builds, and
makes space for more abbrevs without bumping the CodeLen for function blocks.

Alternatively, we could just bump the CodeLen to 5 bits.
@OCHyams
Copy link
Contributor Author

OCHyams commented Jul 2, 2025

Or we just bump it to 5 bits as @nikic says, I don't have a strong opinion either way - #146717 for the abbrev removal.

@OCHyams
Copy link
Contributor Author

OCHyams commented Jul 3, 2025

Thanks @nikic for those numbers, that looks good. It seems like making CodeLen 5 bits is sensible and should be fairly uncontroversial in that case?

@jmorse @dwblaikie are you happy with this patch as is?

Copy link
Member

@jmorse jmorse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO that clearly justifies this course of action. LGTM!

(If there are stronger opinions out there we can revert back).

@jmorse
Copy link
Member

jmorse commented Jul 6, 2025

NB, as Orlando is away, I'm going to land this via #147211 with a test fixup.

jmorse added a commit that referenced this pull request Jul 6, 2025
DILocations that are not attached to instructions are encoded using
METADATA_LOCATION records which have an abbrev. DILocations attached to
instructions are interleaved with instruction records as FUNC_CODE_DEBUG_LOC
records, which do not have an abbrev (and FUNC_CODE_DEBUG_LOC_AGAIN
which have no operands).

Add a new FUNCTION_BLOCK abbrev FUNCTION_DEBUG_LOC_ABBREV for
FUNC_CODE_DEBUG_LOC records.

This reduces the bc file size by up to 7% in CTMark, with many between 2-4%
smaller.

[per-file file size
compile-time-tracker](https://llvm-compile-time-tracker.com/compare.php?from=75cf826849713c00829cdf657e330e24c1a2fd03&to=1e268ebd0a581016660d9d7e942495c1be041f7d&stat=size-file&details=on)
(go to stage1-ReleaseLTO-g).

This optimisation is motivated by #144102, which adds the new Key Instructions
fields to bitcode records. The combined patches still overall look to be a
slight improvement over the base.

(Originally reviewed in PR #146497)

Co-authored-by: Orlando Cazalet-Hyams <[email protected]>
@jmorse
Copy link
Member

jmorse commented Jul 6, 2025

Merged in 51f4e2c , as discussed I think the size-cost is worth it, and it sounds like we'll gain it back with the additional abbrevs anyway.

@jmorse jmorse closed this Jul 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants