-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[llvm-remarkutil] Introduce filter command #159784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[llvm-remarkutil] Introduce filter command #159784
Conversation
Created using spr 1.3.7-wip
Created using spr 1.3.7-wip [skip ci]
Created using spr 1.3.7-wip
Currently there are two serialization modes for bitstream Remarks: standalone and separate. The separate mode splits remark metadata (e.g. the string table) from actual remark data. The metadata is written into the object file by the AsmPrinter, while the remark data is stored in a separate remarks file. This means we can't use bitstream remarks with tools like opt that don't generate an object file. Also, it is confusing to post-process bitstream remarks files, because only the standalone files can be read by llvm-remarkutil. We always need to use dsymutil to convert the separate files to standalone files, which only works for MachO. It is not possible for clang/opt to directly emit bitstream remark files in standalone mode, because the string table can only be serialized after all remarks were emitted. Therefore, this change completely removes the separate serialization mode. Instead, the remark string table is now always written to the end of the remarks file. This requires us to tell the serializer when to finalize remark serialization. This automatically happens when the serializer goes out of scope. However, often the remark file goes out of scope before the serializer is destroyed. To diagnose this, I have added an assert to alert users that they need to explicitly call finalizeLLVMOptimizationRemarks. This change paves the way for further improvements to the remark infrastructure, including more tooling (e.g. #159784), size optimizations for bitstream remarks, and more. Pull Request: #156715
#include "RemarkUtilHelpers.h" | ||
#include "llvm/ADT/MapVector.h" | ||
#include "llvm/Support/Regex.h" | ||
#include <map> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need to pull this in here now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed unnecessary includes from RemarkUtilHelpers. Previously this was pulled in by YAMLRemarkSerializer.h
thru YAMLTraits.h
in RemarkUtilHelpers.
for (; MaybeRemark; MaybeRemark = Parser.next()) { | ||
Remark &Remark = **MaybeRemark; | ||
if (!Filter.filterRemark(Remark)) | ||
continue; | ||
Serializer.emit(Remark); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be possible to create a iterator adopter that takes a parser and returns an iterator over filtered remarks and use it here and in RemarkCounter.cpp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The iterator/range needs to be fallible and must be able to take ownership of the unique_ptr<Remark>
. This means we can't use a combination of iterator_range and fallible_iterator to implement this. We would have to roll this from scratch. My prototype is the same amount of code as the entire filter command. I don't think that's worth it, at least it's out of scope for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's OK, although this might be worth more thought in the future, to avoid duplicating the logic in every remarks tool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tobias-stadler @fhahn I'm seeing failures in some external modules builds due to reuse of the filter
identifier here. E.g.
llvm-project/llvm/tools/llvm-remarkutil/RemarkFilter.cpp:23:11: error: redefinition of 'filter' as different kind of symbol
05:35:57 23 | namespace filter {
05:35:57 | ^
05:35:57 /<sdk>/usr/include/curses.h:686:29: note: previous definition is here
05:35:57 686 | extern NCURSES_EXPORT(void) filter (void);
Is there a reasonable alternative name that we could use for this namespace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created using spr 1.3.7-wip [skip ci]
Created using spr 1.3.7-wip
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks
for (; MaybeRemark; MaybeRemark = Parser.next()) { | ||
Remark &Remark = **MaybeRemark; | ||
if (!Filter.filterRemark(Remark)) | ||
continue; | ||
Serializer.emit(Remark); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's OK, although this might be worth more thought in the future, to avoid duplicating the logic in every remarks tool
Add a filter command to llvm-remarkutil. This can be used to extract remarks for a certain function, pass, type, etc. from a large remarks file to a new remarks file. This uses the same filter arguments as the count command. Depends on #156715. Thanks to this change, we don't need to buffer all remarks before reserializing them, so we should be able to process arbitrarily large files. Pull Request: llvm/llvm-project#159784
Remove the filter namespace, because `filter` is used by `curses.h`, causing some external build failures (llvm#159784 (comment)). We don't really need the namespace here anyways, because everything is static. This was just following what some of the other commands in llvm-remarkutil are doing. Pull Request: llvm#160802
Add a filter command to llvm-remarkutil. This can be used to extract remarks for a certain function, pass, type, etc. from a large remarks file to a new remarks file. This uses the same filter arguments as the count command. Depends on llvm#156715. Thanks to this change, we don't need to buffer all remarks before reserializing them, so we should be able to process arbitrarily large files. Pull Request: llvm#159784
Remove the filter namespace, because `filter` is used by `curses.h`, causing some external build failures (#159784 (comment)). We don't really need the namespace here anyways, because everything is static. This was just following what some of the other commands in llvm-remarkutil are doing.
… (#160802) Remove the filter namespace, because `filter` is used by `curses.h`, causing some external build failures (llvm/llvm-project#159784 (comment)). We don't really need the namespace here anyways, because everything is static. This was just following what some of the other commands in llvm-remarkutil are doing.
Add a filter command to llvm-remarkutil. This can be used to extract
remarks for a certain function, pass, type, etc. from a large remarks
file to a new remarks file. This uses the same filter arguments as the
count command.
Depends on #156715. Thanks to this change, we don't need to buffer all
remarks before reserializing them, so we should be able to process
arbitrarily large files.