[llvm-remarkutil] Introduce filter command #159784

tobias-stadler · 2025-09-19T14:34:46Z

Add a filter command to llvm-remarkutil. This can be used to extract
remarks for a certain function, pass, type, etc. from a large remarks
file to a new remarks file. This uses the same filter arguments as the
count command.

Depends on #156715. Thanks to this change, we don't need to buffer all
remarks before reserializing them, so we should be able to process
arbitrarily large files.

Created using spr 1.3.7-wip

Created using spr 1.3.7-wip [skip ci]

Created using spr 1.3.7-wip

Currently there are two serialization modes for bitstream Remarks: standalone and separate. The separate mode splits remark metadata (e.g. the string table) from actual remark data. The metadata is written into the object file by the AsmPrinter, while the remark data is stored in a separate remarks file. This means we can't use bitstream remarks with tools like opt that don't generate an object file. Also, it is confusing to post-process bitstream remarks files, because only the standalone files can be read by llvm-remarkutil. We always need to use dsymutil to convert the separate files to standalone files, which only works for MachO. It is not possible for clang/opt to directly emit bitstream remark files in standalone mode, because the string table can only be serialized after all remarks were emitted. Therefore, this change completely removes the separate serialization mode. Instead, the remark string table is now always written to the end of the remarks file. This requires us to tell the serializer when to finalize remark serialization. This automatically happens when the serializer goes out of scope. However, often the remark file goes out of scope before the serializer is destroyed. To diagnose this, I have added an assert to alert users that they need to explicitly call finalizeLLVMOptimizationRemarks. This change paves the way for further improvements to the remark infrastructure, including more tooling (e.g. #159784), size optimizations for bitstream remarks, and more. Pull Request: #156715

fhahn · 2025-09-23T08:45:13Z

llvm/tools/llvm-remarkutil/RemarkCounter.h

 #include "RemarkUtilHelpers.h"
 #include "llvm/ADT/MapVector.h"
 #include "llvm/Support/Regex.h"
+#include <map>


why do we need to pull this in here now?

removed unnecessary includes from RemarkUtilHelpers. Previously this was pulled in by YAMLRemarkSerializer.h thru YAMLTraits.h in RemarkUtilHelpers.

fhahn · 2025-09-23T08:50:32Z

llvm/tools/llvm-remarkutil/RemarkFilter.cpp

+  for (; MaybeRemark; MaybeRemark = Parser.next()) {
+    Remark &Remark = **MaybeRemark;
+    if (!Filter.filterRemark(Remark))
+      continue;
+    Serializer.emit(Remark);
+  }


would it be possible to create a iterator adopter that takes a parser and returns an iterator over filtered remarks and use it here and in RemarkCounter.cpp

The iterator/range needs to be fallible and must be able to take ownership of the unique_ptr<Remark> . This means we can't use a combination of iterator_range and fallible_iterator to implement this. We would have to roll this from scratch. My prototype is the same amount of code as the entire filter command. I don't think that's worth it, at least it's out of scope for this PR.

Yeah that's OK, although this might be worth more thought in the future, to avoid duplicating the logic in every remarks tool

@tobias-stadler @fhahn I'm seeing failures in some external modules builds due to reuse of the filter identifier here. E.g.

llvm-project/llvm/tools/llvm-remarkutil/RemarkFilter.cpp:23:11: error: redefinition of 'filter' as different kind of symbol 05:35:57 23 | namespace filter { 05:35:57 | ^ 05:35:57 /<sdk>/usr/include/curses.h:686:29: note: previous definition is here 05:35:57 686 | extern NCURSES_EXPORT(void) filter (void);

Is there a reasonable alternative name that we could use for this namespace?

Created using spr 1.3.7-wip [skip ci]

Created using spr 1.3.7-wip

fhahn

LGTM, thanks

fhahn · 2025-09-24T11:12:11Z

llvm/tools/llvm-remarkutil/RemarkFilter.cpp

+  for (; MaybeRemark; MaybeRemark = Parser.next()) {
+    Remark &Remark = **MaybeRemark;
+    if (!Filter.filterRemark(Remark))
+      continue;
+    Serializer.emit(Remark);
+  }


Yeah that's OK, although this might be worth more thought in the future, to avoid duplicating the logic in every remarks tool

Add a filter command to llvm-remarkutil. This can be used to extract remarks for a certain function, pass, type, etc. from a large remarks file to a new remarks file. This uses the same filter arguments as the count command. Depends on #156715. Thanks to this change, we don't need to buffer all remarks before reserializing them, so we should be able to process arbitrarily large files. Pull Request: llvm/llvm-project#159784

Remove the filter namespace, because `filter` is used by `curses.h`, causing some external build failures (llvm#159784 (comment)). We don't really need the namespace here anyways, because everything is static. This was just following what some of the other commands in llvm-remarkutil are doing. Pull Request: llvm#160802

Add a filter command to llvm-remarkutil. This can be used to extract remarks for a certain function, pass, type, etc. from a large remarks file to a new remarks file. This uses the same filter arguments as the count command. Depends on llvm#156715. Thanks to this change, we don't need to buffer all remarks before reserializing them, so we should be able to process arbitrarily large files. Pull Request: llvm#159784

Remove the filter namespace, because `filter` is used by `curses.h`, causing some external build failures (#159784 (comment)). We don't really need the namespace here anyways, because everything is static. This was just following what some of the other commands in llvm-remarkutil are doing.

… (#160802) Remove the filter namespace, because `filter` is used by `curses.h`, causing some external build failures (llvm/llvm-project#159784 (comment)). We don't really need the namespace here anyways, because everything is static. This was just following what some of the other commands in llvm-remarkutil are doing.

tstellar · 2025-10-21T22:26:02Z

@tobias-stadler I'm seeing this test fail when trying to use the release binary script on the github runners. Do you have any ideas why it might be failing. Looks like it could be a line ending or whitespace issue. (edit: This is on Windows).

https://github.com/llvm/llvm-project/actions/runs/18698571133/job/53322121191?pr=150793

tobias-stadler · 2025-10-21T22:51:05Z

@tobias-stadler I'm seeing this test fail when trying to use the release binary script on the github runners. Do you have any ideas why it might be failing. Looks like it could be a line ending or whitespace issue. (edit: This is on Windows).

https://github.com/llvm/llvm-project/actions/runs/18698571133/job/53322121191?pr=150793

Interesting, this passed the Windows CI. My best guess is that this is a line ending issue with the diff command. We might need to pass -Z or avoid the diff command altogether. This could depend on if git is configured to checkout CRLF line-endings or not. I don't have a Windows machine set up, so no good way to quickly troubleshoot this.

tstellar · 2025-10-21T22:54:30Z

@tobias-stadler Ok, thanks. I'll try to debug this a little more and see what I can find out.

tobias-stadler · 2025-10-21T23:18:05Z

@tstellar I've posted #164516. How can we check if this fixes the release build?

tobias-stadler added 2 commits September 19, 2025 15:34

[spr] initial version

6819c32

Created using spr 1.3.7-wip

[spr] changes to main this commit is based on

d97fe62

Created using spr 1.3.7-wip [skip ci]

tobias-stadler requested review from anemet, fhahn and jroelofs September 19, 2025 14:38

fhahn mentioned this pull request Sep 22, 2025

[Remarks] Restructure bitstream remarks to be fully standalone #156715

Merged

Hopefully fix SerializerFormat

b27faa2

Created using spr 1.3.7-wip

jroelofs approved these changes Sep 22, 2025

View reviewed changes

fhahn reviewed Sep 23, 2025

View reviewed changes

tobias-stadler added 3 commits September 24, 2025 00:19

[spr] changes introduced through rebase

eff6b51

Created using spr 1.3.7-wip [skip ci]

Rebase

7ee6eae

Created using spr 1.3.7-wip

Fix stray whitespace

626eb4a

Created using spr 1.3.7-wip

fhahn approved these changes Sep 24, 2025

View reviewed changes

tobias-stadler changed the base branch from users/tobias-stadler/spr/main.llvm-remarkutil-introduce-filter-command to main September 24, 2025 14:07

tobias-stadler merged commit 6e6a3d8 into main Sep 24, 2025
9 of 10 checks passed

tobias-stadler deleted the users/tobias-stadler/spr/llvm-remarkutil-introduce-filter-command branch September 24, 2025 14:07

tobias-stadler mentioned this pull request Sep 26, 2025

[llvm-remarkutil] filter: Fix curses.h namespace pollution #160802

Merged

[llvm-remarkutil] Introduce filter command #159784

[llvm-remarkutil] Introduce filter command #159784

Uh oh!

Conversation

tobias-stadler commented Sep 19, 2025

Uh oh!

fhahn Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

tobias-stadler Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fhahn Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

tobias-stadler Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fhahn Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

lhames Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

tobias-stadler Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

fhahn left a comment

Choose a reason for hiding this comment

Uh oh!

fhahn Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tstellar commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tobias-stadler commented Oct 21, 2025

Uh oh!

tstellar commented Oct 21, 2025

Uh oh!

tobias-stadler commented Oct 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

tobias-stadler Sep 23, 2025 •

edited

Loading

tobias-stadler Sep 23, 2025 •

edited

Loading

tstellar commented Oct 21, 2025 •

edited

Loading