[analyzer] Disable graph-trim-interval by default #111843

NagyDonat · 2024-10-10T14:33:51Z

Because this hack is responsible for use-after-free errors that can trigger assertion failures, as reported e.g. in the github issue #105512

Disabling this performance optimization can cause severe slowdowns (up to +200% runtime) on a few specific translation units, but its effects are barely significant when they are averaged over a whole project: even on projects that are "severely affected" the slowdowns are <20 seconds.

If someone happens to be heavily affected by this performance loss (and doesn't fear the crashes) they can re-enable ExplodedNode reclamation by passing -analyzer-option graph-trim-interval=1000 (the old default) to the analyzer.

This commit is a temporary workaround to eliminate the crashes ASAP; in addition to this, we're also working to develop a long-term solution that can hopefully remove 'graph-trim-interval' altogether and replace it with different solutions that guarantee a good runtime without breaking the invariants of the ExplodedGraph. (It would be much easier to reason about the execution paths if we didn't have to think about the case that some nodes can just disappear.)

Because this hack is responsible for use-after-free errors that can trigger assertion failures, as reported e.g. in the github issue llvm#105512 Disabling this performance optimization can cause severe slowdowns (up to +200% runtime) on a few specific translation units, but its effects are barely significant when they are averaged over a whole project: even on projects that are "severely affected" the slowdowns are <20 seconds. If someone happens to be heavily affected by this performance loss (and doesn't fear the crashes) they can re-enable ExplodedNode reclamation by passing -analyzer-option graph-trim-interval=1000 (the old default) to the analyzer. This commit is a temporary workaround to eliminate the crashes ASAP; in addition to this, we're also working to develop a long-term solution that can hopefully remove 'graph-trim-interval' altogether and replace it with different solutions that guarantee a good runtime without breaking the invariants of the ExplodedGraph. (It would be much easier to reason about the execution paths if we didn't have to think about the case that some nodes can just disappear.)

llvmbot · 2024-10-10T14:34:27Z

@llvm/pr-subscribers-clang-static-analyzer-1

@llvm/pr-subscribers-clang

Author: Donát Nagy (NagyDonat)

Changes

Because this hack is responsible for use-after-free errors that can trigger assertion failures, as reported e.g. in the github issue #105512

Disabling this performance optimization can cause severe slowdowns (up to +200% runtime) on a few specific translation units, but its effects are barely significant when they are averaged over a whole project: even on projects that are "severely affected" the slowdowns are <20 seconds.

If someone happens to be heavily affected by this performance loss (and doesn't fear the crashes) they can re-enable ExplodedNode reclamation by passing -analyzer-option graph-trim-interval=1000 (the old default) to the analyzer.

This commit is a temporary workaround to eliminate the crashes ASAP; in addition to this, we're also working to develop a long-term solution that can hopefully remove 'graph-trim-interval' altogether and replace it with different solutions that guarantee a good runtime without breaking the invariants of the ExplodedGraph. (It would be much easier to reason about the execution paths if we didn't have to think about the case that some nodes can just disappear.)

Full diff: https://github.com/llvm/llvm-project/pull/111843.diff

2 Files Affected:

(modified) clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def (+5-3)
(modified) clang/test/Analysis/analyzer-config.c (+1-1)

diff --git a/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def b/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def
index 737bc8e86cfb6a..ab7d696e595bce 100644
--- a/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def
+++ b/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def
@@ -393,9 +393,11 @@ ANALYZER_OPTION(
 
 ANALYZER_OPTION(
     unsigned, GraphTrimInterval, "graph-trim-interval",
-    "How often nodes in the ExplodedGraph should be recycled to save memory. "
-    "To disable node reclamation, set the option to 0.",
-    1000)
+    "[DEPRECATED] When set to a number N > 0, this enables reclamation of "
+    "'unimportant' nodes in the ExplodedGraph once per N node creation steps. "
+    "This reduces the memory usage, but can cause use-after-free errors or "
+    "crashes, so will be phased out.",
+    0)
 
 ANALYZER_OPTION(
     unsigned, MinCFGSizeTreatFunctionsAsLarge,
diff --git a/clang/test/Analysis/analyzer-config.c b/clang/test/Analysis/analyzer-config.c
index 47594e8317bc79..2b6230f0921e60 100644
--- a/clang/test/Analysis/analyzer-config.c
+++ b/clang/test/Analysis/analyzer-config.c
@@ -84,7 +84,7 @@
 // CHECK-NEXT: experimental-enable-naive-ctu-analysis = false
 // CHECK-NEXT: exploration_strategy = unexplored_first_queue
 // CHECK-NEXT: faux-bodies = true
-// CHECK-NEXT: graph-trim-interval = 1000
+// CHECK-NEXT: graph-trim-interval = 0
 // CHECK-NEXT: ignore-bison-generated-files = true
 // CHECK-NEXT: ignore-flex-generated-files = true
 // CHECK-NEXT: inline-lambdas = true

NagyDonat · 2024-10-10T14:41:27Z

Feedback about this topic in the parent issue:

@steakhal at #105512 (comment)

I can stand by disabling node reclamation given that our measurements were correct - so I'd like someone else to repeat the measurements at some scale to gain confidence that our decision is based of real datapoints. That measurements should also include ffmpeg (as a project showcasing the slowdown), and confirm that even in worst case translation unit analysis times behave as we described.

Assuming the observed numbers align with the expectation, we can turn it off by default.

@xazax at #105512 (comment)

Note that this is not just about performance, but about memory. If the change in the memory consumption is significant, that might be the difference whether some users can do -j 12 or -j 8 before running out of memory which would be a huge difference in the throughput. I don't think we can make this decision without looking at the memory consumption.

steakhal · 2024-10-10T15:03:39Z

If someone happens to be heavily affected by this performance loss (and doesn't fear the crashes) they can re-enable ExplodedNode reclamation by passing -analyzer-option graph-trim-interval=1000 (the old default) to the analyzer.

Correct me if I'm wrong, but this is an assertion failure. That said, these nodes come from a pool that doesn't get deallocated, but all at once (BumpPtrAllocator). Consequently, all pointers point to allocated memory, regardless if the object's destructor was called. In practice, this means that this issue can't cause segfaults. We may get surprising (but valid) states super rarely, but that can't manifest crashes. The issue the user reported underpins this as that refers to the assertion.

So, to me, this isn't an urgent or important issue that would deserve immediate actions. One such equally good option would be removing that assert.
Despite all I said here, I'm still very much looking for the day disabling node reclamation. I just don't get the urgency.

steakhal · 2024-10-10T15:07:14Z

And the message I wanted to share xD

Is it a possible way forward dropping that assert?

NagyDonat · 2024-10-10T16:15:13Z

Is it a possible way forward dropping that assert?

The function that performs the assertion is not part of the static analyzer, it's a generic graph algorithm from an LLVM support library and the assertion seems to be a really obvious sanity check. I don't think that it's reasonable to remove the assertion months after it was added.

steakhal · 2024-10-10T17:05:55Z

Is it a possible way forward dropping that assert?

The function that performs the assertion is not part of the static analyzer, it's a generic graph algorithm from an LLVM support library and the assertion seems to be a really obvious sanity check. I don't think that it's reasonable to remove the assertion months after it was added.

Ah, yes. Well, thats makes this choice easier. Somehow I misremembered. Thanks for insisting!

NagyDonat · 2024-11-21T15:05:17Z

It turns out that this commit increases the runtime by ~3% on a set of 6-8 open source projects that we used as a benchmark. (The testing was done on our local fork by @gamesh411, he can provide more accurate data if needed.) Based on this, we decided to temporarily disable the assertion on our downstream branch (instead of using this patch), so I'm closing this PR because if we don't use it ourselves, then I won't try to sell it to the open source community.

I hope that eventually there will be a permanent solution for this issue, but I don't know exactly when. This task is assigned to an intern (Gábor Tóthvári) who is capable and diligent, but has other responsibilities (e.g. university studies) so doesn't work on this full time.

llvmbot added clang Clang issues not falling into any other category clang:static analyzer labels Oct 10, 2024

NagyDonat requested review from Xazax-hun, dkrupp, haoNoQ and steakhal October 10, 2024 14:34

NagyDonat mentioned this pull request Oct 10, 2024

clang -cc1 -analyze crashes with Assertion `(!BB || Parent == NodeTrait::getParent(const_cast<NodeT *>(BB))) && "cannot get DomTreeNode of block with different parent"' failed. #105512

Open

NagyDonat closed this Nov 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[analyzer] Disable graph-trim-interval by default #111843

[analyzer] Disable graph-trim-interval by default #111843

Uh oh!

NagyDonat commented Oct 10, 2024

Uh oh!

llvmbot commented Oct 10, 2024 •

edited

Loading

Uh oh!

NagyDonat commented Oct 10, 2024

Uh oh!

steakhal commented Oct 10, 2024

Uh oh!

steakhal commented Oct 10, 2024

Uh oh!

NagyDonat commented Oct 10, 2024

Uh oh!

steakhal commented Oct 10, 2024

Uh oh!

NagyDonat commented Nov 21, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[analyzer] Disable graph-trim-interval by default #111843

[analyzer] Disable graph-trim-interval by default #111843

Uh oh!

Conversation

NagyDonat commented Oct 10, 2024

Uh oh!

llvmbot commented Oct 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NagyDonat commented Oct 10, 2024

Uh oh!

steakhal commented Oct 10, 2024

Uh oh!

steakhal commented Oct 10, 2024

Uh oh!

NagyDonat commented Oct 10, 2024

Uh oh!

steakhal commented Oct 10, 2024

Uh oh!

NagyDonat commented Nov 21, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

llvmbot commented Oct 10, 2024 •

edited

Loading