introduced a cache for `followAllReferences()` calls #7192

firewave · 2025-01-07T20:24:06Z

No description provided.

firewave · 2025-01-07T20:28:21Z

This essentially eliminates any meaningful impact by followAllReferences() at all.

-D__GNUC__ --check-level=exhaustive ../lib/utils.cpp

Clang 19 652,987,030 -> ~~624,476,510~~ 618,089,977

followAllReferences() calls from isAliasOf() - 350,100 -> 1,581

The example from https://trac.cppcheck.net/ticket/10765#comment:4:

Clang 19 3,056,382,003 -> ~~2,838,708,731~~ 2,815,165,117

followAllReferences() calls from isAliasOf() - 2,592,565 -> 641

firewave · 2025-01-07T20:33:25Z

lib/valueflow.cpp

            [&] {
                // Follow references
-                auto refs = followAllReferences(tok);
+                auto refs = tok->refs();


needs to be const auto&.

Done.

I filed https://trac.cppcheck.net/ticket/13533 about detecting this.

firewave · 2025-01-07T20:34:32Z

lib/vf_analyzers.cpp

            return Action::Invalid;
        // Follow references
-        auto refs = followAllReferences(tok);
+        auto refs = tok->refs();


This copy is necessary since an additional entry is being added. But I think this is not necessary and I will try to refactor the code to avoid it.

I adjusted this by unfortunately there is some redundant code introduced.

pfultz2 · 2025-01-07T20:40:42Z

lib/token.cpp

+    if (!mImpl->mRefs)
+        mImpl->mRefs = new SmallVector<ReferenceToken>(followAllReferences(this));
+    return *mImpl->mRefs;
+}


This is not the right way to do this. This is a const method that is modifying the token. Instead followAllReferences should be moved to the SymbolDatabase and there should be a pass that fills this in for all of the tokens.

Yes. I was about to add a comment about this. This violates the technical const and if we would not allow this (I hope some day I will finish up that change) this would require mutable (which from my experience is acceptable for caches inside objects).

I am not sure how easy it would be to implement an earlier pass since it is not done for all tokens but there are lots of checks which are performed before we actually end up following references. That would need to be replicated I reckon - and that also has a certain visible overhead and we would need to run through that twice then.

Actually I would also have the ValueFlow behave this way so we might avoid running it for code which is not relevant.

This is not the right way to do this. This is a const method that is modifying the token.

That should be totally fine (by precedent). We modify const Token objects all over the place in the ValueFlow and symbol database via const_cast. Obviously it would be better if we didn't but here it is much cleaner and in a single place and as stated before I think this is acceptable practice.

Actually I would also have the ValueFlow behave this way so we might avoid running it for code which is not relevant.

Please disregard this. This is wishful thinking as this would not be possible the way the ValueFlow is working. I totally forgot I already looked into this.

The const_cast should be fixed, but we shouldn't add more code that needs to be fixed.

Also this is called in ValueFlowForward and ValueFlowReverse so its already called on almost every token in functionScopes, so it really won't help performance being a cache.

Furthermore, in copcheck we update the tokens through passes rather than using a cache, this makes it easier to debug and we can provide this information to addons later on. So doing a pass in SymbolDatabase would be consistent with the rest of the code.

Furthermore, in copcheck we update the tokens through passes rather than using a cache, this makes it easier to debug and we can provide this information to addons later on. So doing a pass in SymbolDatabase would be consistent with the rest of the code.

Will give it a try and check how it impacts performance.

Do we know if it causes a perf impact or how much? It seems we are making it worse for premature optimizations.

There are other advantages to doing it the correct way too such as better debugging and addons can take advantage of this information (this seems like a useful analysis for addons). So if we enable it for addons then we will beed to run a pass regardless.

Also you could consider skipping this for functions we are skipping analysis for, if the performance is too bad, but it would be good to see some actual numbers to make this decision.

So the current approach seems like the best approach.

I meant to say "It seems like the currently best approach".

Do we know if it causes a perf impact or how much? It seems we are making it worse for premature optimizations.

Various performance numbers are in the PR. It is a massive improvement. It would also help with the runtime of the CI.

Also you could consider skipping this for functions we are skipping analysis for, if the performance is too bad, but it would be good to see some actual numbers to make this decision.

That was an idea regarding the ValueFlow (see https://trac.cppcheck.net/ticket/12528) but that won't work since not all passes are based on function scopes. But that is currently out-of-scope and is something I am looking at within another context hopefully soon.

It might actually not an issue after all because with the duplicated calls eliminated it basically no longer has any footprint. The only issue could be that we perform it for more tokens than we actually need so that would introduce new overhead but it might not be much. Will test that. Although I would prefer not to have that at all since all the overhead adds up - a lot.

I just realized this is called when setting exprids, so it always called on every token regardless of ValueFlow analysis.

I just gave it a small spin and it does not increase the Ir so that looks feasible.

pfultz2 · 2025-01-07T20:43:40Z

lib/token.h

    void setCppcheckAttribute(CppcheckAttributes::Type type, MathLib::bigint value);
    bool getCppcheckAttribute(CppcheckAttributes::Type type, MathLib::bigint &value) const;

+    SmallVector<ReferenceToken>* mRefs{};


If this is going to be a pointer, you should use std::unique_ptr or std::shared_ptr.

I modeled it after mValues which is also just a raw pointer.

You should model it after mAttributeAlignas, thats a much safer choice. I think the raw pointers were used before managed pointers were available.

That requires ReferenceToken to be a complete type and the astutils.h to be added.

You can just move the ReferenceToken to the token.h header.

True, but followAllReferences() is the code which produces the ReferenceToken objects so I would prefer to keep those together.

It is unfortunate enough that smallvector.h spilled into more code.

The spillage could be addressed by moving TokenImpl into the implementation (hence the name). I tried that in the past but unfortunately the code relies on the implementation being exposed - I will give it another try.

Onward into the next rabbithole: #7831.

It actually compiles without the full type. It is just the highlighting of my IDE which shows errors.

Not so in the CI. I have no idea how it builds locally - with various compilers...with and without Boost.

It builds in the CI by making the constructor = default though.

So if this passes the CI we should finally be able to merge.

firewave · 2025-01-07T20:59:27Z

Something really weird is going on here in the UBSAN job:

Check time: cli/threadexecutor.cpp: 0.53017s
Check time: cli/processexecutor.cpp: 1.41327s
Check time: lib/addoninfo.cpp: 0.172107s
Check time: lib/analyzerinfo.cpp: 0.636273s

The timing information for cli/cmdlineparser.cpp is missing ...

firewave · 2025-01-07T21:20:25Z

Before

Check time: cli/cmdlineparser.cpp: 1091.73s
[...]
Check time: lib/checkio.cpp: 219.069s
[...]
Check time: lib/symboldatabase.cpp: 191.785s
[...]
Check time: lib/tokenize.cpp: 290.026s

After

Check time: cli/cmdlineparser.cpp: 760.299s
[...]
Check time: lib/checkio.cpp: 168.103s
[...]
Check time: lib/symboldatabase.cpp: 145.913s
[...]
Check time: lib/tokenize.cpp: 236.561s

firewave · 2025-03-15T09:15:58Z

Could we merge this without the debug integration for now (I will file a ticket about doing that)? I would like to clean up and test the existing debug output first (see #7258 - as usual stalled) before I add new data. And it would greatly speed up the CI as well as giving a baseline to compare against if running it as a pass would have significant performance impact.

pfultz2 · 2025-04-05T17:04:55Z

This should be done in the symboldatabase either before or during exprids.

pfultz2 · 2025-04-05T17:07:01Z

lib/vf_analyzers.cpp

+                return a;
+        }
+        if (analyzeTok) {
+            auto a = analyze_f(tok);


I dont really like this at all, it is harder to follow and harder to extend in the future.

I dropped it for now.

pfultz2 · 2025-04-05T17:18:36Z

lib/astutils.h

-                                                bool inconclusive = true,
-                                                ErrorPath errors = ErrorPath{},
-                                                int depth = 20);
+SmallVector<ReferenceToken> followAllReferences(const Token* tok, bool temporary = true);


Since we are removing the inconclusive flag, then we should add that to the ReferenceToken so we can know if it was inconclusive or not.

The removed flags were not used outside of astutils.cpp itself. If that information is need in the future we can make the suggested change.

pfultz2 · 2025-04-05T17:19:21Z

lib/token.cpp

    return mTokensFrontBack.list.getFiles()[mImpl->mFileIndex];
 }
+
+const SmallVector<ReferenceToken>& Token::refs() const


This is missing the bool temporary parameter.

That was intentional - added.

firewave · 2025-04-07T09:06:22Z

Thanks for the feedback. Some of the shortcomings were intentional since it seemed to make things simpler and I wanted to get the performance improvement in. But I remembered the changes to be simpler which is not the case so I need a bit to get into it again.

firewave · 2025-05-07T06:42:23Z

@pfultz2 How should the information be presented in the debug output? I also assume it should be a separate section.

firewave · 2025-06-03T12:26:01Z

This should be done in the symboldatabase either before or during exprids.

In the limited cases I tested that produced considerably more calls to followAllReferences() but did not affect the overall performance much. This still needs to be tested with the selfcheck.

…for the GUI (#7570)

firewave · 2025-08-14T09:18:09Z

Given the improvement this change provides and how it affects the CI (we are actually getting even slower and not having this applied makes it harder to pinpoint the other hot spots) I would really prefer if we could just merge the very first version and do the adjustments as a follow-ups.

firewave · 2025-08-14T09:27:21Z

Having this done in incremental commits would also make it possible to bisect differences in behavior/performance this might have based on the stage this cache is generated. Especially after I tried to add a different cache in #7573 which relies on that the results are not being cached beyond a point (which might also cause issues here if this is stored too early - but might not be reflected in our tests).

firewave · 2025-09-15T18:49:40Z

I dropped the changes to generate these beforehand as it was too unspecific. The performance gains are just too big to delay this any longer. I will file a ticket and publish a PR with the WIP code later.

And I would really prefer that to land it in separate commits anyways so it will make it easier to distinguish possible issues caused by adding the cache and by precomputation of the cache.

I am not also confident the data might actually correct when precomputated after working on #7573 (I am trying to improving guardrails so we could detect that).

I also confirmed that adding the cache shows no differences with #7800.

lib/astutils.cpp

danmar · 2025-10-19T15:08:55Z

The performance gains are just too big to delay this any longer.

It looks really nice!

Did you measure in Windows also?
Do you have access to a Windows computer?
I would expect that you get performance gain in windows also, just checking.

As far as I know we have had a focus on Linux and I fear it might have hurt Windows experience. I am talking with a customer, they say it takes 40 minutes for them to scan their project in Linux and 17-18 hours in Windows, using similar hardware. Some more info:

it is valueflow that takes most time.
single threaded analysis is not that bad as I understand it. It's the multithreaded analysis that is much slower. However it is ValueFlow that is very slow and as far as I know it does not rely on disk i/o nor mutexes (except Settings::terminated()).

so well it might be an idea to start measuring performance both with -j1 and -jX.

I can also reproduce that Cppcheck is much slower in windows than in Linux.

firewave · 2025-10-19T15:27:03Z

As far as I know we have had a focus on Linux and I fear it might have hurt Windows experience.

Official Windows release binaries are still built without Boost - see https://trac.cppcheck.net/ticket/13823. All pieces are in place and the only thing left was a step which downloads it and only extracts it partially (will have a look later). Are the Linux binaries built with Boost?

single threaded analysis is not that bad as I understand it. It's the multithreaded analysis that is much slower. However it is ValueFlow that is very slow and as far as I know it does not rely on disk i/o nor mutexes (except Settings::terminated()).

That should only be the case if you have a lot of very small files which generates a lot of output (i.e. debug warnings enabled) when using the process executor (caused by the IPC). Multi-threaded being slower than single-threaded with a long running analysis seems impossible to me. The worst case should be the longest running file in a multi-threaded analysis.

And since Windows has no process executor that does not even apply.

From my own experience I would assume be that there is an issue in the configuration leading to Windows having to process more code from includes than Linux. Enabling information should highlight any missing includes.

It could also be the case that all the system includes are provided and that the Windows versions of them (or the Windows SDK includes which are not used on Linux) are much heavier. Adapting the simplecpp selfcheck for Windows might be a first step to get a baseline difference of the cost of the system includes.

danmar · 2025-10-19T15:53:02Z

Are the Linux binaries built with Boost?

they used Cppcheck Premium binaries and these do not use boost.

From my own experience I would assume be that there is an issue in the configuration leading to Windows having to process more code from includes than Linux.

That is very reasonable but I do not think that is the case. We have significant multithreading slowdown when checking cppcheck source code with such command also:

cd path/to/cppcheck
cppcheck -D__CPPCHECK__ -D__GNUC__ --showtime=file-total -jX lib

danmar · 2025-10-19T16:49:38Z

cppcheck -D__CPPCHECK__ -D__GNUC__ --showtime=file-total -jX lib

For information.. when I run this command on my machine I get such timings:

Linux:
j1: Check time: lib/tokenize.cpp: 5.02091s
j16: Check time: lib/tokenize.cpp: 6.39582s
Linux multithreading overhead: +27.4%

Windows:
j1: Check time: lib/tokenize.cpp: 45.634s
j16: Check time: lib/tokenize.cpp: 79.749s
Windows multithreading overhead: +74.8%

So analysis in Windows is more than 10x slower than analysis in Linux on my machine if -j16 is used.

The open source cppcheck windows release binary is a bit faster but it's not much.

As far as I see, switching from msvc to Clang make it even slower. :-(

firewave · 2025-10-19T17:13:00Z

Please also run Linux with --executor=thread (quite some data has been added to the IPC recently) and Windows with Boost.

firewave · 2025-10-19T17:20:31Z

I did some short test on my system with a smaller file and I did not see any differences between using thread or not.

The process executor code also has some other overhead unrelated to the IPC which will go away when I am done with reworking the executors. That getting awfully close but needs the remaining suppression issues resolved first since those require changes on how we set up certain things for the analysis. If no new blockers show up there I hope it will finally doe by the end of the year (main blocker is the time it takes to get stuff reviewed though).

danmar · 2025-10-20T07:30:43Z

That --executor=thread output is really interesting.

I wrote this script:

/opt/cppcheckpremium/cppcheck --executor=thread -D__CPPCHECK__ -D__GNUC__ --showtime=file-total -j1 lib > t1
/opt/cppcheckpremium/cppcheck --executor=thread -D__CPPCHECK__ -D__GNUC__ --showtime=file-total -j16 lib > t16

/opt/cppcheckpremium/cppcheck -D__CPPCHECK__ -D__GNUC__ --showtime=file-total -j1 lib > p1
/opt/cppcheckpremium/cppcheck -D__CPPCHECK__ -D__GNUC__ --showtime=file-total -j16 lib > p16

grep "Check time: lib/tokenize.cpp:" t1 t16 p1 p16

And got this output:

t1:Check time: lib/tokenize.cpp: 5.17966s
t16:Check time: lib/tokenize.cpp: 49.8277s
p1:Check time: lib/tokenize.cpp: 4.84888s
p16:Check time: lib/tokenize.cpp: 7.88717s

Does this indicate that if we implement a process executor in windows we might get 10x performance boost?

firewave · 2025-10-20T11:47:15Z

Does this indicate that if we implement a process executor in windows we might get 10x performance boost?

No - it indicates there is an concurrency issue when using std::thread. I have only ever profiled for instructions which does not account for any wait times. I will take a look later.

It should be the other way around (albeit not as drastic). That is also apparent from the unit tests where tests using the process executor are much slower than tests using the thread one.

But none of this has to do with the changes in this PR which will speed up things no matter the platform or threading.

firewave · 2025-10-20T12:09:58Z

Do you actually have 16 thread available on your system? If you specify more threads than available the analysis will obviously be much slower since you are overloading the system. As a rule of thumb you should only utilize about 80% of the available threads if you are actively using the system,

firewave · 2025-10-20T13:57:11Z

Do you actually have 16 thread available on your system? If you specify more threads than available the analysis will obviously be much slower since you are overloading the system. As a rule of thumb you should only utilize about 80% of the available threads if you are actively using the system,

Please disregard. In that case it should also happen when using the processes.

firewave · 2025-10-20T19:53:42Z

I am not able to reproduce this locally.

I also thought it might have been caused by your CPU potentially having P- and E-cores but it not occurring with processes also rules this out.

danmar · 2025-10-25T11:28:44Z

I don't know what happened but right now the times looks better. Here are times measured with time:

t1:real 1m5,429s
t8:real 0m15,339s
t16:real        0m13,304s
p1:real 1m7,956s
p8:real 0m16,380s
p16:real        0m13,441s

So the p16 does not stand out this time.

danmar

this looks good to me.

sonarqubecloud · 2025-10-28T11:45:03Z

Quality Gate passed

Issues
2 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

firewave · 2025-10-28T12:45:23Z

So the p16 does not stand out this time.

So maybe it is E-cores? You should possibly look at the runtimes of all files.

During another test I saw a difference between using -j1 and -j2. The CTU analysis is run - even if there is only single file. I wonder if that is either unnecessary (with -j2) or might lead to false negatives (with -j1).

firewave commented Jan 7, 2025

View reviewed changes

pfultz2 reviewed Jan 7, 2025

View reviewed changes

firewave force-pushed the followref-cache branch from 42d1ec9 to b38e8ba Compare January 8, 2025 13:04

firewave marked this pull request as ready for review January 8, 2025 13:38

firewave force-pushed the followref-cache branch from b38e8ba to a36380e Compare January 14, 2025 15:38

firewave marked this pull request as draft January 19, 2025 16:48

firewave force-pushed the followref-cache branch from a36380e to f54680c Compare April 5, 2025 08:08

firewave marked this pull request as ready for review April 5, 2025 08:08

pfultz2 reviewed Apr 5, 2025

View reviewed changes

firewave marked this pull request as draft April 7, 2025 07:33

firewave force-pushed the followref-cache branch 2 times, most recently from fb74dde to 6699af7 Compare June 3, 2025 12:18

firewave force-pushed the followref-cache branch from 6699af7 to 4e3a9ce Compare June 4, 2025 10:09

firewave changed the title ~~introduced a cache for followAllReferences() calls with default parameters~~ introduced a cache for followAllReferences() calls Jun 4, 2025

firewave force-pushed the followref-cache branch from 4e3a9ce to 270a9c8 Compare June 5, 2025 22:01

firewave referenced this pull request Jul 14, 2025

fixed #13909 / refs #13914 - some (enforced) language handling fixes …

1af769e

…for the GUI (#7570)

firewave force-pushed the followref-cache branch 2 times, most recently from 1031202 to 92ad6f3 Compare September 15, 2025 18:39

firewave force-pushed the followref-cache branch from 92ad6f3 to 90c3d46 Compare September 15, 2025 18:49

firewave marked this pull request as ready for review September 15, 2025 18:49

github-advanced-security bot found potential problems Sep 15, 2025

View reviewed changes

lib/astutils.cpp Fixed Show fixed Hide fixed

firewave force-pushed the followref-cache branch 2 times, most recently from e8ba4f3 to fdef434 Compare October 13, 2025 01:31

danmar approved these changes Oct 25, 2025

View reviewed changes

firewave added 3 commits October 28, 2025 12:38

introduced a cache for followAllReferences() calls

5ac4be5

astutils.cpp: avoid potential NULL dereference in isAliasOf()

0c80684

Token: use std::unique_ptr for mRef and mRefTemp

364da72

firewave force-pushed the followref-cache branch from fdef434 to 364da72 Compare October 28, 2025 11:38

firewave merged commit 5f301d3 into danmar:main Oct 28, 2025
55 checks passed

firewave deleted the followref-cache branch October 28, 2025 12:48

Uh oh!

introduced a cache for followAllReferences() calls #7192

introduced a cache for followAllReferences() calls #7192

Conversation

firewave commented Jan 7, 2025

Uh oh!

firewave commented Jan 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment was marked as duplicate.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

firewave commented Jan 7, 2025

Uh oh!

firewave commented Jan 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

firewave commented Mar 15, 2025

Uh oh!

pfultz2 commented Apr 5, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

firewave commented Apr 7, 2025

Uh oh!

introduced a cache for `followAllReferences()` calls #7192

introduced a cache for `followAllReferences()` calls #7192

firewave commented Jan 7, 2025 •

edited

Loading

firewave commented Jan 7, 2025 •

edited

Loading

danmar commented Oct 19, 2025 •

edited

Loading

danmar commented Oct 19, 2025 •

edited

Loading

danmar commented Oct 19, 2025 •

edited

Loading

danmar commented Oct 20, 2025 •

edited

Loading

firewave commented Oct 20, 2025 •

edited

Loading