|
| 1 | +========================= |
| 2 | +Performance Investigation |
| 3 | +========================= |
| 4 | + |
| 5 | +Multiple factors contribute to the time it takes to analyze a file with Clang Static Analyzer. |
| 6 | +A translation unit contains multiple entry points, each of which take multiple steps to analyze. |
| 7 | + |
| 8 | +You can add the ``-ftime-trace=file.json`` option to break down the analysis time into individual entry points and steps within each entry point. |
| 9 | +You can explore the generated JSON file in a Chromium browser using the ``chrome://tracing`` URL, |
| 10 | +or using `speedscope <https://speedscope.app>`_. |
| 11 | +Once you narrow down to specific analysis steps you are interested in, you can more effectively employ heavier profilers, |
| 12 | +such as `Perf <https://perfwiki.github.io/main/>`_ and `Callgrind <https://valgrind.org/docs/manual/cl-manual.html>`_. |
| 13 | + |
| 14 | +Each analysis step has a time scope in the trace, corresponds to processing of an exploded node, and is designated with a ``ProgramPoint``. |
| 15 | +If the ``ProgramPoint`` is associated with a location, you can see it on the scope metadata label. |
| 16 | + |
| 17 | +Here is an example of a time trace produced with |
| 18 | + |
| 19 | +.. code-block:: bash |
| 20 | + :caption: Clang Static Analyzer invocation to generate a time trace of string.c analysis. |
| 21 | +
|
| 22 | + clang -cc1 -nostdsysteminc -analyze -analyzer-constraints=range \ |
| 23 | + -setup-static-analyzer -analyzer-checker=core,unix,alpha.unix.cstring,debug.ExprInspection \ |
| 24 | + -verify ./clang/test/Analysis/string.c \ |
| 25 | + -ftime-trace=trace.json -ftime-trace-granularity=1 |
| 26 | +
|
| 27 | +.. image:: ../images/speedscope.png |
| 28 | + |
| 29 | +On the speedscope screenshot above, under the first time ruler is the bird's-eye view of the entire trace that spans a little over 60 milliseconds. |
| 30 | +Under the second ruler (focused on the 18.09-18.13ms time point) you can see a narrowed-down portion. |
| 31 | +The second box ("HandleCode memset...") that spans entire screen (and actually extends beyond it) corresponds to the analysis of ``memset16_region_cast()`` entry point that is defined in the "string.c" test file on line 1627. |
| 32 | +Below it, you can find multiple sub-scopes each corresponding to processing of a single exploded node. |
| 33 | + |
| 34 | +- First: a ``PostStmt`` for some statement on line 1634. This scope has a selected subscope "CheckerManager::runCheckersForCallEvent (Pre)" that takes 5 microseconds. |
| 35 | +- Four other nodes, too small to be discernible at this zoom level |
| 36 | +- Last on this screenshot: another ``PostStmt`` for a statement on line 1635. |
| 37 | + |
| 38 | +In addition to the ``-ftime-trace`` option, you can use ``-ftime-trace-granularity`` to fine-tune the time trace. |
| 39 | + |
| 40 | +- ``-ftime-trace-granularity=NN`` dumps only time scopes that are longer than NN microseconds. |
| 41 | +- ``-ftime-trace-verbose`` enables some additional dumps in the frontend related to template instantiations. |
| 42 | + At the moment, it has no effect on the traces from the static analyzer. |
| 43 | + |
| 44 | +Note: Both Chrome-tracing and speedscope tools might struggle with time traces above 100 MB in size. |
| 45 | +Luckily, in most cases the default max-steps boundary of 225 000 produces the traces of approximately that size |
| 46 | +for a single entry point. |
| 47 | +You can use ``-analyze-function=get_global_options`` together with ``-ftime-trace`` to narrow down analysis to a specific entry point. |
0 commit comments