Skip to content

Commit bc3cf59

Browse files
authored
Merge branch 'main' into ita9naiwa/nd-transpose-to-shape-cast
2 parents 2de6639 + df1bee0 commit bc3cf59

File tree

551 files changed

+23614
-6672
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

551 files changed

+23614
-6672
lines changed

bolt/test/AArch64/exceptions-plt.cpp

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,9 @@
22

33
// REQUIRES: system-linux
44

5-
// RUN: %clangxx %cxxflags -O1 -Wl,-q,-znow %s -o %t.exe
5+
// RUN: %clang %cflags -fpic -shared -xc /dev/null -o %t.so
6+
// Link against a DSO to ensure PLT entries.
7+
// RUN: %clangxx %cxxflags -O1 -Wl,-q,-znow %s %t.so -o %t.exe
68
// RUN: llvm-bolt %t.exe -o %t.bolt.exe --plt=all --print-only=.*main.* \
79
// RUN: --print-finalized 2>&1 | FileCheck %s
810

bolt/test/AArch64/plt-call.test

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
// Verify that PLTCall optimization works.
22

3-
RUN: %clang %cflags %p/../Inputs/plt-tailcall.c \
3+
RUN: %clang %cflags -fpic -shared -xc /dev/null -o %t.so
4+
// Link against a DSO to ensure PLT entries.
5+
RUN: %clang %cflags %p/../Inputs/plt-tailcall.c %t.so \
46
RUN: -o %t -Wl,-q
57
RUN: llvm-bolt %t -o %t.bolt --plt=all --print-plt --print-only=foo | FileCheck %s
68

bolt/test/X86/callcont-fallthru.s

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,9 @@
11
## Ensures that a call continuation fallthrough count is set when using
22
## pre-aggregated perf data.
33

4-
# RUN: %clangxx %cxxflags %s -o %t -Wl,-q -nostdlib
4+
# RUN: %clang %cflags -fpic -shared -xc /dev/null -o %t.so
5+
## Link against a DSO to ensure PLT entries.
6+
# RUN: %clangxx %cxxflags %s %t.so -o %t -Wl,-q -nostdlib
57
# RUN: link_fdata %s %t %t.pa1 PREAGG
68
# RUN: link_fdata %s %t %t.pa2 PREAGG2
79
# RUN: link_fdata %s %t %t.pa3 PREAGG3

bolt/test/X86/cfi-instrs-reordered.s

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,9 @@
33

44
# RUN: llvm-mc -filetype=obj -triple x86_64-unknown-unknown %s -o %t.o
55
# RUN: llvm-strip --strip-unneeded %t.o
6-
# RUN: %clangxx %cflags %t.o -o %t.exe
6+
# RUN: %clang %cflags -fpic -shared -xc /dev/null -o %t.so
7+
## Link against a DSO to ensure PLT entries.
8+
# RUN: %clangxx %cflags %t.o %t.so -o %t.exe
79
# RUN: llvm-bolt %t.exe -o %t --reorder-blocks=cache --print-after-lowering \
810
# RUN: --print-only=_Z10SolveCubicddddPiPd 2>&1 | FileCheck %s
911
#

bolt/test/X86/plt-call.test

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
// Verify that PLTCall optimization works.
22

3-
RUN: %clang %cflags %p/../Inputs/plt-tailcall.c \
3+
RUN: %clang %cflags -fpic -shared -xc /dev/null -o %t.so
4+
// Link against a DSO to ensure PLT entries.
5+
RUN: %clang %cflags %p/../Inputs/plt-tailcall.c %t.so \
46
RUN: -o %t -Wl,-q
57
RUN: llvm-bolt %t -o %t.bolt --plt=all --print-plt --print-only=foo | FileCheck %s
68

bolt/test/runtime/exceptions-plt.cpp

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,9 @@
22

33
// REQUIRES: system-linux
44

5-
// RUN: %clangxx %cxxflags -O1 -Wl,-q,-znow %s -o %t.exe
5+
// RUN: %clang %cflags -fpic -shared -xc /dev/null -o %t.so
6+
// Link against a DSO to ensure PLT entries.
7+
// RUN: %clangxx %cxxflags -O1 -Wl,-q,-znow %s %t.so -o %t.exe
68
// RUN: llvm-bolt %t.exe -o %t.bolt.exe --plt=all
79
// RUN: %t.bolt.exe
810

bolt/test/runtime/plt-lld.test

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,15 @@
11
// This test checks that the pointers to PLT are properly updated.
2-
// The test is using lld linker.
2+
// The test uses lld and links against a DSO to ensure PLT entries.
3+
RUN: %clang %cflags -fpic -shared -xc /dev/null -o %t.so
34

45
// Non-PIE:
5-
RUN: %clang %cflags -no-pie %p/../Inputs/plt.c -fuse-ld=lld \
6+
RUN: %clang %cflags -no-pie %p/../Inputs/plt.c %t.so -fuse-ld=lld \
67
RUN: -o %t.lld.exe -Wl,-q
78
RUN: llvm-bolt %t.lld.exe -o %t.lld.bolt.exe --use-old-text=0 --lite=0
89
RUN: %t.lld.bolt.exe | FileCheck %s
910

1011
// PIE:
11-
RUN: %clang %cflags -fPIC -pie %p/../Inputs/plt.c -fuse-ld=lld \
12+
RUN: %clang %cflags -fPIC -pie %p/../Inputs/plt.c %t.so -fuse-ld=lld \
1213
RUN: -o %t.lld.pie.exe -Wl,-q
1314
RUN: llvm-bolt %t.lld.pie.exe -o %t.lld.bolt.pie.exe --use-old-text=0 --lite=0
1415
RUN: %t.lld.bolt.pie.exe | FileCheck %s

clang/docs/ReleaseNotes.rst

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -112,6 +112,8 @@ Removed Compiler Flags
112112
Attribute Changes in Clang
113113
--------------------------
114114

115+
- The ``no_sanitize`` attribute now accepts both ``gnu`` and ``clang`` names.
116+
115117
Improvements to Clang's diagnostics
116118
-----------------------------------
117119

@@ -143,12 +145,16 @@ Bug Fixes to Attribute Support
143145
Bug Fixes to C++ Support
144146
^^^^^^^^^^^^^^^^^^^^^^^^
145147

148+
- Clang is now better at keeping track of friend function template instance contexts. (#GH55509)
149+
146150
Bug Fixes to AST Handling
147151
^^^^^^^^^^^^^^^^^^^^^^^^^
148152

149153
Miscellaneous Bug Fixes
150154
^^^^^^^^^^^^^^^^^^^^^^^
151155

156+
- HTML tags in comments that span multiple lines are now parsed correctly by Clang's comment parser. (#GH120843)
157+
152158
Miscellaneous Clang Crashes Fixed
153159
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
154160

@@ -237,6 +243,11 @@ Static Analyzer
237243
New features
238244
^^^^^^^^^^^^
239245

246+
A new flag - `-static-libclosure` was introduced to support statically linking
247+
the runtime for the Blocks extension on Windows. This flag currently only
248+
changes the code generation, and even then, only on Windows. This does not
249+
impact the linker behaviour like the other `-static-*` flags.
250+
240251
Crash and bug fixes
241252
^^^^^^^^^^^^^^^^^^^
242253

clang/docs/analyzer/developer-docs.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,3 +11,4 @@ Contents:
1111
developer-docs/InitializerLists
1212
developer-docs/nullability
1313
developer-docs/RegionStore
14+
developer-docs/PerformanceInvestigation
Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
=========================
2+
Performance Investigation
3+
=========================
4+
5+
Multiple factors contribute to the time it takes to analyze a file with Clang Static Analyzer.
6+
A translation unit contains multiple entry points, each of which take multiple steps to analyze.
7+
8+
You can add the ``-ftime-trace=file.json`` option to break down the analysis time into individual entry points and steps within each entry point.
9+
You can explore the generated JSON file in a Chromium browser using the ``chrome://tracing`` URL,
10+
or using `speedscope <https://speedscope.app>`_.
11+
Once you narrow down to specific analysis steps you are interested in, you can more effectively employ heavier profilers,
12+
such as `Perf <https://perfwiki.github.io/main/>`_ and `Callgrind <https://valgrind.org/docs/manual/cl-manual.html>`_.
13+
14+
Each analysis step has a time scope in the trace, corresponds to processing of an exploded node, and is designated with a ``ProgramPoint``.
15+
If the ``ProgramPoint`` is associated with a location, you can see it on the scope metadata label.
16+
17+
Here is an example of a time trace produced with
18+
19+
.. code-block:: bash
20+
:caption: Clang Static Analyzer invocation to generate a time trace of string.c analysis.
21+
22+
clang -cc1 -nostdsysteminc -analyze -analyzer-constraints=range \
23+
-setup-static-analyzer -analyzer-checker=core,unix,alpha.unix.cstring,debug.ExprInspection \
24+
-verify ./clang/test/Analysis/string.c \
25+
-ftime-trace=trace.json -ftime-trace-granularity=1
26+
27+
.. image:: ../images/speedscope.png
28+
29+
On the speedscope screenshot above, under the first time ruler is the bird's-eye view of the entire trace that spans a little over 60 milliseconds.
30+
Under the second ruler (focused on the 18.09-18.13ms time point) you can see a narrowed-down portion.
31+
The second box ("HandleCode memset...") that spans entire screen (and actually extends beyond it) corresponds to the analysis of ``memset16_region_cast()`` entry point that is defined in the "string.c" test file on line 1627.
32+
Below it, you can find multiple sub-scopes each corresponding to processing of a single exploded node.
33+
34+
- First: a ``PostStmt`` for some statement on line 1634. This scope has a selected subscope "CheckerManager::runCheckersForCallEvent (Pre)" that takes 5 microseconds.
35+
- Four other nodes, too small to be discernible at this zoom level
36+
- Last on this screenshot: another ``PostStmt`` for a statement on line 1635.
37+
38+
In addition to the ``-ftime-trace`` option, you can use ``-ftime-trace-granularity`` to fine-tune the time trace.
39+
40+
- ``-ftime-trace-granularity=NN`` dumps only time scopes that are longer than NN microseconds.
41+
- ``-ftime-trace-verbose`` enables some additional dumps in the frontend related to template instantiations.
42+
At the moment, it has no effect on the traces from the static analyzer.
43+
44+
Note: Both Chrome-tracing and speedscope tools might struggle with time traces above 100 MB in size.
45+
Luckily, in most cases the default max-steps boundary of 225 000 produces the traces of approximately that size
46+
for a single entry point.
47+
You can use ``-analyze-function=get_global_options`` together with ``-ftime-trace`` to narrow down analysis to a specific entry point.

0 commit comments

Comments
 (0)