You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/learning-paths/smartphones-and-mobile/function-multiversioning/examples.md
+24-24Lines changed: 24 additions & 24 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,12 +10,12 @@ layout: learningpathall
10
10
11
11
#### Code Generation example
12
12
13
-
In this example we have specified two versions of `foo` using the `target_clones` attribute (the order in which they are listed does not matter). At certain optimization levels compilers can decide to perform loop vectorization depending on the target's vector capabilities. Our intention is to enable the compiler to use SVE instructions in the specialized case, whilst restricting it to use only Armv8 instructions in the default one.
13
+
In this example we have specified two versions of `sumPosEltsScaledByIndex` using the `target_clones` attribute (the order in which they are listed does not matter). At certain optimization levels compilers can decide to perform loop vectorization depending on the target's vector capabilities. Our intention is to enable the compiler to use SVE instructions in the specialized case, whilst restricting it to use only Armv8 instructions in the default one.
Note that when using the `clang` compiler, the option `--rtlib=compiler-rt` should be specified on the command line. This allows the compiler to generate runtime checks for detecting the presence of hardware features on your host target.
36
36
37
-
Here is the generated compiler output for the SVE version of `foo` (using `clang`):
37
+
Here is the generated compiler output for the SVE version of `sumPosEltsScaledByIndex` (using `clang`):
38
38
```
39
39
.text
40
-
.globl foo._Msve
40
+
.globl sumPosEltsScaledByIndex._Msve
41
41
.p2align 2
42
-
.type foo._Msve,@function
43
-
foo._Msve:
42
+
.type sumPosEltsScaledByIndex._Msve,@function
43
+
sumPosEltsScaledByIndex._Msve:
44
44
cbz w1, .LBB0_3
45
45
mov w9, w1
46
46
cnth x8
@@ -96,7 +96,7 @@ foo._Msve:
96
96
ret
97
97
```
98
98
99
-
This is the default version of `foo`:
99
+
This is the default version of `sumPosEltsScaledByIndex`:
100
100
```
101
101
.section .rodata.cst16,"aM",@progbits,16
102
102
.p2align 4, 0x0
@@ -106,10 +106,10 @@ This is the default version of `foo`:
106
106
.word 2
107
107
.word 3
108
108
.text
109
-
.globl foo.default
109
+
.globl sumPosEltsScaledByIndex.default
110
110
.p2align 2
111
-
.type foo.default,@function
112
-
foo.default:
111
+
.type sumPosEltsScaledByIndex.default,@function
112
+
sumPosEltsScaledByIndex.default:
113
113
cbz w1, .LBB2_3
114
114
cmp w1, #8
115
115
mov w9, w1
@@ -164,35 +164,35 @@ foo.default:
164
164
ret
165
165
```
166
166
167
-
Any calls to `foo` are routed through `foo.resolver`. This is the function which contains the runtime checks for feature detection. More on this later.
167
+
Any calls to `sumPosEltsScaledByIndex` are routed through `sumPosEltsScaledByIndex.resolver`. This is the function which contains the runtime checks for feature detection. More on this later.
The names `foo._Msve` and `foo.default` correspond to the function versions of `foo`. See the [Arm C Language Extensions](https://arm-software.github.io/acle/main/acle.html#name-mangling) document for more details on the name mangling rules.
195
+
The names `sumPosEltsScaledByIndex._Msve` and `sumPosEltsScaledByIndex.default` correspond to the function versions of `sumPosEltsScaledByIndex`. See the [Arm C Language Extensions](https://arm-software.github.io/acle/main/acle.html#name-mangling) document for more details on the name mangling rules.
Copy file name to clipboardExpand all lines: content/learning-paths/smartphones-and-mobile/function-multiversioning/implementation-details.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -18,7 +18,7 @@ In order to select the most appropriate version of a function, each call to a ve
18
18
19
19
#### Resolver emission with LLVM
20
20
21
-
When using the LLVM compiler the resolver is emitted in the translation unit which contains the definition of the default version. To correctly generate a resolver the compiler must be aware of all the versions of a function. Therefore, the user must declare every function version in the TU where the default version resides. For example:
21
+
When using the LLVM compiler, the resolver is emitted in the translation unit which contains the definition of the default version. To correctly generate a resolver the compiler must be aware of all the versions of a function. Therefore, the user must declare every function version in the TU where the default version resides. For example:
0 commit comments