You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/learning-paths/cross-platform/cpp-loop-size-context/Example.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@ layout: learningpathall
8
8
9
9
## Example
10
10
11
-
The following `C++` snippet takes user input as the loop size so that the loop size, `max_loop_size`, is only known at runtime. This initialises an array of size, , `max_loop_size` with the value for each element corresponding to the index position. The function, `foo`, loops of each element to print out the sum of all elements.
11
+
The following `C++` snippet takes user input as the loop size so that the loop size, `max_loop_size`, is only known at runtime. This initialises an array of size, , `max_loop_size` with the value for each element corresponding to the index position. The function, `foo`, loops through each element to print out the sum of all elements.
12
12
13
13
Copy the snippet below into a file named, `no-context.cpp`.
Running the example with the number 4000 leads to the following results. Naturally you will see variability depending on which platform you run this on.
57
+
Running the example with the number 4000 leads to the following results. You will see runtime variability depending on which platform you run this on.
Copy file name to clipboardExpand all lines: content/learning-paths/cross-platform/cpp-loop-size-context/Introduction.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,15 +8,15 @@ layout: learningpathall
8
8
9
9
## Introduction
10
10
11
-
Often the programmer will have a better understanding of their softwareand the inputs than the compiler. For example, if the loop size is calculated at runtime, the compiler will have to account for a variable size. However, a developer may have knowledge of the runtime profile, for example if the loop size is always a multiple of a specific number.
11
+
Often, the programmer has deeper insights into their software's behavior and its inputs than the compiler does. For instance, if a loop's size is determined at runtime, the compiler must conservatively handle the possibility of variable sizes, potentially limiting optimization opportunities. However, a developer might know more about the application's runtime characteristics—such as the fact that the loop size always adheres to specific constraints, like being a multiple of a particular number.
12
12
13
-
To provide this context to the compiler we will use a simple example written in C++.
13
+
To illustrate how you can explicitly provide this valuable context to the compiler, we'll walk through a simple C++ example.
14
14
15
15
## Setup
16
16
17
-
In this learning path I will be using an Arm-based `r7g.large` instance from AWS but any Arm-based machine can be used.
17
+
In this learning path, I will be demonstrating the examples using an Arm-based `r7g.large` instance from AWS; however, you're welcome to follow along using any Arm-based machine that suits your environment or preference.
18
18
19
-
Install the `g++` compiler with the following commands. Adjust to the appropriate commands for your operating system.
19
+
To get started, you'll first need to install the `g++` compiler on your system. Use the following commands as a guide, adjusting them accordingly based on the operating system or distribution you're working with.
Copy file name to clipboardExpand all lines: content/learning-paths/cross-platform/cpp-loop-size-context/providing-inside-knowledge.md
+10-14Lines changed: 10 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,21 +8,23 @@ layout: learningpathall
8
8
9
9
## Adding Inside Knowledge
10
10
11
-
To make the compiler aware that the input will be a multiple of 4 we will rewrite our loop size as the following.
11
+
To explicitly inform the compiler that our input will always be a multiple of 4, we can rewrite the loop size calculation as follows:
12
12
13
13
```output
14
14
((max_loop_size/4)*4)
15
15
```
16
16
17
-
Mathematically this may seem redundant. However since `(max_loop_size/4)`will be truncated to an integer this guarantees `(max_loop_size/4)*4`is a multiple of 4.
17
+
At first glance, this calculation might seem mathematically redundant. However, since the expression `(max_loop_size/4)`is an integer division, it truncates the result, effectively guaranteeing that `(max_loop_size/4)*4`will always yield a number divisible by 4. The compiler can pick up on this information and optimise accordingly.
18
18
19
-
As slightly easier to read method that avoids confusion when arguments are passed in is dividing the variable before passing it in. For example.
19
+
As slightly easier to read method that avoids confusion when passing arguments is to divide the variable and rename before it is passed in. For example.
20
20
21
21
```output
22
22
(max_loop_size_div_4 * 4)
23
23
```
24
24
25
-
## Adding Insider Knowledge
25
+
## Improved Example
26
+
27
+
Copy the snippet below and paste into a file named `context.cpp`.
@@ -73,17 +75,11 @@ Enter a value for max_loop_size (must be a multiple of 4): 40000
73
75
Sum: 799980000
74
76
Time taken by foo: 24650 nanoseconds
75
77
```
78
+
In this particular run, the time taken has significantly reduced compared to our previous example.
76
79
77
80
## Comparison
78
81
79
-
To compare we will use compiler explorer to see the assembly.
80
-
81
-
First, looking at the example without context [here](https://godbolt.org/z/qPaW5Kjxa).
82
-
Second, looking at the example with context [here](https://godbolt.org/z/rhj65Pe4v).
83
-
84
-
85
-
[Here](https://godbolt.org/z/nvx4j1vTK).
86
-
87
-
As the assembly shows we have fewer lines of assembly corresponding to the function `foo` as there is less setup code to account given the insider knowledge.
82
+
To compare we will use compiler explorer to see the assembly [here](https://godbolt.org/z/nvx4j1vTK).
88
83
84
+
As the assembly shows we have fewer lines of assembly corresponding to the function `foo` when context is added. This is because the compiler can optimise the conditional checking and any clean up code given the context.
0 commit comments