Skip to content

Commit 762f3ae

Browse files
author
Your Name
committed
final check
1 parent c9b3031 commit 762f3ae

File tree

4 files changed

+20
-24
lines changed

4 files changed

+20
-24
lines changed

content/learning-paths/cross-platform/cpp-loop-size-context/Example.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ layout: learningpathall
88

99
## Example
1010

11-
The following `C++` snippet takes user input as the loop size so that the loop size, `max_loop_size`, is only known at runtime. This initialises an array of size, , `max_loop_size` with the value for each element corresponding to the index position. The function, `foo`, loops of each element to print out the sum of all elements.
11+
The following `C++` snippet takes user input as the loop size so that the loop size, `max_loop_size`, is only known at runtime. This initialises an array of size, , `max_loop_size` with the value for each element corresponding to the index position. The function, `foo`, loops through each element to print out the sum of all elements.
1212

1313
Copy the snippet below into a file named, `no-context.cpp`.
1414

@@ -51,10 +51,10 @@ int main() {
5151
Compiling using the following command.
5252
5353
```bash
54-
g++ -O3 -march=armv8-a+simd -o no_context
54+
g++ -O3 -march=armv8-a+simd no_context.cpp -o no_context
5555
```
5656

57-
Running the example with the number 4000 leads to the following results. Naturally you will see variability depending on which platform you run this on.
57+
Running the example with the number 4000 leads to the following results. You will see runtime variability depending on which platform you run this on.
5858

5959
```output
6060
./no_context

content/learning-paths/cross-platform/cpp-loop-size-context/Introduction.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,15 +8,15 @@ layout: learningpathall
88

99
## Introduction
1010

11-
Often the programmer will have a better understanding of their software and the inputs than the compiler. For example, if the loop size is calculated at runtime, the compiler will have to account for a variable size. However, a developer may have knowledge of the runtime profile, for example if the loop size is always a multiple of a specific number.
11+
Often, the programmer has deeper insights into their software's behavior and its inputs than the compiler does. For instance, if a loop's size is determined at runtime, the compiler must conservatively handle the possibility of variable sizes, potentially limiting optimization opportunities. However, a developer might know more about the application's runtime characteristics—such as the fact that the loop size always adheres to specific constraints, like being a multiple of a particular number.
1212

13-
To provide this context to the compiler we will use a simple example written in C++.
13+
To illustrate how you can explicitly provide this valuable context to the compiler, we'll walk through a simple C++ example.
1414

1515
## Setup
1616

17-
In this learning path I will be using an Arm-based `r7g.large` instance from AWS but any Arm-based machine can be used.
17+
In this learning path, I will be demonstrating the examples using an Arm-based `r7g.large` instance from AWS; however, you're welcome to follow along using any Arm-based machine that suits your environment or preference.
1818

19-
Install the `g++` compiler with the following commands. Adjust to the appropriate commands for your operating system.
19+
To get started, you'll first need to install the `g++` compiler on your system. Use the following commands as a guide, adjusting them accordingly based on the operating system or distribution you're working with.
2020

2121
```bash
2222
sudo apt update

content/learning-paths/cross-platform/cpp-loop-size-context/_index.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
---
2-
title: Learn to improve for loop run time with loop size context
2+
title: Learn to Optimize C++ Loops with Size Context
33

44
minutes_to_complete: 15
55

6-
who_is_this_for: C++ developers
6+
who_is_this_for: C++ developer who want to improve the runtime of for loops with basic insider knowledge of the loop size
77

88
learning_objectives:
99
- Learn how to add preexisting knowledge of loop sizes to for loops
@@ -16,7 +16,7 @@ author: Kieran Hejmadi
1616

1717
### Tags
1818
skilllevels: Introductory
19-
subjects: C++
19+
subjects: ML
2020
armips:
2121
- Neoverse
2222
tools_software_languages:

content/learning-paths/cross-platform/cpp-loop-size-context/providing-inside-knowledge.md

Lines changed: 10 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -8,21 +8,23 @@ layout: learningpathall
88

99
## Adding Inside Knowledge
1010

11-
To make the compiler aware that the input will be a multiple of 4 we will rewrite our loop size as the following.
11+
To explicitly inform the compiler that our input will always be a multiple of 4, we can rewrite the loop size calculation as follows:
1212

1313
```output
1414
((max_loop_size/4)*4)
1515
```
1616

17-
Mathematically this may seem redundant. However since `(max_loop_size/4)` will be truncated to an integer this guarantees `(max_loop_size/4)*4` is a multiple of 4.
17+
At first glance, this calculation might seem mathematically redundant. However, since the expression `(max_loop_size/4)` is an integer division, it truncates the result, effectively guaranteeing that `(max_loop_size/4)*4` will always yield a number divisible by 4. The compiler can pick up on this information and optimise accordingly.
1818

19-
As slightly easier to read method that avoids confusion when arguments are passed in is dividing the variable before passing it in. For example.
19+
As slightly easier to read method that avoids confusion when passing arguments is to divide the variable and rename before it is passed in. For example.
2020

2121
```output
2222
(max_loop_size_div_4 * 4)
2323
```
2424

25-
## Adding Insider Knowledge
25+
## Improved Example
26+
27+
Copy the snippet below and paste into a file named `context.cpp`.
2628

2729
```cpp
2830
#include <iostream>
@@ -64,7 +66,7 @@ int main() {
6466
Again compile with the same compiler flags.
6567
6668
```bash
67-
g++ -O3 -march=armv8-a+simd -o context
69+
g++ -O3 -march=armv8-a+simd context.cpp -o context
6870
```
6971

7072
```output
@@ -73,17 +75,11 @@ Enter a value for max_loop_size (must be a multiple of 4): 40000
7375
Sum: 799980000
7476
Time taken by foo: 24650 nanoseconds
7577
```
78+
In this particular run, the time taken has significantly reduced compared to our previous example.
7679

7780
## Comparison
7881

79-
To compare we will use compiler explorer to see the assembly.
80-
81-
First, looking at the example without context [here](https://godbolt.org/z/qPaW5Kjxa).
82-
Second, looking at the example with context [here](https://godbolt.org/z/rhj65Pe4v).
83-
84-
85-
[Here](https://godbolt.org/z/nvx4j1vTK).
86-
87-
As the assembly shows we have fewer lines of assembly corresponding to the function `foo` as there is less setup code to account given the insider knowledge.
82+
To compare we will use compiler explorer to see the assembly [here](https://godbolt.org/z/nvx4j1vTK).
8883

84+
As the assembly shows we have fewer lines of assembly corresponding to the function `foo` when context is added. This is because the compiler can optimise the conditional checking and any clean up code given the context.
8985

0 commit comments

Comments
 (0)