|
| 1 | +--- |
| 2 | +title: Basics of Compilers |
| 3 | +weight: 2 |
| 4 | + |
| 5 | +### FIXED, DO NOT MODIFY |
| 6 | +layout: learningpathall |
| 7 | +--- |
| 8 | + |
| 9 | +## Introduction to C++ and Compilers |
| 10 | + |
| 11 | +The C++ language gives the programmer the freedom to be expressive in the way they write code - allowing low-level manipulation of memory and data structures. Compared to managed languages, such as Java, C++ source code is generally less portable, requiring recompilation to the target Arm architecture. In the context of optimizing C++ workloads on Arm, significant performance improvements can be achieved without modifying the source code, simply by using the compiler correctly. |
| 12 | + |
| 13 | +Writing performant C++ code is a topic in itself and out of scope for this learning path. Instead we will focus on how to effectively use the compiler to target Arm instances in a cloud environment. |
| 14 | + |
| 15 | +## Purpose of a Compiler |
| 16 | + |
| 17 | +The g++ compiler is part of the GNU Compiler Collection (GCC), which is a set of compilers for various programming languages, including C++. The primary objective of the g++ compiler is to translate C++ source code into machine code that can be executed by a computer. This process involves several high-level stages: |
| 18 | + |
| 19 | +- Preprocessing: In this initial stage, the preprocessor handles directives that start with a # symbol, such as `#include`, `#define`, and `#if`. It expands included header files, replaces macros, and processes conditional compilation statements. |
| 20 | + |
| 21 | +- Compilation: The compiler translates the preprocessed source code into an intermediate representation specific to the target processor architecture. This step includes syntax checking, semantic analysis, and generating error messages for any issues encountered in the source code. |
| 22 | + |
| 23 | +- Assembly: The intermediate representation is converted into assembly language, which uses mnemonics and syntax specific to the target processor architecture. Assemblers then convert this assembly code into object code (machine code). |
| 24 | + |
| 25 | +- Linking: The final stage involves linking the object code with necessary libraries and other object files. The linker merges multiple object files and libraries, resolves external references, allocates memory addresses for functions and variables, and generates an executable file that can be run on the target platform. |
| 26 | + |
| 27 | +An interesting fact about the g++ compiler is that it is designed to optimize both the performance and the size of the generated code. The compiler performs various optimizations based on the knowledge it has of the program, and it can be configured to prioritize reducing the size of the generated executable. |
| 28 | + |
| 29 | + |
| 30 | +### Compiler Versioning |
| 31 | + |
| 32 | +Two popular compilers of C++ are the GNU Compiler Collection (GCC) and LLVM - both of which are open-source compilers and have contributions from Arm engineers to support the latest architectures. Proprietary or vendor-specific compilers, such as `nvcc` for compiling for NVIDIA GPUs, are often based on these open-source compilers. Alternative proprietary compilers are often designed for specific use cases. For example, safety-critical applications may need to comply with various ISO standards, which also include the compiler. The functional safety [Arm Compiler for Embedded](https://developer.arm.com/Tools%20and%20Software/Arm%20Compiler%20for%20Embedded%20FuSa) is such an example of a C/C++ compiler. |
| 33 | + |
| 34 | +Most application developers are not in this safety qualification domain so we will be using the open-source GCC/G++ compiler for this learning path. |
| 35 | + |
| 36 | +There are multiple Linux distribtions available to choose from. Each Linux distribution and operating system has a default compiler. For example after installing the default g++ on an `r8g` AWS instance, the default g++ compiler as of January 2025 is below. |
| 37 | + |
| 38 | +``` output |
| 39 | +g++ --version |
| 40 | +g++ (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 |
| 41 | +Copyright (C) 2023 Free Software Foundation, Inc. |
| 42 | +This is free software; see the source for copying conditions. There is NO |
| 43 | +warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. |
| 44 | +``` |
| 45 | + |
| 46 | +The table below shows the default (*) and available compilers for a variety of linux distributions. This is taken from the [AWS-graviton performance runbook](https://github.com/aws/aws-graviton-getting-started/blob/main/c-c%2B%2B.md). |
| 47 | + |
| 48 | + |
| 49 | +Distribution | GCC | Clang/LLVM |
| 50 | +----------------|----------------------|------------- |
| 51 | +Amazon Linux 2023 | 11* | 15* |
| 52 | +Amazon Linux 2 | 7*, 10 | 7, 11* |
| 53 | +Ubuntu 24.04 | 9, 10, 11, 12, 13*, 14 | 14, 15, 16, 17, 18* |
| 54 | +Ubuntu 22.04 | 9, 10, 11*, 12 | 11, 12, 13, 14* |
| 55 | +Ubuntu 20.04 | 7, 8, 9*, 10 | 6, 7, 8, 9, 10, 11, 12 |
| 56 | +Ubuntu 18.04 | 4.8, 5, 6, 7*, 8 | 3.9, 4, 5, 6, 7, 8, 9, 10 |
| 57 | +Debian10 | 7, 8* | 6, 7, 8 |
| 58 | +Red Hat EL8 | 8*, 9, 10 | 10 |
| 59 | +SUSE Linux ES15 | 7*, 9, 10 | 7 |
| 60 | + |
| 61 | + |
| 62 | +The biggest and most simple performance gain can be achieved by using the most recent compiler available. The most recent optimisations and support will be available through the latest compiler. |
| 63 | + |
| 64 | +Looking at the g++ documentation as an example, the most recent version of GCC available at the time of writing, version 14.2, has the following support and optimisations listed on their website [change note](https://gcc.gnu.org/gcc-14/changes.html). |
| 65 | + |
| 66 | +```output |
| 67 | +A number of new CPUs are supported through the -mcpu and -mtune options (GCC identifiers in parentheses). |
| 68 | + - Ampere-1B (ampere1b). |
| 69 | + - Arm Cortex-A520 (cortex-a520). |
| 70 | + - Arm Cortex-A720 (cortex-a720). |
| 71 | + - Arm Cortex-X4 (cortex-x4). |
| 72 | + - Microsoft Cobalt-100 (cobalt-100). |
| 73 | +... |
| 74 | +``` |
| 75 | + |
| 76 | +Sufficient due diligence should be taken when updating your C++ compiler because the process may reveal bugs in your source code. These bugs are often undefined behaviour caused by not adhering to the C++ standard. It is rare that the compiler itself will introduced a bug. However, in such events known bugs are made publicly available in the compiler documentation. |
| 77 | + |
| 78 | +## Basic g++ Optimisation Levels |
| 79 | + |
| 80 | +Using the g++ compiler as an example, the most course-grained dial you can adjust is the optimisation level, denoted with `-O<x>`. This adjusts a variety of lower-level optimsation flags at the expense of increased computation time, memory use and debuggability. When aggresive optimisation is used, the optimised binary may not show expected behaviour when hooked up to a debugger such as `gdb`. This is because the generated code may not match the original source code or program order, for example from loop unrolling and vectorisation. |
| 81 | + |
| 82 | +A few of the most common optimization levels are in the table below. |
| 83 | + |
| 84 | +| Optimization Level | Description | |
| 85 | +|--------------------|----------------------------------------------------------------------------------------------| |
| 86 | +| `-O0` | No optimization; useful for debugging. | |
| 87 | +| `-O1` | Basic optimizations that improve performance without significantly increasing compilation time. | |
| 88 | +| `-O2` | More aggressive optimizations that further enhance performance. | |
| 89 | +| `-O3` | Enables aggressive optimizations that can significantly improve execution speed. | |
| 90 | +| `-Os` | Optimizes code size, reducing the overall binary size. | |
| 91 | +| `-Ofast` | Enables optimizations that may not strictly adhere to standard compliance. | |
| 92 | + |
| 93 | + Please refer to your compiler documentation for full details on the optimisation level, for example [GCC](https://gcc.gnu.org/onlinedocs/gcc-14.2.0/gcc/Optimize-Options.html). |
0 commit comments