Skip to content

Commit d19de47

Browse files
author
Your Name
committed
final tidy before PR
1 parent de3624c commit d19de47

File tree

4 files changed

+11
-11
lines changed
  • content/learning-paths/servers-and-cloud-computing/using-and-porting-performance-libs

4 files changed

+11
-11
lines changed

content/learning-paths/servers-and-cloud-computing/using-and-porting-performance-libs/1.md

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -8,11 +8,11 @@ layout: learningpathall
88

99
## Introduction to Performance Libraries
1010

11-
The C++ Standard Library provides a collection of classes and functions that are essential for everyday programming tasks, such as data structures, algorithms, and input/output operations. It is designed to be versatile and easy to use, ensuring compatibility and portability across different platforms. However as a result of this portability, standard libraries introduces some limitations. Performance sensitive applications may wish to take maximum advantage of the hardware's capabilities. This is where performance libraries come in.
11+
The C++ Standard Library provides a collection of classes and functions that are essential for everyday programming tasks, such as data structures, algorithms, and input/output operations. It is designed to be versatile and easy to use, ensuring compatibility and portability across different platforms. However as a result of this portability, standard libraries introduces some limitations. Performance sensitive applications may wish to take maximum advantage of the hardware's capabilities - this is where performance libraries come in.
1212

13-
Performance libraries like OpenRNG are specialized for high-performance computing tasks and are often tailored to the microarchitecture of a specific processor. These libraries are optimized for speed and efficiency, often leveraging hardware-specific features such as vector units to achieve maximum performance. Performance libraries are crafted through extensive benchmarking and optimization, and can be domain-specific, such as genomics libraries, or produced by Arm for general-purpose computing. For example, OpenRNG focuses on generating random numbers quickly and efficiently, which is crucial for simulations and scientific computations, whereas the C++ Standard Library offers a more general-purpose approach with functions like std::mt19937 for random number generation.
13+
Performance libraries are specialized for high-performance computing tasks and are often tailored to the microarchitecture of a specific processor. These libraries are optimized for speed and efficiency, often leveraging hardware-specific features such as vector units to achieve maximum performance. Performance libraries are crafted through extensive benchmarking and optimization, and can be domain-specific, such as genomics libraries, or produced by Arm for general-purpose computing. For example, OpenRNG focuses on generating random numbers quickly and efficiently, which is crucial for simulations and scientific computations, whereas the C++ Standard Library offers a more general-purpose approach with functions like std::mt19937 for random number generation.
1414

15-
Performance libraries for Arm CPUs, such as the Arm Performance Libraries (APL), provide highly optimized mathematical functions for scientific computing, similar to how cuBLAS are a set of optimised libaries specifically for NVIDIA GPUs. These libraries can be linked dynamically at runtime or statically during compilation, offering flexibility in deployment. They are designed to support multiple versions of the Arm architecture, including those with NEON and SVE extensions. Generally, minimal source code changes are required to support these libraries, making them easy to integrate.
15+
Performance libraries for Arm CPUs, such as the Arm Performance Libraries (APL), provide highly optimized mathematical functions for scientific computing. An analogous library for accelerating routines on GPU is cuBLAS for NVIDIA GPUs. These libraries can be linked dynamically at runtime or statically during compilation, offering flexibility in deployment. They are designed to support multiple versions of the Arm architecture, including those with NEON and SVE extensions. Generally, minimal source code changes are required to support these libraries, making them simple for porting and optimising.
1616

1717
### Choosing the right version of a library
1818

@@ -35,13 +35,12 @@ Multiple performance libraries coexist to cater to the diverse needs of differen
3535
- **Hardware Specialization** Some libraries are designed to be cross-platform, supporting multiple hardware architectures to provide flexibility and broader usability. For example, the OpenBLAS library supports both Arm and x86 architectures, allowing developers to use the same library across different systems.
3636

3737
- **Domain-Specific Libraries**: Libraries are often created to handle specific domains or types of computations more efficiently. For instance, libraries like cuDNN are optimized for deep learning tasks, providing specialized functions that significantly speed up neural network training and inference.
38-
These factors contribute to the existence of multiple performance libraries, each tailored to meet the specific demands of various hardware and applications.
3938

4039
- **Commercial Libraries**: Alternatively, some highly performant libraries require a license to use. This is more common in domain specific libraries such as computations chemistry or fluid dynamics.
4140

41+
These factors contribute to the existence of multiple performance libraries, each tailored to meet the specific demands of various hardware and applications.
4242

43-
44-
Invariably, there will be performance differences between each library and the best way to observe it to use the library within your own program. For more information please read [this blog](https://community.arm.com/arm-community-blogs/b/servers-and-cloud-computing-blog/posts/arm-performance-libraries-24-10).
43+
Invariably, there will be performance differences between each library and the best way to observe it to use the library within your own program. For more information on performance benchmarking please read [this blog](https://community.arm.com/arm-community-blogs/b/servers-and-cloud-computing-blog/posts/arm-performance-libraries-24-10).
4544

4645
### What performance libraries are available on Arm?
4746

content/learning-paths/servers-and-cloud-computing/using-and-porting-performance-libs/2.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ layout: learningpathall
88

99
## Setting Up Your Environment
1010

11-
In this initial example we will use an Arm-based AWS `t4g.2xlarge` instance along with the Arm Performance Libraries. For instructions to connect to an AWS instance, please see our [getting started guide](https://learn.arm.com/learning-paths/servers-and-cloud-computing/intro/).
11+
In this initial example we will use an Arm-based AWS `t4g.2xlarge` instance running Ubuntu 22.04 LTS along with the Arm Performance Libraries. For instructions to connect to an AWS instance, please see our [getting started guide](https://learn.arm.com/learning-paths/servers-and-cloud-computing/intro/).
1212

1313
Once connected via `ssh`, install the required packages with the following commands.
1414

content/learning-paths/servers-and-cloud-computing/using-and-porting-performance-libs/3.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ layout: learningpathall
88

99
## Example using Optimised Math library
1010

11-
The libamath library from Arm is an optimized subset of the standard library math functions for Arm-based CPUs, providing both scalar and vector functions at different levels of precision. It includes vectorized versions (Neon and SVE) of common math functions found in the standard library, such as those in the `<cmath>` header.
11+
The `libamath` library from Arm is an optimized subset of the standard library math functions for Arm-based CPUs, providing both scalar and vector functions at different levels of precision. It includes vectorized versions (Neon and SVE) of common math functions found in the standard library, such as those in the `<cmath>` header.
1212

1313
The trivial snippet below uses the `<cmath>` standard cmath header to calculate the base exponential of a scalar value. Copy and paste the code sample below into a file named `basic_math.cpp`.
1414

@@ -65,13 +65,13 @@ int main() {
6565
}
6666
```
6767

68-
Compiling using the following g++ command. Again we can use the `ldd` command to print the shared objects for dynamic linking. Now we can opbserve the `libamath.so` shared object is linked.
68+
Compiling using the following g++ command. Again we can use the `ldd` command to print the shared objects for dynamic linking.
6969

7070
```bash
7171
g++ optimised_math.cpp -o optimised_math -lamath -lm
7272
ldd optimised_math
7373
```
74-
You should see the following output.
74+
Now we can observe the `libamath.so` shared object is linked.
7575

7676
```output
7777

content/learning-paths/servers-and-cloud-computing/using-and-porting-performance-libs/_index.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,8 @@ learning_objectives:
99
- Learn how to incorporate optimised libraries
1010
- Learn how to port a basic application from x86 to AArch64
1111
prerequisites:
12-
- Understanding of C++, Linux and the GCC/G++ compiler
12+
- Access to an Arm / x86-based cloud instance
13+
- Intermediate understanding of C++, Linux and compilation
1314

1415
author_primary: Kieran Hejmadi
1516

0 commit comments

Comments
 (0)