|
| 1 | +--- |
| 2 | +title: Moving from x86 to AArch64 |
| 3 | +weight: 5 |
| 4 | + |
| 5 | +### FIXED, DO NOT MODIFY |
| 6 | +layout: learningpathall |
| 7 | +--- |
| 8 | + |
| 9 | +## Example Porting Application that uses Intel Vector Statistics Library |
| 10 | + |
| 11 | +OpenRNG is an open-source Random Number Generator (RNG) library, initially released with Arm Performance Libraries 24.04, designed to improve performance when porting applications to Arm. It serves as a drop-in replacement for Intel's Vector Statistics Library (VSL). OpenRNG supports various RNG types, including pseudorandom, quasirandom, and nondeterministic generators, and offers tools for efficient multithreading and converting random sequences into common probability distributions. A vector of random numbers is a sequence of numbers that appear random and are used in various applications, such as simulating unpredictable natural processes, modeling financial markets, and creating unpredictable AI behaviors in gaming. |
| 12 | + |
| 13 | + |
| 14 | +## Run on an X86 Instance |
| 15 | + |
| 16 | +To demonstrate porting we will start with an application running on an x86_64, AWS `t3.2xlarge` instance with 32GB of storage. Please refer to our cloud instance [Getting started with Servers and Cloud computing](https://learn.arm.com/learning-paths/servers-and-cloud-computing/intro/) guide and select an x86 instance type. |
| 17 | + |
| 18 | +Install the OneAPI toolkit using [Intel's instructions](https://www.intel.com/content/www/us/en/docs/oneapi/installation-guide-linux/2023-0/apt.html#GUID-560A487B-1B5B-4406-BB93-22BC7B526BCD). |
| 19 | + |
| 20 | +The following source code uses a classic algorithm to calculate pi. Copy and paste the source code below into a file named 'pi_x86.c`. |
| 21 | + |
| 22 | +```c |
| 23 | +/* |
| 24 | + * SPDX-FileCopyrightText: <text>Copyright 2024 Arm Limited and/or its |
| 25 | + * affiliates <[email protected]></text> |
| 26 | + * |
| 27 | + * SPDX-License-Identifier: MIT OR Apache-2.0 WITH LLVM-exception |
| 28 | + */ |
| 29 | + |
| 30 | +#include <mkl.h> // Using Vector Statistics Library |
| 31 | +#include <stdio.h> |
| 32 | +#include <stdlib.h> |
| 33 | + |
| 34 | +void assert_message(int condition, const char *message) { |
| 35 | + if (!condition) { |
| 36 | + printf("Error: %s\n", message); |
| 37 | + exit(EXIT_FAILURE); |
| 38 | + } |
| 39 | +} |
| 40 | + |
| 41 | +int main() { |
| 42 | + |
| 43 | + const size_t nIterations = 1000 * 1000; |
| 44 | + const size_t nRandomNumbers = 2 * nIterations; |
| 45 | + |
| 46 | + // |
| 47 | + // Declare and initialise the stream. |
| 48 | + // |
| 49 | + // In this example, we've selected the PHILOX4X32X10 generator and seeded it |
| 50 | + // with 42. We can then check that the method executed succesfully by checking |
| 51 | + // the return value for VSL_ERROR_OK. Most methods return VSL_ERROR_OK on |
| 52 | + // success. |
| 53 | + // |
| 54 | + VSLStreamStatePtr stream; |
| 55 | + int errcode = vslNewStream(&stream, VSL_BRNG_PHILOX4X32X10, 42); |
| 56 | + assert_message(errcode == VSL_ERROR_OK, "vslNewStream failed"); |
| 57 | + |
| 58 | + // |
| 59 | + // Allocate a buffer for storing random numbers. |
| 60 | + // |
| 61 | + float *randomNumbers = malloc(nRandomNumbers * sizeof(float)); |
| 62 | + assert_message(randomNumbers != NULL, "malloc failed"); |
| 63 | + |
| 64 | + // |
| 65 | + // Generate a uniform distribution between 0 and 1. |
| 66 | + // |
| 67 | + // First, we select the method used to generate the uniform distribution; in |
| 68 | + // this example, we use the standard method. We pass in a pointer to an |
| 69 | + // initialised stream, the amount of random numbers we want, followed by a |
| 70 | + // pointer to a buffer big enough to hold all the random numbers requested. |
| 71 | + // Finally, we pass in parameters specific to the distribution, in this case, |
| 72 | + // 0 and 1, meaning we want the range [0, 1). |
| 73 | + // |
| 74 | + errcode = vsRngUniform(VSL_RNG_METHOD_UNIFORM_STD, stream, nRandomNumbers, |
| 75 | + randomNumbers, 0, 1); |
| 76 | + assert_message(errcode == VSL_ERROR_OK, "vsRngUniform failed"); |
| 77 | + |
| 78 | + // |
| 79 | + // Use the random numbers. |
| 80 | + // |
| 81 | + // This is a classic algorithm used for estimating the value of pi. We imagine |
| 82 | + // a unit square overlapping a quarter of a circle with unit radius. We then |
| 83 | + // treat pairs of successive random numbers as points on the unit square. We |
| 84 | + // can check if the point is inside the quarter circle by measuring the |
| 85 | + // distance between the point and the centre of the circle; if the distance is |
| 86 | + // less than 1, the point is inside the circle. The proportion of points |
| 87 | + // inside the circle should be |
| 88 | + // |
| 89 | + // (area of quarter circle) / (area of square) := pi / 4. |
| 90 | + // |
| 91 | + // so |
| 92 | + // |
| 93 | + // pi = 4 * (proportion of points inside circle) |
| 94 | + // |
| 95 | + int count = 0; |
| 96 | + for (size_t i = 0; i < nIterations; i++) { |
| 97 | + float x = randomNumbers[2 * i + 0]; |
| 98 | + float y = randomNumbers[2 * i + 1]; |
| 99 | + |
| 100 | + if (x * x + y * y < 1) { |
| 101 | + count++; |
| 102 | + } |
| 103 | + } |
| 104 | + float estimateOfPi = 4.0f * count / nIterations; |
| 105 | + |
| 106 | + printf("Estimate of pi: %f\n", estimateOfPi); |
| 107 | + printf("Number of iterations: %zu\n", nIterations); |
| 108 | + |
| 109 | + // |
| 110 | + // The buffer passed into vsRngUniform is still owned by the user. |
| 111 | + // |
| 112 | + free(randomNumbers); |
| 113 | + |
| 114 | + // |
| 115 | + // Release any resources held by the stream. |
| 116 | + // |
| 117 | + errcode = vslDeleteStream(&stream); |
| 118 | + assert_message(errcode == VSL_ERROR_OK, "vslDeleteStream failed"); |
| 119 | + |
| 120 | + return EXIT_SUCCESS; |
| 121 | +} |
| 122 | + |
| 123 | +``` |
| 124 | +
|
| 125 | +
|
| 126 | +Compile the source code by running the following commands. Please note: you may need to adjust the oneapi version from 2025.0 to the version installed on your system. |
| 127 | +
|
| 128 | +```bash |
| 129 | +export LD_LIBRARY_PATH=/opt/intel/oneapi/2025.0/lib:$LD_LIBRARY_PATH |
| 130 | +gcc -o pi_x86 pi_x86.c -lmkl_rt -I/opt/intel/oneapi/2025.0/include -L/opt/intel/oneapi/2025.0/lib |
| 131 | +``` |
| 132 | + |
| 133 | +Using the `ldd` command to print the shared objects we can see we a linking to `libmkl`. |
| 134 | + |
| 135 | +```output |
| 136 | +ldd ./pi_x86 |
| 137 | + linux-vdso.so.1 (0x00007fff9ddc7000) |
| 138 | + libmkl_rt.so.2 => /opt/intel/oneapi/2025.0/lib/libmkl_rt.so.2 (0x0000748c46400000) |
| 139 | + libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x0000748c46000000) |
| 140 | + libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x0000748c4711a000) |
| 141 | + /lib64/ld-linux-x86-64.so.2 (0x0000748c4712c000) |
| 142 | +``` |
| 143 | +## Porting to use OpenRNG |
| 144 | + |
| 145 | +OpenRNG in most cases is a drop-in replacement for the Vector Statistics Library. Please refer to the reference guide for full information on which functions are supported. To enable this source code to run on Arm we simply need to adjust the header file. |
| 146 | + |
| 147 | +```output |
| 148 | +// from |
| 149 | +#include "mkl.h" |
| 150 | +// to |
| 151 | +#include "openrng.h" |
| 152 | +``` |
| 153 | + |
| 154 | +``` |
| 155 | +gcc -c -mcpu=native -I/opt/arm/armpl_24.10_gcc/include -std=c99 pi.c -o pi.o |
| 156 | +gcc -mcpu=native pi.o -L/opt/arm/armpl_24.10_gcc/lib -larmpl -lamath -lm -o pi.exe |
| 157 | +
|
| 158 | +Running program openrng.exe: |
| 159 | +Estimate of pi: 3.142112 |
| 160 | +Number of iterations: 1000000 |
| 161 | +``` |
0 commit comments