Skip to content

Commit 93d87c7

Browse files
authored
Merge pull request #1738 from pareenaverma/content_review
Tech review of thread sync LP
2 parents 0f8338f + c95f627 commit 93d87c7

File tree

5 files changed

+224
-88
lines changed

5 files changed

+224
-88
lines changed

content/learning-paths/servers-and-cloud-computing/memory_consistency/_index.md

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -29,13 +29,6 @@ tools_software_languages:
2929
- Litmus7
3030
- Arm ISA
3131

32-
test_images:
33-
- ubuntu:latest
34-
test_link: null
35-
test_maintenance: false
36-
test_status:
37-
- passed
38-
3932
further_reading:
4033
- resource:
4134
title: Arm Architecture Reference Manual for A-profile architecture

content/learning-paths/servers-and-cloud-computing/memory_consistency/arm_mem_model.md

Lines changed: 47 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -6,47 +6,80 @@ layout: "learningpathall"
66

77
## CPU Memory Model vs Language/Runtime Memory Models
88

9-
It's important to understand that the majority of developers will not need to concern themselves with the memory consistency model of the CPU their code will execute on. This is because programming languages and runtime engines abstract away the CPUs memory model by presenting the programmer with a language/runtime memory model. This abstraction is achieved by providing developers with a set of language/runtime specific memory ordering rules, synchronization constructs, and supporting libraries. So long as the developer uses these correctly, language compilers and runtime engines will make sure the code executes correctly on any CPU regardless of how strong or weakly ordered it is.
9+
Majority of developers will not need to be deeply familiar with the memory consistency model of the CPU their code will execute on. This is because programming languages and runtime engines abstract away the CPUs memory model by presenting the programmer with a language/runtime memory model. This abstraction is achieved by providing developers with a set of language/runtime specific memory ordering rules, synchronization constructs, and supporting libraries. As long as the developer uses these correctly, language compilers and runtime engines will make sure the code executes correctly on any CPU regardless of how strong or weakly ordered it is.
1010

1111
That said, developers may want to dig deeper into this topic for various reasons including:
1212

13-
- Extract more performance from weakly ordered CPUs (like Arm).
14-
- Keep in mind that compilers and runtimes will do a good job of maximizing performance. Only in well understood niche cases will there be a potential for performance gain by going beyond what the compiler/runtime would do to higher level code. For most cases, all it takes to get more performance is to use the latest compilers, compiler switches, and runtimes.
13+
- Extract more performance from weakly ordered CPUs (like Arm CPUs).
14+
- Compilers and runtimes will do a good job of maximizing performance. Only in well understood niche cases will there be a potential for performance gain by going beyond what the compiler/runtime would do to higher level code. For most cases, all it takes to get more performance is to use the latest compilers, compiler switches, and runtimes.
1515
- Develop confidence in the correctness of synchronization constructs.
1616
- Develop an understanding of how compilers and runtimes select different machine instructions while still honoring the memory ordering rules of the language/runtime and CPU.
17-
- General learning.
1817

19-
## What We Will Do
20-
21-
In this Learning Path, we will use publicly available tools to explore thread synchronization on Arm CPUs. This will provide the reader with enough working knowledge of the tools to be able to explore on their own. At the end of this Learning Path, we will provide more information on where readers can find a more formal (and complex) treatment of this subject.
18+
In this Learning Path, you will use publicly available tools to explore thread synchronization on Arm CPUs. You will gain enough working knowledge of the tools to be able to explore thread synchronization concepts. At the end of this Learning Path, you will be able access more information to get a deeper understanding of this subject.
2219

2320
## The Formal Definition of the Arm Memory Model
2421

25-
The formal definition of the Arm memory model is in the [Arm Architecture Reference Manual for A-profile architecture](https://developer.arm.com/documentation/ddi0487/la) (Arm ARM) under the section called `The AArch64 Application Level Memory Model`. The ordering requirements defined in the Arm ARM is a transliteration of the `aarch64.cat` file hosted on the Arm [herd7 simulator tool](https://developer.arm.com/herd7). In fact, this `aarch64.cat` file is the authoritative definition of the Arm memory model. Since it is a formal definition, it is complex.
22+
The formal definition of the Arm memory model is in the [Arm Architecture Reference Manual for A-profile architecture](https://developer.arm.com/documentation/ddi0487/la) (Arm ARM) under the section called `The AArch64 Application Level Memory Model`. The ordering requirements defined in the Arm ARM is a transliteration of the `aarch64.cat` file hosted on the Arm [herd7 simulator tool](https://developer.arm.com/herd7). In fact, this `aarch64.cat` file is the authoritative definition of the Arm memory model. As it is a formal definition, it is complex.
2623

2724
## Herd7 Simulator & Litmus7 Tool
2825

29-
The herd7 simulator provides a way to test snippets of Arm assembly against the formal definition of the Arm memory model (the `aarch64.cat` file mentioned above). The litmus7 tool can take the same snippets of assembly that run on herd7 and run them on actual Arm HW. This allows for comparing the formal memory model to the memory model of an actual Arm CPU. These snippets of assembly are called litmus tests.
26+
The herd7 simulator provides a way to test snippets of Arm assembly against the formal definition of the Arm memory model (the `aarch64.cat` file mentioned above). The litmus7 tool can take the same snippets of assembly that run on herd7 and run them on actual Arm hardware. This allows for comparing the formal memory model to the memory model of an actual Arm CPU. These snippets of assembly are called litmus tests.
3027

3128
It's important to understand that it is possible for an implementation of an Arm CPU to be more strongly ordered than the formally defined Arm memory model. This case is not a violation of the memory model because it will still execute code in a way that is compliant with the memory ordering rules.
3229

33-
## Installing the Tools
30+
## Install the Tools
31+
32+
Herd7 and Litmus7 are part of the [diy7](http://diy.inria.fr/) tool suite. The diy7 tool suite can be installed by following the [installation instructions](http://diy.inria.fr/sources/index.html). You will install the tools on an Arm Linux system so that both herd7 and litmus7 can be compared side by side on the same system.
33+
34+
Start an Arm-based cloud instance. This example uses a `t4g.xlarge` AWS instance running Ubuntu 22.04 LTS, but other instances types are possible.
35+
36+
If you are new to cloud-based virtual machines, refer to [Get started with Servers and Cloud Computing](/learning-paths/servers-and-cloud-computing/intro/).
37+
38+
First confirm you are using a Arm-based instance with the following command.
39+
40+
```bash
41+
uname -m
42+
```
43+
You should see the following output.
44+
45+
```output
46+
aarch64
47+
```
48+
49+
Next, install OCaml Package Manager (opam). You will need `opam` to install the `herdtools7` tool suite:
50+
```bash
51+
sudo apt update
52+
sudo apt install opam -y
53+
```
54+
55+
Setup `opam` to install the tools:
56+
```bash
57+
opam init
58+
opam update
59+
eval $(opam env)
60+
```
3461

35-
Herd7 and Litmus7 are part of the [diy7](http://diy.inria.fr/) tool suite. The diy7 tool suite can be installed by following the [installation instructions](http://diy.inria.fr/sources/index.html). We suggest installing this on an Arm system so that both herd7 and litmus7 can be compared side by side on the same system.
62+
Now install the `herdtool7` tool suite which include both `litmus7` and `herd7`:
3663

64+
```bash
65+
opam install herdtools7
66+
```
3767

38-
## Running Herd7 and Litmus7 Example Commands
3968

40-
The test file is assumed to be called `test.litmus` and is in the current directory.
69+
## Herd7 and Litmus7 Example Commands
4170

42-
The help menu shows all options.
71+
You can run `--help` on both the tools to review all the options available.
4372
```
4473
herd7 --help
4574
```
4675
```
4776
litmus7 --help
4877
```
4978

79+
The input to both `herd7` and `litmus7` tools are snippets of assembly code, called litmus tests.
80+
81+
Shown below are some example of running the tools with a litmus test. In the next section, you will go through an actual litmus test example.
82+
5083
Example of running herd7.
5184
```
5285
herd7 ./test.litmus

0 commit comments

Comments
 (0)