You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/learning-paths/servers-and-cloud-computing/llama_cpp_streamline/1_overview.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,7 +14,7 @@ Frameworks such as [**llama.cpp**](https://github.com/ggml-org/llama.cpp), provi
14
14
15
15
To analyze their execution and use profiling insights for optimization, you need both a basic understanding of transformer architectures and the right analysis tools.
16
16
17
-
This Learning Path demonstrates how to use `llama-cli`application from llama.cpp together with Arm Streamline to analyze the efficiency of LLM inference on Arm CPUs.
17
+
This Learning Path demonstrates how to use `llama-cli` from the command line together with Arm Streamline to analyze the efficiency of LLM inference on Arm CPUs.
18
18
19
19
You will learn how to:
20
20
- Profile token generation at the Prefill and Decode stages
@@ -23,4 +23,4 @@ You will learn how to:
23
23
24
24
You will run the `Qwen1_5-0_5b-chat-q4_0.gguf` model using `llama-cli` on Arm Linux and use Streamline for analysis.
25
25
26
-
The same method can also be applied to Android platforms.
Copy file name to clipboardExpand all lines: content/learning-paths/servers-and-cloud-computing/llama_cpp_streamline/3_llama.cpp_annotation.md
+23-24Lines changed: 23 additions & 24 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -20,36 +20,33 @@ You can either build natively on an Arm platform, or cross-compile on another ar
20
20
21
21
### Step 1: Build Streamline Annotation library
22
22
23
-
Install [Arm DS](https://developer.arm.com/Tools%20and%20Software/Arm%20Development%20Studio) or [Arm Streamline](https://developer.arm.com/Tools%20and%20Software/Streamline%20Performance%20Analyzer) on your development machine first.
23
+
Download and install [Arm Performance Studio](https://developer.arm.com/Tools%20and%20Software/Arm%20Performance%20Studio#Downloads) on your development machine.
24
24
25
-
Streamline Annotation support code is in the installation directory such as `Arm/Development Studio 2024.1/sw/streamline/gator/annotate`.
26
-
27
-
For installation guidance, refer to the [Streamline installation guide](/install-guides/streamline/).
28
-
29
-
Clone the gator repository that matches your Streamline version and build the `Annotation support library`.
25
+
{{% notice Note %}}
26
+
You can also download and install [Arm Development Studio](https://developer.arm.com/Tools%20and%20Software/Arm%20Development%20Studio#Downloads), as it also includes Streamline.
30
27
31
-
The installation step depends on your development machine.
28
+
{{% /notice %}}
32
29
33
-
For Arm native build, you can use the following instructions to install the packages.
30
+
Streamline Annotation support code is in the Arm Performance Studio installation directory in the `streamline/gator/annotate` directory.
34
31
35
-
For other machines, you need to set up the cross compiler environment by installing [Arm GNU toolchain](https://developer.arm.com/downloads/-/arm-gnu-toolchain-downloads).
32
+
Clone the gator repository that matches your Streamline version and build the `Annotation support library`. You can build it on your current machine using the native build instructions and you can cross compile it for another Arm computer using the cross compile instructions.
36
33
37
-
You can refer to the [GCC install guide](https://learn.arm.com/install-guides/gcc/cross/) for cross-compiler installation.
34
+
If you need to set up a cross compiler you can review the [GCC install guide](/install-guides/gcc/cross/).
To link the `libstreamline_annotate.a` library when building llama-cli, add the following lines at the end of `llama.cpp/tools/main/CMakeLists.txt`.
79
+
To link the `libstreamline_annotate.a` library when building llama-cli, use an editor to add the following lines at the end of `llama.cpp/tools/main/CMakeLists.txt`.
Copy file name to clipboardExpand all lines: content/learning-paths/servers-and-cloud-computing/llama_cpp_streamline/4_analyze_token_prefill_decode.md
+11-9Lines changed: 11 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,24 +8,25 @@ layout: learningpathall
8
8
9
9
## Run llama-cli and analyze the data with Streamline
10
10
11
-
After successfully building llama-cli, the next step is to set up the runtime environment on your Arm platform.
11
+
After successfully building llama-cli, the next step is to set up the runtime environment on your Arm platform. This can be your development machine or another Arm system.
12
12
13
-
### Set up gatord
13
+
### Set up the gator daemon
14
14
15
-
The gator daemon (gatord) is the Streamline collection agent that runs on the target device. It captures performance data including CPU metrics, PMU events, and annotations, then sends this data to the Streamline analysis tool running on your host machine. The daemon needs to be running on your target device before you can capture performance data.
15
+
The gator daemon, `gatord`, is the Streamline collection agent that runs on the target device. It captures performance data including CPU metrics, PMU events, and annotations, then sends this data to the Streamline analysis tool running on your host machine. The daemon needs to be running on your target device before you can capture performance data.
16
16
17
17
Depending on how you built llama.cpp:
18
18
19
19
For the cross-compiled build flow:
20
20
21
21
- Copy the `llama-cli` executable to your Arm target.
22
-
-Also copy the `gatord` binary from the Arm DS or Streamline installation:
23
-
- Linux: `Arm\Development Studio 2024.1\sw\streamline\bin\linux\arm64`
24
-
- Android: `Arm\Development Studio 2024.1\sw\streamline\bin\android\arm64`
22
+
-Copy the `gatord` binary from the Arm Performance Studio release. If you are targeting Linux, take it from `streamline\bin\linux\arm64` and if you are targeting Android take it from `streamline\bin\android\arm64`.
23
+
24
+
Put both of these programs in your home directory on the target system.
25
25
26
26
For the native build flow:
27
+
- Use the `llama-cli` from your local build in `llama.cpp/build/bin` and the `gatord` you compiled earlier at `~/gator/build-native-gcc-rel/gatord`.
27
28
28
-
- Use the `llama-cli`from your local build and the `gatord` you compiled earlier (`~/gator/build-native-gcc-rel/gatord`).
29
+
You now have the `gatord` and the `llama-cli`on the computer you want to run and profile.
29
30
30
31
### Download a lightweight model
31
32
@@ -49,8 +50,9 @@ Start the gator daemon on your Arm target:
49
50
You should see similar messages to those shown below:
50
51
51
52
```bash
52
-
Streamline Data Recorder v9.4.0 (Build 9b1e8f8)
53
-
Copyright (c) 2010-2024 Arm Limited. All rights reserved.
53
+
Streamline Data Recorder v9.6.0 (Build oss)
54
+
Copyright (c) 2010-2025 Arm Limited. All rights reserved.
0 commit comments