Skip to content

Commit e234be4

Browse files
committed
Merge branch 'uvision-review' of https://github.com/pareenaverma/arm-learning-paths into uvision-review
2 parents b3e7ac9 + 1d94548 commit e234be4

File tree

33 files changed

+778
-99
lines changed

33 files changed

+778
-99
lines changed

content/install-guides/ams.md

Lines changed: 71 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ additional_search_terms:
1212
- mali
1313
- immortalis
1414
- cortex-a
15+
- Install Arm Mobile Studio
1516

1617

1718
### Estimated completion time in minutes (please use integer multiple of 5)
@@ -32,41 +33,95 @@ test_maintenance: true
3233
test_images:
3334
- ubuntu:latest
3435
---
35-
[Arm Performance Studio](https://developer.arm.com/Tools%20and%20Software/Arm%20Performance%20Studio%20for%20Mobile) (formally known as `Arm Mobile Studio`) is a performance analysis tool suite for various application developers:
36+
[Arm Performance Studio](https://developer.arm.com/Tools%20and%20Software/Arm%20Performance%20Studio) is a performance analysis tool suite for Android and Linux application developers.
3637

37-
* Android application developers
38-
* Linux application developers in Embedded and Cloud segments
38+
It comprises a suite of easy-to-use tools that show you how well your game or app performs on production devices, so that you can identify problems that might cause slow performance, overheat devices, or drain the battery.
3939

40-
It comprises of a suite of easy-to-use tools that show you how well your game or app performs on production devices, so that you can identify problems that might cause slow performance, overheat the device, or drain the battery.
4140

42-
[Frame Advisor](https://developer.arm.com/Tools%20and%20Software/Frame%20Advisor) is available in `2023.5` and later.
41+
| Component | Functionality |
42+
|----------|-------------|
43+
| [Streamline](https://developer.arm.com/Tools%20and%20Software/Streamline%20Performance%20Analyzer) | Capture a performance profile that shows all the performance counter activity from the device. |
44+
| [Performance Advisor](https://developer.arm.com/Tools%20and%20Software/Performance%20Advisor) | Generate an easy-to-read performance summary from an annotated Streamline capture, and get actionable advice about where you should optimize. |
45+
| [Frame Advisor](https://developer.arm.com/Tools%20and%20Software/Frame%20Advisor) | Capture the API calls and rendering from a problem frame and get comprehensive geometry metrics to discover what might be slowing down your application. |
46+
| [Mali Offline Compiler](https://developer.arm.com/Tools%20and%20Software/Mali%20Offline%20Compiler) | Analyze how efficiently your shader programs perform on a range of Mali GPUs. |
47+
| [RenderDoc for Arm GPUs](https://developer.arm.com/Tools%20and%20Software/RenderDoc%20for%20Arm%20GPUs) | The industry-standard tool for debugging Vulkan graphics applications, including early support for Arm GPU extensions and Android features. |
4348

44-
[RenderDoc for Arm GPUs](https://community.arm.com/arm-community-blogs/b/graphics-gaming-and-vr-blog/posts/beyond-mobile-arm-mobile-studio-is-now-arm-performance-studio) is available in `2024.0` and later.
4549

46-
[Graphics Analyzer](https://developer.arm.com/Tools%20and%20Software/Graphics%20Analyzer) is no longer provided. The final release was provided in the `2024.2` release.
47-
48-
All features of Arm Performance Studio are available free of charge without any additional license as of the `2022.4` release.
50+
All features of Arm Performance Studio are available free of charge without any additional license.
4951

5052
## How do I install Arm Performance Studio?
5153

5254
Arm Performance Studio is supported on Windows, Linux, and macOS hosts. Download the appropriate installer from [Arm Performance Studio Downloads](https://developer.arm.com/Tools%20and%20Software/Arm%20Performance%20Studio#Downloads).
5355

54-
Full installation and application launch instructions are given in the Arm Performance Studio [Release Notes](https://developer.arm.com/documentation/107649).
56+
Full details about the supported OS and Android versions are given in the Arm Performance Studio [Release Notes](https://developer.arm.com/documentation/107649).
5557

5658
### How do I install Arm Performance Studio on Windows?
5759

58-
Run the supplied `Arm_Performance_Studio_<version>_windows_x86-64.exe` installer, and follow on-screen instructions.
60+
Run the downloaded `Arm_Performance_Studio_<version>_windows_x86-64.exe` installer, and follow the on-screen instructions.
5961

60-
### How do I install Arm Performance Studio on Linux?
62+
To open Streamline, Frame Advisor or RenderDoc for Arm GPUs, go to the Windows Start menu and search for the name of the tool you want to open.
63+
64+
Performance Advisor is a feature of the Streamline command-line application. To generate a performance report, you must first run the provided Python script to enable Streamline to collect frame data from the device. This process is described in detail in the [Get started with Performance Advisor tutorial](https://developer.arm.com/documentation/102478/latest). After you have captured a profile with Streamline, run `Streamline-cli` on the Streamline capture file. This command is added to your `PATH` environment variable during installation, so it can be used from anywhere.
65+
66+
```console
67+
Streamline-cli.exe -pa <options> my_capture.apc
68+
```
69+
70+
To run Mali Offline Compiler, open a command terminal, navigate to your work directory, and run the `malioc` command on a shader program. The malioc command is added to your `PATH` environment variable during installation, so it can be used from anywhere.
6171

62-
Unpack the supplied `Arm Performance Studio` bundle to the desired location. For example:
6372
```console
64-
tar -xf Arm_Performance_Studio_2024.3_linux_x86-64.tgz
73+
malioc.exe <options> my_shader.frag
6574
```
75+
6676
### How do I install Arm Performance Studio on macOS?
6777

68-
Run the supplied `Arm_Performance_Studio_<version>_macos_x86-64.dmg` installer, and follow on-screen instructions.
78+
Arm Performance Studio is provided as a `.dmg` package. To mount it, double-click the `.dmg` package and follow the instructions. The Arm Performance Studio directory tree is copied to the Applications directory on your local file system for easy access.
79+
80+
You can remove write permission from the installation directory to prevent other users from writing to it. This is done with the `chmod` command. For example:
81+
82+
```
83+
chmod go-w <dest_dir>
84+
```
85+
86+
Open Streamline, Frame Advisor or RenderDoc for Arm GPUs directly from the Arm Performance Studio directory in your Applications directory. For example, to open Streamline, go to the `<installation_directory>/streamline` directory and open the `Streamline.app` file.
87+
88+
To run Performance Advisor, go to the `<installation_directory>/streamline` directory, and double-click the `Streamline-cli-launcher` file. Your computer will ask you to allow Streamline to control the Terminal application. Allow this. The Performance Advisor launcher opens the Terminal application and updates your `PATH` environment variable so you can run Performance Advisor from any directory.
89+
90+
Performance Advisor is a feature of the Streamline command-line application. To generate a performance report, you must first run the provided Python script to enable Streamline to collect frame data from the device. This process is described in detail in the [Get started with Performance Advisor tutorial](https://developer.arm.com/documentation/102478/latest). After you have captured a profile with Streamline, run the `Streamline-cli` command on the Streamline capture file to generate a performance report:
91+
92+
```
93+
Streamline-cli -pa <options> my_capture.apc
94+
```
95+
96+
To run Mali Offline Compiler, go to the `<installation_directory>/mali_offline_compiler` directory, and double-click the `mali_offline_compiler_launcher` file. The Mali Offline Compiler launcher opens the Terminal application and updates your `PATH` environment variable so you can run the `malioc` command from any directory. To generate a shader analysis report, run the `malioc` command on a shader program:
97+
98+
```
99+
malioc <options> my_shader.frag
100+
```
101+
102+
On some versions of macOS, you might see a message that Mali Offline Compiler is not recognized as an application from an identified developer. To enable Mali Offline Compiler, cancel this message, then open **System Preferences > Security and Privacy** and select **Allow Anyway** for the `malioc` application.
103+
104+
### How do I install Arm Performance Studio on Linux?
105+
106+
Arm Performance Studio is provided as a gzipped tar archive. Extract this tar archive to your preferred location, using version 1.13 or later of GNU tar:
107+
108+
```
109+
tar xvzf Arm_Performance_Studio_<version>_linux.tgz
110+
```
111+
112+
You can remove write permission from the installation directory to prevent other users from writing to it. This is done with the `chmod` command. For example:
113+
114+
```
115+
chmod go-w <dest_dir>
116+
```
117+
118+
You might find it useful to edit your `PATH` environment variable to add the paths to the `Streamline-cli` and `malioc` executables so that you can run them from any directory. Add the following commands to the .bashrc file in your home directory, so that they are set whenever you initialize a shell session:
119+
120+
```
121+
PATH=$PATH:<installation_directory>/streamline
122+
PATH=$PATH:<installation_directory>/mali_offline_compiler
123+
```
69124

70125
## How do I get started with Arm Performance Studio?
71126

72-
See the [Get started with Arm Performance Studio for Mobile](/learning-paths/mobile-graphics-and-gaming/ams/) learning path for a collection of tutorials for each component of Performance Studio.
127+
Refer to [Get started with Arm Performance Studio](/learning-paths/mobile-graphics-and-gaming/ams/) for an overview of how to run each tool in Arm Performance Studio.
Lines changed: 121 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,121 @@
1+
---
2+
### Title the install tools article with the name of the tool to be installed
3+
### Include vendor name where appropriate
4+
title: PyTorch for Windows on Arm
5+
6+
### Optional additional search terms (one per line) to assist in finding the article
7+
additional_search_terms:
8+
- python
9+
- windows
10+
- woa
11+
- windows on arm
12+
- open source windows on arm
13+
- pytorch
14+
15+
### Estimated completion time in minutes (please use integer multiple of 5)
16+
minutes_to_complete: 15
17+
18+
### Link to official documentation
19+
official_docs: https://www.python.org/doc/
20+
21+
author: Pareena Verma
22+
23+
### PAGE SETUP
24+
weight: 1 # Defines page ordering. Must be 1 for first (or only) page.
25+
tool_install: true # Set to true to be listed in main selection page, else false
26+
multi_install: false # Set to true if first page of multi-page article, else false
27+
multitool_install_part: false # Set to true if a sub-page of a multi-page article, else false
28+
layout: installtoolsall # DO NOT MODIFY. Always true for tool install articles
29+
---
30+
31+
PyTorch has native support for [Windows on Arm](https://learn.microsoft.com/en-us/windows/arm/overview). Starting with PyTorch 2.7 release, you can access Arm native builds of PyTorch for Windows available for Python 3.12.
32+
33+
A number of developer-ready Windows on Arm [devices](/learning-paths/laptops-and-desktops/intro/find-hardware/) are available.
34+
35+
Windows on Arm instances are available with Microsoft Azure. For further information, see [Deploy a Windows on Arm virtual machine on Microsoft Azure](/learning-paths/cross-platform/woa_azure/).
36+
37+
## How do I install PyTorch for Windows on Arm?
38+
39+
Before you install PyTorch on your Windows on Arm machine, you will need to install [Python for Windows on Arm](/install-guides/py-woa)
40+
41+
Verify your Python installation at a Windows Command prompt or a PowerShell prompt:
42+
43+
```command
44+
python --version
45+
```
46+
The output should look like:
47+
48+
```output
49+
Python 3.12.9
50+
```
51+
Once you have downloaded Python, you can install the PyTorch Stable release (2.7.0) on your Windows on Arm machine.
52+
53+
```command
54+
pip3 install torch==2.7.0 --index-url https://download.pytorch.org/whl/cpu
55+
```
56+
57+
You will see that the `arm64` wheel for PyTorch is installed on your machine:
58+
```output
59+
Downloading https://download.pytorch.org/whl/cpu/torch-2.7.0%2Bcpu-cp312-cp312-win_arm64.whl (107.9 MB)
60+
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 107.9/107.9 MB 29.7 MB/s eta 0:00:00
61+
Downloading https://download.pytorch.org/whl/sympy-1.13.3-py3-none-any.whl (6.2 MB)
62+
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.2/6.2 MB 47.4 MB/s eta 0:00:00
63+
Downloading https://download.pytorch.org/whl/typing_extensions-4.12.2-py3-none-any.whl (37 kB)
64+
Downloading https://download.pytorch.org/whl/filelock-3.13.1-py3-none-any.whl (11 kB)
65+
Downloading https://download.pytorch.org/whl/fsspec-2024.6.1-py3-none-any.whl (177 kB)
66+
Downloading https://download.pytorch.org/whl/Jinja2-3.1.4-py3-none-any.whl (133 kB)
67+
Downloading https://download.pytorch.org/whl/networkx-3.3-py3-none-any.whl (1.7 MB)
68+
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.7/1.7 MB 30.6 MB/s eta 0:00:00
69+
```
70+
71+
You can also install the nightly preview versions of PyTorch on your Windows Arm machine:
72+
73+
```command
74+
pip install --pre torch --index-url https://download.pytorch.org/whl/nightly/cpu
75+
```
76+
77+
## How can I run a PyTorch example?
78+
79+
To run a PyTorch example, and confirm that PyTorch is working, use a text editor to save the code below to a file named `pytorch_woa.py`.
80+
81+
```python
82+
import torch
83+
import platform
84+
85+
# Print PyTorch version
86+
print("PyTorch version:", torch.__version__)
87+
88+
# Check if CUDA is available
89+
if torch.cuda.is_available():
90+
print("CUDA is available. PyTorch can use the GPU.")
91+
else:
92+
print("CUDA is not available. PyTorch will use the CPU.")
93+
94+
# Detect system architecture
95+
architecture = platform.machine()
96+
if "ARM" in architecture.upper() or "AARCH" in architecture.upper():
97+
print("PyTorch is running on Arm:", architecture)
98+
else:
99+
print("PyTorch is not running on Arm. Detected architecture:", architecture)
100+
101+
# Perform a basic PyTorch operation to confirm it's working
102+
try:
103+
tensor = torch.tensor([1.0, 2.0, 3.0])
104+
print("PyTorch is operational. Tensor created:", tensor)
105+
except Exception as e:
106+
print("An error occurred while testing PyTorch:", e)
107+
```
108+
Run the code:
109+
110+
```console
111+
python pytorch_woa.py
112+
```
113+
Running on a Windows on Arm machine produces an output similar to:
114+
115+
```output
116+
PyTorch version: 2.7.0+cpu
117+
CUDA is not available. PyTorch will use the CPU.
118+
PyTorch is running on Arm: ARM64
119+
PyTorch is operational. Tensor created: tensor([1., 2., 3.])
120+
```
121+
You are now ready to use Python on your Windows on Arm device.

content/learning-paths/embedded-and-microcontrollers/cmsisdsp-dev-with-python/_index.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,9 @@ title: Getting Started with CMSIS-DSP Using Python
33

44
minutes_to_complete: 30
55

6-
draft: true
6+
draft: false
77
cascade:
8-
draft: true
8+
draft: false
99

1010
who_is_this_for: Developers who want to learn how the CMSIS-DSP package can be integrated into their applications
1111

content/learning-paths/embedded-and-microcontrollers/cmsisdsp-dev-with-python/how-to-4.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -67,11 +67,11 @@ First, you need to compute the signal energy from audio in Q15 format using CMSI
6767

6868
If you look at the CMSIS-DSP documentation, you'll see that the `power` and `vlog` functions don't produce results in Q15 format. Tracking the fixed-point format throughout all lines of an algorithm can be challenging. In this example, this means that:
6969

70-
* Subtracting the mean to center the signal - as you did in the reference implementation - is handled in CMSIS-DSP by negating the mean and applying it as an offset to the window. Because the mean is small and the shift is minor relative to the Q15 range, this adjustment won't cause saturation.
70+
* Subtracting the mean to center the signal - as you did in the reference implementation - is handled in CMSIS-DSP by negating the mean and applying it as an offset to the window. Using CMSIS-DSP, `arm_negate_q15` is needed to avoid saturation issues that could prevent the value sign from changing (`0x8000` remaining unchanged as `0x8000`). In practice, the mean should be small, and there should be no difference between `-` and `dsp.arm_negate_q15`. However, it is good practice to avoid using `-` or `+` in a fixed-point algorithm when translating it to CMSIS-DSP function calls.
7171
* The resulting `energy` and `dB` values are not in Q15 format because the `power` and `vlog` functions are used
7272
* The multiplication by 10 from the reference implementation is missing
7373

74-
This means that the `signal_energy_q15` will have a different output than the above implementation. Instead of trying to determine the exact fixed-point format of the output and applying the necessary shift to adjust the output's fixed-point format, you will address it in the next step. By tuning the threshold of the detection function, the comparison with the reference VAD can be valid.
74+
This means that the `signal_energy_q15` will have a different output than the above implementation. Instead of trying to determine the exact fixed-point format of the output and applying the necessary shift to adjust the output's fixed-point format, you will address it in the next step by tuning the threshold of the detection function.
7575

7676

7777
```python

content/learning-paths/embedded-and-microcontrollers/cmsisdsp-dev-with-python/how-to-5.md

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -128,7 +128,7 @@ The constructor for `NoiseSuppression`:
128128
- Computes the FFT length that can be used for each slice
129129
- Computes the padding needed for the FFT
130130

131-
The FFT length must be a power of 2. The slice length is not necessarily a power of 2. The constructor therefore computes the closest usable power of 2, and the audio slices are padded with zeros on both sides to match the required FFT length. To make the implementation more robust, this could be computed from by taking the smaller power of two greater than the signal length.
131+
The FFT length must be a power of 2. The slice length is not necessarily a power of 2. The constructor therefore computes the smaller power of two greater than the signal length, and the audio slices are padded with zeros on both sides to match the required FFT length.
132132

133133
#### NoiseSuppressionReference constructor
134134

@@ -153,8 +153,7 @@ The constructor for `NoiseSuppressionReference`:
153153

154154
#### subnoise
155155

156-
Calculates the approximate Wiener gain and is applied to all frequency bands of the FFT. The `v` argument is a vector.
157-
156+
Calculates the approximate Wiener gain and it is applied to all frequency bands of the FFT. The `v` argument is a vector. If the gain is negative, it is set to 0. A small value is added to the energy to avoid division by zero.
158157

159158
```python
160159
def subnoise(self,v):
@@ -170,7 +169,6 @@ def subnoise(self,v):
170169
#### remove_noise
171170

172171
Computes the FFT (with padding) and reduces noise in the frequency bands using the approximate Wiener gain.
173-
If the gain is negative, it is set to zero. A small value is added to the energy to avoid division by zero.
174172

175173
The function also uses `window_and_pad`, which is implemented in the final code-block later.
176174
At a glance, this helper method takes care of padding the signal for a basic even-length window, ensuring it runs smoothly with the FFT.

content/learning-paths/embedded-and-microcontrollers/cmsisdsp-dev-with-python/how-to-6.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -82,9 +82,9 @@ if status==0:
8282
3. CMSIS-DSP fixed-point division represents 1 exactly. So in Q31, instead of using `0x7FFFFFFF`, `1` is represented as `0x40000000` with a shift of `1`. This behavior is handled in the algorithm when converting the scaling factor to an approximate Q31 value.
8383

8484
Several safeguards are applied:
85-
* It is assumed that |energy - noise| ≤ energy. If this condition is violated (i.e., noise is greater than energy), the gain is capped at 1 to prevent overflow.
85+
* The Wiener gain is capped at 1 to prevent overflow.
8686
* If the energy is zero, the gain is also set to 1 to avoid divide-by-zero errors.
87-
* When energy == noise, the result should be exactly 1. In this case, `arm_divide_q31` will return a quotient of 0x40000000 and shiftVal of 1. The algorithm detects this specific representation and overrides it, setting quotient = 0x7FFFFFFF and shiftVal = 0, which is a closer approximation to full-scale gain in Q31 without the need for additional shifts.
87+
* When energy == noise, the result should be exactly 1. In this case, `arm_divide_q31` will return a quotient of `0x40000000` and shiftVal of 1. The algorithm detects this specific representation and overrides it, setting quotient = `0x7FFFFFFF` and shiftVal = 0, which is a closer approximation to full-scale gain in Q31 without the need for additional shifts.
8888

8989
```python
9090
quotient=0x7FFFFFFF
@@ -173,7 +173,7 @@ The noise estimation function performs both noise estimation and noise suppressi
173173

174174
## The final Q15 implementation
175175

176-
Try the final implementation first, and then we’ll analyze the differences from the reference implementation.
176+
Try the final implementation:
177177

178178
```python
179179
class NoiseSuppressionQ15(NoiseSuppression):

0 commit comments

Comments
 (0)