ArmDeveloperEcosystem
diff --git a/‎content/install-guides/ams.md‎
Lines changed: 71 additions & 16 deletions b/‎content/install-guides/ams.md‎
Lines changed: 71 additions & 16 deletions
diff --git a/‎content/install-guides/pytorch-woa.md‎
Lines changed: 121 additions & 0 deletions b/‎content/install-guides/pytorch-woa.md‎
Lines changed: 121 additions & 0 deletions
diff --git a/‎content/learning-paths/embedded-and-microcontrollers/cmsisdsp-dev-with-python/_index.md‎
Lines changed: 2 additions & 2 deletions b/‎content/learning-paths/embedded-and-microcontrollers/cmsisdsp-dev-with-python/_index.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎content/learning-paths/embedded-and-microcontrollers/cmsisdsp-dev-with-python/how-to-4.md‎
Lines changed: 2 additions & 2 deletions b/‎content/learning-paths/embedded-and-microcontrollers/cmsisdsp-dev-with-python/how-to-4.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎content/learning-paths/embedded-and-microcontrollers/cmsisdsp-dev-with-python/how-to-5.md‎
Lines changed: 2 additions & 4 deletions b/‎content/learning-paths/embedded-and-microcontrollers/cmsisdsp-dev-with-python/how-to-5.md‎
Lines changed: 2 additions & 4 deletions
diff --git a/‎content/learning-paths/embedded-and-microcontrollers/cmsisdsp-dev-with-python/how-to-6.md‎
Lines changed: 3 additions & 3 deletions b/‎content/learning-paths/embedded-and-microcontrollers/cmsisdsp-dev-with-python/how-to-6.md‎
Lines changed: 3 additions & 3 deletions
@@ -12,6 +12,7 @@ additional_search_terms:
 - mali
 - immortalis
 - cortex-a
+- Install Arm Mobile Studio
 
 
 ### Estimated completion time in minutes (please use integer multiple of 5)
@@ -32,41 +33,95 @@ test_maintenance: true
 test_images:
   - ubuntu:latest
 ---
-[Arm Performance Studio](https://developer.arm.com/Tools%20and%20Software/Arm%20Performance%20Studio%20for%20Mobile) (formally known as `Arm Mobile Studio`) is a performance analysis tool suite for various application developers:
+[Arm Performance Studio](https://developer.arm.com/Tools%20and%20Software/Arm%20Performance%20Studio) is a performance analysis tool suite for Android and Linux application developers.
 
-* Android application developers
-* Linux application developers in Embedded and Cloud segments
+It comprises a suite of easy-to-use tools that show you how well your game or app performs on production devices, so that you can identify problems that might cause slow performance, overheat devices, or drain the battery. 
 
-It comprises of a suite of easy-to-use tools that show you how well your game or app performs on production devices, so that you can identify problems that might cause slow performance, overheat the device, or drain the battery.
 
-[Frame Advisor](https://developer.arm.com/Tools%20and%20Software/Frame%20Advisor) is available in `2023.5` and later.
+| Component | Functionality |
+|----------|-------------|
+| [Streamline](https://developer.arm.com/Tools%20and%20Software/Streamline%20Performance%20Analyzer) | Capture a performance profile that shows all the performance counter activity from the device. |
+| [Performance Advisor](https://developer.arm.com/Tools%20and%20Software/Performance%20Advisor) | Generate an easy-to-read performance summary from an annotated Streamline capture, and get actionable advice about where you should optimize. |
+| [Frame Advisor](https://developer.arm.com/Tools%20and%20Software/Frame%20Advisor) | Capture the API calls and rendering from a problem frame and get comprehensive geometry metrics to discover what might be slowing down your application. |
+| [Mali Offline Compiler](https://developer.arm.com/Tools%20and%20Software/Mali%20Offline%20Compiler) | Analyze how efficiently your shader programs perform on a range of Mali GPUs. |
+| [RenderDoc for Arm GPUs](https://developer.arm.com/Tools%20and%20Software/RenderDoc%20for%20Arm%20GPUs) | The industry-standard tool for debugging Vulkan graphics applications, including early support for Arm GPU extensions and Android features. |
 
-[RenderDoc for Arm GPUs](https://community.arm.com/arm-community-blogs/b/graphics-gaming-and-vr-blog/posts/beyond-mobile-arm-mobile-studio-is-now-arm-performance-studio) is available in `2024.0` and later.
 
-[Graphics Analyzer](https://developer.arm.com/Tools%20and%20Software/Graphics%20Analyzer) is no longer provided. The final release was provided in the `2024.2` release.
-
-All features of Arm Performance Studio are available free of charge without any additional license as of the `2022.4` release.
+All features of Arm Performance Studio are available free of charge without any additional license.
 
 ## How do I install Arm Performance Studio?
 
 Arm Performance Studio is supported on Windows, Linux, and macOS hosts. Download the appropriate installer from [Arm Performance Studio Downloads](https://developer.arm.com/Tools%20and%20Software/Arm%20Performance%20Studio#Downloads).
 
-Full installation and application launch instructions are given in the Arm Performance Studio [Release Notes](https://developer.arm.com/documentation/107649).
+Full details about the supported OS and Android versions are given in the Arm Performance Studio [Release Notes](https://developer.arm.com/documentation/107649).
 
 ### How do I install Arm Performance Studio on Windows?
 
-Run the supplied `Arm_Performance_Studio_<version>_windows_x86-64.exe` installer, and follow on-screen instructions.
+Run the downloaded `Arm_Performance_Studio_<version>_windows_x86-64.exe` installer, and follow the on-screen instructions.
 
-### How do I install Arm Performance Studio on Linux?
+To open Streamline, Frame Advisor or RenderDoc for Arm GPUs, go to the Windows Start menu and search for the name of the tool you want to open.
+
+Performance Advisor is a feature of the Streamline command-line application. To generate a performance report, you must first run the provided Python script to enable Streamline to collect frame data from the device. This process is described in detail in the [Get started with Performance Advisor tutorial](https://developer.arm.com/documentation/102478/latest). After you have captured a profile with Streamline, run `Streamline-cli` on the Streamline capture file. This command is added to your `PATH` environment variable during installation, so it can be used from anywhere.
+
+```console
+Streamline-cli.exe -pa <options> my_capture.apc
+```
+
+To run Mali Offline Compiler, open a command terminal, navigate to your work directory, and run the `malioc` command on a shader program. The malioc command is added to your `PATH` environment variable during installation, so it can be used from anywhere.
 
-Unpack the supplied `Arm Performance Studio` bundle to the desired location. For example:
 ```console
-tar -xf Arm_Performance_Studio_2024.3_linux_x86-64.tgz
+malioc.exe <options> my_shader.frag
 ```
+
 ### How do I install Arm Performance Studio on macOS?
 
-Run the supplied `Arm_Performance_Studio_<version>_macos_x86-64.dmg` installer, and follow on-screen instructions.
+Arm Performance Studio is provided as a `.dmg` package. To mount it, double-click the `.dmg` package and follow the instructions. The Arm Performance Studio directory tree is copied to the Applications directory on your local file system for easy access.
+
+You can remove write permission from the installation directory to prevent other users from writing to it. This is done with the `chmod` command. For example:
+
+```
+chmod go-w <dest_dir>
+```
+
+Open Streamline, Frame Advisor or RenderDoc for Arm GPUs directly from the Arm Performance Studio directory in your Applications directory. For example, to open Streamline, go to the `<installation_directory>/streamline` directory and open the `Streamline.app` file.
+
+To run Performance Advisor, go to the `<installation_directory>/streamline` directory, and double-click the `Streamline-cli-launcher` file. Your computer will ask you to allow Streamline to control the Terminal application. Allow this. The Performance Advisor launcher opens the Terminal application and updates your `PATH` environment variable so you can run Performance Advisor from any directory.
+
+Performance Advisor is a feature of the Streamline command-line application. To generate a performance report, you must first run the provided Python script to enable Streamline to collect frame data from the device. This process is described in detail in the [Get started with Performance Advisor tutorial](https://developer.arm.com/documentation/102478/latest). After you have captured a profile with Streamline, run the `Streamline-cli` command on the Streamline capture file to generate a performance report:
+
+```
+Streamline-cli -pa <options> my_capture.apc
+```
+
+To run Mali Offline Compiler, go to the `<installation_directory>/mali_offline_compiler` directory, and double-click the `mali_offline_compiler_launcher` file. The Mali Offline Compiler launcher opens the Terminal application and updates your `PATH` environment variable so you can run the `malioc` command from any directory. To generate a shader analysis report, run the `malioc` command on a shader program:
+
+```
+malioc <options> my_shader.frag
+```
+
+On some versions of macOS, you might see a message that Mali Offline Compiler is not recognized as an application from an identified developer. To enable Mali Offline Compiler, cancel this message, then open **System Preferences > Security and Privacy** and select **Allow Anyway** for the `malioc` application.
+
+### How do I install Arm Performance Studio on Linux?
+
+Arm Performance Studio is provided as a gzipped tar archive. Extract this tar archive to your preferred location, using version 1.13 or later of GNU tar:
+
+```
+tar xvzf Arm_Performance_Studio_<version>_linux.tgz
+```
+
+You can remove write permission from the installation directory to prevent other users from writing to it. This is done with the `chmod` command. For example:
+
+```
+chmod go-w <dest_dir>
+```
+
+You might find it useful to edit your `PATH` environment variable to add the paths to the `Streamline-cli` and `malioc` executables so that you can run them from any directory. Add the following commands to the .bashrc file in your home directory, so that they are set whenever you initialize a shell session:
+
+```
+PATH=$PATH:<installation_directory>/streamline
+PATH=$PATH:<installation_directory>/mali_offline_compiler
+```
 
 ## How do I get started with Arm Performance Studio?
 
-See the [Get started with Arm Performance Studio for Mobile](/learning-paths/mobile-graphics-and-gaming/ams/) learning path for a collection of tutorials for each component of Performance Studio.
+Refer to [Get started with Arm Performance Studio](/learning-paths/mobile-graphics-and-gaming/ams/) for an overview of how to run each tool in Arm Performance Studio.
@@ -0,0 +1,121 @@
+---
+### Title the install tools article with the name of the tool to be installed
+### Include vendor name where appropriate
+title: PyTorch for Windows on Arm
+
+### Optional additional search terms (one per line) to assist in finding the article
+additional_search_terms:
+- python
+- windows
+- woa
+- windows on arm
+- open source windows on arm
+- pytorch
+
+### Estimated completion time in minutes (please use integer multiple of 5)
+minutes_to_complete: 15
+
+### Link to official documentation
+official_docs: https://www.python.org/doc/
+
+author: Pareena Verma
+
+### PAGE SETUP
+weight: 1                       # Defines page ordering. Must be 1 for first (or only) page.
+tool_install: true              # Set to true to be listed in main selection page, else false
+multi_install: false            # Set to true if first page of multi-page article, else false
+multitool_install_part: false   # Set to true if a sub-page of a multi-page article, else false
+layout: installtoolsall         # DO NOT MODIFY. Always true for tool install articles
+---
+
+PyTorch has native support for [Windows on Arm](https://learn.microsoft.com/en-us/windows/arm/overview). Starting with PyTorch 2.7 release, you can access Arm native builds of PyTorch for Windows available for Python 3.12. 
+
+A number of developer-ready Windows on Arm [devices](/learning-paths/laptops-and-desktops/intro/find-hardware/) are available.
+
+Windows on Arm instances are available with Microsoft Azure. For further information, see [Deploy a Windows on Arm virtual machine on Microsoft Azure](/learning-paths/cross-platform/woa_azure/).
+
+## How do I install PyTorch for Windows on Arm?
+
+Before you install PyTorch on your Windows on Arm machine, you will need to install [Python for Windows on Arm](/install-guides/py-woa)
+
+Verify your Python installation at a Windows Command prompt or a PowerShell prompt:
+
+```command
+python --version
+```
+The output should look like:
+
+```output
+Python 3.12.9
+```
+Once you have downloaded Python, you can install the PyTorch Stable release (2.7.0) on your Windows on Arm machine. 
+
+```command
+pip3 install torch==2.7.0 --index-url https://download.pytorch.org/whl/cpu
+```
+
+You will see that the `arm64` wheel for PyTorch is installed on your machine:
+```output
+Downloading https://download.pytorch.org/whl/cpu/torch-2.7.0%2Bcpu-cp312-cp312-win_arm64.whl (107.9 MB)
+   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 107.9/107.9 MB 29.7 MB/s eta 0:00:00
+Downloading https://download.pytorch.org/whl/sympy-1.13.3-py3-none-any.whl (6.2 MB)
+   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.2/6.2 MB 47.4 MB/s eta 0:00:00
+Downloading https://download.pytorch.org/whl/typing_extensions-4.12.2-py3-none-any.whl (37 kB)
+Downloading https://download.pytorch.org/whl/filelock-3.13.1-py3-none-any.whl (11 kB)
+Downloading https://download.pytorch.org/whl/fsspec-2024.6.1-py3-none-any.whl (177 kB)
+Downloading https://download.pytorch.org/whl/Jinja2-3.1.4-py3-none-any.whl (133 kB)
+Downloading https://download.pytorch.org/whl/networkx-3.3-py3-none-any.whl (1.7 MB)
+   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.7/1.7 MB 30.6 MB/s eta 0:00:00
+```
+
+You can also install the nightly preview versions of PyTorch on your Windows Arm machine:
+
+```command
+pip install --pre torch --index-url https://download.pytorch.org/whl/nightly/cpu
+```
+
+## How can I run a PyTorch example?
+
+To run a PyTorch example, and confirm that PyTorch is working, use a text editor to save the code below to a file named `pytorch_woa.py`.
+
+```python
+import torch
+import platform
+
+# Print PyTorch version
+print("PyTorch version:", torch.__version__)
+
+# Check if CUDA is available
+if torch.cuda.is_available():
+    print("CUDA is available. PyTorch can use the GPU.")
+else:
+    print("CUDA is not available. PyTorch will use the CPU.")
+
+# Detect system architecture
+architecture = platform.machine()
+if "ARM" in architecture.upper() or "AARCH" in architecture.upper():
+    print("PyTorch is running on Arm:", architecture)
+else:
+    print("PyTorch is not running on Arm. Detected architecture:", architecture)
+
+# Perform a basic PyTorch operation to confirm it's working
+try:
+    tensor = torch.tensor([1.0, 2.0, 3.0])
+    print("PyTorch is operational. Tensor created:", tensor)
+except Exception as e:
+    print("An error occurred while testing PyTorch:", e)
+```
+Run the code:
+
+```console
+python pytorch_woa.py
+```
+Running on a Windows on Arm machine produces an output similar to:
+
+```output
+PyTorch version: 2.7.0+cpu
+CUDA is not available. PyTorch will use the CPU.
+PyTorch is running on Arm: ARM64
+PyTorch is operational. Tensor created: tensor([1., 2., 3.])
+```
+You are now ready to use Python on your Windows on Arm device. 
@@ -3,9 +3,9 @@ title: Getting Started with CMSIS-DSP Using Python
 
 minutes_to_complete: 30
 
-draft: true
+draft: false
 cascade:
-    draft: true
+    draft: false
 
 who_is_this_for: Developers who want to learn how the CMSIS-DSP package can be integrated into their applications
 
 
@@ -67,11 +67,11 @@ First, you need to compute the signal energy from audio in Q15 format using CMSI
 
 If you look at the CMSIS-DSP documentation, you'll see that the `power` and `vlog` functions don't produce results in Q15 format. Tracking the fixed-point format throughout all lines of an algorithm can be challenging. In this example, this means that:
 
-* Subtracting the mean to center the signal - as you did in the reference implementation - is handled in CMSIS-DSP by negating the mean and applying it as an offset to the window. Because the mean is small and the shift is minor relative to the Q15 range, this adjustment won't cause saturation.
+* Subtracting the mean to center the signal - as you did in the reference implementation - is handled in CMSIS-DSP by negating the mean and applying it as an offset to the window. Using CMSIS-DSP, `arm_negate_q15` is needed to avoid saturation issues that could prevent the value sign from changing (`0x8000` remaining unchanged as `0x8000`). In practice, the mean should be small, and there should be no difference between `-` and `dsp.arm_negate_q15`. However, it is good practice to avoid using `-` or `+` in a fixed-point algorithm when translating it to CMSIS-DSP function calls.
 * The resulting `energy` and `dB` values are not in Q15 format because the `power` and `vlog` functions are used
 * The multiplication by 10 from the reference implementation is missing
 
-This means that the `signal_energy_q15` will have a different output than the above implementation. Instead of trying to determine the exact fixed-point format of the output and applying the necessary shift to adjust the output's fixed-point format, you will address it in the next step. By tuning the threshold of the detection function, the comparison with the reference VAD can be valid.
+This means that the `signal_energy_q15` will have a different output than the above implementation. Instead of trying to determine the exact fixed-point format of the output and applying the necessary shift to adjust the output's fixed-point format, you will address it in the next step by tuning the threshold of the detection function.
 
 
 ```python
 
@@ -128,7 +128,7 @@ The constructor for `NoiseSuppression`:
 - Computes the FFT length that can be used for each slice
 - Computes the padding needed for the FFT
 
-The FFT length must be a power of 2. The slice length is not necessarily a power of 2. The constructor therefore computes the closest usable power of 2, and the audio slices are padded with zeros on both sides to match the required FFT length. To make the implementation more robust, this could be computed from by taking the smaller power of two greater than the signal length.
+The FFT length must be a power of 2. The slice length is not necessarily a power of 2. The constructor therefore computes the smaller power of two greater than the signal length, and the audio slices are padded with zeros on both sides to match the required FFT length. 
 
 #### NoiseSuppressionReference constructor
 
@@ -153,8 +153,7 @@ The constructor for `NoiseSuppressionReference`:
 
 #### subnoise
 
-Calculates the approximate Wiener gain and is applied to all frequency bands of the FFT. The `v` argument is a vector.
-
+Calculates the approximate Wiener gain and it is applied to all frequency bands of the FFT. The `v` argument is a vector. If the gain is negative, it is set to 0. A small value is added to the energy to avoid division by zero.
 
 ```python
 def subnoise(self,v):
@@ -170,7 +169,6 @@ def subnoise(self,v):
 #### remove_noise
 
 Computes the FFT (with padding) and reduces noise in the frequency bands using the approximate Wiener gain.
-If the gain is negative, it is set to zero. A small value is added to the energy to avoid division by zero.
 
 The function also uses `window_and_pad`, which is implemented in the final code-block later.
 At a glance, this helper method takes care of padding the signal for a basic even-length window, ensuring it runs smoothly with the FFT.
 
@@ -82,9 +82,9 @@ if status==0:
 3. CMSIS-DSP fixed-point division represents 1 exactly. So in Q31, instead of using `0x7FFFFFFF`, `1` is represented as `0x40000000` with a shift of `1`. This behavior is handled in the algorithm when converting the scaling factor to an approximate Q31 value.
 
 Several safeguards are applied:
-* It is assumed that |energy - noise| ≤ energy. If this condition is violated (i.e., noise is greater than energy), the gain is capped at 1 to prevent overflow.
+* The Wiener gain is capped at 1 to prevent overflow.
 * If the energy is zero, the gain is also set to 1 to avoid divide-by-zero errors.
-* When energy == noise, the result should be exactly 1. In this case, `arm_divide_q31` will return a quotient of 0x40000000 and shiftVal of 1. The algorithm detects this specific representation and overrides it, setting quotient = 0x7FFFFFFF and shiftVal = 0, which is a closer approximation to full-scale gain in Q31 without the need for additional shifts.
+* When energy == noise, the result should be exactly 1. In this case, `arm_divide_q31` will return a quotient of `0x40000000` and shiftVal of 1. The algorithm detects this specific representation and overrides it, setting quotient = `0x7FFFFFFF` and shiftVal = 0, which is a closer approximation to full-scale gain in Q31 without the need for additional shifts.
 
 ```python
 quotient=0x7FFFFFFF
@@ -173,7 +173,7 @@ The noise estimation function performs both noise estimation and noise suppressi
 
 ## The final Q15 implementation
 
-Try the final implementation first, and then we’ll analyze the differences from the reference implementation.
+Try the final implementation:
 
 ```python
 class NoiseSuppressionQ15(NoiseSuppression):