diff --git a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/1_Overview.md b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/1_overview.md similarity index 100% rename from content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/1_Overview.md rename to content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/1_overview.md diff --git a/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/2_build_kernel_image.md b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/2_build_kernel_image.md new file mode 100644 index 0000000000..79efa9bd44 --- /dev/null +++ b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/2_build_kernel_image.md @@ -0,0 +1,98 @@ +--- +title: Build Linux image +weight: 3 + +### FIXED, DO NOT MODIFY +layout: learningpathall +--- + +## Install packages + +``` +sudo apt update +sudo apt install -y which sed make binutils build-essential diffutils gcc g++ bash patch gzip \ +bzip2 perl tar cpio unzip rsync file bc findutils gawk libncurses-dev python-is-python3 \ +gcc-arm-none-eabi +``` + +## Build a debuggable kernel image + +For this learning path you will be using [Buildroot](https://github.com/buildroot/buildroot) to build a Linux image for Raspberry Pi 3B+ with a debuggable Linux kernel. You will profile Linux kernel modules built out-of-tree and Linux device drivers built in the Linux source code tree. + +1. Clone the Buildroot Repository and initialize the build system with the default configurations. + +```bash +git clone https://github.com/buildroot/buildroot.git +cd buildroot +export BUILDROOT_HOME=$(pwd) +make raspberrypi3_64_defconfig +``` +{{% notice Using a different board %}} +If you're not using a Raspberry Pi 3 for this Learning Path, change the `raspberrypi3_64_defconfig` to the option that matches your hardware in `$(BUILDROOT_HOME)/configs` +{{% /notice %}} + +2. You will use `menuconfig` to configure the setup. Invoke it with the following command: + +``` +make menuconfig +``` + +![Menuconfig UI for Buildroot configuration](./images/menuconfig.png) + +Change Buildroot configurations to enable debugging symbols and SSH access. + +```plaintext +Build options ---> + [*] build packages with debugging symbols + gcc debug level (debug level 3) + [*] build packages with runtime debugging info + gcc optimization level (optimize for debugging) ---> + +System configuration ---> + [*] Enable root login with password + (****) Root password # Choose root password here + +Kernel ---> + Linux Kernel Tools ---> + [*] perf + +Target packages ---> + Networking applications ---> + [*] openssh + [*] server + [*] key utilities +``` + +You might also need to change your default `sshd_config` file according to your network settings. To do that, you need to modify System configuration→ Root filesystem overlay directories to add a directory that contains your modified `sshd_config` file. + +3. By default the Linux kernel images are stripped. You will need to make the image debuggable as you'll be using it later. + +Invoke `linux-menuconfig` and uncheck the option as shown. + +```bash +make linux-menuconfig +``` + +```plaintext +Kernel hacking ---> + -*- Kernel debugging + Compile-time checks and compiler options ---> + Debug information (Rely on the toolchain's implicit default DWARF version) + [ ] Reduce debugging information # un-check +``` + +4. Now you can build the Linux image and flash it to the the SD card to run it on the Raspberry Pi. + +```bash +make -j$(nproc) +``` + +It will take some time to build the Linux image. When it completes, the output will be in `$BUILDROOT_HOME/output/images/sdcard.img`: + +```bash +ls $BUILDROOT_HOME/output/images/ | grep sdcard.img +``` + +For details on flashing the SD card image, see [this helpful article](https://www.ev3dev.org/docs/tutorials/writing-sd-card-image-ubuntu-disk-image-writer/). + +Now that you have a target running Linux with a debuggable kernel image, you can start writing your kernel module that you want to profile. diff --git a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/3_OOT_module.md b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/3_oot_module.md similarity index 80% rename from content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/3_OOT_module.md rename to content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/3_oot_module.md index 420bb00662..578a52f9b4 100644 --- a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/3_OOT_module.md +++ b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/3_oot_module.md @@ -8,15 +8,15 @@ layout: learningpathall ## Creating the Linux Kernel Module -We will now learn how to create an example Linux kernel module (Character device) that demonstrates a cache miss issue caused by traversing a 2D array in column-major order. This access pattern is not cache-friendly, as it skips over most of the neighboring elements in memory during each iteration. +You will now create an example Linux kernel module (Character device) that demonstrates a cache miss issue caused by traversing a 2D array in column-major order. This access pattern is not cache-friendly, as it skips over most of the neighboring elements in memory during each iteration. -To build the Linux kernel module, start by creating a new directory—We will call it **example_module**—in any location of your choice. Inside this directory, add two files: `mychardrv.c` and `Makefile`. +To build the Linux kernel module, start by creating a new directory, for example `example_module`. Inside this directory, add two files: `mychardrv.c` and `Makefile`. **Makefile** ```makefile obj-m += mychardrv.o -BUILDROOT_OUT := /opt/rpi-linux/buildroot/output # Change this to your buildroot output directory +BUILDROOT_OUT := $(BUILDROOT_HOME)/output # Change this to your buildroot output directory KDIR := $(BUILDROOT_OUT)/build/linux-custom CROSS_COMPILE := $(BUILDROOT_OUT)/host/bin/aarch64-buildroot-linux-gnu- ARCH := arm64 @@ -29,7 +29,7 @@ clean: ``` {{% notice Note %}} -Change **BUILDROOT_OUT** to the correct buildroot output directory on your host machine +Change **BUILDROOT_OUT** to the correct buildroot output directory on your host machine. {{% /notice %}} **mychardrv.c** @@ -201,40 +201,45 @@ MODULE_AUTHOR("Yahya Abouelseoud"); MODULE_DESCRIPTION("A simple char driver with cache misses issue"); ``` -The module above receives the size of a 2D array as a string through the `char_dev_write()` function, converts it to an integer, and passes it to the `char_dev_cache_traverse()` function. This function then creates the 2D array, initializes it with simple data, traverses it in a column-major (cache-unfriendly) order, computes the sum of its elements, and prints the result to the kernel log. +The module above receives the size of a 2D array as a string through the `char_dev_write()` function, converts it to an integer, and passes it to the `char_dev_cache_traverse()` function. This function then creates the 2D array, initializes it with simple data, traverses it in a column-major (cache-unfriendly) order, computes the sum of its elements, and prints the result to the kernel log. The cache-unfriendly aspects allows you to inspect a bottleneck using Streamline in the next section. ## Building and Running the Kernel Module 1. To compile the kernel module, run make inside the example_module directory. This will generate the output file `mychardrv.ko`. -2. Transfer the .ko file to the target using scp command and then insert it using insmod command. After inserting the module, we create a character device node using mknod command. Finally, we can test the module by writing a size value (e.g., 10000) to the device file and measuring the time taken for the operation using the `time` command. +2. Transfer the .ko file to the target using scp command and then insert it using insmod command. After inserting the module, you create a character device node using mknod command. Finally, you can test the module by writing a size value (e.g., 10000) to the device file and measuring the time taken for the operation using the `time` command. ```bash scp mychardrv.ko root@:/root/ ``` {{% notice Note %}} - Replace \ with your own target IP address + Replace \ with your target's IP address {{% /notice %}} -3. To run the module on the target, we need to run the following commands on the target: +3. SSH onto your target device: ```bash ssh root@ - - #The following commands should be running on target device - + ``` + +4. Execute the following commads on the target to run the module: + ```bash insmod /root/mychardrv.ko mknod /dev/mychardrv c 42 0 ``` {{% notice Note %}} - 42 and 0 are the major and minor number we chose in our module code above + 42 and 0 are the major and minor number specified in the module code above {{% /notice %}} -4. Now if you run dmesg you should see something like: +4. To verify that the module is active, run `dmesg` and the output should match the below: + + ```bash + dmesg + ``` - ```log + ```output [12381.654983] mychardrv is open - Major(42) Minor(0) ``` @@ -249,4 +254,4 @@ The module above receives the size of a 2D array as a string through the `char_d The command above passes 10000 to the module, which specifies the size of the 2D array to be created and traversed. The **echo** command takes a long time to complete (around 38 seconds) due to the cache-unfriendly traversal implemented in the `char_dev_cache_traverse()` function. -With the kernel module built, the next step is to profile it using Arm Streamline. We will use it to capture runtime behavior, highlight performance bottlenecks, and help identifying issues such as the cache-unfriendly traversal in our module. +With the kernel module built, the next step is to profile it using Arm Streamline. You will use it to capture runtime behavior, highlight performance bottlenecks, and help identifying issues such as the cache-unfriendly traversal in your module. diff --git a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/4_sl_profile_OOT.md b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/4_sl_profile_oot.md similarity index 72% rename from content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/4_sl_profile_OOT.md rename to content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/4_sl_profile_oot.md index a5950cd2ac..d10aa1d78f 100644 --- a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/4_sl_profile_OOT.md +++ b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/4_sl_profile_oot.md @@ -10,25 +10,33 @@ layout: learningpathall Arm Streamline is a tool that uses sampling to measure system performance. Instead of recording every single event (like instrumentation does, which can slow things down), it takes snapshots of hardware counters and system registers at regular intervals. This gives a statistical view of how the system runs, while keeping the overhead small. -Streamline tracks many performance metrics such as CPU usage, execution cycles, memory access, cache hits and misses, and GPU activity. By putting this information together, it helps developers see how their code is using the hardware. Captured data is presented on a timeline, so you can see how performance changes as your program runs. This makes it easier to notice patterns, find bottlenecks, and link performance issues to specific parts of your application. +Streamline tracks performance metrics such as CPU usage, execution cycles, memory access, cache hits and misses, and GPU activity. By putting this information together, it helps developers see how their code is using the hardware. Captured data is presented on a timeline, so you can see how performance changes as your program runs. This makes it easier to notice patterns, find bottlenecks, and link performance issues to specific parts of your application. For more details about Streamline and its features, refer to the [Streamline user guide](https://developer.arm.com/documentation/101816/latest/Getting-started-with-Streamline/Introduction-to-Streamline). -Streamline is included with Arm Performance Studio, which you can download and use for free from [Arm Performance Studio downloads](https://developer.arm.com/Tools%20and%20Software/Arm%20Performance%20Studio#Downloads). +### Download Streamline + +Streamline is included with Arm Performance Studio, which you can download and use for free. Download it by following the link below. + +[Arm Performance Studio downloads](https://developer.arm.com/Tools%20and%20Software/Arm%20Performance%20Studio#Downloads). For step-by-step guidance on setting up Streamline on your host machine, follow the installation instructions provided in [Streamline installation guide](https://developer.arm.com/documentation/101816/latest/Getting-started-with-Streamline/Install-Streamline). ### Pushing Gator to the Target and Making a Capture -Once Streamline is installed on the host machine, you can capture trace data of our Linux kernel module. +Once Streamline is installed on the host machine, you can capture trace data of our Linux kernel module. On Linux, the binaries will be installed where you extracted the package. 1. To communicate with the target, Streamline requires a daemon, called **gatord**, to be installed and running on the target. gatord must be running before you can capture trace data. There are two pre-built gatord binaries available in Streamline's install directory, one for *Armv7 (AArch32)* and one for *Armv8 or later(AArch64)*. Push **gatord** to the target device using **scp**. ```bash scp /streamline/bin/linux/arm64/gatord root@:/root/gatord - # use arm instead of arm64, if your are using an AArch32 target ``` +{{% notice Note %}} +If you are using an AArch32 target, use `arm` instead of `arm64`. +{{% /notice%}} + + 2. Run gator on the target to start system-wide capture mode. ```bash @@ -42,17 +50,19 @@ Once Streamline is installed on the host machine, you can capture trace data of 4. Enter your target hostname or IP address. ![Streamline TCP settings#center](./images/img02_streamline_tcp.png) -5. Click on *Select counters* to open the counter configuration dialogue, to learn more about counters and how to configure them please refer to [counter configuration guide](https://developer.arm.com/documentation/101816/latest/Capture-a-Streamline-profile/Counter-Configuration) +5. Click on *Select counters* to open the counter configuration dialogue. 6. Add `L1 data Cache: Refill` and `L1 Data Cache: Access` and enable Event-Based Sampling (EBS) for both of them as shown in the screenshot and click *Save*. - {{% notice %}} + {{% notice Further reading %}} + To learn more about counters and how to configure them please refer to [counter configuration guide](https://developer.arm.com/documentation/101816/latest/Capture-a-Streamline-profile/Counter-Configuration) + To learn more about EBS, please refer to [Streamline user guide](https://developer.arm.com/documentation/101816/9-7/Capture-a-Streamline-profile/Counter-Configuration/Setting-up-event-based-sampling) {{% /notice %}} ![Counter configuration#center](./images/img03_counter_config.png) -7. In the Command section, we will add the same shell command we used earlier to test our Linux module. +7. In the Command section, add the same shell command you used earlier to test our Linux module. ```bash sh -c "echo 10000 > /dev/mychardrv" @@ -60,7 +70,7 @@ Once Streamline is installed on the host machine, you can capture trace data of ![Streamline command#center](./images/img04_streamline_cmd.png) -8. In the Capture settings dialog, select Add image, add your kernel module file `mychardrv.ko` and click Save. +8. In the Capture settings dialog, select Add image, add the absolut path of your kernel module file `mychardrv.ko` and click Save. ![Capture settings#center](./images/img05_capture_settings.png) 9. Start the capture and enter a name and location for the capture file. Streamline will start collecting data and the charts will show activity being captured from the target. @@ -70,21 +80,21 @@ Once Streamline is installed on the host machine, you can capture trace data of Once the capture is stopped, Streamline automatically analyzes the collected data and provides insights to help identify performance issues and bottlenecks. This section describes how to view these insights, starting with locating the functions related to our kernel module and narrowing down to the exact lines of code that may be responsible for the performance problems. -1. Open the *Functions tab*. In the counters list, select one of the counters we selected earlier in the counter configuration dialog, as shown: +1. Open the *Functions tab*. In the counters list, select one of the counters you selected earlier in the counter configuration dialog, as shown: ![Counter selection#center](./images/img07_select_datasource.png) -2. In the Functions tab, observe that the function `char_dev_cache_traverse()` has the highest L1 Cache refill rate, which we already expected. +2. In the Functions tab, observe that the function `char_dev_cache_traverse()` has the highest L1 Cache refill rate, which is expected. Also notice the Image name on the right, which is our module file name `mychardrv.ko`: ![Functions tab#center](./images/img08_Functions_Tab.png) 3. To view the call path of this function, right click on the function name and choose *Select in Call Paths*. -4. You can now see the exact function that called `char_dev_cache_traverse()`. In the Locations column, notice that the function calls started in the userspace (echo command) and terminated in the kernel space module `mychardrv.ko`: +4. You can now see the exact function that called `char_dev_cache_traverse()`. In the Locations column, notice that the function calls started in the userspace (`echo` command) and terminated in the kernel space module `mychardrv.ko`: ![Call paths tab#center](./images/img09_callpaths_tab.png) -5. Since we compiled our kernel module with debug info, we will be able to see the exact code lines that are causing these cache misses. +5. Since you compiled the kernel module with debug info, you will be able to see the exact code lines that are causing these cache misses. To do so, double-click on the function name and the *Code tab* opens. This view shows you how much each code line contributed to the cache misses and in bottom half of the code view, you can also see the disassembly of these lines with the counter values of each assembly instruction: ![Code tab#center](./images/img10_code_tab.png) diff --git a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/5_inTree_kernel_driver.md b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/5_intree_kernel_driver.md similarity index 52% rename from content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/5_inTree_kernel_driver.md rename to content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/5_intree_kernel_driver.md index cfa99ef04d..93d3713813 100644 --- a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/5_inTree_kernel_driver.md +++ b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/5_intree_kernel_driver.md @@ -8,13 +8,13 @@ layout: learningpathall ## Build an in-tree Linux kernel driver -Now that we have learned how to build and profile an out-of-tree kernel module, we will move on to building a driver statically into the Linux kernel. We will then profile it by adding the kernel’s vmlinux file as an image in Streamline’s capture settings. This allows us to view function calls and call paths as before, and also inspect specific sections of the kernel code that may be contributing to performance issues. +Now that you have learned how to build and profile an out-of-tree kernel module, you will move on to building a driver statically into the Linux kernel. You will then profile it by adding the kernel’s `vmlinux` file as an image in Streamline’s capture settings, rather than the kernel object itself. This allows you to view function calls and call paths as before, and also inspect specific sections of the kernel code that may be contributing to performance issues. ### Creating an in-tree simple character device driver -We will use the same example character driver we used earlier `mychardrv` except that this time we will be statically linking it to the kernel. +Use the same example character driver you used earlier `mychardrv`. This time, you will be statically linking it to the kernel. -1. Go to your kernel source directory, in our case, it's located in Buildroot's output directory in `/output/build/linux-custom`. +1. Go to your kernel source directory, in our case, it's located in Buildroot's output directory in `$(BUILDROOT_HOME)/output/build/linux-custom`. 2. Copy the `mychardrv.c` file created earlier to `drivers/char` directory. @@ -23,7 +23,7 @@ We will use the same example character driver we used earlier `mychardrv` except cp ./mychardrv.c ``` -3. Add the following configuration to the bottom of the `Kconfig` file to make the kernel configuration system aware of the the new driver we just added. +3. Add the following configuration to the bottom of the `Kconfig` file to make the kernel configuration system aware of the the new driver you just added. ```plaintext config MYCHAR_DRIVER @@ -34,7 +34,7 @@ We will use the same example character driver we used earlier `mychardrv` except endmenu ``` -4. We also need to modify the `Makefile` in the current directory to make it build the object file for `mychardrv.c`, so we'll add the following line to it. +4. You also need to modify the `Makefile` in the current directory to make it build the object file for `mychardrv.c`. Add the following line to it: ```Makefile obj-$(CONFIG_MYCHAR_DRIVER) += mychardrv.o @@ -45,16 +45,22 @@ We will use the same example character driver we used earlier `mychardrv` except You can rebuild the Linux image simply by running the **make** command in your Buildroot directory. This rebuilds the Linux kernel including our new device driver and produce a debuggable `vmlinux` ELF file. ```bash -cd +cd $(BUILDROOT_HOME) make -j$(nproc) ``` To verify that our driver was compiled into the kernel, you can run the following command: ```bash -find -iname "mychardrv.o" +find $(BUILDROOT_HOME) -iname "mychardrv.o" ``` This should return the full path of the object file produced from compiling our character device driver. -Now you can flash the new `sdcard.img` file produced to your target's SD card. To learn how to flash the sdcard.img file to your SD card, you can look at [this helpful article](https://www.ev3dev.org/docs/tutorials/writing-sd-card-image-ubuntu-disk-image-writer/). This time our driver will be automatically loaded when Linux is booted. +Now you can flash the new `sdcard.img` file produced to your target's SD card. + +{{% notice %}} +To learn how to flash the sdcard.img file to your SD card, you can look at [this helpful article](https://www.ev3dev.org/docs/tutorials/writing-sd-card-image-ubuntu-disk-image-writer/). +{{% /notice %}} + +This time your driver will be automatically loaded when Linux is booted. diff --git a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/6_sl_profile_inTree.md b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/6_sl_profile_intree.md similarity index 70% rename from content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/6_sl_profile_inTree.md rename to content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/6_sl_profile_intree.md index 18a729bf8c..4481bd3b50 100644 --- a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/6_sl_profile_inTree.md +++ b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/6_sl_profile_intree.md @@ -10,19 +10,20 @@ layout: learningpathall Profiling in-tree drivers follows almost the same process as profiling an out-of-tree kernel module. The steps include: -1. Transferring gator to the target device using scp. +1. Transferring `gator` to the target device using scp. 2. Launching Streamline, selecting TCP view, and entering the target’s IP or hostname. 3. Setting up counters and enabling Event-Based Sampling (EBS). -The main difference is that, instead of adding the kernel module’s object file as the capture image in Capture settings, we now use the Linux ELF file (vmlinux) generated by Buildroot. +The main difference is that, instead of adding the kernel module’s object file as the capture image in Capture settings, you use the Linux ELF file (vmlinux) generated by Buildroot. ![Vmlinux capture settings#center](./images/img11_vmlinux_capture_settings.png) -After clicking Save in Capture settings dialog, you can start the capture and analyze it as we did before. +After clicking Save in Capture settings dialog, you can start the capture and analyze it as you did before. ![Vmlinux function tab#center](./images/img12_vmlinux_function_tab.png) -Since we used vmlinux image we can view our driver functions as well as all other kernel functions that were sampled during our capture. +Since you used `vmlinux` image, you can view the driver functions as well as all other kernel functions that were sampled during the capture. + You can also view the full Call path of any sampled function within the kernel. ![Vmlinux call paths tab#center](./images/img13_vmlinux_callpaths_tab.png) diff --git a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/7_sl_SPE.md b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/7_sl_spe.md similarity index 81% rename from content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/7_sl_SPE.md rename to content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/7_sl_spe.md index abb5729d4e..7671ee25c8 100644 --- a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/7_sl_SPE.md +++ b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/7_sl_spe.md @@ -12,11 +12,11 @@ With periodic sampling, Streamline collects CPU performance data using hardware The Statistical Profiling Extension (SPE) removes these limits. It samples the PC in hardware, directly inside the CPU pipeline. This adds almost no overhead, so the sampling rate can be much higher. SPE also records extra details about each sampled instruction, giving a much clearer view of how the code runs. For more details on SPE and how it works in Streamline see [this blog post](https://community.arm.com/arm-community-blogs/b/tools-software-ides-blog/posts/introduction-to-statistical-profiling-support-in-streamline). -To find out if your target supports SPE, please see [Streamline user guide](https://developer.arm.com/documentation/101816/9-7/Capture-a-Streamline-profile/Counter-Configuration/Configure-SPE-counters). +To find out if your target supports SPE, see [Streamline user guide](https://developer.arm.com/documentation/101816/9-7/Capture-a-Streamline-profile/Counter-Configuration/Configure-SPE-counters). ### Profiling Kernel Module Using SPE -To profile both in-tree and out-of-tree kernel modules, we can use the same setup steps as before. The only change is to add “Arm Statistical Profiling Extension” to the Events to Collect list in the Counter Configuration dialog. +To profile both in-tree and out-of-tree kernel modules, you can use the same setup steps as before. The only change is to add “Arm Statistical Profiling Extension” to the Events to Collect list in the Counter Configuration dialog. ![SPE counter selection#center](./images/img14_spe_select_counters.png) After saving the counter configurations, Click Start capture to begin data collection, then wait for Streamline to analyze results. diff --git a/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/8_summary.md b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/8_summary.md new file mode 100644 index 0000000000..22f46c211b --- /dev/null +++ b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/8_summary.md @@ -0,0 +1,12 @@ +--- +title: Summary +weight: 9 + +### FIXED, DO NOT MODIFY +layout: learningpathall +--- +## Summary + +In this learning path, you learned how to build and profile Linux kernel modules step by step. You started with an out-of-tree character driver that had a cache performance issue and then used Arm Streamline to spot where the problem was. Later, you tried the same idea with an in-tree driver and saw how profiling works with the full kernel. Although the example problem was simple, the same methods apply to complex, real-world drivers and scenarios. + +The key takeaway is that profiling isn’t just about making code faster—it’s about understanding how your code talks to the hardware. Streamline gives us a clear picture of what’s happening inside the CPU so you can write better, more efficient drivers. By learning to identify bottlenecks, you will be more confident in fixing them and avoiding common mistakes in kernel programming. diff --git a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/_index.md b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/_index.md similarity index 99% rename from content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/_index.md rename to content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/_index.md index 56f917249c..866f9098c4 100644 --- a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/_index.md +++ b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/_index.md @@ -28,7 +28,6 @@ skilllevels: Advanced subjects: Performance and Architecture armips: - Cortex-A - - Neoverse tools_software_languages: - Arm Streamline - Arm Performance Studio diff --git a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/_next-steps.md b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/_next-steps.md similarity index 100% rename from content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/_next-steps.md rename to content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/_next-steps.md diff --git a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/images/img01_gator_cmd.png b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img01_gator_cmd.png similarity index 100% rename from content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/images/img01_gator_cmd.png rename to content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img01_gator_cmd.png diff --git a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/images/img02_streamline_tcp.png b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img02_streamline_tcp.png similarity index 100% rename from content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/images/img02_streamline_tcp.png rename to content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img02_streamline_tcp.png diff --git a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/images/img03_counter_config.png b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img03_counter_config.png similarity index 100% rename from content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/images/img03_counter_config.png rename to content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img03_counter_config.png diff --git a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/images/img04_streamline_cmd.png b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img04_streamline_cmd.png similarity index 100% rename from content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/images/img04_streamline_cmd.png rename to content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img04_streamline_cmd.png diff --git a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/images/img05_capture_settings.png b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img05_capture_settings.png similarity index 100% rename from content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/images/img05_capture_settings.png rename to content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img05_capture_settings.png diff --git a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/images/img06_streamline_timeline.png b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img06_streamline_timeline.png similarity index 100% rename from content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/images/img06_streamline_timeline.png rename to content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img06_streamline_timeline.png diff --git a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/images/img07_select_datasource.png b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img07_select_datasource.png similarity index 100% rename from content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/images/img07_select_datasource.png rename to content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img07_select_datasource.png diff --git a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/images/img08_Functions_Tab.png b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img08_Functions_Tab.png similarity index 100% rename from content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/images/img08_Functions_Tab.png rename to content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img08_Functions_Tab.png diff --git a/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img08_functions_tab.png b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img08_functions_tab.png new file mode 100644 index 0000000000..cd23986177 Binary files /dev/null and b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img08_functions_tab.png differ diff --git a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/images/img09_callpaths_tab.png b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img09_callpaths_tab.png similarity index 100% rename from content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/images/img09_callpaths_tab.png rename to content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img09_callpaths_tab.png diff --git a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/images/img10_code_tab.png b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img10_code_tab.png similarity index 100% rename from content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/images/img10_code_tab.png rename to content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img10_code_tab.png diff --git a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/images/img11_vmlinux_capture_settings.png b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img11_vmlinux_capture_settings.png similarity index 100% rename from content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/images/img11_vmlinux_capture_settings.png rename to content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img11_vmlinux_capture_settings.png diff --git a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/images/img12_vmlinux_function_tab.png b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img12_vmlinux_function_tab.png similarity index 100% rename from content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/images/img12_vmlinux_function_tab.png rename to content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img12_vmlinux_function_tab.png diff --git a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/images/img13_vmlinux_callpaths_tab.png b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img13_vmlinux_callpaths_tab.png similarity index 100% rename from content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/images/img13_vmlinux_callpaths_tab.png rename to content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img13_vmlinux_callpaths_tab.png diff --git a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/images/img14_spe_select_counters.png b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img14_spe_select_counters.png similarity index 100% rename from content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/images/img14_spe_select_counters.png rename to content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img14_spe_select_counters.png diff --git a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/images/img15_spe_function_tab.gif b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img15_spe_function_tab.gif similarity index 100% rename from content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/images/img15_spe_function_tab.gif rename to content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/img15_spe_function_tab.gif diff --git a/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/menuconfig.png b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/menuconfig.png new file mode 100644 index 0000000000..75f13205bc Binary files /dev/null and b/content/learning-paths/embedded-and-microcontrollers/streamline-kernel-module/images/menuconfig.png differ diff --git a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/2_build_kernel_image.md b/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/2_build_kernel_image.md deleted file mode 100644 index 03860d4453..0000000000 --- a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/2_build_kernel_image.md +++ /dev/null @@ -1,71 +0,0 @@ ---- -title: Build Linux image -weight: 3 - -### FIXED, DO NOT MODIFY -layout: learningpathall ---- - -## Build a debuggable kernel image - -For this learning path we will be using [Buildroot](https://github.com/buildroot/buildroot) to build a Linux image for Raspberry Pi 3B+ with a debuggable Linux kernel. We will profile Linux kernel modules built out-of-tree and Linux device drivers built in the Linux source code tree. - -1. Clone the Buildroot Repository and initialize the build system with the default configurations. - - ```bash - git clone https://github.com/buildroot/buildroot.git - cd buildroot - make raspberrypi3_64_defconfig - make menuconfig - make -j$(nproc) - ``` - -2. Change Buildroot configurations to enable debugging symbols and SSH access. - - ```plaintext - Build options ---> - [*] build packages with debugging symbols - gcc debug level (debug level 3) - [*] build packages with runtime debugging info - gcc optimization level (optimize for debugging) ---> - - System configuration ---> - [*] Enable root login with password - (****) Root password # Choose root password here - - Kernel ---> - Linux Kernel Tools ---> - [*] perf - - Target packages ---> - Networking applications ---> - [*] openssh - [*] server - [*] key utilities - ``` - - You might also need to change your default `sshd_config` file according to your network settings. To do that, you need to modify System configuration→ Root filesystem overlay directories to add a directory that contains your modified `sshd_config` file. - -3. By default the Linux kernel images are stripped so we will need to make the image debuggable as we'll be using it later. - - ```bash - make linux-menuconfig - ``` - - ```plaintext - Kernel hacking ---> - -*- Kernel debugging - Compile-time checks and compiler options ---> - Debug information (Rely on the toolchain's implicit default DWARF version) - [ ] Reduce debugging information #un-check - ``` - -4. Now we can build the Linux image and flash it to the the SD card to run it on the Raspberry Pi. - - ```bash - make -j$(nproc) - ``` - -It will take some time to build the Linux image. When it completes, the output will be in `/output/images/sdcard.img` -For details on flashing the SD card image, see [this helpful article](https://www.ev3dev.org/docs/tutorials/writing-sd-card-image-ubuntu-disk-image-writer/). -Now that we have a target running Linux with a debuggable kernel image, we can start writing our kernel module that we want to profile. diff --git a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/8_summary.md b/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/8_summary.md deleted file mode 100644 index ef91418e51..0000000000 --- a/content/learning-paths/servers-and-cloud-computing/streamline-kernel-module/8_summary.md +++ /dev/null @@ -1,12 +0,0 @@ ---- -title: Summary -weight: 9 - -### FIXED, DO NOT MODIFY -layout: learningpathall ---- -## Summary - -In this learning path, we learned how to build and profile Linux kernel modules step by step. We started with an out-of-tree character driver that had a cache performance issue and then used Arm Streamline to spot where the problem was. Later, we tried the same idea with an in-tree driver and saw how profiling works with the full kernel. Although the example problem was simple, the same methods apply to complex, real-world drivers and scenarios. - -The key takeaway is that profiling isn’t just about making code faster—it’s about understanding how your code talks to the hardware. Streamline gives us a clear picture of what’s happening inside the CPU so we can write better, more efficient drivers. By learning to identify bottlenecks, you will be more confident in fixing them and avoiding common mistakes in kernel programming.