|
1 | 1 | --- |
2 | | - |
3 | 2 | title: Overview |
4 | 3 | weight: 4 |
5 | 4 |
|
6 | 5 | ### FIXED, DO NOT MODIFY |
7 | 6 | layout: learningpathall |
8 | | - |
9 | 7 | --- |
10 | 8 |
|
11 | 9 | ## KleidiAI |
12 | 10 |
|
13 | | -[KleidiAI](https://gitlab.arm.com/kleidi/kleidiai) is an open-source library that provides optimized, performance-critical routines - also known as micro-kernels - for artificial intelligence (AI) workloads on Arm CPUs. |
| 11 | +[KleidiAI](https://gitlab.arm.com/kleidi/kleidiai) is an open-source library of optimized, performance-critical routines (micro-kernels) for AI workloads on Arm CPUs. These routines are tuned for specific Arm microarchitectures to maximize performance and are designed for straightforward integration into C/C++ ML and AI frameworks. |
14 | 12 |
|
15 | | -These routines are tuned to take full advantage of specific Arm hardware architectures to maximize performance. The [KleidiAI](https://gitlab.arm.com/kleidi/kleidiai) library is designed for easy integration into C or C++ machine learning (ML) and AI frameworks. |
16 | | - |
17 | | -Several popular AI frameworks already take advantage of [KleidiAI](https://gitlab.arm.com/kleidi/kleidiai) to improve performance on Arm platforms. |
| 13 | +Several popular AI frameworks already take advantage of KleidiAI to improve performance on Arm platforms. |
18 | 14 |
|
19 | 15 | ## KleidiCV |
20 | 16 |
|
21 | | -[KleidiCV](https://gitlab.arm.com/kleidi/kleidicv) is an open-source library that provides high-performance image processing functions for AArch64. |
22 | | - |
23 | | -It is designed to be lightweight and simple to integrate into a wide variety of projects. Some computer vision frameworks, such as OpenCV, leverage [KleidiCV](https://gitlab.arm.com/kleidi/kleidicv) to accelerate image processing on Arm devices. |
| 17 | +[KleidiCV](https://gitlab.arm.com/kleidi/kleidicv) is an open-source library that provides high-performance image-processing functions for AArch64. It is lightweight and simple to integrate, and computer-vision frameworks such as OpenCV can leverage KleidiCV to accelerate image processing on Arm devices. |
24 | 18 |
|
25 | 19 | ## AI camera pipelines |
26 | 20 |
|
27 | 21 | This Learning Path provides three example applications that combine AI and computer vision (CV) techniques: |
28 | | -- Background Blur, |
29 | | -- Low-Light Enhancement, |
30 | | -- Neural Denoising. |
31 | 22 |
|
32 | | -## Background Blur and Low Light Enhancement |
| 23 | +- Background blur |
| 24 | +- Low-light enhancement (LLE) |
| 25 | +- Neural denoising |
33 | 26 |
|
34 | | -Both applications: |
35 | | -- Use input and output images that are stored in `png` format, with three RGB channels (Red, Green, and Blue). Each channel supports 256 intensity levels (0-255) commonly referred to as `RGB8`. |
36 | | -- Convert the images to the `YUV420` color space for processing. |
37 | | -- Apply the relevant effect (background blur or low-light enhancement). |
38 | | -- Convert the processed images back to `RGB8` and save them as `.png` files. |
| 27 | +## Background blur and low-light enhancement |
39 | 28 |
|
40 | | -### Background Blur |
| 29 | +The applications: |
| 30 | + |
| 31 | +- Use input and output images in **PNG** format with three **RGB** channels (8-bit per channel, often written as **RGB8**) |
| 32 | +- Convert images to **YUV 4:2:0** for processing |
| 33 | +- Apply the relevant effect (background blur or low-light enhancement) |
| 34 | +- Convert the processed images back to **RGB8** and save as **.png** |
| 35 | + |
| 36 | +## Background blur |
41 | 37 |
|
42 | 38 | The background blur pipeline is implemented as follows: |
43 | 39 |
|
44 | | - |
| 40 | + |
| 41 | + |
| 42 | +## Low-light enhancement |
45 | 43 |
|
46 | | -### Low Light Enhancement |
| 44 | +The low-light enhancement pipeline is adapted from the LiveHDR+ method proposed by Google Research (2017): |
47 | 45 |
|
48 | | -The low-light enhancement pipeline is adapted from the LiveHDR+ method originally proposed by Google Research in 2017: |
| 46 | + |
49 | 47 |
|
50 | | - |
| 48 | +The low-resolution coefficient-prediction network (implemented with LiteRT) performs operations such as: |
51 | 49 |
|
52 | | -The Low-Resolution Coefficient Prediction Network (implemented with LiteRT) performs computations such as: |
53 | | -- Strided convolutions. |
54 | | -- Local feature extraction using convolutional layers. |
55 | | -- Global feature extraction using convolutional and fully connected layers. |
56 | | -- Add, convolve, and reshape operations. |
| 50 | +- Strided convolutions |
| 51 | +- Local feature extraction using convolutional layers |
| 52 | +- Global feature extraction using convolutional and fully connected layers |
| 53 | +- Add, convolve, and reshape ops |
57 | 54 |
|
58 | | -## Neural Denoising |
| 55 | +## Neural denoising |
59 | 56 |
|
60 | | -Every smartphone photographer has seen it: images that look sharp in daylight |
61 | | -but fall apart in dim lighting. This is because _signal-to-noise ratio (SNR)_ |
62 | | -drops dramatically when sensors capture fewer photons. At 1000 lux, the signal |
63 | | -dominates and images look clean; at 1 lux, readout noise becomes visible as |
64 | | -grain, color speckles, and loss of fine detail. |
| 57 | +Every smartphone photographer has experienced it: images that look sharp in daylight but degrade in dim lighting. This is because **signal-to-noise ratio (SNR)** drops sharply when sensors capture fewer photons. At 1000 lux, the signal dominates and images look clean; at 1 lux, readout noise becomes visible as grain, color speckling, and loss of fine detail. |
65 | 58 |
|
66 | | -That’s why _neural camera denoising_ is one of the most critical --- and |
67 | | -computationally demanding --- steps in a camera pipeline. Done well, it |
68 | | -transforms noisy frames into sharp, vibrant captures. Done poorly, it leaves |
69 | | -smudges and artifacts that ruin the shot. |
| 59 | +That’s why **neural camera denoising** is a critical, computationally-demanding, stage in modern camera pipelines. Done well, it can transform noisy frames into sharp, vibrant captures; done poorly, it leaves smudges and artifacts. |
70 | 60 |
|
71 | | -As depicted in the diagram below, the Neural Denoising pipeline is using 2 |
72 | | -algorithms to process the frames: |
73 | | -- either temporally, with an algorithm named `ultralite` in the code |
74 | | -repository, |
75 | | -- or spatially, with an algorithm named `collapsenet` in the code repository, |
76 | | -- or both. |
| 61 | +As shown below, the neural-denoising pipeline uses two algorithms: |
77 | 62 |
|
78 | | -Temporal denoising uses some frames as history. |
| 63 | +- **Temporal** denoising, `ultralite` in the repository (uses a history of previous frames) |
| 64 | +- **Spatial** denoising, `collapsenet` in the repository |
| 65 | +- Or a combination of both |
79 | 66 |
|
80 | | - |
| 67 | + |
81 | 68 |
|
82 | 69 | The Neural Denoising application works on frames, as emitted by a camera sensor in Bayer format: |
83 | | -- the input frames are in RGGB 1080x1920x4 format, |
84 | | -- the output frames in YGGV 4x1080x1920 format. |
| 70 | +- The input frames are in RGGB 1080x1920x4 format |
| 71 | +- The output frames in YGGV 4x1080x1920 format |
0 commit comments