Skip to content

Commit fecad7a

Browse files
davidplowmannaushir
authored andcommitted
AI camera documentation updates
Principally: A new "Overview" to start the "Under the Hood" section. Model Conversion substantially overhauled, though more still needed.
1 parent c08575b commit fecad7a

File tree

4 files changed

+125
-48
lines changed

4 files changed

+125
-48
lines changed

documentation/asciidoc/accessories/ai-camera/details.adoc

Lines changed: 45 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,34 @@
11

22
== Under the Hood
33

4+
=== Overview
5+
6+
The Raspberry Pi AI Camera works rather differently to more traditional AI-based camera image processing systems, as shown in the diagram below.
7+
8+
image::images/imx500-comparison.svg[Traditional versus IMX500 AI camera systems]
9+
10+
On the left is a diagram of a more traditional AI camera system. Here, the camera delivers only images to the Raspberry Pi. The Raspberry Pi processes the images and is then responsible for performing AI inferencing. This may use an optional external AI accelerator, as shown, or it may happen (more slowly) in the CPU.
11+
12+
On the right we have the IMX500-based system. The camera module contains a small ISP which turns the raw camera image data into an _input tensor_ which is fed directly to the AI accelerator within the camera. In turn, this produces an _output tensor_, containing the inferencing results, which is fed back to the Raspberry Pi itself. There is no need for an external accelerator, nor for the Raspberry Pi to run neural network software on the CPU.
13+
14+
Some concepts that may be helpful to understand include:
15+
16+
==== The _Input Tensor_
17+
18+
This is the part of the sensor image that is passed to the AI engine for inferencing. It is produced by a small on-board ISP which also crops and scales the camera image to the dimensions expected by the neural network that has been loaded. The input tensor is not normally made available to applications, though it is possible to access it for debugging purposes.
19+
20+
==== The _Region of Interest_
21+
22+
The Region of Interest (or _ROI_) specifies exactly which part of the sensor image is cropped out before being rescaled to the size demanded by the neural network. It can be queried and set by an application. The units used are always pixels in the full resolution sensor output.
23+
24+
By default, the ROI is set to be the full image received from the sensor (that is, nothing is actally cropped out).
25+
26+
==== The _Output Tensor_
27+
28+
These are the results of inferencing performed by the neural network. The precise number and shape of the outputs will depend on the neural network, and application code will need to understand how to handle them.
29+
30+
=== System Architecture
31+
432
The diagram below shows the various camera software components (in green) used during our imaging/inference use case with the Raspberry Pi AI Camera module hardware (in red).
533

634
image::images/imx500-block-diagram.svg[IMX500 block diagram]
@@ -204,39 +232,39 @@ The IMX500 class in Picamera2 provides the following helper functions:
204232
| `IMX500.get_full_sensor_resolution()`
205233
| Return the full sensor resolution of the IMX500.
206234

207-
| `IMX500.config()`
235+
| `IMX500.config`
208236
| Returns a dictionary of the neural network configuration.
209237

210-
| `IMX500.convert_inference_coords()`
211-
| Converts from the input tensor coordinate space to the final ISP output image space.
238+
| `IMX500.convert_inference_coords(coords, metadata, picamera2)`
239+
| Converts the coordinates _coords_ from the input tensor coordinate space to the final ISP output image space. Must be passed Picamera2's image metadata for the image, and the Picamera2 object.
212240

213241
There are a number of scaling/cropping/translation operations occurring from the original sensor image to the fully processed ISP output image. This function converts coordinates provided by the output tensor to the equivalent coordinates after performing these operations.
214242

215243
| `IMX500.show_network_fw_progress_bar()`
216244
| Displays a progress bar on the console showing the progress of the neural network firmware upload to the IMX500.
217245

218-
| `IMX500.get_roi_scaled()`
219-
| Returns the region of interest (ROI) in the ISP output coordinate space.
246+
| `IMX500.get_roi_scaled(request)`
247+
| Returns the region of interest (ROI) in the ISP output image coordinate space.
220248

221-
| `IMX500.get_isp_output_size()`
249+
| `IMX500.get_isp_output_size(picamera2)`
222250
| Returns the ISP output image size.
223251

224-
| `IMX5000.get_input_w_h()`
252+
| `IMX5000.get_input_size()`
225253
| Returns the input tensor size based on the neural network model used.
226254

227-
| `IMX500.get_outputs()`
228-
| Returns the output tensors for a given frame request.
255+
| `IMX500.get_outputs(metadata)`
256+
| Returns the output tensors from the Picamera2 image metadata metadata.
229257

230-
| `IMX500.get_output_shapes()`
231-
| Returns the shape of the output tensors for the neural network model used.
258+
| `IMX500.get_output_shapes(metadata)`
259+
| Returns the shape of the output tensors from the Picamera2 image metadata for the neural network model used.
232260

233-
| `IMX500.set_inference_roi_abs()`
234-
| Sets an absolute region of interest (ROI) crop rectangle on the sensor image to use for inferencing on the IMX500.
261+
| `IMX500.set_inference_roi_abs(rectangle)`
262+
| Sets the region of interest (ROI) crop rectangle which determines which part of the sensor image is converted to the input tensor that is used for inferencing on the IMX500. The region of interest should be specified in units of pixels at the full sensor resolution, as a `(x_offset, y_offset, width, height)` tuple.
235263

236-
| `IMX500.set_inference_aspect_ratio()`
237-
| Automatically calculates region of interest (ROI) crop rectangle on the sensor image to preserve the input tensor aspect ratio for a given neural network.
264+
| `IMX500.set_inference_aspect_ratio(aspect_ratio)`
265+
| Automatically calculates region of interest (ROI) crop rectangle on the sensor image to preserve the given aspect ratio. To make the ROI aspect ratio exactly match the input tensor for this network, use `imx500.set_inference_aspect_ratio(imx500.get_input_size())`.
238266

239-
| `IMX500.get_kpi_info()`
240-
| Returns the frame level performance indicators logged by the IMX500.
267+
| `IMX500.get_kpi_info(metadata)`
268+
| Returns the frame level performance indicators logged by the IMX500 for the given image metadata.
241269

242270
|===

documentation/asciidoc/accessories/ai-camera/getting-started.adoc

Lines changed: 21 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -19,15 +19,17 @@ The AI camera must download runtime firmware onto the IMX500 sensor during start
1919

2020
[source,console]
2121
----
22-
$ sudo apt install imx500-firmware imx500-models rpicam-apps-imx500-postprocess python3-opencv
22+
$ sudo apt install imx500-all
2323
----
2424

2525
This command:
2626

27-
* installs the `/lib/firmware/imx500_loader.fpk` and `/lib/firmware/imx500_main.fpk` firmware files required to operate the IMX500 sensor
27+
* installs the `/lib/firmware/imx500_loader.fpk` and `/lib/firmware/imx500_firmware.fpk` firmware files required to operate the IMX500 sensor
2828
* places a number of neural network model firmware files in `/usr/share/imx500-models/`
29+
* installs the IMX500 post-processing software stages in `rpicam-apps`
30+
* installs the Sony network model packaging tools
2931

30-
NOTE: The IMX500 kernel device driver loads all the firmware files (loader, main, and network) when the camera starts. This may take several minutes if the neural network model firmware has not been previously cached. The demos below display a progress bar on the console to indicate firmware loading progress.
32+
NOTE: The IMX500 kernel device driver loads all the firmware files when the camera starts. This may take several minutes if the neural network model firmware has not been previously cached. The demos below display a progress bar on the console to indicate firmware loading progress.
3133

3234
=== Reboot
3335

@@ -44,13 +46,13 @@ Once all the system packages are updated and firmware files installed, we can st
4446

4547
=== `rpicam-apps`
4648

47-
The xref:../computers/camera_software.adoc#rpicam-apps[`rpicam-apps` camera applications] include IMX500 object inference and pose estimation stages that can be run in the post-processing pipeline. For more information about the post-processing pipeline, see xref:../computers/camera_software.adoc#post-process-file[the post-processing documentation].
49+
The xref:../computers/camera_software.adoc#rpicam-apps[`rpicam-apps` camera applications] include IMX500 object detection and pose estimation stages that can be run in the post-processing pipeline. For more information about the post-processing pipeline, see xref:../computers/camera_software.adoc#post-process-file[the post-processing documentation].
4850

4951
The examples on this page use post-processing JSON files located in `/usr/share/rpicam-assets/`.
5052

51-
==== Object inference
53+
==== Object detection
5254

53-
The MobileNet SSD neural network performs basic object detection, providing bounding boxes and confidence values for each object found. `imx500_mobilenet_ssd.json` contains the configuration parameters for the IMX500 object inferencing post-processing stage using the MobileNet SSD neural network.
55+
The MobileNet SSD neural network performs basic object detection, providing bounding boxes and confidence values for each object found. `imx500_mobilenet_ssd.json` contains the configuration parameters for the IMX500 object detection post-processing stage using the MobileNet SSD neural network.
5456

5557
`imx500_mobilenet_ssd.json` declares a post-processing pipeline that contains two stages:
5658

@@ -77,7 +79,7 @@ To record video with object detection overlays, use `rpicam-vid` instead. The fo
7779
$ rpicam-vid -t 10s -o output.264 --post-process-file /usr/share/rpicam-assets/imx500_mobilenet_ssd.json --width 1920 --height 1080 --framerate 30
7880
----
7981

80-
You can configure the `imx500_object_inference` stage in many ways.
82+
You can configure the `imx500_object_detection` stage in many ways.
8183

8284
For example, `max_detections` defines the maximum number of objects that the pipeline will detect at any given time. `threshold` defines the minimum confidence value required for the pipeline to consider any input as an object.
8385

@@ -105,15 +107,22 @@ image::images/imx500-posenet.jpg[IMX500 PoseNet]
105107

106108
You can configure the `imx500_posenet` stage in many ways.
107109

108-
For example, `max_detections` defines the maximum number of body points that the pipeline will detect at any given time. `threshold` defines the minimum confidence value required for the pipeline to consider input as a body point.
110+
For example, `max_detections` defines the maximum number of bodies that the pipeline will detect at any given time. `threshold` defines the minimum confidence value required for the pipeline to consider input as a body.
109111

110112
=== Picamera2
111113

112-
For examples of image classification, object inference, object segmentation, and pose estimation using Picamera2, see https://github.com/raspberrypi/picamera2-imx500/blob/main/examples/imx500/[the `picamera2-imx500` GitHub repository].
114+
For examples of image classification, object detection, object segmentation, and pose estimation using Picamera2, see https://github.com/raspberrypi/picamera2/blob/main/examples/imx500/[the `picamera2` GitHub repository].
113115

114-
Download the repository to your Raspberry Pi to run the examples. You'll find example files in the root directory, with additional information in the `README.md` file.
116+
Most of the examples use OpenCV for some additional processing, so if you haven't done so previously, please run:
115117

116-
Run the following script from the repository to run YOLOv8 object inference:
118+
[source,console]
119+
----
120+
$ sudo apt install python3-opencv python3-munkres
121+
----
122+
123+
Now download the https://github.com/raspberrypi/picamera2[the `picamera2` repository] to your Raspberry Pi to run the examples. You'll find example files in the root directory, with additional information in the `README.md` file.
124+
125+
Run the following script from the repository to run YOLOv8 object detection:
117126

118127
[source,console]
119128
----
@@ -124,5 +133,5 @@ To try pose estimation in Picamera2, run the following script from the repositor
124133

125134
[source,console]
126135
----
127-
$ python imx500_pose_estimation_yolov8n_demo.py --model /usr/share/imx500-models/imx500_network_yolov8n_pose.rpk
136+
$ python imx500_pose_estimation_higherhrnet_demo.py
128137
----

documentation/asciidoc/accessories/ai-camera/images/imx500-comparison.svg

Lines changed: 1 addition & 0 deletions
Loading
Lines changed: 58 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,48 +1,87 @@
1-
== Model Conversion
1+
== Model Deployment
22

3-
Sony provides tools that enable users to convert pre-existing TensorFlow or PyTorch models to run on the Raspberry Pi AI Camera. Additionally, users can also build and train entirely new models for the IMX500.
3+
The process of deploying a new neural network model to the Raspberry Pi AI Camera will normally consist of the following steps:
44

5-
=== Install the IMX500 tools package
5+
. A neural network model must be provided.
6+
. The model must be quantised and compressed so that it can be run using the resources available in the IMX500 camera.
7+
. The compressed model must be converted to IMX500 format.
8+
. Finally, the model must be packaged into a firmware file that can be loaded at runtime into the camera.
69

7-
First, install the necessary tools:
10+
The first three steps will normally be performed on a more powerful computer such as a desktop or server, whilst the final packaging step must be performed on a Raspberry Pi.
11+
12+
=== Model Creation
13+
14+
The creation of neural network models is beyond the scope of this guide. Existing models can be re-used, or new ones created using popular frameworks like TensorFlow or PyTorch.
15+
16+
For more information, readers are referred to the official https://developer.aitrios.sony-semicon.com/en/raspberrypi-ai-camera[AITRIOS Developer] website.
17+
18+
=== Quantisation and Compression
19+
20+
Models are quantised and compressed using Sony's _Model Compression Toolkit_. This can be installed with
821

922
[source,console]
1023
----
11-
$ sudo apt install imx500-tools
24+
pip install model_compression_toolkit
1225
----
1326

14-
== Convert and package the model
27+
and information and tutorials can be found at the project's https://github.com/sony/model_optimization[GitHub page].
28+
29+
The _Model Compression Toolkit_ will genearate a quantised model in either Keras (for TensorFlow) or ONNX (for PyTorch) format.
30+
31+
=== Conversion
32+
33+
First, we must install the necessary converter tools. If you are using TensorFlow, please run
34+
35+
[source,console]
36+
----
37+
pip install imx500-converter[tf]
38+
----
1539

16-
Next, run the tools to convert and package the model.
40+
TIP: Be careful that you have installed the same version of TensorFlow as you used to compress your model. This avoids problems where the above may install a more recent version of TensorFlow that is not compatible with your model.
1741

18-
The following command converts a model file stored in the `<model-folder>` directory into a converted, IMX500-compatible model stored in `<converted-model-folder>`:
42+
or if you are using PyTorch, please run
1943

2044
[source,console]
2145
----
22-
$ imx500-convert.sh -i <model-folder> -o <converted-model-folder>
46+
pip install imx500-converter[pt]
2347
----
2448

49+
TIP: If you need to install both these packages, we strongly recommend doing so in separate Python virtual environments (for example, using `python -m venv <virtual-environment-name>`). This avoids any problems with TensorFlow and PyTorch causing conflicts with one another.
2550

26-
Then, run the following command to package the converted model stored in the `<converted-model-folder>` directory into a package stored in `<packaged-model-folder>`.
51+
Next, we can convert the model. For TensorFlow, use
2752

2853
[source,console]
2954
----
30-
$ imx500-package.sh -i <converted-model-folder> -o <packaged-model-folder>
55+
imxconv-tf -i <compressed Keras model> -o <output folder>
3156
----
3257

33-
=== Prepare firmware for deployment
58+
and for PyTorch, use
3459

35-
Finally, prepare the firmware for the Raspberry Pi AI Camera. This preparation swaps the Endian-ness of the byte ordering, then appends some sensor register information provided by the model conversion steps above into the firmware file.
60+
[source,console]
61+
----
62+
imxconv-pt -i <compressed ONNX model> -o <output folder>
63+
----
64+
65+
In both cases, the output folder will be created containing, among other things, a memory usage report, plus a `packerOut.zip` file which is what we will need to copy to the Pi for the final step.
66+
67+
Again, for more information on the model conversion process, please refer to the official https://developer.aitrios.sony-semicon.com/en/raspberrypi-ai-camera/documentation/imx500-converter[IMX500 Converter] documentation.
68+
69+
=== Packaging
70+
71+
The final step, which we run on a Raspberry Pi, is packaging the model into an _RPK_ file. This _RPK_ file is then uploaded to the IMX500 camera when running the neural network model. Before proceeding, we must install the necessary tools:
72+
73+
[source,console]
74+
----
75+
$ sudo apt install imx500-tools
76+
----
3677

37-
Run the following commands to prepare the firmware into a file named `imx500_network.fpk`:
78+
Now we can run
3879

3980
[source,console]
4081
----
41-
$ objcopy -I binary -O binary --reverse-bytes=4 /<packaged-model>/network.fpk network.fpk.REVERSED
42-
$ ni_to_reg /<packaged-model>/network_info.txt > registers.bin
43-
$ cat network.fpk.REVERSED registers.bin > imx500_network.fpk
82+
imx500-package.sh -i <path to packerOut.zip> -o <output folder>
4483
----
4584

46-
You can then load the prepared `imx500_network.fpk` file onto your Raspberry Pi AI Camera using the helper functions described above.
85+
The output folder should finally contain a file `network.rpk`, the name of which is what we pass to our IMX500 camera applications.
4786

48-
For more information about the AI Camera and the tools used to work with it, visit the https://developer.sony.com/imx500/[Sony IMX500 developer website].
87+
More specific instructions on all these tools, and their constraints is out of scope for this tutorial. For a more comprehensive set of instructions and further specifics on the tools used, please see the official https://developer.aitrios.sony-semicon.com/en/raspberrypi-ai-camera/documentation/imx500-packager[IMX500 Packager] documentation.

0 commit comments

Comments
 (0)