|
1 | | ---- |
2 | | -title: MaixPy MaixCAM Using YOLOv5 / YOLOv8 / YOLO11 for Object Detection |
3 | | ---- |
| 1 | +# MaixPy: Object Detection with YOLOv5 / YOLOv8 / YOLO11 / YOLO26 Models |
| 2 | +## Concept of Object Detection |
| 3 | +Object detection refers to identifying the positions and categories of targets in images or videos—for example, detecting objects like apples and airplanes in an image and marking their locations. |
4 | 4 |
|
5 | | -## Object Detection Concept |
| 5 | +Unlike image classification, it includes positional information, so the result of object detection is usually a bounding box that outlines the object's position. |
6 | 6 |
|
7 | | -Object detection refers to detecting the position and category of objects in images or videos, such as identifying apples or airplanes in a picture and marking their locations. |
8 | | - |
9 | | -Unlike classification, object detection includes positional information. Therefore, the result of object detection is generally a rectangular box that marks the location of the object. |
10 | | - |
11 | | -## Object Detection in MaixPy |
12 | | - |
13 | | -MaixPy provides `YOLOv5`, `YOLOv8`, and `YOLO11` models by default, which can be used directly: |
| 7 | +## Using Object Detection in MaixPy |
| 8 | +MaixPy natively supports the **YOLOv5**, **YOLOv8**, **YOLO11** and **YOLO26** models, which can be used directly: |
14 | 9 | > YOLOv8 requires MaixPy >= 4.3.0. |
15 | 10 | > YOLO11 requires MaixPy >= 4.7.0. |
16 | | -
|
| 11 | +> YOLO26 requires MaixPy >= 4.12.5. |
17 | 12 | ```python |
18 | 13 | from maix import camera, display, image, nn, app |
19 | 14 |
|
20 | 15 | detector = nn.YOLOv5(model="/root/models/yolov5s.mud", dual_buff=True) |
21 | 16 | # detector = nn.YOLOv8(model="/root/models/yolov8n.mud", dual_buff=True) |
22 | 17 | # detector = nn.YOLO11(model="/root/models/yolo11n.mud", dual_buff=True) |
| 18 | +# detector = nn.YOLO26(model="/root/models/yolo26n.mud", dual_buff=True) |
23 | 19 |
|
24 | 20 | cam = camera.Camera(detector.input_width(), detector.input_height(), detector.input_format()) |
25 | 21 | disp = display.Display() |
26 | 22 |
|
27 | 23 | while not app.need_exit(): |
28 | 24 | img = cam.read() |
29 | | - objs = detector.detect(img, conf_th=0.5, iou_th=0.45) |
| 25 | + objs = detector.detect(img, conf_th = 0.5, iou_th = 0.45) |
30 | 26 | for obj in objs: |
31 | | - img.draw_rect(obj.x, obj.y, obj.w, obj.h, color=image.COLOR_RED) |
| 27 | + img.draw_rect(obj.x, obj.y, obj.w, obj.h, color = image.COLOR_RED) |
32 | 28 | msg = f'{detector.labels[obj.class_id]}: {obj.score:.2f}' |
33 | | - img.draw_string(obj.x, obj.y, msg, color=image.COLOR_RED) |
| 29 | + img.draw_string(obj.x, obj.y, msg, color = image.COLOR_RED) |
34 | 30 | disp.show(img) |
35 | 31 | ``` |
36 | 32 |
|
37 | | -Example video: |
38 | | - |
| 33 | +Demo Video: |
39 | 34 | <div> |
40 | 35 | <video playsinline controls autoplay loop muted preload src="/static/video/detector.mp4" type="video/mp4"> |
| 36 | +</video> |
41 | 37 | </div> |
42 | 38 |
|
43 | | -Here, the camera captures an image, passes it to the `detector` for detection, and then displays the results (classification name and location) on the screen. |
44 | | - |
45 | | -You can switch between `YOLO11`, `YOLOv5`, and `YOLOv8` simply by replacing the corresponding line and modifying the model file path. |
| 39 | +The code above captures images via the camera, passes them to the `detector` for inference, and then displays the detection results (category names and positions) on the screen after obtaining them. |
46 | 40 |
|
47 | | -For the list of 80 objects supported by the model, see the appendix of this document. |
| 41 | +You can switch between **YOLO11/v5/v8/26** simply by replacing the corresponding model initialization code—note to modify the model file path as well. |
48 | 42 |
|
49 | | -For more API usage, refer to the documentation for the [maix.nn](/api/maix/nn.html) module. |
| 43 | +See the appendix of this article for the list of 80 object categories supported by the pre-trained models. |
50 | 44 |
|
51 | | -## dual_buff for Double Buffering Acceleration |
| 45 | +For more API details, refer to the documentation of the [maix.nn](/api/maix/nn.html) module. |
52 | 46 |
|
53 | | -You may notice that the model initialization uses `dual_buff` (default value is `True`). Enabling the `dual_buff` parameter can improve efficiency and increase the frame rate. For more details and usage considerations, see the [dual_buff Introduction](./dual_buff.md). |
| 47 | +## Dual Buffer Acceleration (`dual_buff`) |
| 48 | +You may notice the `dual_buff` parameter is used during model initialization (it is `True` by default). Enabling this parameter can improve runtime efficiency and frame rate. For the specific principle and usage notes, see [Introduction to dual_buff](./dual_buff.md). |
54 | 49 |
|
55 | 50 | ## More Input Resolutions |
56 | | - |
57 | | -The default model input resolution is `320x224`, which closely matches the aspect ratio of the default screen. You can also download other model resolutions: |
| 51 | +The default model input resolutions are **320x224** for MaixCam and **640x480** for MaixCam2, as these aspect ratios are close to the native screen resolutions of the devices. You can also manually download models with other resolutions for replacement: |
58 | 52 |
|
59 | 53 | YOLOv5: [https://maixhub.com/model/zoo/365](https://maixhub.com/model/zoo/365) |
60 | 54 | YOLOv8: [https://maixhub.com/model/zoo/400](https://maixhub.com/model/zoo/400) |
61 | 55 | YOLO11: [https://maixhub.com/model/zoo/453](https://maixhub.com/model/zoo/453) |
62 | 56 |
|
63 | | -Higher resolutions provide more accuracy, but take longer to process. Choose the appropriate resolution based on your application. |
| 57 | +Higher resolutions yield higher detection accuracy but take longer to run. Choose the appropriate resolution based on your application scenario. |
64 | 58 |
|
65 | | -## Which Model to Use: YOLOv5, YOLOv8, or YOLO11? |
| 59 | +## Which to Choose: YOLOv5, YOLOv8, YOLO11 or YOLO26? |
| 60 | +The pre-provided models include **YOLOv5s**, **YOLOv8n**, **YOLO11n** and **YOLO26n**. The YOLOv5s model has a larger size, while YOLOv8n, YOLO11n and YOLO26n run slightly faster. According to official data, the accuracy ranking is **YOLO26n > YOLO11n > YOLOv8n > YOLOv5s**. You can conduct actual tests and select the model that fits your needs. |
66 | 61 |
|
67 | | -We provide three models: `YOLOv5s`, `YOLOv8n`, and `YOLO11n`. The `YOLOv5s` model is larger, while `YOLOv8n` and `YOLO11n` are slightly faster. According to official data, the accuracy is `YOLO11n > YOLOv8n > YOLOv5s`. You can test them to decide which works best for your situation. |
| 62 | +You can also try the **YOLOv8s** or **YOLO11s** models—their frame rates will be slightly lower (e.g., yolov8s_320x224 runs 10ms slower than yolov8n_320x224), but their accuracy is higher than the nano versions. These models can be downloaded from the model libraries mentioned above or exported by yourself from the official YOLO repositories. |
68 | 63 |
|
69 | | -Additionally, you may try `YOLOv8s` or `YOLO11s`, which will have a lower frame rate (e.g., `yolov8s_320x224` is 10ms slower than `yolov8n_320x224`), but offer higher accuracy. You can download these models from the model library mentioned above or export them yourself from the official `YOLO` repository. |
| 64 | +## Is It Allowed to Use Different Resolutions for Camera and Model? |
| 65 | +When using the `detector.detect(img)` function for inference, if the resolution of `img` differs from the model's input resolution, the function will automatically call `img.resize` to scale the image to match the model's input resolution. The default resizing method is `image.Fit.FIT_CONTAIN`, which scales the image while maintaining its aspect ratio and fills the surrounding areas with black pixels. The detected bounding box coordinates are also automatically mapped back to the coordinates of the original `img`. |
70 | 66 |
|
71 | | -## Different Resolutions for Camera and Model |
| 67 | +## Train Custom Object Detection Models Online with MaixHub |
| 68 | +If you need to detect specific objects instead of using the pre-trained 80-class model, visit [MaixHub](https://maixhub.com) to learn and train custom object detection models—simply select **Object Detection Model** when creating a project. For details, refer to [MaixHub Online Training Documentation](./maixhub_train.md). |
72 | 69 |
|
73 | | -If the resolution of `img` is different from the model's resolution when using the `detector.detect(img)` function, the function will automatically call `img.resize` to adjust the image to the model's input resolution. The default `resize` method is `image.Fit.FIT_CONTAIN`, which scales while maintaining the aspect ratio and fills the surrounding areas with black. The detected coordinates will also be automatically mapped back to the original `img`. |
| 70 | +You can also find models shared by the community in the [MaixHub Model Zoo](https://maixhub.com/model/zoo?platform=maixcam). |
74 | 71 |
|
75 | | -## Training Your Own Object Detection Model on MaixHub |
| 72 | +## Train Custom Object Detection Models Offline |
| 73 | +It is highly recommended to start with MaixHub online training—offline training is more complex and not suggested for beginners. |
76 | 74 |
|
77 | | -If you need to detect specific objects beyond the 80 categories provided, visit [MaixHub](https://maixhub.com) to learn and train an object detection model. Select "Object Detection Model" when creating a project. Refer to the [MaixHub Online Training Documentation](./maixhub_train.md). |
| 75 | +This method assumes you have basic relevant knowledge (which will not be covered in this article). Search online for solutions if you encounter problems. |
78 | 76 |
|
79 | | -Alternatively, you can find models shared by community members at the [MaixHub Model Library](https://maixhub.com/model/zoo?platform=maixcam). |
80 | | - |
81 | | -## Training Your Own Object Detection Model Offline |
82 | | - |
83 | | -We strongly recommend starting with MaixHub for online training, as the offline method is much more difficult and is not suitable for beginners. Some knowledge may not be explicitly covered here, so be prepared to do further research. |
84 | | - |
85 | | -Refer to [Training a Custom YOLOv5 Model](./customize_model_yolov5.md) or [Training a Custom YOLOv8/YOLO11 Model Offline](./customize_model_yolov8.md). |
86 | | - |
87 | | -## Appendix: 80 Classes |
88 | | - |
89 | | -The 80 objects in the COCO dataset are: |
| 77 | +See [Offline Training of YOLOv5 Models](./customize_model_yolov5.md) or [Offline Training of YOLOv8/YOLO11/YOLO26 Models](./customize_model_yolov8.md) for details. |
90 | 78 |
|
| 79 | +## Appendix: 80 Object Categories |
| 80 | +The 80 object categories of the COCO dataset are as follows: |
91 | 81 | ```txt |
92 | 82 | person |
93 | 83 | bicycle |
@@ -169,5 +159,4 @@ scissors |
169 | 159 | teddy bear |
170 | 160 | hair dryer |
171 | 161 | toothbrush |
172 | | -``` |
173 | | - |
| 162 | +``` |
0 commit comments