chore: fix rtmdet export shape, update docs

nullptr · nullptr · commit ab8801a5d3a8 · 2024-12-25T13:35:14.000Z
commit e55152f Author: nullptr <nullptr@localhost> Date: Wed Dec 25 10:05:08 2024 +0000 refactor: rtmdet forward shape commit a88f7f1 Author: nullptr <nullptr@localhost> Date: Wed Dec 25 08:21:20 2024 +0000 docs: update contents commit f291538 Author: nullptr <nullptr@localhost> Date: Wed Dec 25 08:21:10 2024 +0000 docs: update contents
diff --git a/README.md b/README.md
@@ -0,0 +1,116 @@
+<div align="center">
+  <img width="20%" src="https://files.seeedstudio.com/sscma/docs/images/SSCMA-Hero.png"/>
+
+  <h1>
+      SenseCraft Model Assistant by Seeed Studio
+  </h1>
+
+[![docs-build](https://github.com/Seeed-Studio/ModelAssistant/actions/workflows/docs-build.yml/badge.svg)](https://github.com/Seeed-Studio/ModelAssistant/actions/workflows/docs-build.yml)
+![GitHub Release](https://img.shields.io/github/v/release/Seeed-Studio/ModelAssistant)
+[![license](https://img.shields.io/github/license/Seeed-Studio/ModelAssistant.svg)](https://github.com/Seeed-Studio/ModelAssistant/blob/main/LICENSE)
+[![Average time to resolve an issue](http://isitmaintained.com/badge/resolution/Seeed-Studio/ModelAssistant.svg)](http://isitmaintained.com/project/Seeed-Studio/ModelAssistant "Average time to resolve an issue")
+[![Percentage of issues still open](http://isitmaintained.com/badge/open/Seeed-Studio/ModelAssistant.svg)](http://isitmaintained.com/project/Seeed-Studio/ModelAssistant "Percentage of issues still open")
+
+  <h3>
+    <a href="https://sensecraftma.seeed.cc"> Documentation </a> |
+    <a href="https://sensecraftma.seeed.cc/introduction/installation"> Installation </a> |
+    <a href="https://github.com/Seeed-Studio/ModelAssistant/tree/main/notebooks"> Colab </a> |
+    <a href="https://github.com/Seeed-Studio/sscma-model-zoo"> Model Zoo </a> |
+    <a href="https://seeed-studio.github.io/SenseCraft-Web-Toolkit"> Deploy </a> -
+    <a href="README_zh-CN.md"> 简体中文 </a>
+  </h3>
+
+</div>
+
+## Introduction
+
+**S**eeed **S**ense**C**raft **M**odel **A**ssistant is an open-source project focused on providing state-of-the-art AI algorithms for embedded devices. It is designed to help developers and makers to easily deploy various AI models on low-cost hardwares, such as microcontrollers and single-board computers (SBCs).
+
+<div align="center">
+
+<img width="98%" src="https://files.seeedstudio.com/sscma/docs/images/SSCMA-Deploy.gif"/>
+
+</div>
+
+**Real-world deploy examples on MCUs with less than 0.3 Watts power consumption.*
+
+### 🤝 User-friendly
+
+SSCMA provides a user-friendly platform that allows users to easily perform training on collected data, and to better understand the performance of algorithms through visualizations generated during the training process.
+
+### 🔋 Models with low computing power and high performance
+
+SSCMA focuses on end-side AI algorithm research, and the algorithm models can be deployed on microprocessors, similar to [ESP32](https://www.espressif.com.cn/en/products/socs/esp32), some [Arduino](https://arduino.cc) development boards, and even in embedded SBCs such as [Raspberry Pi](https://www.raspberrypi.org).
+
+### 🗂️ Supports multiple formats for model export
+
+[TensorFlow Lite](https://www.tensorflow.org/lite) is mainly used in microcontrollers, while [ONNX](https://onnx.ai) is mainly used in devices with Embedded Linux. There are some special formats such as [TensorRT](https://developer.nvidia.com/tensorrt), [OpenVINO](https://docs.openvino.ai) which are already well supported by OpenMMLab. SSCMA has added TFLite model export for microcontrollers, which can be directly converted to [TensorRT](https://developer.nvidia.com/tensorrt), [UF2](https://github.com/microsoft/uf2) format and drag-and-drop into the device for deployment.
+
+## Features
+
+We have optimized excellent algorithms from [OpenMMLab](https://github.com/open-mmlab) for real-world scenarios and made implementation more user-friendly, achieving faster and more accurate inference. Currently we support the following directions of algorithms:
+
+### 🔍 Anomaly Detection
+
+In the real world, anomalous data is often difficult to identify, and even if it can be identified, it requires a very high cost. The anomaly detection algorithm collects normal data in a low-cost way, and anything outside normal data is considered anomalous.
+
+### 👁️ Computer Vision
+
+Here we provide a number of computer vision algorithms such as **object detection, image classification, image segmentation and pose estimation**. However, these algorithms cannot run on low-cost hardwares. SSCMA optimizes these computer vision algorithms to achieve good running speed and accuracy in low-end devices.
+
+### ⏱️ Scenario Specific
+
+SSCMA provides customized scenarios for specific production environments, such as identification of analog instruments, traditional digital meters, and audio classification. We will continue to add more algorithms for specified scenarios in the future.
+
+## What's New
+
+SSCMA is always committed to providing the cutting-edge AI algorithms for best performance and accuracy, along with the community feedbacks, we keeps updating and optimizing the algorithms to meet the actual needs of users, here are some of the latest updates:
+
+### 🔥 RTMDet, VAE, QAT
+
+We have added the RTMDet algorithm for real-time multi-object detection, VAE for anomaly detection, and QAT for quantization-aware training. These algorithms are optimized for low-cost hardwares and can be deployed on microcontrollers.
+
+![RTMDet COCO Benchmark](docs/images/rtmdet_coco_eval.png)
+
+We also optimized the training process for these algorithms, now the training process is much more faster than before.
+
+### YOLOv8, YOLOv8 Pose, Nvidia Tao Models and ByteTrack
+
+With [SSCMA-Micro](https://github.com/Seeed-Studio/SSCMA-Micro), now you can deploy the latest [YOLOv8](https://github.com/ultralytics/ultralytics), YOLOv8 Pose, [Nvidia TAO Models](https://docs.nvidia.com/tao/tao-toolkit/text/model_zoo/cv_models/index.html) on microcontrollers. we also added the [ByteTrack](https://github.com/ifzhang/ByteTrack) algorithm to enable real-time object tracking on low-cost hardwares.
+
+<div align="center"><img width="98%" src="https://files.seeedstudio.com/sscma/docs/images/SSCMA-WebCam-Tracking.gif"/></div>
+
+### Swift YOLO
+
+We implemented a lightweight object detection algorithm called Swift YOLO, which is designed to run on low-cost hardware with limited computing power. The visualization tool, model training and export command-line interface has refactored now.
+
+<div align="center"><img width="98%" src="https://files.seeedstudio.com/sscma/docs/static/esp32/images/person_detection.png"/></div>
+
+### Meter Recognition
+
+Meter is a common instrument in our daily life and industrial production, such as analog meters, digital meters, etc. SSCMA provides meter recognition algorithms that can be used to identify the readings of various meters.
+
+<div align="center"><img width="98%" src="https://files.seeedstudio.com/sscma/docs/static/grove/images/pfld_meter.gif"/></div>
+
+## The SSCMA Toolchains
+
+SSCMA provides a complete toolchain for users to easily deploy AI models on low-cost hardwares, including:
+
+- [SSCMA-Model-Zoo](https://sensecraft.seeed.cc/ai/#/model) SSCMA Model Zoo provides a series of pre-trained models for different application scenarios for you to use. The source code for this web is [hosted here](https://github.com/Seeed-Studio/sscma-model-zoo).
+- [SSCMA-Web-Toolkit, which is now renamed to SenseCraft AI](https://sensecraft.seeed.cc/ai/#/home) A web-based tool that makes trainning and deploying machine learning models (with a focus on vision models by now) fast, easy, and accessible to everyone.
+- [SSCMA-Micro](https://github.com/Seeed-Studio/SSCMA-Micro) A cross-platform framework that deploys and applies SSCMA models to microcontrol devices.
+- [Seeed-Arduino-SSCMA](https://github.com/Seeed-Studio/Seeed_Arduino_SSCMA) Arduino library for devices supporting the SSCMA-Micro firmware.
+- [Python-SSCMA](https://github.com/Seeed-Studio/python-sscma) A Python library for interacting with microcontrollers using SSCMA-Micro, and for higher-level deep learning applications.
+
+## Acknowledgement
+
+SSCMA is a united effort of many developers and contributors, we would like to thank the following projects and organizations for their contributions which SSCMA referenced to implement:
+
+- [OpenMMLab](https://openmmlab.com/)
+- [ONNX](https://github.com/onnx/onnx)
+- [NCNN](https://github.com/Tencent/ncnn)
+- [TinyNN](https://github.com/alibaba/TinyNeuralNetwork)
+
+## License
+
+This project is released under the [Apache 2.0 license](LICENSE).
diff --git a/README_zh-CN.md b/README_zh-CN.md
@@ -0,0 +1,116 @@
+<div align="center">
+  <img width="20%" src="https://files.seeedstudio.com/sscma/docs/images/SSCMA-Hero.png"/>
+
+  <h1>
+      SenseCraft Model Assistant by Seeed Studio
+  </h1>
+
+[![docs-build](https://github.com/Seeed-Studio/ModelAssistant/actions/workflows/docs-build.yml/badge.svg)](https://github.com/Seeed-Studio/ModelAssistant/actions/workflows/docs-build.yml)
+![GitHub Release](https://img.shields.io/github/v/release/Seeed-Studio/ModelAssistant)
+[![license](https://img.shields.io/github/license/Seeed-Studio/ModelAssistant.svg)](https://github.com/Seeed-Studio/ModelAssistant/blob/main/LICENSE)
+[![Average time to resolve an issue](http://isitmaintained.com/badge/resolution/Seeed-Studio/ModelAssistant.svg)](http://isitmaintained.com/project/Seeed-Studio/ModelAssistant "Average time to resolve an issue")
+[![Percentage of issues still open](http://isitmaintained.com/badge/open/Seeed-Studio/ModelAssistant.svg)](http://isitmaintained.com/project/Seeed-Studio/ModelAssistant "Percentage of issues still open")
+
+  <h3>
+    <a href="https://sensecraftma.seeed.cc"> 文档 </a> |
+    <a href="https://sensecraftma.seeed.cc/introduction/installation"> 安装 </a> |
+    <a href="https://github.com/Seeed-Studio/ModelAssistant/tree/main/notebooks"> Colab </a> |
+    <a href="https://github.com/Seeed-Studio/sscma-model-zoo"> 模型仓库 </a> |
+    <a href="https://seeed-studio.github.io/SenseCraft-Web-Toolkit"> 部署 </a> -
+    <a href="README.md"> English </a>
+  </h3>
+
+</div>
+
+## 简介
+
+**S**eeed **S**ense**C**raft **M**odel **A**ssistant 是一个专注于为嵌入式设备提供最先进的人工智能算法的开源项目。它旨在帮助开发人员和制造商轻松部署各种人工智能模型到低成本硬件上，如微控制器和单板计算机（SBCs）。
+
+<div align="center">
+
+<img width="98%" src="https://files.seeedstudio.com/sscma/docs/images/SSCMA-Deploy.gif"/>
+
+</div>
+
+**在功耗低于 0.3 瓦的微控制器上的真实部署示例。*
+
+### 🤝 用户友好
+
+SenseCraft 模型助手提供了一个用户友好的平台，方便用户使用收集的数据进行训练，并通过训练过程中生成的可视化结果更好地了解算法的性能。
+
+### 🔋 低计算功耗、高性能的模型
+
+SenseCraft 模型助手专注于边缘端人工智能算法研究，算法模型可以部署在微处理器上，类似于 [ESP32](https://www.espressif.com.cn/en/products/socs/esp32)、一些 [Arduino](https://arduino.cc) 开发板，甚至在嵌入式 SBCs（如 [Raspberry Pi](https://www.raspberrypi.org) ）上。
+
+### 🗂️ 支持多种模型导出格式
+
+[TensorFlow Lite](https://www.tensorflow.org/lite) 主要用于微控制器，而 [ONNX](https://onnx.ai) 主要用于嵌入式Linux设备。还有一些特殊格式，如 [TensorRT](https://developer.nvidia.com/tensorrt)、[OpenVINO](https://docs.openvino.ai)，这些格式已经得到 OpenMMLab 的良好支持。SenseCraft 模型助手添加了 TFLite 模型导出功能，可直接转换为 [TensorRT](https://developer.nvidia.com/tensorrt) 和 [UF2](https://github.com/microsoft/uf2) 格式，并可拖放到设备上进行部署。
+
+## 功能
+
+我们已经从 [OpenMMLab](https://github.com/open-mmlab) 优化了出色的算法，针对实际场景进行了改进，并使实现更加用户友好，实现了更快、更准确的推理。目前我们支持以下算法方向:
+
+### 🔍 异常检测
+
+在现实世界中，异常数据通常难以识别，即使能够识别出来，也需要很高的成本。异常检测算法以低成本的方式收集正常数据，认为任何超出正常数据范围的数据都是异常的。
+
+### 👁️ 计算机视觉
+
+我们提供了许多计算机视觉算法，例如目标检测、图像分类、图像分割和姿态估计。但是，这些算法无法在低成本硬件上运行。SenseCraft 模型助手优化了这些计算机视觉算法，实现了较好的运行速度和准确性。
+
+### ⏱️ 场景特定
+
+SenseCraft 模型助手为特定的生产环境提供了定制化场景，例如模拟仪器、传统数字仪表和音频分类的识别。我们将继续在未来添加更多的指定场景算法。
+
+## 新特性
+
+SSCMA 一直致力于为用户提供最先进的人工智能算法，以获得最佳性能和准确性。我们根据社区反馈不断更新和优化算法，以满足用户的实际需求。以下是一些最新的更新内容:
+
+### 🔥 RTMDet, VAE, QAT
+
+我们增加了 RTMDet 算法用于实时多目标检测，VAE 用于异常检测，以及 QAT 用于量化感知训练。这些算法针对低成本硬件进行了优化，并且可以部署在微控制器上。
+
+![RTMDet COCO 基准测试](docs/images/rtmdet_coco_eval.png)
+
+我们还对这些算法的训练过程进行了优化，现在训练过程比以前快得多。
+
+### YOLOv8、YOLOv8 Pose、Nvidia Tao Models 和 ByteTrack
+
+通过 [SSCMA-Micro](https://github.com/Seeed-Studio/SSCMA-Micro)，现在您可以在微控制器上部署最新的 [YOLOv8](https://github.com/ultralytics/ultralytics)、YOLOv8 Pose 和 [Nvidia TAO Models](https://docs.nvidia.com/tao/tao-toolkit/text/model_zoo/cv_models/index.html)。我们还添加了 [ByteTrack](https://github.com/ifzhang/ByteTrack) 算法，以在低成本硬件上实现实时物体跟踪。
+
+<div align="center"><img width="98%" src="https://files.seeedstudio.com/sscma/docs/images/SSCMA-WebCam-Tracking.gif"/></div>
+
+### Swift YOLO
+
+我们实现了一个轻量级的目标检测算法，称为 Swift YOLO，它专为在计算能力有限的低成本硬件上运行而设计。可视化工具、模型训练和导出命令行界面现已重构。
+
+<div align="center"><img width="98%" src="https://files.seeedstudio.com/sscma/docs/static/esp32/images/person_detection.png"/></div>
+
+### 仪表识别
+
+仪表是我们日常生活和工业生产中常见的仪器，例如模拟仪表、数字仪表等。SSCMA 提供了可以用来识别各种仪表读数的仪表识别算法。
+
+<div align="center"><img width="98%" src="https://files.seeedstudio.com/sscma/docs/static/grove/images/pfld_meter.gif"/></div>
+
+## SSCMA 工具链
+
+SSCMA 提供了完整的工具链，让用户可以轻松地在低成本硬件上部署 AI 模型，包括：
+
+- [SSCMA-Model-Zoo](https://github.com/Seeed-Studio/sscma-model-zoo) SSCMA 模型库为您提供了一系列针对不同应用场景的预训练模型。
+- [SSCMA-Micro](https://github.com/Seeed-Studio/SSCMA-Micro) 一个跨平台的框架，用于在微控制器设备上部署和应用 SSCMA 模型。
+- [Seeed-Arduino-SSCMA](https://github.com/Seeed-Studio/Seeed_Arduino_SSCMA) 支持 SSCMA-Micro 固件的 Arduino 库。
+- [SSCMA-Web-Toolkit](https://seeed-studio.github.io/SenseCraft-Web-Toolkit) 一个基于 Web 的工具，用于更新设备固件、SSCMA 模型和参数。
+- [Python-SSCMA](https://github.com/Seeed-Studio/python-sscma) 用于与微控制器进行交互的 Python 库，使用 SSCMA-Micro，并用于更高级别的深度学习应用。
+
+## 致谢
+
+SSCMA 是许多开发人员和贡献者的共同努力，感谢以下项目和组织对 SSCMA 的实现提供了参考和贡献:
+
+- [OpenMMLab](https://openmmlab.com/)
+- [ONNX](https://github.com/onnx/onnx)
+- [NCNN](https://github.com/Tencent/ncnn)
+- [TinyNN](https://github.com/alibaba/TinyNeuralNetwork)
+
+## 许可证
+
+本项目在 [Apache 2.0 开源许可证](LICENSE) 下发布。
diff --git a/docs/images/rtmdet_coco_eval.png b/docs/images/rtmdet_coco_eval.png
diff --git a/sscma/models/detectors/rtmdet.py b/sscma/models/detectors/rtmdet.py
@@ -1,9 +1,12 @@
 # Copyright (c) OpenMMLab. All rights reserved.
+from typing import List, Tuple, Union
 import torch
+from torch import Tensor
 
 from mmengine.dist import get_world_size
 from mmengine.logging import print_log
 
+from sscma.structures import SampleList
 from sscma.utils.typing_utils import ConfigType, OptConfigType, OptMultiConfig
 from .single_stage import SingleStageDetector
 
diff --git a/sscma/models/heads/rtmdet_head.py b/sscma/models/heads/rtmdet_head.py
@@ -1,6 +1,7 @@
 # Copyright (c) OpenMMLab. All rights reserved.
 from typing import List, Optional, Tuple, Union, Sequence
 import copy
+import math
 import torch
 import torch.nn as nn
 from torch import Tensor
@@ -213,6 +214,7 @@ def forward(self, feats: Tuple[Tensor, ...]) -> tuple:
 
         cls_scores = []
         bbox_preds = []
+
         for idx, x in enumerate(feats):
             cls_feat = x
             reg_feat = x
@@ -225,13 +227,15 @@ def forward(self, feats: Tuple[Tensor, ...]) -> tuple:
                 reg_feat = reg_layer(reg_feat)
 
             reg_dist = self.rtm_reg[idx](reg_feat)
-            # cls_scores.append(cls_score.permute(0,2,3,1).reshape(1,-1,self.num_classes))
-            # bbox_preds.append(reg_dist.permute(0,2,3,1).reshape(1,-1,4))
-            cls_scores.append(cls_score)
-            bbox_preds.append(reg_dist)
+            cls_scores.append(cls_score.permute(0,2,3,1).reshape(1,-1,self.num_classes))
+            bbox_preds.append(reg_dist.permute(0,2,3,1).reshape(1,-1,4))
+            # cls_scores.append(cls_score)
+            # bbox_preds.append(reg_dist)#32,6,6,2
         return tuple(cls_scores), tuple(bbox_preds)
 
 
+
+
 class YOLOv5Head(BaseDenseHead):
     """YOLOv5Head head used in `YOLOv5`.
 
@@ -444,7 +448,7 @@ def predict_by_feat(
         cfg.multi_label = multi_label
 
         num_imgs = len(batch_img_metas)
-        featmap_sizes = [cls_score.shape[2:] for cls_score in cls_scores]
+        featmap_sizes = [[int(math.sqrt(featmap.shape[1] // num_imgs))] * 2 for featmap in cls_scores]
 
         # If the shape does not change, use the previous mlvl_priors
         if featmap_sizes != self.featmap_sizes:
@@ -456,18 +460,18 @@ def predict_by_feat(
 
         mlvl_strides = [
             flatten_priors.new_full(
-                (featmap_size.numel() * self.num_base_priors,), stride
+                (featmap_size[0] * featmap_size[1] * self.num_base_priors,), stride
             )
             for featmap_size, stride in zip(featmap_sizes, self.featmap_strides)
         ]
         flatten_stride = torch.cat(mlvl_strides)
 
         flatten_cls_scores = [
-            cls_score.permute(0, 2, 3, 1).reshape(num_imgs, -1, self.num_classes)
+            cls_score.reshape(num_imgs, -1, self.num_classes)
             for cls_score in cls_scores
         ]
         flatten_bbox_preds = [
-            bbox_pred.permute(0, 2, 3, 1).reshape(num_imgs, -1, 4)
+            bbox_pred.reshape(num_imgs, -1, 4)
             for bbox_pred in bbox_preds
         ]
 
@@ -1157,7 +1161,7 @@ def loss_by_feat(
             dict[str, Tensor]: A dictionary of loss components.
         """
         num_imgs = len(batch_img_metas)
-        featmap_sizes = [featmap.size()[-2:] for featmap in cls_scores]
+        featmap_sizes = [[int(math.sqrt(featmap.shape[1] // num_imgs))] * 2 for featmap in cls_scores]
         assert len(featmap_sizes) == self.prior_generator.num_levels
 
         gt_info = gt_instances_preprocess(batch_gt_instances, num_imgs)
@@ -1177,7 +1181,7 @@ def loss_by_feat(
 
         flatten_cls_scores = torch.cat(
             [
-                cls_score.permute(0, 2, 3, 1).reshape(
+                cls_score.reshape(
                     num_imgs, -1, self.cls_out_channels
                 )
                 for cls_score in cls_scores
@@ -1187,11 +1191,12 @@ def loss_by_feat(
 
         flatten_bboxes = torch.cat(
             [
-                bbox_pred.permute(0, 2, 3, 1).reshape(num_imgs, -1, 4)
+                bbox_pred.reshape(num_imgs, -1, 4)
                 for bbox_pred in bbox_preds
             ],
             1,
-        )
+        ).contiguous()
+
         flatten_bboxes = flatten_bboxes * self.flatten_priors_train[..., -1, None]
         flatten_bboxes = distance2bbox(
             self.flatten_priors_train[..., :2], flatten_bboxes