Skip to content

Commit b481efc

Browse files
[Docs] Add docs and README for MinkUnet (#2358)
* add readme * rename * fix miou typo * add link * fix backbone name * add torchsparse link * revise link
1 parent 20987e5 commit b481efc

File tree

5 files changed

+159
-52
lines changed

5 files changed

+159
-52
lines changed

README.md

Lines changed: 29 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -134,6 +134,7 @@ Results and models are available in the [model zoo](docs/en/model_zoo.md).
134134
<li><a href="configs/dgcnn">DGCNN (TOG'2019)</a></li>
135135
<li>DLA (CVPR'2018)</li>
136136
<li>MinkResNet (CVPR'2019)</li>
137+
<li><a href="configs/minkunet">MinkUNet (CVPR'2019)</a></li>
137138
<li><a href="configs/cylinder3d">Cylinder3D (CVPR'2021)</a></li>
138139
</ul>
139140
</td>
@@ -221,6 +222,7 @@ Results and models are available in the [model zoo](docs/en/model_zoo.md).
221222
<td>
222223
<li><b>Outdoor</b></li>
223224
<ul>
225+
<li><a href="configs/minkunet">MinkUNet (CVPR'2019)</a></li>
224226
<li><a href="configs/cylinder3d">Cylinder3D (CVPR'2021)</a></li>
225227
</ul>
226228
<li><b>Indoor</b></li>
@@ -237,32 +239,33 @@ Results and models are available in the [model zoo](docs/en/model_zoo.md).
237239
</tbody>
238240
</table>
239241

240-
| | ResNet | PointNet++ | SECOND | DGCNN | RegNetX | DLA | MinkResNet | Cylinder3D |
241-
| :-----------: | :----: | :--------: | :----: | :---: | :-----: | :-: | :--------: | :--------: |
242-
| SECOND |||||||||
243-
| PointPillars |||||||||
244-
| FreeAnchor |||||||||
245-
| VoteNet |||||||||
246-
| H3DNet |||||||||
247-
| 3DSSD |||||||||
248-
| Part-A2 |||||||||
249-
| MVXNet |||||||||
250-
| CenterPoint |||||||||
251-
| SSN |||||||||
252-
| ImVoteNet |||||||||
253-
| FCOS3D |||||||||
254-
| PointNet++ |||||||||
255-
| Group-Free-3D |||||||||
256-
| ImVoxelNet |||||||||
257-
| PAConv |||||||||
258-
| DGCNN |||||||||
259-
| SMOKE |||||||||
260-
| PGD |||||||||
261-
| MonoFlex |||||||||
262-
| SA-SSD |||||||||
263-
| FCAF3D |||||||||
264-
| PV-RCNN |||||||||
265-
| Cylinder3D |||||||||
242+
| | ResNet | PointNet++ | SECOND | DGCNN | RegNetX | DLA | MinkResNet | Cylinder3D | MinkUNet |
243+
| :-----------: | :----: | :--------: | :----: | :---: | :-----: | :-: | :--------: | :--------: | :------: |
244+
| SECOND ||||||||||
245+
| PointPillars ||||||||||
246+
| FreeAnchor ||||||||||
247+
| VoteNet ||||||||||
248+
| H3DNet ||||||||||
249+
| 3DSSD ||||||||||
250+
| Part-A2 ||||||||||
251+
| MVXNet ||||||||||
252+
| CenterPoint ||||||||||
253+
| SSN ||||||||||
254+
| ImVoteNet ||||||||||
255+
| FCOS3D ||||||||||
256+
| PointNet++ ||||||||||
257+
| Group-Free-3D ||||||||||
258+
| ImVoxelNet ||||||||||
259+
| PAConv ||||||||||
260+
| DGCNN ||||||||||
261+
| SMOKE ||||||||||
262+
| PGD ||||||||||
263+
| MonoFlex ||||||||||
264+
| SA-SSD ||||||||||
265+
| FCAF3D ||||||||||
266+
| PV-RCNN ||||||||||
267+
| Cylinder3D ||||||||||
268+
| MinkUNet ||||||||||
266269

267270
**Note:** All the about **300+ models, methods of 40+ papers** in 2D detection supported by [MMDetection](https://github.com/open-mmlab/mmdetection/blob/3.x/docs/en/model_zoo.md) can be trained or used in this codebase.
268271

README_zh-CN.md

Lines changed: 29 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -131,6 +131,7 @@ MMDetection3D 是一个基于 PyTorch 的目标检测开源工具箱,下一代
131131
<li><a href="configs/dgcnn">DGCNN (TOG'2019)</a></li>
132132
<li>DLA (CVPR'2018)</li>
133133
<li>MinkResNet (CVPR'2019)</li>
134+
<li><a href="configs/minkunet">MinkUNet (CVPR'2019)</a></li>
134135
<li><a href="configs/cylinder3d">Cylinder3D (CVPR'2021)</a></li>
135136
</ul>
136137
</td>
@@ -217,6 +218,7 @@ MMDetection3D 是一个基于 PyTorch 的目标检测开源工具箱,下一代
217218
<td>
218219
<li><b>室外</b></li>
219220
<ul>
221+
<li><a href="configs/minkunet">MinkUNet (CVPR'2019)</a></li>
220222
<li><a href="configs/cylinder3d">Cylinder3D (CVPR'2021)</a></li>
221223
</ul>
222224
<li><b>室内</b></li>
@@ -233,32 +235,33 @@ MMDetection3D 是一个基于 PyTorch 的目标检测开源工具箱,下一代
233235
</tbody>
234236
</table>
235237

236-
| | ResNet | PointNet++ | SECOND | DGCNN | RegNetX | DLA | MinkResNet | Cylinder3D |
237-
| :-----------: | :----: | :--------: | :----: | :---: | :-----: | :-: | :--------: | :--------: |
238-
| SECOND |||||||||
239-
| PointPillars |||||||||
240-
| FreeAnchor |||||||||
241-
| VoteNet |||||||||
242-
| H3DNet |||||||||
243-
| 3DSSD |||||||||
244-
| Part-A2 |||||||||
245-
| MVXNet |||||||||
246-
| CenterPoint |||||||||
247-
| SSN |||||||||
248-
| ImVoteNet |||||||||
249-
| FCOS3D |||||||||
250-
| PointNet++ |||||||||
251-
| Group-Free-3D |||||||||
252-
| ImVoxelNet |||||||||
253-
| PAConv |||||||||
254-
| DGCNN |||||||||
255-
| SMOKE |||||||||
256-
| PGD |||||||||
257-
| MonoFlex |||||||||
258-
| SA-SSD |||||||||
259-
| FCAF3D |||||||||
260-
| PV-RCNN |||||||||
261-
| Cylinder3D |||||||||
238+
| | ResNet | PointNet++ | SECOND | DGCNN | RegNetX | DLA | MinkResNet | Cylinder3D | MinkUNet |
239+
| :-----------: | :----: | :--------: | :----: | :---: | :-----: | :-: | :--------: | :--------: | :------: |
240+
| SECOND ||||||||||
241+
| PointPillars ||||||||||
242+
| FreeAnchor ||||||||||
243+
| VoteNet ||||||||||
244+
| H3DNet ||||||||||
245+
| 3DSSD ||||||||||
246+
| Part-A2 ||||||||||
247+
| MVXNet ||||||||||
248+
| CenterPoint ||||||||||
249+
| SSN ||||||||||
250+
| ImVoteNet ||||||||||
251+
| FCOS3D ||||||||||
252+
| PointNet++ ||||||||||
253+
| Group-Free-3D ||||||||||
254+
| ImVoxelNet ||||||||||
255+
| PAConv ||||||||||
256+
| DGCNN ||||||||||
257+
| SMOKE ||||||||||
258+
| PGD ||||||||||
259+
| MonoFlex ||||||||||
260+
| SA-SSD ||||||||||
261+
| FCAF3D ||||||||||
262+
| PV-RCNN ||||||||||
263+
| Cylinder3D ||||||||||
264+
| MinkUNet ||||||||||
262265

263266
**注意:**[MMDetection](https://github.com/open-mmlab/mmdetection/blob/3.x/docs/zh_cn/model_zoo.md) 支持的基于 2D 检测的 **300+ 个模型,40+ 的论文算法**在 MMDetection3D 中都可以被训练或使用。
264267

configs/minkunet/README.md

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
# 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks
2+
3+
> [4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks](https://arxiv.org/abs/1904.08755)
4+
5+
<!-- [ALGORITHM] -->
6+
7+
## Abstract
8+
9+
In many robotics and VR/AR applications, 3D-videos are readily-available sources of input (a continuous sequence of depth images, or LIDAR scans). However, those 3D-videos are processed frame-by-frame either through 2D convnets or 3D perception algorithms. In this work, we propose 4-dimensional convolutional neural networks for spatio-temporal perception that can directly process such 3D-videos using high-dimensional convolutions. For this, we adopt sparse tensors and propose the generalized sparse convolution that encompasses all discrete convolutions. To implement the generalized sparse convolution, we create an open-source auto-differentiation library for sparse tensors that provides extensive functions for high-dimensional convolutional neural networks. We create 4D spatio-temporal convolutional neural networks using the library and validate them on various 3D semantic segmentation benchmarks and proposed 4D datasets for 3D-video perception. To overcome challenges in the 4D space, we propose the hybrid kernel, a special case of the generalized sparse convolution, and the trilateral-stationary conditional random field that enforces spatio-temporal consistency in the 7D space-time-chroma space. Experimentally, we show that convolutional neural networks with only generalized 3D sparse convolutions can outperform 2D or 2D-3D hybrid methods by a large margin. Also, we show that on 3D-videos, 4D spatio-temporal convolutional neural networks are robust to noise, outperform 3D convolutional neural networks and are faster than the 3D counterpart in some cases.
10+
11+
<div align=center>
12+
<img src="https://user-images.githubusercontent.com/72679458/225243534-cd0ed738-4224-4e7c-bcac-4f4c8d89f3a9.png" width="800"/>
13+
</div>
14+
15+
## Introduction
16+
17+
We implement MinkUNet with [TorchSparse](https://github.com/mit-han-lab/torchsparse) backend and provide the result and checkpoints on SemanticKITTI datasets.
18+
19+
## Results and models
20+
21+
### SemanticKITTI
22+
23+
| Method | Lr schd | Mem (GB) | mIoU | Download |
24+
| :----------: | :-----: | :------: | :--: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
25+
| MinkUNet-W16 | 15e | 3.4 | 60.3 | [model](https://download.openmmlab.com/mmdetection3d/v1.1.0_models/minkunet/minkunet_w16_8xb2-15e_semantickitti/minkunet_w16_8xb2-15e_semantickitti_20230309_160737-0d8ec25b.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.1.0_models/minkunet/minkunet_w16_8xb2-15e_semantickitti/minkunet_w16_8xb2-15e_semantickitti_20230309_160737.log) |
26+
| MinkUNet-W20 | 15e | 3.7 | 61.6 | [model](https://download.openmmlab.com/mmdetection3d/v1.1.0_models/minkunet/minkunet_w20_8xb2-15e_semantickitti/minkunet_w20_8xb2-15e_semantickitti_20230309_160718-c3b92e6e.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.1.0_models/minkunet/minkunet_w20_8xb2-15e_semantickitti/minkunet_w20_8xb2-15e_semantickitti_20230309_160718.log) |
27+
| MinkUNet-W32 | 15e | 4.9 | 63.1 | [model](https://download.openmmlab.com/mmdetection3d/v1.1.0_models/minkunet/minkunet_w32_8xb2-15e_semantickitti/minkunet_w32_8xb2-15e_semantickitti_20230309_160710-7fa0a6f1.pth) \| [log](https://download.openmmlab.com/mmdetection3d/v1.1.0_models/minkunet/minkunet_w32_8xb2-15e_semantickitti/minkunet_w32_8xb2-15e_semantickitti_20230309_160710.log) |
28+
29+
**Note:** We follow the implementation in SPVNAS original [repo](https://github.com/mit-han-lab/spvnas) and W16\\W20\\W32 indicates different number of channels.
30+
31+
**Note:** Due to TorchSparse backend, the model performance is unstable with TorchSparse backend and may fluctuate by about 1.5 mIoU for different random seeds.
32+
33+
## Citation
34+
35+
```latex
36+
@inproceedings{choy20194d,
37+
title={4d spatio-temporal convnets: Minkowski convolutional neural networks},
38+
author={Choy, Christopher and Gwak, JunYoung and Savarese, Silvio},
39+
booktitle={Proceedings of the IEEE/CVF conference on computer vision and pattern recognition},
40+
pages={3075--3084},
41+
year={2019}
42+
}
43+
```

0 commit comments

Comments
 (0)