You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[Docs] Add BEV-based detection pipeline in NuScenes Dataset tutorial (#2672)
* update the part of in doc of nuScenes dataset
* update nuScenes tutorial
* add alternative bev sample code and necessary description for the nuscenes dataset
* update nuscenes tutorial
* update nuscenes tutorial
* update nuscenes tutorial
* use two subsections to introduce monocular and BEV
* use two subsections to introduce monocular and BEV
* use two subsections to introduce monocular and BEV
* update NuScenes dataset BEV based tutorial
* update NuScenes dataset BEV based tutorial
Copy file name to clipboardExpand all lines: docs/en/advanced_guides/datasets/nuscenes.md
+65-1Lines changed: 65 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -153,7 +153,9 @@ Intensity is not used by default due to its yielded noise when concatenating the
153
153
154
154
### Vision-Based Methods
155
155
156
-
A typical training pipeline of image-based 3D detection on nuScenes is as below.
156
+
#### Monocular-based
157
+
158
+
In the NuScenes dataset, for multi-view images, this paradigm usually involves detecting and outputting 3D object detection results separately for each image, and then obtaining the final detection results through post-processing (such as NMS). Essentially, it directly extends monocular 3D detection to multi-view settings. A typical training pipeline of image-based monocular 3D detection on nuScenes is as below.
157
159
158
160
```python
159
161
train_pipeline = [
@@ -184,6 +186,68 @@ It follows the general pipeline of 2D detection while differs in some details:
184
186
- Some data augmentation techniques need to be adjusted, such as `RandomFlip3D`.
185
187
Currently we do not support more augmentation methods, because how to transfer and apply other techniques is still under explored.
186
188
189
+
#### BEV-based
190
+
191
+
BEV, Bird's-Eye-View, is another popular 3D detection paradigm. It directly takes multi-view images to perform 3D detection, for nuScenes, they are `CAM_FRONT`, `CAM_FRONT_LEFT`, `CAM_FRONT_RIGHT`, `CAM_BACK`, `CAM_BACK_LEFT` and `CAM_BACK_RIGHT`. A basic training pipeline of bev-based 3D detection on nuScenes is as below.
鸟瞰图,BEV(Bird's-Eye-View),是另一种常用的 3D 检测范式。它直接利用多个视角图像进行 3D 检测。对于 NuScenes 数据集而言,这些视角包括前方`CAM_FRONT`、左前方`CAM_FRONT_LEFT`、右前方`CAM_FRONT_RIGHT`、后方`CAM_BACK`、左后方`CAM_BACK_LEFT`、右后方`CAM_BACK_RIGHT`。一个基本的用于 BEV 方法的流水线如下。
0 commit comments