Skip to content

Commit ec42990

Browse files
authored
[Feature] Support TOOD: Task-aligned One-stage Object Detection (ICCV 2021 Oral) (#6746)
* [Feature] Support TOOD. * update * use assign result * use assign result * clean assigner * add config * add tood head unit test and fix device bug * test assigner and fix empty gt error * test hook * add anchor-based cfg and readme * update readme * resolve comments * resolve comment * add metafile * fix model index * copyright * resolve comments * resolve comments
1 parent d3fcce5 commit ec42990

26 files changed

+1573
-15
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -149,6 +149,7 @@ Results and models are available in the [model zoo](docs/en/model_zoo.md).
149149
- [x] [YOLOX (ArXiv'2021)](configs/yolox/README.md)
150150
- [x] [SOLO (ECCV'2020)](configs/solo/README.md)
151151
- [x] [QueryInst (ICCV'2021)](configs/queryinst/README.md)
152+
- [x] [TOOD (ICCV'2021)](configs/tood/README.md)
152153
</details>
153154

154155
Some other methods are also supported in [projects using MMDetection](./docs/en/projects.md).

README_zh-CN.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -146,6 +146,7 @@ MMDetection 是一个基于 PyTorch 的目标检测开源工具箱。它是 [Ope
146146
- [x] [YOLOX (ArXiv'2021)](configs/yolox/README.md)
147147
- [x] [SOLO (ECCV'2020)](configs/solo/README.md)
148148
- [x] [QueryInst (ICCV'2021)](configs/queryinst/README.md)
149+
- [x] [TOOD (ICCV'2021)](configs/tood/README.md)
149150
</details>
150151

151152
我们在[基于 MMDetection 的项目](./docs/zh_cn/projects.md)中列举了一些其他的支持的算法。

configs/tood/README.md

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
# TOOD: Task-aligned One-stage Object Detection
2+
3+
## Abstract
4+
5+
<!-- [ABSTRACT] -->
6+
7+
One-stage object detection is commonly implemented by optimizing two sub-tasks: object classification and localization, using heads with two parallel branches, which might lead to a certain level of spatial misalignment in predictions between the two tasks. In this work, we propose a Task-aligned One-stage Object Detection (TOOD) that explicitly aligns the two tasks in a learning-based manner. First, we design a novel Task-aligned Head (T-Head) which offers a better balance between learning task-interactive and task-specific features, as well as a greater flexibility to learn the alignment via a task-aligned predictor. Second, we propose Task Alignment Learning (TAL) to explicitly pull closer (or even unify) the optimal anchors for the two tasks during training via a designed sample assignment scheme and a task-aligned loss. Extensive experiments are conducted on MS-COCO, where TOOD achieves a 51.1 AP at single-model single-scale testing. This surpasses the recent one-stage detectors by a large margin, such as ATSS (47.7 AP), GFL (48.2 AP), and PAA (49.0 AP), with fewer parameters and FLOPs. Qualitative results also demonstrate the effectiveness of TOOD for better aligning the tasks of object classification and localization.
8+
9+
<!-- [IMAGE] -->
10+
<div align=center>
11+
<img src="https://user-images.githubusercontent.com/12907710/145400075-e08191f5-8afa-4335-9b3b-27926fc9a26e.png"/>
12+
</div>
13+
14+
<!-- [PAPER_TITLE: TOOD: Task-aligned One-stage Object Detection] -->
15+
<!-- [PAPER_URL: https://arxiv.org/abs/2108.07755] -->
16+
17+
## Citation
18+
19+
<!-- [ALGORITHM] -->
20+
21+
```latex
22+
@inproceedings{feng2021tood,
23+
title={TOOD: Task-aligned One-stage Object Detection},
24+
author={Feng, Chengjian and Zhong, Yujie and Gao, Yu and Scott, Matthew R and Huang, Weilin},
25+
booktitle={ICCV},
26+
year={2021}
27+
}
28+
```
29+
30+
## Results and Models
31+
32+
| Backbone | Style | Anchor Type | Lr schd | Multi-scale Training| Mem (GB)| Inf time (fps) | box AP | Config | Download |
33+
|:-----------------:|:-------:|:------------:|:-------:|:-------------------:|:-------:|:--------------:|:------:|:------:|:--------:|
34+
| R-50 | pytorch | Anchor-free | 1x | N | 4.1 | | 42.4 | [config](./tood_r50_fpn_1x_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/tood/tood_r50_fpn_1x_coco/tood_r50_fpn_1x_coco_20211210_103425-20e20746.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/tood/tood_r50_fpn_1x_coco/tood_r50_fpn_1x_coco_20211210_103425.log) |
35+
| R-50 | pytorch | Anchor-based | 1x | N | 4.1 | | 42.4 | [config](./tood_r50_fpn_anchor_based_1x_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/tood/tood_r50_fpn_anchor_based_1x_coco/tood_r50_fpn_anchor_based_1x_coco_20211214_100105-b776c134.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/tood/tood_r50_fpn_anchor_based_1x_coco/tood_r50_fpn_anchor_based_1x_coco_20211214_100105.log) |
36+
| R-50 | pytorch | Anchor-free | 2x | Y | 4.1 | | 44.5 | [config](./tood_r50_fpn_mstrain_2x_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/tood/tood_r50_fpn_mstrain_2x_coco/tood_r50_fpn_mstrain_2x_coco_20211210_144231-3b23174c.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/tood/tood_r50_fpn_mstrain_2x_coco/tood_r50_fpn_mstrain_2x_coco_20211210_144231.log) |
37+
| R-101 | pytorch | Anchor-free | 2x | Y | 6.0 | | 46.1 | [config](./tood_r101_fpn_mstrain_2x_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/tood/tood_r101_fpn_mstrain_2x_coco/tood_r101_fpn_mstrain_2x_coco_20211210_144232-a18f53c8.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/tood/tood_r101_fpn_mstrain_2x_coco/tood_r101_fpn_mstrain_2x_coco_20211210_144232.log) |
38+
| R-101-dcnv2 | pytorch | Anchor-free | 2x | Y | 6.2 | | 49.3 | [config](./tood_r101_fpn_dconv_c3-c5_mstrain_2x_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/tood/tood_r101_fpn_dconv_c3-c5_mstrain_2x_coco/tood_r101_fpn_dconv_c3-c5_mstrain_2x_coco_20211210_213728-4a824142.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/tood/tood_r101_fpn_dconv_c3-c5_mstrain_2x_coco/tood_r101_fpn_dconv_c3-c5_mstrain_2x_coco_20211210_213728.log) |
39+
| X-101-64x4d | pytorch | Anchor-free | 2x | Y | 10.2 | | 47.6 | [config](./tood_x101_64x4d_fpn_mstrain_2x_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/tood/tood_x101_64x4d_fpn_mstrain_2x_coco/tood_x101_64x4d_fpn_mstrain_2x_coco_20211211_003519-a4f36113.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/tood/tood_x101_64x4d_fpn_mstrain_2x_coco/tood_x101_64x4d_fpn_mstrain_2x_coco_20211211_003519.log) |
40+
| X-101-64x4d-dcnv2 | pytorch | Anchor-free | 2x | Y | | | | [config](./tood_x101_64x4d_fpn_dconv_c4-c5_mstrain_2x_coco.py) | [model]() &#124; [log]() |
41+
42+
[1] *1x and 2x mean the model is trained for 90K and 180K iterations, respectively.* \
43+
[2] *All results are obtained with a single model and without any test time data augmentation such as multi-scale, flipping and etc..* \
44+
[3] *`dcnv2` denotes deformable convolutional networks v2.* \

configs/tood/metafile.yml

Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
Collections:
2+
- Name: TOOD
3+
Metadata:
4+
Training Data: COCO
5+
Training Techniques:
6+
- SGD
7+
Training Resources: 8x V100 GPUs
8+
Architecture:
9+
- TOOD
10+
Paper:
11+
URL: https://arxiv.org/abs/2108.07755
12+
Title: 'TOOD: Task-aligned One-stage Object Detection'
13+
README: configs/tood/README.md
14+
Code:
15+
URL: https://github.com/open-mmlab/mmdetection/blob/v2.20.0/mmdet/models/detectors/tood.py#L7
16+
Version: v2.20.0
17+
18+
Models:
19+
- Name: tood_r101_fpn_mstrain_2x_coco
20+
In Collection: TOOD
21+
Config: configs/tood/tood_r101_fpn_mstrain_2x_coco.py
22+
Metadata:
23+
Training Memory (GB): 6.0
24+
Epochs: 24
25+
Results:
26+
- Task: Object Detection
27+
Dataset: COCO
28+
Metrics:
29+
box AP: 46.1
30+
Weights: https://download.openmmlab.com/mmdetection/v2.0/tood/tood_r101_fpn_mstrain_2x_coco/tood_r101_fpn_mstrain_2x_coco_20211210_144232-a18f53c8.pth
31+
32+
- Name: tood_x101_64x4d_fpn_mstrain_2x_coco
33+
In Collection: TOOD
34+
Config: configs/tood/tood_x101_64x4d_fpn_mstrain_2x_coco.py
35+
Metadata:
36+
Training Memory (GB): 10.2
37+
Epochs: 24
38+
Results:
39+
- Task: Object Detection
40+
Dataset: COCO
41+
Metrics:
42+
box AP: 47.6
43+
Weights: https://download.openmmlab.com/mmdetection/v2.0/tood/tood_x101_64x4d_fpn_mstrain_2x_coco/tood_x101_64x4d_fpn_mstrain_2x_coco_20211211_003519-a4f36113.pth
44+
45+
- Name: tood_r101_fpn_dconv_c3-c5_mstrain_2x_coco
46+
In Collection: TOOD
47+
Config: configs/tood/tood_r101_fpn_dconv_c3-c5_mstrain_2x_coco.py
48+
Metadata:
49+
Training Memory (GB): 6.2
50+
Epochs: 24
51+
Results:
52+
- Task: Object Detection
53+
Dataset: COCO
54+
Metrics:
55+
box AP: 49.3
56+
Weights: https://download.openmmlab.com/mmdetection/v2.0/tood/tood_r101_fpn_dconv_c3-c5_mstrain_2x_coco/tood_r101_fpn_dconv_c3-c5_mstrain_2x_coco_20211210_213728-4a824142.pth
57+
58+
- Name: tood_r50_fpn_anchor_based_1x_coco
59+
In Collection: TOOD
60+
Config: configs/tood/tood_r50_fpn_anchor_based_1x_coco.py
61+
Metadata:
62+
Training Memory (GB): 4.1
63+
Epochs: 12
64+
Results:
65+
- Task: Object Detection
66+
Dataset: COCO
67+
Metrics:
68+
box AP: 42.4
69+
Weights: https://download.openmmlab.com/mmdetection/v2.0/tood/tood_r50_fpn_anchor_based_1x_coco/tood_r50_fpn_anchor_based_1x_coco_20211214_100105-b776c134.pth
70+
71+
- Name: tood_r50_fpn_1x_coco
72+
In Collection: TOOD
73+
Config: configs/tood/tood_r50_fpn_1x_coco.py
74+
Metadata:
75+
Training Memory (GB): 4.1
76+
Epochs: 12
77+
Results:
78+
- Task: Object Detection
79+
Dataset: COCO
80+
Metrics:
81+
box AP: 42.4
82+
Weights: https://download.openmmlab.com/mmdetection/v2.0/tood/tood_r50_fpn_1x_coco/tood_r50_fpn_1x_coco_20211210_103425-20e20746.pth
83+
84+
- Name: tood_r50_fpn_mstrain_2x_coco
85+
In Collection: TOOD
86+
Config: configs/tood/tood_r50_fpn_mstrain_2x_coco.py
87+
Metadata:
88+
Training Memory (GB): 4.1
89+
Epochs: 24
90+
Results:
91+
- Task: Object Detection
92+
Dataset: COCO
93+
Metrics:
94+
box AP: 44.5
95+
Weights: https://download.openmmlab.com/mmdetection/v2.0/tood/tood_r50_fpn_mstrain_2x_coco/tood_r50_fpn_mstrain_2x_coco_20211210_144231-3b23174c.pth
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
_base_ = './tood_r101_fpn_mstrain_2x_coco.py'
2+
3+
model = dict(
4+
backbone=dict(
5+
dcn=dict(type='DCNv2', deformable_groups=1, fallback_on_stride=False),
6+
stage_with_dcn=(False, True, True, True)),
7+
bbox_head=dict(num_dcn=2))
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
_base_ = './tood_r50_fpn_mstrain_2x_coco.py'
2+
3+
model = dict(
4+
backbone=dict(
5+
depth=101,
6+
init_cfg=dict(type='Pretrained',
7+
checkpoint='torchvision://resnet101')))
Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
_base_ = [
2+
'../_base_/datasets/coco_detection.py',
3+
'../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
4+
]
5+
model = dict(
6+
type='TOOD',
7+
backbone=dict(
8+
type='ResNet',
9+
depth=50,
10+
num_stages=4,
11+
out_indices=(0, 1, 2, 3),
12+
frozen_stages=1,
13+
norm_cfg=dict(type='BN', requires_grad=True),
14+
norm_eval=True,
15+
style='pytorch',
16+
init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50')),
17+
neck=dict(
18+
type='FPN',
19+
in_channels=[256, 512, 1024, 2048],
20+
out_channels=256,
21+
start_level=1,
22+
add_extra_convs='on_output',
23+
num_outs=5),
24+
bbox_head=dict(
25+
type='TOODHead',
26+
num_classes=80,
27+
in_channels=256,
28+
stacked_convs=6,
29+
feat_channels=256,
30+
anchor_type='anchor_free',
31+
anchor_generator=dict(
32+
type='AnchorGenerator',
33+
ratios=[1.0],
34+
octave_base_scale=8,
35+
scales_per_octave=1,
36+
strides=[8, 16, 32, 64, 128]),
37+
bbox_coder=dict(
38+
type='DeltaXYWHBBoxCoder',
39+
target_means=[.0, .0, .0, .0],
40+
target_stds=[0.1, 0.1, 0.2, 0.2]),
41+
initial_loss_cls=dict(
42+
type='FocalLoss',
43+
use_sigmoid=True,
44+
activated=True, # use probability instead of logit as input
45+
gamma=2.0,
46+
alpha=0.25,
47+
loss_weight=1.0),
48+
loss_cls=dict(
49+
type='QualityFocalLoss',
50+
use_sigmoid=True,
51+
activated=True, # use probability instead of logit as input
52+
beta=2.0,
53+
loss_weight=1.0),
54+
loss_bbox=dict(type='GIoULoss', loss_weight=2.0)),
55+
train_cfg=dict(
56+
initial_epoch=4,
57+
initial_assigner=dict(type='ATSSAssigner', topk=9),
58+
assigner=dict(type='TaskAlignedAssigner', topk=13),
59+
alpha=1,
60+
beta=6,
61+
allowed_border=-1,
62+
pos_weight=-1,
63+
debug=False),
64+
test_cfg=dict(
65+
nms_pre=1000,
66+
min_bbox_size=0,
67+
score_thr=0.05,
68+
nms=dict(type='nms', iou_threshold=0.6),
69+
max_per_img=100))
70+
# optimizer
71+
optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
72+
73+
# custom hooks
74+
custom_hooks = [dict(type='SetEpochInfoHook')]
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
_base_ = './tood_r50_fpn_1x_coco.py'
2+
model = dict(bbox_head=dict(anchor_type='anchor_based'))
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
_base_ = './tood_r50_fpn_1x_coco.py'
2+
# learning policy
3+
lr_config = dict(step=[16, 22])
4+
runner = dict(type='EpochBasedRunner', max_epochs=24)
5+
# multi-scale training
6+
img_norm_cfg = dict(
7+
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
8+
train_pipeline = [
9+
dict(type='LoadImageFromFile'),
10+
dict(type='LoadAnnotations', with_bbox=True),
11+
dict(
12+
type='Resize',
13+
img_scale=[(1333, 480), (1333, 800)],
14+
multiscale_mode='range',
15+
keep_ratio=True),
16+
dict(type='RandomFlip', flip_ratio=0.5),
17+
dict(type='Normalize', **img_norm_cfg),
18+
dict(type='Pad', size_divisor=32),
19+
dict(type='DefaultFormatBundle'),
20+
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
21+
]
22+
data = dict(train=dict(pipeline=train_pipeline))
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
_base_ = './tood_x101_64x4d_fpn_mstrain_2x_coco.py'
2+
model = dict(
3+
backbone=dict(
4+
dcn=dict(type='DCNv2', deformable_groups=1, fallback_on_stride=False),
5+
stage_with_dcn=(False, False, True, True),
6+
),
7+
bbox_head=dict(num_dcn=2))

0 commit comments

Comments
 (0)