[Feature] Support Class Aware Sampler (#7436)

BIGWangYuDong · web-flow · commit 4aaaf4dccfe2 · 2022-04-23T10:10:00.000+08:00
* [Feature] Support Class Aware Sampler

* minor fix

* minor fix

* rename get_label_dict to get_index_dict

* fix cas logic

* minor fix

* minor fix

* minor fix

* minor fix

* minor fix
diff --git a/configs/openimages/README.md b/configs/openimages/README.md
@@ -1,6 +1,8 @@
 # Open Images Dataset
-<!-- [DATASET] -->
 
+> [Open Images Dataset](https://arxiv.org/abs/1811.00982)
+
+<!-- [DATASET] -->
 ## Abstract
 
 <!-- [ABSTRACT] -->
@@ -90,14 +92,14 @@ training/testing by using `tools/misc/get_image_metas.py`.
     │   │   │   ├── class-descriptions-boxable.csv
     │   │   │   ├── oidv6-train-annotations-bbox.scv
     │   │   │   ├── validation-annotations-bbox.csv
-    │   │   │   ├── validation-annotations-human-imagelabels-boxable.csv    # is not necessary
+    │   │   │   ├── validation-annotations-human-imagelabels-boxable.csv
     │   │   │   ├── validation-image-metas.pkl      # get from script
     │   │   ├── challenge2019
     │   │   │   ├── challenge-2019-train-detection-bbox.txt
     │   │   │   ├── challenge-2019-validation-detection-bbox.txt
     │   │   │   ├── class_label_tree.np
     │   │   │   ├── class_sample_train.pkl
-    │   │   │   ├── challenge-2019-validation-detection-human-imagelabels.csv       # download from official website, not necessary
+    │   │   │   ├── challenge-2019-validation-detection-human-imagelabels.csv       # download from official website
     │   │   │   ├── challenge-2019-validation-metas.pkl     # get from script
     │   │   ├── OpenImages
     │   │   │   ├── train           # training images
@@ -112,14 +114,30 @@ Open Images v6, but the test images are different.
 You can also download the annotations from [official website](https://storage.googleapis.com/openimages/web/challenge2019_downloads.html),
 and set data.train.type=OpenImagesDataset, data.val.type=OpenImagesDataset, and data.test.type=OpenImagesDataset in the config
 3. If users do not want to use `validation-annotations-human-imagelabels-boxable.csv` and `challenge-2019-validation-detection-human-imagelabels.csv`
-users can should set `data.val.load_image_level_labels=False` and `data.test.load_image_level_labels=False` in the config .
-
+users can set `data.val.load_image_level_labels=False` and `data.test.load_image_level_labels=False` in the config.
+Please note that loading image-levels label is the default of Open Images evaluation metric.
+More details please refer to the [official website](https://storage.googleapis.com/openimages/web/evaluation.html)
 
 ## Results and Models
 
 | Architecture | Backbone  | Style   | Lr schd | Sampler | Mem (GB) | Inf time (fps) | box AP | Config | Download |
 |:------------:|:---------:|:-------:|:-------:|:-------:|:--------:|:--------------:|:------:|:------:|:--------:|
 | Faster R-CNN | R-50      | pytorch | 1x      |     Group Sampler    |  7.7   | -          | 51.6 |[config](https://github.com/open-mmlab/mmdetection/tree/master/configs/openimages/faster_rcnn_r50_fpn_32x2_1x_openimages.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/openimages/faster_rcnn_r50_fpn_32x2_1x_openimages/faster_rcnn_r50_fpn_32x2_1x_openimages_20211130_231159-e87ab7ce.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/openimages/faster_rcnn_r50_fpn_32x2_1x_openimages/faster_rcnn_r50_fpn_32x2_1x_openimages_20211130_231159.log.json) |
-| Faster R-CNN (Challenge 2019) | R-50  | pytorch | 1x |   Group Sampler  |  7.7  | -          | 54.5 |[config](https://github.com/open-mmlab/mmdetection/tree/master/configs/openimages/faster_rcnn_r50_fpn_32x2_1x_openimages_challenge.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/openimages/faster_rcnn_r50_fpn_32x2_1x_openimages_challenge/faster_rcnn_r50_fpn_32x2_1x_openimages_challenge_20211229_071252-46380cde.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/openimages/faster_rcnn_r50_fpn_32x2_1x_openimages_challenge/faster_rcnn_r50_fpn_32x2_1x_openimages_challenge_20211229_071252.log.json) |
+| Faster R-CNN | R-50      | pytorch | 1x      |     Class Aware Sampler    |  7.7   | -          | 60.0 |[config](https://github.com/open-mmlab/mmdetection/tree/master/configs/openimages/faster_rcnn_r50_fpn_32x2_cas_1x_openimages.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/openimages/faster_rcnn_r50_fpn_32x2_cas_1x_openimages/faster_rcnn_r50_fpn_32x2_cas_1x_openimages_20220306_202424-98c630e5.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/openimages/faster_rcnn_r50_fpn_32x2_1x_openimages/faster_rcnn_r50_fpn_32x2_cas_1x_openimages_20220306_202424.log.json) |
+| Faster R-CNN (Challenge 2019) | R-50  | pytorch | 1x |   Group Sampler  |  7.7  | -          | 54.9 |[config](https://github.com/open-mmlab/mmdetection/tree/master/configs/openimages/faster_rcnn_r50_fpn_32x2_1x_openimages_challenge.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/openimages/faster_rcnn_r50_fpn_32x2_1x_openimages_challenge/faster_rcnn_r50_fpn_32x2_1x_openimages_challenge_20220114_045100-0e79e5df.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/openimages/faster_rcnn_r50_fpn_32x2_1x_openimages_challenge/faster_rcnn_r50_fpn_32x2_1x_openimages_challenge_20220114_045100.log.json) |
+| Faster R-CNN (Challenge 2019) | R-50  | pytorch | 1x |   Class Aware Sampler  |  7.1  | -          | 65.0 |[config](https://github.com/open-mmlab/mmdetection/tree/master/configs/openimages/faster_rcnn_r50_fpn_32x2_cas_1x_openimages_challenge.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/openimages/faster_rcnn_r50_fpn_32x2_cas_1x_openimages_challenge/faster_rcnn_r50_fpn_32x2_cas_1x_openimages_challenge_20220221_192021-34c402d9.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/openimages/faster_rcnn_r50_fpn_32x2_cas_1x_openimages_challenge/faster_rcnn_r50_fpn_32x2_cas_1x_openimages_challenge_20220221_192021.log.json) |
 | Retinanet    | R-50      | pytorch | 1x      |    Group Sampler     |  6.6   | -          | 61.5 |[config](https://github.com/open-mmlab/mmdetection/tree/master/configs/openimages/retinanet_r50_fpn_32x2_1x_openimages.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/openimages/retinanet_r50_fpn_32x2_1x_openimages/retinanet_r50_fpn_32x2_1x_openimages_20211223_071954-d2ae5462.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/openimages/retinanet_r50_fpn_32x2_1x_openimages/retinanet_r50_fpn_32x2_1x_openimages_20211223_071954.log.json) |
-| SSD          | VGG16     | pytorch | 36e     |    Group Sampler     |  10.8  | -          | 35.4 |[config](https://github.com/open-mmlab/mmdetection/tree/master/configs/openimages/ssd300_32x8_36e_openimages.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/openimages/ssd300_32x8_36e_openimages/ssd300_32x8_36e_openimages_20211224_000232-dce93846.pth) &#124; [log](ttps://download.openmmlab.com/mmdetection/v2.0/openimages/ssd300_32x8_36e_openimages/ssd300_32x8_36e_openimages_20211224_000232.log.json) |
+| SSD          | VGG16     | pytorch | 36e     |    Group Sampler     |  10.8  | -          | 35.4 |[config](https://github.com/open-mmlab/mmdetection/tree/master/configs/openimages/ssd300_32x8_36e_openimages.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/openimages/ssd300_32x8_36e_openimages/ssd300_32x8_36e_openimages_20211224_000232-dce93846.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/openimages/ssd300_32x8_36e_openimages/ssd300_32x8_36e_openimages_20211224_000232.log.json) |
+
+**Notes:**
+
+- 'cas' is short for 'Class Aware Sampler'
+
+### Results of consider image level labels
+
+| Architecture | Sampler | Consider Image Level Labels | box AP|
+|:------------:|:-------:|:---------------------------:|:-----:|
+|Faster R-CNN r50 (Challenge 2019)| Group Sampler| w/o | 62.19 |
+|Faster R-CNN r50 (Challenge 2019)| Group Sampler| w/ | 54.87 |
+|Faster R-CNN r50 (Challenge 2019)| Class Aware Sampler| w/o | 71.77 |
+|Faster R-CNN r50 (Challenge 2019)| Class Aware Sampler| w/ | 64.98 |
diff --git a/configs/openimages/faster_rcnn_r50_fpn_32x2_cas_1x_openimages.py b/configs/openimages/faster_rcnn_r50_fpn_32x2_cas_1x_openimages.py
@@ -0,0 +1,5 @@
+_base_ = ['faster_rcnn_r50_fpn_32x2_1x_openimages.py']
+
+# Use ClassAwareSampler
+data = dict(
+    train_dataloader=dict(class_aware_sampler=dict(num_sample_class=1)))
diff --git a/configs/openimages/faster_rcnn_r50_fpn_32x2_cas_1x_openimages_challenge.py b/configs/openimages/faster_rcnn_r50_fpn_32x2_cas_1x_openimages_challenge.py
@@ -0,0 +1,5 @@
+_base_ = ['faster_rcnn_r50_fpn_32x2_1x_openimages_challenge.py']
+
+# Use ClassAwareSampler
+data = dict(
+    train_dataloader=dict(class_aware_sampler=dict(num_sample_class=1)))
diff --git a/configs/openimages/metafile.yml b/configs/openimages/metafile.yml
@@ -1,20 +1,14 @@
-Collections:
-  - Name: Open Images Dataset
-    Paper:
-      URL: https://arxiv.org/abs/1811.00982
-      Title: 'The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale'
-    README: configs/openimages/README.md
-    Code:
-      URL: https://github.com/open-mmlab/mmdetection/blob/v2.20.0/mmdet/datasets/openimages.py#L21
-      Version: v2.20.0
-
 Models:
   - Name: faster_rcnn_r50_fpn_32x2_1x_openimages
-    In Collection: Open Images Dataset
+    In Collection: Faster R-CNN
     Config: configs/openimages/faster_rcnn_r50_fpn_32x2_1x_openimages.py
     Metadata:
       Training Memory (GB): 7.7
       Epochs: 12
+      Training Data: Open Images v6
+      Training Techniques:
+        - SGD with Momentum
+        - Weight Decay
     Results:
       - Task: Object Detection
         Dataset: Open Images v6
@@ -23,11 +17,15 @@ Models:
     Weights: https://download.openmmlab.com/mmdetection/v2.0/openimages/faster_rcnn_r50_fpn_32x2_1x_openimages/faster_rcnn_r50_fpn_32x2_1x_openimages_20211130_231159-e87ab7ce.pth
 
   - Name: retinanet_r50_fpn_32x2_1x_openimages
-    In Collection: Open Images Dataset
+    In Collection: RetinaNet
     Config: configs/openimages/retinanet_r50_fpn_32x2_1x_openimages.py
     Metadata:
       Training Memory (GB): 6.6
       Epochs: 12
+      Training Data: Open Images v6
+      Training Techniques:
+        - SGD with Momentum
+        - Weight Decay
     Results:
       - Task: Object Detection
         Dataset: Open Images v6
@@ -36,11 +34,15 @@ Models:
     Weights: https://download.openmmlab.com/mmdetection/v2.0/openimages/retinanet_r50_fpn_32x2_1x_openimages/retinanet_r50_fpn_32x2_1x_openimages_20211223_071954-d2ae5462.pth
 
   - Name: ssd300_32x8_36e_openimages
-    In Collection: Open Images Dataset
+    In Collection: SSD
     Config: configs/openimages/ssd300_32x8_36e_openimages
     Metadata:
       Training Memory (GB): 10.8
-      Epochs: 12
+      Epochs: 36
+      Training Data: Open Images v6
+      Training Techniques:
+        - SGD with Momentum
+        - Weight Decay
     Results:
       - Task: Object Detection
         Dataset: Open Images v6
@@ -49,14 +51,52 @@ Models:
     Weights: https://download.openmmlab.com/mmdetection/v2.0/openimages/ssd300_32x8_36e_openimages/ssd300_32x8_36e_openimages_20211224_000232-dce93846.pth
 
   - Name: faster_rcnn_r50_fpn_32x2_1x_openimages_challenge
-    In Collection: Open Images Dataset
+    In Collection: Faster R-CNN
     Config: configs/openimages/faster_rcnn_r50_fpn_32x2_1x_openimages_challenge.py
     Metadata:
       Training Memory (GB): 7.7
       Epochs: 12
+      Training Data: Open Images Challenge 2019
+      Training Techniques:
+        - SGD with Momentum
+        - Weight Decay
+    Results:
+      - Task: Object Detection
+        Dataset: Open Images Challenge 2019
+        Metrics:
+          box AP: 54.9
+    Weights: https://download.openmmlab.com/mmdetection/v2.0/openimages/faster_rcnn_r50_fpn_32x2_1x_openimages_challenge/faster_rcnn_r50_fpn_32x2_1x_openimages_challenge_20220114_045100-0e79e5df.pth
+
+  - Name: faster_rcnn_r50_fpn_32x2_cas_1x_openimages
+    In Collection: Faster R-CNN
+    Config: configs/openimages/faster_rcnn_r50_fpn_32x2_cas_1x_openimages.py
+    Metadata:
+      Training Memory (GB): 7.7
+      Epochs: 12
+      Training Data: Open Images Challenge 2019
+      Training Techniques:
+        - SGD with Momentum
+        - Weight Decay
+    Results:
+      - Task: Object Detection
+        Dataset: Open Images Challenge 2019
+        Metrics:
+          box AP: 60.0
+    Weights: https://download.openmmlab.com/mmdetection/v2.0/openimages/faster_rcnn_r50_fpn_32x2_cas_1x_openimages/faster_rcnn_r50_fpn_32x2_cas_1x_openimages_20220306_202424-98c630e5.pth
+
+  - Name: faster_rcnn_r50_fpn_32x2_cas_1x_openimages_challenge
+    In Collection: Faster R-CNN
+    Config: configs/openimages/faster_rcnn_r50_fpn_32x2_cas_1x_openimages_challenge.py
+    Metadata:
+      Training Memory (GB): 7.1
+      Epochs: 12
+      Training Data: Open Images Challenge 2019
+      Training Techniques:
+        - SGD with Momentum
+        - Weight Decay
     Results:
       - Task: Object Detection
-        Dataset: Open Images Challenge 2019W
+        Dataset: Open Images Challenge 2019
         Metrics:
-          box AP: 54.5
-    Weights: https://download.openmmlab.com/mmdetection/v2.0/openimages/faster_rcnn_r50_fpn_32x2_1x_openimages_challenge/faster_rcnn_r50_fpn_32x2_1x_openimages_challenge_20211229_071252-46380cde.pth
+          box AP: 65.0
+    Weights: https://download.openmmlab.com/mmdetection/v2.0/openimages/faster_rcnn_r50_fpn_32x2_cas_1x_openimages_challenge/faster_rcnn_r50_fpn_32x2_cas_1x_openimages_challenge_20220221_192021-34c402d9.pth
diff --git a/mmdet/datasets/builder.py b/mmdet/datasets/builder.py
@@ -12,8 +12,8 @@
 from mmcv.utils import TORCH_VERSION, Registry, build_from_cfg, digit_version
 from torch.utils.data import DataLoader
 
-from .samplers import (DistributedGroupSampler, DistributedSampler,
-                       GroupSampler, InfiniteBatchSampler,
+from .samplers import (ClassAwareSampler, DistributedGroupSampler,
+                       DistributedSampler, GroupSampler, InfiniteBatchSampler,
                        InfiniteGroupBatchSampler)
 
 if platform.system() != 'Windows':
@@ -93,6 +93,7 @@ def build_dataloader(dataset,
                      seed=None,
                      runner_type='EpochBasedRunner',
                      persistent_workers=False,
+                     class_aware_sampler=None,
                      **kwargs):
     """Build PyTorch DataLoader.
 
@@ -115,6 +116,8 @@ def build_dataloader(dataset,
             the worker processes after a dataset has been consumed once.
             This allows to maintain the workers `Dataset` instances alive.
             This argument is only valid when PyTorch>=1.7.0. Default: False.
+        class_aware_sampler (dict): Whether to use `ClassAwareSampler`
+            during training. Default: None.
         kwargs: any keyword argument to be used to initialize DataLoader
 
     Returns:
@@ -153,7 +156,18 @@ def build_dataloader(dataset,
         batch_size = 1
         sampler = None
     else:
-        if dist:
+        if class_aware_sampler is not None:
+            # ClassAwareSampler can be used in both distributed and
+            # non-distributed training.
+            num_sample_class = class_aware_sampler.get('num_sample_class', 1)
+            sampler = ClassAwareSampler(
+                dataset,
+                samples_per_gpu,
+                world_size,
+                rank,
+                seed=seed,
+                num_sample_class=num_sample_class)
+        elif dist:
             # DistributedGroupSampler will definitely shuffle the data to
             # satisfy that images on each GPU are in the same group
             if shuffle:
diff --git a/mmdet/datasets/custom.py b/mmdet/datasets/custom.py
@@ -285,6 +285,25 @@ def get_classes(cls, classes=None):
 
         return class_names
 
+    def get_cat2imgs(self):
+        """Get a dict with class as key and img_ids as values, which will be
+        used in :class:`ClassAwareSampler`.
+
+        Returns:
+            dict[list]: A dict of per-label image list,
+            the item of the dict indicates a label index,
+            corresponds to the image index that contains the label.
+        """
+        if self.CLASSES is None:
+            raise ValueError('self.CLASSES can not be None')
+        # sort the label index
+        cat2imgs = {i: [] for i in range(len(self.CLASSES))}
+        for i in range(len(self)):
+            cat_ids = set(self.get_cat_ids(i))
+            for cat in cat_ids:
+                cat2imgs[cat].append(i)
+        return cat2imgs
+
     def format_results(self, results, **kwargs):
         """Place holder to format result to dataset specific output."""
 
diff --git a/mmdet/datasets/openimages.py b/mmdet/datasets/openimages.py
@@ -601,6 +601,17 @@ def denormalize_gt_bboxes(self, annotations):
             annotations[i]['bboxes'][:, 1::2] *= h
         return annotations
 
+    def get_cat_ids(self, idx):
+        """Get category ids by index.
+
+        Args:
+            idx (int): Index of data.
+
+        Returns:
+            list[int]: All categories in the image of specified index.
+        """
+        return self.get_ann_info(idx)['labels'].astype(np.int).tolist()
+
     def evaluate(self,
                  results,
                  metric='mAP',
diff --git a/mmdet/datasets/pipelines/loading.py b/mmdet/datasets/pipelines/loading.py
@@ -256,12 +256,11 @@ def _load_bboxes(self, results):
         results['gt_bboxes'] = ann_info['bboxes'].copy()
 
         if self.denorm_bbox:
-            h, w = results['img_shape'][:2]
             bbox_num = results['gt_bboxes'].shape[0]
             if bbox_num != 0:
+                h, w = results['img_shape'][:2]
                 results['gt_bboxes'][:, 0::2] *= w
                 results['gt_bboxes'][:, 1::2] *= h
-            results['gt_bboxes'] = results['gt_bboxes'].astype(np.float32)
 
         gt_bboxes_ignore = ann_info.get('bboxes_ignore', None)
         if gt_bboxes_ignore is not None:
diff --git a/mmdet/datasets/samplers/__init__.py b/mmdet/datasets/samplers/__init__.py
@@ -1,9 +1,10 @@
 # Copyright (c) OpenMMLab. All rights reserved.
+from .class_aware_sampler import ClassAwareSampler
 from .distributed_sampler import DistributedSampler
 from .group_sampler import DistributedGroupSampler, GroupSampler
 from .infinite_sampler import InfiniteBatchSampler, InfiniteGroupBatchSampler
 
 __all__ = [
     'DistributedSampler', 'DistributedGroupSampler', 'GroupSampler',
-    'InfiniteGroupBatchSampler', 'InfiniteBatchSampler'
+    'InfiniteGroupBatchSampler', 'InfiniteBatchSampler', 'ClassAwareSampler'
 ]
diff --git a/mmdet/datasets/samplers/class_aware_sampler.py b/mmdet/datasets/samplers/class_aware_sampler.py
diff --git a/model-index.yml b/model-index.yml