Skip to content

Commit 373a7a1

Browse files
oindrilasahaharsha-simhadriUbuntu
authored
Merge RNNPool Codes (#201)
* added bidirectional * bidirectional in BaseRNN * updated all rnn for new bidirectional * debugging for fastgrnncuda * fix for fastgrnncuda * visual wakeword * visual wakewords evaluation * visual wakeword evaluation * updated readme for eval * face detection * update face detection * rnn edit * test update * model loading change * update readme * update eval tools in readme * update train * readme * readme * readme * requirements * requirements * readme * rnn * update s3fd_net * update s3fd_net * update s3fd_net * train * train * add additional args * add additional args * arg changes in wider_test * added arg for using new ckpt * readme * remove old modelsupport * readme * readme * requirements * readme * eval on all format * support for calculating MAP * update readme * Update README.md * readme * readme * readme * readme * fix for warnings * readme * readme scores * add dump weights and traces support * readme * remove eval warnings * eval remove import warnings * readme changes * readme changes * support for qvga monochrome * readme update * readme update * Update README.md * readme update * environment key update * config files * update both config files text * change architecture * readme update * quantized cpp rnnpool * Update README.md * Update README.md * Update README.md * smaller model for qvga * Update RPool_Face_QVGA_monochrome.py * update to ssd code * update to init * update to init * update to dataloader * tf code for face detection * tf code for face detection * add tf face detection code * eval file * fix weights and detect function * Delete factory.py * Update RPool_Face_QVGA_monochrome.py * Update RPool_Face_QVGA_monochrome.py * Update RPool_Face_C.py * Update RPool_Face_Quant.py * Update augmentations.py * Update detection.py * Update eval.py * vww updates * removed tf code * Update fastcell_example.py * Update widerface.py * Update rnnpool.py * Update fastTrainer.py * Update fastTrainer.py * Update fastTrainer.py * Update model_mobilenet_2rnnpool.py * Update model_mobilenet_rnnpool.py * Delete top_level.txt * Delete dependency_links.txt * remove egg-info * Update fastcell_example.py * file copyright edits * Update README.md * delete output blank file * remove input trace file Co-authored-by: Harsha Vardhan Simhadri <[email protected]> Co-authored-by: Ubuntu <harshasi@GPUnode1.n14uw44gbsdu3cvfvcchcfucod.xx.internal.cloudapp.net> Co-authored-by: Harsha Vardhan Simhadri <[email protected]>
1 parent 2c89850 commit 373a7a1

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

42 files changed

+14078
-44
lines changed
Lines changed: 146 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,146 @@
1+
# Code for Face Detection experiments with RNNPool
2+
## Requirements
3+
1. Follow instructions to install requirements for EdgeML operators and the EdgeML operators [here](https://github.com/microsoft/EdgeML/blob/master/pytorch/README.md).
4+
2. Install requirements for face detection model using
5+
``` pip install -r requirements.txt ```
6+
We have tested the installation and the code on Ubuntu 18.04 with Cuda 10.2 and CuDNN 7.6
7+
8+
## Dataset
9+
1. Download WIDER face dataset images and annotations from http://shuoyang1213.me/WIDERFACE/ and place them all in a folder with name 'WIDER_FACE'. That is, download WIDER_train.zip, WIDER_test.zip, WIDER_val.zip, wider_face_split.zip and place it in WIDER_FACE folder, and unzip files using:
10+
11+
```shell
12+
cd WIDER_FACE
13+
unzip WIDER_train.zip
14+
unzip WIDER_test.zip
15+
unzip WIDER_val.zip
16+
unzip wider_face_split.zip
17+
cd ..
18+
19+
```
20+
21+
2. In `data/config.py` , set _C.HOME to the parent directory of the above folder, and set the _C.FACE.WIDER_DIR to the folder path.
22+
That is, if the WIDER_FACE folder is created in /mnt folder, then _C.HOME='/mnt'
23+
_C.FACE.WIDER_DIR='/mnt/WIDER_FACE'.
24+
Similarly, change `data/config_qvga.py` to set _C.HOME and _C.FACE.WIDER_DIR.
25+
3. Run
26+
``` python prepare_wider_data.py ```
27+
28+
29+
# Usage
30+
31+
## Training
32+
33+
```shell
34+
35+
IS_QVGA_MONO=0 python train.py --batch_size 32 --model_arch RPool_Face_Quant --cuda True --multigpu True --save_folder weights/ --epochs 300 --save_frequency 5000
36+
37+
```
38+
39+
For QVGA:
40+
```shell
41+
42+
IS_QVGA_MONO=1 python train.py --batch_size 64 --model_arch RPool_Face_QVGA_monochrome --cuda True --multigpu True --save_folder weights/ --epochs 300 --save_frequency 5000
43+
44+
```
45+
This will save checkpoints after every '--save_frequency' number of iterations in a weight file with 'checkpoint.pth' at the end and weights for the best state in a file with 'best_state.pth' at the end. These will be saved in '--save_folder'. For resuming training from a checkpoint, use '--resume <checkpoint_name>.pth' with the above command. For example,
46+
47+
48+
```shell
49+
50+
IS_QVGA_MONO=1 python train.py --batch_size 64 --model_arch RPool_Face_QVGA_monochrome --cuda True --multigpu True --save_folder weights/ --epochs 300 --save_frequency 5000 --resume <checkpoint_name>.pth
51+
52+
```
53+
54+
If IS_QVGA_MONO is 0 then training input images will be 640x640 and RGB.
55+
If IS_QVGA_MONO is 1 then training input images will be 320x320 and converted to monochrome.
56+
57+
Input images for training models are cropped and reshaped to square to maintain consistency with [S3FD](https://arxiv.org/abs/1708.05237). However testing can be done on any size of images, thus we resize testing input image size to have area equal to VGA (640x480)/QVGA (320x240), so that aspect ratio is not changed.
58+
59+
The architecture RPool_Face_QVGA_monochrome is for QVGA monochrome format while RPool_Face_C and RPool_Face_Quant are for VGA RGB format.
60+
61+
62+
## Test
63+
There are two modes of testing the trained model -- the evaluation mode to generate bounding boxes for a set of sample images, and the test mode to compute statistics like mAP scores.
64+
65+
#### Evaluation Mode
66+
67+
Given a set of images in <your_image_folder>, `eval/py` generates bounding boxes around faces (where the confidence is higher than certain threshold) and write the images in <your_save_folder>. To evaluate the `rpool_face_best_state.pth` model (stored in ./weights), execute the following command:
68+
69+
```shell
70+
IS_QVGA_MONO=0 python eval.py --model_arch RPool_Face_Quant --model ./weights/RPool_Face_Quant_best_state.pth --image_folder <your_image_folder> --save_dir <your_save_folder>
71+
```
72+
73+
For QVGA:
74+
```shell
75+
IS_QVGA_MONO=1 python eval.py --model_arch RPool_Face_QVGA_monochrome --model ./weights/RPool_Face_QVGA_monochrome_best_state.pth --image_folder <your_image_folder> --save_dir <your_save_folder>
76+
```
77+
78+
This will save images in <your_save_folder> with bounding boxes around faces, where the confidence is high. Here is an example image with a single bounding box.
79+
80+
![Camera: Himax0360](imrgb20ft.png)
81+
82+
If IS_QVGA_MONO=0 the evaluation code accepts an image of any size and resizes it to 640x480x3 while preserving original image aspect ratio.
83+
84+
If IS_QVGA_MONO=1 the evaluation code accepts an image of any size and resizes and converts it to monochrome to make it 320x240x1 while preserving original image aspect ratio.
85+
86+
#### WIDER Set Test
87+
In this mode, we test the generated model against the provided WIDER_FACE validation and test dataset.
88+
89+
For this, first run the following to generate predictions of the model and store output in the '--save_folder' folder.
90+
91+
```shell
92+
IS_QVGA_MONO=0 python wider_test.py --model_arch RPool_Face_Quant --model ./weights/RPool_Face_Quant_best_state.pth --save_folder rpool_face_quant_val --subset val
93+
```
94+
95+
For QVGA:
96+
```shell
97+
IS_QVGA_MONO=1 python wider_test.py --model_arch RPool_Face_QVGA_monochrome --model ./weights/RPool_Face_QVGA_monochrome_best_state.pth --save_folder rpool_face_qvgamono_val --subset val
98+
```
99+
100+
The above command generates predictions for each image in the "validation" dataset. For each image, a separate prediction file is provided (image_name.txt file in appropriate folder). The first line of the prediction file contains the total number of boxes identified.
101+
Then each line in the file corresponds to an identified box. For each box, five numbers are generated: length of the box, height of the box, x-axis offset, y-axis offset, confidence value for presence of a face in the box.
102+
103+
If IS_QVGA_MONO=1 then testing is done by converting images to monochrome and QVGA, else if IS_QVGA_MONO=0 then testing is done on VGA RGB images.
104+
105+
The architecture RPool_Face_QVGA_monochrome is for QVGA monochrome format while RPool_Face_C and RPool_Face_Quant are for VGA RGB format.
106+
107+
###### For calculating MAP scores:
108+
Now using these boxes, we can compute the standard MAP score that is widely used in this literature (see [here](https://medium.com/@jonathan_hui/map-mean-average-precision-for-object-detection-45c121a31173) for more details) as follows:
109+
110+
1. Download eval_tools.zip from http://shuoyang1213.me/WIDERFACE/support/eval_script/eval_tools.zip and unzip in a folder of same name in this directory.
111+
112+
Example code:
113+
114+
```shell
115+
wget http://shuoyang1213.me/WIDERFACE/support/eval_script/eval_tools.zip
116+
unzip eval_tools.zip
117+
```
118+
119+
2. Set up scripts to use the Matlab '.mat' data files in eval_tools/ground_truth folder for MAP calculation: The following installs python files that provide the same functionality as the '.m' matlab scripts in eval_tools folder.
120+
```
121+
cd eval_tools
122+
git clone https://github.com/wondervictor/WiderFace-Evaluation.git
123+
cd WiderFace-Evaluation
124+
python3 setup.py build_ext --inplace
125+
```
126+
127+
3. Run ```python3 evaluation.py -p <your_save_folder> -g <groud truth dir>``` in WiderFace-Evaluation folder
128+
129+
where `prediction_dir` is the '--save_folder' used for `wider_test.py` above and <groud truth dir> is the subfolder `eval_tools/ground_truth`. That is in, WiderFace-Evaluation directory, run:
130+
131+
```shell
132+
python3 evaluation.py -p <your_save_folder> -g ../ground_truth
133+
```
134+
This script should output the MAP for the WIDER-easy, WIDER-medium, and WIDER-hard subsets of the dataset. Our best performance using RPool_Face_Quant model is: 0.80 (WIDER-easy), 0.78 (WIDER-medium), 0.53 (WIDER-hard).
135+
136+
137+
##### Dump RNNPool Input Output Traces and Weights
138+
139+
To save model weights and/or input output pairs for each patch through RNNPool in numpy format use the command below. Put images which you want to save traces for in <your_image_folder> . Specify output folder for saving model weights in numpy format in <your_save_model_numpy_folder>. Specify output folder for saving input output traces of RNNPool in numpy format in <your_save_traces_numpy_folder>. Note that input traces will be saved in a folder named 'inputs' and output traces in a folder named 'outputs' inside <your_save_traces_numpy_folder>.
140+
141+
```shell
142+
python3 dump_model.py --model ./weights/RPool_Face_QVGA_monochrome_best_state.pth --model_arch RPool_Face_Quant --image_folder <your_image_folder> --save_model_npy_dir <your_save_model_numpy_folder> --save_traces_npy_dir <your_save_traces_numpy_folder>
143+
```
144+
If you wish to save only model weights, do not specify --save_traces_npy_dir. If you wish to save only traces do not specify --save_model_npy_dir.
145+
146+
Code has been built upon https://github.com/yxlijun/S3FD.pytorch
Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
# Copyright (c) Microsoft Corporation. All rights reserved.
2+
# Licensed under the MIT license.
3+
4+
from .widerface import WIDERDetection
5+
6+
from data.choose_config import cfg
7+
cfg = cfg.cfg
8+
9+
10+
import torch
11+
12+
13+
def detection_collate(batch):
14+
"""Custom collate fn for dealing with batches of images that have a different
15+
number of associated object annotations (bounding boxes).
16+
17+
Arguments:
18+
batch: (tuple) A tuple of tensor images and lists of annotations
19+
20+
Return:
21+
A tuple containing:
22+
1) (tensor) batch of images stacked on their 0 dim
23+
2) (list of tensors) annotations for a given image are stacked on
24+
0 dim
25+
"""
26+
targets = []
27+
imgs = []
28+
for sample in batch:
29+
imgs.append(sample[0])
30+
targets.append(torch.FloatTensor(sample[1]))
31+
return torch.stack(imgs, 0), targets
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# Copyright (c) Microsoft Corporation. All rights reserved.
2+
# Licensed under the MIT license.
3+
4+
import os
5+
from importlib import import_module
6+
7+
IS_QVGA_MONO = os.environ['IS_QVGA_MONO']
8+
9+
10+
name = 'config'
11+
if IS_QVGA_MONO == '1':
12+
name = name + '_qvga'
13+
14+
15+
cfg = import_module('data.' + name)
Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
# Copyright (c) Microsoft Corporation. All rights reserved.
2+
# Licensed under the MIT license.
3+
4+
import os
5+
from easydict import EasyDict
6+
import numpy as np
7+
8+
9+
_C = EasyDict()
10+
cfg = _C
11+
# data augument config
12+
_C.expand_prob = 0.5
13+
_C.expand_max_ratio = 4
14+
_C.hue_prob = 0.5
15+
_C.hue_delta = 18
16+
_C.contrast_prob = 0.5
17+
_C.contrast_delta = 0.5
18+
_C.saturation_prob = 0.5
19+
_C.saturation_delta = 0.5
20+
_C.brightness_prob = 0.5
21+
_C.brightness_delta = 0.125
22+
_C.data_anchor_sampling_prob = 0.5
23+
_C.min_face_size = 6.0
24+
_C.apply_distort = True
25+
_C.apply_expand = False
26+
_C.img_mean = np.array([104., 117., 123.])[:, np.newaxis, np.newaxis].astype(
27+
'float32')
28+
_C.resize_width = 640
29+
_C.resize_height = 640
30+
_C.scale = 1 / 127.0
31+
_C.anchor_sampling = True
32+
_C.filter_min_face = True
33+
34+
35+
_C.IS_MONOCHROME = False
36+
37+
38+
# anchor config
39+
_C.FEATURE_MAPS = [160, 80, 40, 20, 10, 5]
40+
_C.INPUT_SIZE = 640
41+
_C.STEPS = [4, 8, 16, 32, 64, 128]
42+
_C.ANCHOR_SIZES = [16, 32, 64, 128, 256, 512]
43+
_C.CLIP = False
44+
_C.VARIANCE = [0.1, 0.2]
45+
46+
# detection config
47+
_C.NMS_THRESH = 0.3
48+
_C.NMS_TOP_K = 5000
49+
_C.TOP_K = 750
50+
_C.CONF_THRESH = 0.01
51+
52+
# loss config
53+
_C.NEG_POS_RATIOS = 3
54+
_C.NUM_CLASSES = 2
55+
_C.USE_NMS = True
56+
57+
# dataset config
58+
_C.HOME = '/mnt/' ## change here ----------
59+
60+
# face config
61+
_C.FACE = EasyDict()
62+
_C.FACE.TRAIN_FILE = './data/face_train.txt'
63+
_C.FACE.VAL_FILE = './data/face_val.txt'
64+
_C.FACE.WIDER_DIR = '/mnt/WIDER_FACE' ## change here ---------
65+
_C.FACE.OVERLAP_THRESH = [0.1, 0.35, 0.5]
Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
# Copyright (c) Microsoft Corporation. All rights reserved.
2+
# Licensed under the MIT license.
3+
4+
import os
5+
from easydict import EasyDict
6+
import numpy as np
7+
8+
9+
_C = EasyDict()
10+
cfg = _C
11+
# data augument config
12+
_C.expand_prob = 0.5
13+
_C.expand_max_ratio = 2
14+
_C.hue_prob = 0.5
15+
_C.hue_delta = 18
16+
_C.contrast_prob = 0.5
17+
_C.contrast_delta = 0.5
18+
_C.saturation_prob = 0.5
19+
_C.saturation_delta = 0.5
20+
_C.brightness_prob = 0.5
21+
_C.brightness_delta = 0.125
22+
_C.data_anchor_sampling_prob = 0.5
23+
_C.min_face_size = 1.0
24+
_C.apply_distort = True
25+
_C.apply_expand = False
26+
_C.img_mean = np.array([104., 117., 123.])[:, np.newaxis, np.newaxis].astype(
27+
'float32')
28+
_C.resize_width = 320
29+
_C.resize_height = 320
30+
_C.scale = 1 / 127.0
31+
_C.anchor_sampling = True
32+
_C.filter_min_face = True
33+
34+
35+
_C.IS_MONOCHROME = True
36+
37+
# anchor config
38+
_C.FEATURE_MAPS = [40, 40, 20, 20]
39+
_C.INPUT_SIZE = 320
40+
_C.STEPS = [8, 8, 16, 16]
41+
_C.ANCHOR_SIZES = [8, 16, 32, 48]
42+
_C.CLIP = False
43+
_C.VARIANCE = [0.1, 0.2]
44+
45+
# detection config
46+
_C.NMS_THRESH = 0.3
47+
_C.NMS_TOP_K = 5000
48+
_C.TOP_K = 750
49+
_C.CONF_THRESH = 0.05
50+
51+
# loss config
52+
_C.NEG_POS_RATIOS = 3
53+
_C.NUM_CLASSES = 2
54+
_C.USE_NMS = True
55+
56+
# dataset config
57+
_C.HOME = '/mnt/'
58+
59+
# face config
60+
_C.FACE = EasyDict()
61+
_C.FACE.TRAIN_FILE = './data/face_train.txt'
62+
_C.FACE.VAL_FILE = './data/face_val.txt'
63+
_C.FACE.WIDER_DIR = '/mnt/WIDER_FACE'
64+
_C.FACE.OVERLAP_THRESH = [0.1, 0.35, 0.5]

0 commit comments

Comments
 (0)