Skip to content

Commit e4d5255

Browse files
oindrilasahaharsha-simhadriShikharJ
authored
Rnnpool facedetection (#215)
* add m4 model * add rnnpool sparsity * revert to previous+remove basenet * scut training and testing * augmentations and data file changes * update readme and eval files * evaluation code * fix bugs * remove lists * data prep script * merge face detection and m4 * eval arch options * finetune * readme changes * readme update * readme update * newlines and slashes * add dataset directory as environment variable * rpool face c detect bug * support for multigpu * multigpu fixes * remove subset option * readme edit * mkdir * readme changes * fix warning * rnnpool device * add arch * trace generation * Update eval.py * Update prior_box.py * Update multibox_loss.py * Update train.py * Update scut_test.py * eval bug + newlines * Remove stray newline Co-authored-by: Harsha Vardhan Simhadri <[email protected]> Co-authored-by: ShikharJ <[email protected]>
1 parent 5f0b6e8 commit e4d5255

25 files changed

+3645
-90
lines changed

examples/pytorch/vision/Face_Detection/README.md

Lines changed: 28 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,10 @@
1-
# Code for Face Detection experiments with RNNPool
1+
# Code for Face Detection Experiments with RNNPool
2+
Refer to README_M4.md for instructions related to the M4 model
23
## Requirements
3-
1. Follow instructions to install requirements for EdgeML operators and the EdgeML operators [here](https://github.com/microsoft/EdgeML/blob/master/pytorch/README.md).
4+
1. Follow instructions to install EdgeML operators and their pre-requisites [here](https://github.com/microsoft/EdgeML/blob/master/pytorch/README.md).
45
2. Install requirements for face detection model using
56
``` pip install -r requirements.txt ```
6-
We have tested the installation and the code on Ubuntu 18.04 with Cuda 10.2 and CuDNN 7.6
7+
We have tested the installation and the code on Ubuntu 18.04 with Python 3.6, Cuda 10.2 and CuDNN 7.6
78

89
## Dataset
910
1. Download WIDER face dataset images and annotations from http://shuoyang1213.me/WIDERFACE/ and place them all in a folder with name 'WIDER_FACE'. That is, download WIDER_train.zip, WIDER_test.zip, WIDER_val.zip, wider_face_split.zip and place it in WIDER_FACE folder, and unzip files using:
@@ -18,12 +19,17 @@ cd ..
1819

1920
```
2021

21-
2. In `data/config.py` , set _C.HOME to the parent directory of the above folder, and set the _C.FACE.WIDER_DIR to the folder path.
22-
That is, if the WIDER_FACE folder is created in /mnt folder, then _C.HOME='/mnt'
23-
_C.FACE.WIDER_DIR='/mnt/WIDER_FACE'.
24-
Similarly, change `data/config_qvga.py` to set _C.HOME and _C.FACE.WIDER_DIR.
22+
2. Set environment variable DATA_HOME to the parent directory of the above folder
23+
That is, if the WIDER_FACE folder is created in /mnt folder
24+
25+
``` export DATA_HOME='/mnt' ```
26+
27+
Note that for Windows '/' should be replaced by '\'.
28+
For all following commands the environment variable IS_QVGA_MONO has to be set as 0 for using config.py (to use RGB 640x480 images) and as 1 for using config_qvga.py (to use monochrome 320x240 images) as the configuration file.
29+
30+
2531
3. Run
26-
``` python prepare_wider_data.py ```
32+
``` IS_QVGA_MONO=1 python prepare_wider_data.py ```
2733

2834

2935
# Usage
@@ -64,15 +70,15 @@ There are two modes of testing the trained model -- the evaluation mode to gener
6470

6571
#### Evaluation Mode
6672

67-
Given a set of images in <your_image_folder>, `eval/py` generates bounding boxes around faces (where the confidence is higher than certain threshold) and write the images in <your_save_folder>. To evaluate the `rpool_face_best_state.pth` model (stored in ./weights), execute the following command:
73+
Given a set of images in <your_image_folder>, `eval/py` generates bounding boxes around faces (where the confidence is higher than certain threshold) and write the images in <your_save_folder>. Specify if the model was trained in multigpu setting in --multigpu. To evaluate the `rpool_face_best_state.pth` model (stored in ./weights), execute the following command:
6874

6975
```shell
70-
IS_QVGA_MONO=0 python eval.py --model_arch RPool_Face_Quant --model ./weights/RPool_Face_Quant_best_state.pth --image_folder <your_image_folder> --save_dir <your_save_folder>
76+
IS_QVGA_MONO=0 python eval.py --model_arch RPool_Face_Quant --model ./weights/RPool_Face_Quant_best_state.pth --image_folder <your_image_folder> --save_dir <your_save_folder> --multigpu True
7177
```
7278

7379
For QVGA:
7480
```shell
75-
IS_QVGA_MONO=1 python eval.py --model_arch RPool_Face_QVGA_monochrome --model ./weights/RPool_Face_QVGA_monochrome_best_state.pth --image_folder <your_image_folder> --save_dir <your_save_folder>
81+
IS_QVGA_MONO=1 python eval.py --model_arch RPool_Face_QVGA_monochrome --model ./weights/RPool_Face_QVGA_monochrome_best_state.pth --image_folder <your_image_folder> --save_dir <your_save_folder> --multigpu True
7682
```
7783

7884
This will save images in <your_save_folder> with bounding boxes around faces, where the confidence is high. Here is an example image with a single bounding box.
@@ -86,15 +92,15 @@ If IS_QVGA_MONO=1 the evaluation code accepts an image of any size and resizes a
8692
#### WIDER Set Test
8793
In this mode, we test the generated model against the provided WIDER_FACE validation and test dataset.
8894

89-
For this, first run the following to generate predictions of the model and store output in the '--save_folder' folder.
95+
For this, first run the following to generate predictions of the model and store output in the '--save_folder' folder. Specify if the model was trained in multigpu setting in --multigpu.
9096

9197
```shell
92-
IS_QVGA_MONO=0 python wider_test.py --model_arch RPool_Face_Quant --model ./weights/RPool_Face_Quant_best_state.pth --save_folder rpool_face_quant_val --subset val
98+
IS_QVGA_MONO=0 python wider_test.py --model_arch RPool_Face_Quant --model ./weights/RPool_Face_Quant_best_state.pth --save_folder rpool_face_quant_val --subset val --multigpu True
9399
```
94100

95101
For QVGA:
96102
```shell
97-
IS_QVGA_MONO=1 python wider_test.py --model_arch RPool_Face_QVGA_monochrome --model ./weights/RPool_Face_QVGA_monochrome_best_state.pth --save_folder rpool_face_qvgamono_val --subset val
103+
IS_QVGA_MONO=1 python wider_test.py --model_arch RPool_Face_QVGA_monochrome --model ./weights/RPool_Face_QVGA_monochrome_best_state.pth --save_folder rpool_face_qvgamono_val --subset val --multigpu True
98104
```
99105

100106
The above command generates predictions for each image in the "validation" dataset. For each image, a separate prediction file is provided (image_name.txt file in appropriate folder). The first line of the prediction file contains the total number of boxes identified.
@@ -104,8 +110,8 @@ If IS_QVGA_MONO=1 then testing is done by converting images to monochrome and QV
104110

105111
The architecture RPool_Face_QVGA_monochrome is for QVGA monochrome format while RPool_Face_C and RPool_Face_Quant are for VGA RGB format.
106112

107-
###### For calculating MAP scores:
108-
Now using these boxes, we can compute the standard MAP score that is widely used in this literature (see [here](https://medium.com/@jonathan_hui/map-mean-average-precision-for-object-detection-45c121a31173) for more details) as follows:
113+
###### For calculating mAP scores:
114+
Now using these boxes, we can compute the standard mAP score that is widely used in this literature (see [here](https://medium.com/@jonathan_hui/map-mean-average-precision-for-object-detection-45c121a31173) for more details) as follows:
109115

110116
1. Download eval_tools.zip from http://shuoyang1213.me/WIDERFACE/support/eval_script/eval_tools.zip and unzip in a folder of same name in this directory.
111117

@@ -116,7 +122,7 @@ wget http://shuoyang1213.me/WIDERFACE/support/eval_script/eval_tools.zip
116122
unzip eval_tools.zip
117123
```
118124

119-
2. Set up scripts to use the Matlab '.mat' data files in eval_tools/ground_truth folder for MAP calculation: The following installs python files that provide the same functionality as the '.m' matlab scripts in eval_tools folder.
125+
2. Set up scripts to use the Matlab '.mat' data files in eval_tools/ground_truth folder for mAP calculation: The following installs python files that provide the same functionality as the '.m' matlab scripts in eval_tools folder.
120126
```
121127
cd eval_tools
122128
git clone https://github.com/wondervictor/WiderFace-Evaluation.git
@@ -126,20 +132,20 @@ python3 setup.py build_ext --inplace
126132

127133
3. Run ```python3 evaluation.py -p <your_save_folder> -g <groud truth dir>``` in WiderFace-Evaluation folder
128134

129-
where `prediction_dir` is the '--save_folder' used for `wider_test.py` above and <groud truth dir> is the subfolder `eval_tools/ground_truth`. That is in, WiderFace-Evaluation directory, run:
135+
where `-p` is the '--save_folder' used for `wider_test.py` above and <groud truth dir> is the subfolder `eval_tools/ground_truth`. That is in, WiderFace-Evaluation directory, run:
130136

131137
```shell
132-
python3 evaluation.py -p <your_save_folder> -g ../ground_truth
138+
python3 evaluation.py -p ../../rpool_face_qvgamono_val -g ../ground_truth
133139
```
134-
This script should output the MAP for the WIDER-easy, WIDER-medium, and WIDER-hard subsets of the dataset. Our best performance using RPool_Face_Quant model is: 0.80 (WIDER-easy), 0.78 (WIDER-medium), 0.53 (WIDER-hard).
140+
This script should output the mAP for the WIDER-easy, WIDER-medium, and WIDER-hard subsets of the dataset. Our best performance using RPool_Face_Quant model is: 0.80 (WIDER-easy), 0.78 (WIDER-medium), 0.53 (WIDER-hard).
135141

136142

137143
##### Dump RNNPool Input Output Traces and Weights
138144

139-
To save model weights and/or input output pairs for each patch through RNNPool in numpy format use the command below. Put images which you want to save traces for in <your_image_folder> . Specify output folder for saving model weights in numpy format in <your_save_model_numpy_folder>. Specify output folder for saving input output traces of RNNPool in numpy format in <your_save_traces_numpy_folder>. Note that input traces will be saved in a folder named 'inputs' and output traces in a folder named 'outputs' inside <your_save_traces_numpy_folder>.
145+
For saving model weights and/or input output pairs for each patch through RNNPool in numpy format use the command below. Put images which you want to save traces for in <your_image_folder> . Specify output folder for saving model weights in numpy format in <your_save_model_numpy_folder>. Specify output folder for saving input output traces of RNNPool in numpy format in <your_save_traces_numpy_folder>. Note that input traces will be saved in a folder named 'inputs' and output traces in a folder named 'outputs' inside <your_save_traces_numpy_folder>.
140146

141147
```shell
142-
python3 dump_model.py --model ./weights/RPool_Face_QVGA_monochrome_best_state.pth --model_arch RPool_Face_Quant --image_folder <your_image_folder> --save_model_npy_dir <your_save_model_numpy_folder> --save_traces_npy_dir <your_save_traces_numpy_folder>
148+
python3 dump_model.py --model ./weights/RPool_Face_QVGA_monochrome_best_state.pth --model_arch RPool_Face_QVGA_monochrome --image_folder <your_image_folder> --save_model_npy_dir <your_save_model_numpy_folder> --save_traces_npy_dir <your_save_traces_numpy_folder>
143149
```
144150
If you wish to save only model weights, do not specify --save_traces_npy_dir. If you wish to save only traces do not specify --save_model_npy_dir.
145151

Lines changed: 141 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,141 @@
1+
# Code for Face Detection Experiments with RNNPool
2+
## Requirements
3+
1. Follow instructions to install EdgeML operators and their pre-requisites [here](https://github.com/microsoft/EdgeML/blob/master/pytorch/README.md).
4+
2. Install requirements for face detection model using
5+
``` pip install -r requirements.txt ```
6+
We have tested the installation and the code on Ubuntu 18.04 with Python 3.6, Cuda 10.2 and CuDNN 7.6
7+
8+
## Dataset - WIDER Face
9+
1. Download WIDER face dataset images and annotations from http://shuoyang1213.me/WIDERFACE/ and place them all in a folder with name 'WIDER_FACE'. That is, download WIDER_train.zip, WIDER_test.zip, WIDER_val.zip, wider_face_split.zip and place it in WIDER_FACE folder, and unzip files using:
10+
11+
```shell
12+
cd WIDER_FACE
13+
unzip WIDER_train.zip
14+
unzip WIDER_test.zip
15+
unzip WIDER_val.zip
16+
unzip wider_face_split.zip
17+
cd ..
18+
19+
```
20+
21+
2. Set environment variable DATA_HOME to the parent directory of the above folder
22+
That is, if the WIDER_FACE folder is created in /mnt folder
23+
24+
``` export DATA_HOME='/mnt' ```
25+
26+
Note that for Windows '/' should be replaced by '\'.
27+
28+
29+
3. Run
30+
``` IS_QVGA_MONO=1 python prepare_wider_data.py ```
31+
32+
## Dataset - SCUT Head B
33+
Download SCUT Head Part B dataset images and annotations from https://github.com/HCIILAB/SCUT-HEAD-Dataset-Release. Unzipping will create a folder by the name 'SCUT_HEAD_Part_B'. Place this folder in the same parent directory as the WIDER_FACE folder.
34+
35+
36+
# Usage
37+
38+
## Training
39+
40+
```shell
41+
42+
IS_QVGA_MONO=1 python train.py --batch_size 128 --model_arch RPool_Face_M4 --cuda True --multigpu True --save_folder weights/ --epochs 300 --save_frequency 5000
43+
44+
```
45+
This will save checkpoints after every '--save_frequency' number of iterations in a weight file with 'checkpoint.pth' at the end and weights for the best state in a file with 'best_state.pth' at the end. These will be saved in '--save_folder'. For resuming training from a checkpoint, use '--resume <checkpoint_name>.pth' with the above command. For example,
46+
47+
48+
```shell
49+
50+
IS_QVGA_MONO=1 python train.py --batch_size 128 --model_arch RPool_Face_M4 --cuda True --multigpu True --save_folder weights/ --epochs 300 --save_frequency 5000 --resume <checkpoint_name>.pth
51+
52+
```
53+
54+
If IS_QVGA_MONO is 0 then training input images will be 640x640 and RGB.
55+
If IS_QVGA_MONO is 1 then training input images will be 320x320 and converted to monochrome.
56+
57+
Input images for training models are cropped and reshaped to square to maintain consistency with [S3FD](https://arxiv.org/abs/1708.05237). However testing can be done on any size of images, thus we resize testing input image size to have area equal to VGA (640x480)/QVGA (320x240), so that aspect ratio is not changed.
58+
59+
The architecture RPool_Face_QVGA_monochrome and RPool_Face_M4 is for QVGA monochrome format while RPool_Face_C and RPool_Face_Quant are for VGA RGB format.
60+
61+
## Finetuning
62+
63+
To obtain a model better suited for conference room scenarios we finetune our model on the SCUT Head B dataset. Set --finetune as True and pass the model pretrained on WIDER_FACE in --resume as follows:
64+
65+
```shell
66+
67+
IS_QVGA_MONO=1 python train.py --batch_size 64 --model_arch RPool_Face_M4 --cuda True --multigpu True --save_folder weights/ --epochs 300 --save_frequency 5000 --resume ./weights/RPool_Face_M4_best_state.pth --finetune True
68+
69+
```
70+
71+
72+
## Test
73+
There are two modes of testing the trained model -- the evaluation mode to generate bounding boxes for a set of sample images, and the test mode to compute statistics like mAP scores.
74+
75+
#### Evaluation Mode
76+
77+
Given a set of images in <your_image_folder>, `eval/py` generates bounding boxes around faces (where the confidence is higher than certain threshold - 0.5 in this case) and write the images in <your_save_folder>. Specify if the model was trained in multigpu setting in --multigpu. To evaluate the `rpool_face_best_state.pth` model (stored in ./weights), execute the following command:
78+
79+
```shell
80+
IS_QVGA_MONO=1 python eval.py --model_arch RPool_Face_M4 --model ./weights/RPool_Face_M4_best_state.pth --image_folder <your_image_folder> --save_dir <your_save_folder> --thresh 0.5 --multigpu True
81+
```
82+
83+
This will save images in <your_save_folder> with bounding boxes around faces, where the confidence is high. It is recommended to use the model finetuned on SCUT Head for evaluation.
84+
85+
If IS_QVGA_MONO=0 the evaluation code accepts an image of any size and resizes it to 640x480x3 while preserving original image aspect ratio.
86+
87+
If IS_QVGA_MONO=1 the evaluation code accepts an image of any size and resizes and converts it to monochrome to make it 320x240x1 while preserving original image aspect ratio.
88+
89+
90+
#### Saving Full Model Traces
91+
Setting the flag --save_traces as True will save input output traces in two separate .npy files for each image in <your_image_folder>, given the architecture and trained model. Run:
92+
93+
94+
```shell
95+
IS_QVGA_MONO=1 python eval.py --model_arch RPool_Face_M4 --model ./weights/RPool_Face_M4_best_state.pth --image_folder <your_image_folder> --save_dir <your_save_folder> --thresh 0.5 --multigpu True --save_traces True
96+
```
97+
98+
For generating traces on SCUT Head images, set <your_image_folder> as $DATA_HOME/SCUT_HEAD_Part_B/JPEGImages/
99+
100+
#### SCUT Head Validation Set Test
101+
In this mode, we test the generated model against the provided SCUT Head Part B validation dataset. Use the SCUT Head finetuned model for this step.
102+
103+
For this, first run the following to generate predictions of the model and store output in the '--save_folder' folder. Specify if the model was trained in multigpu setting in --multigpu.
104+
105+
```shell
106+
IS_QVGA_MONO=1 python scut_test.py --model_arch RPool_Face_M4 --model ./weights/RPool_Face_M4_best_state.pth --save_folder rpool_face_m4_val --multigpu True
107+
```
108+
109+
The above command generates predictions for each image in the "validation" dataset. For each image, a separate prediction file is provided (image_name.txt file in appropriate folder). The first line of the prediction file contains the total number of boxes identified.
110+
Then each line in the file corresponds to an identified box. For each box, five numbers are generated: length of the box, height of the box, x-axis offset, y-axis offset, confidence value for presence of a face in the box.
111+
112+
If IS_QVGA_MONO=1 then testing is done by converting images to monochrome and QVGA, else if IS_QVGA_MONO=0 then testing is done on VGA RGB images.
113+
114+
###### For calculating mAP scores:
115+
Now using these boxes, we can compute the standard mAP score that is widely used in this literature (see [here](https://medium.com/@jonathan_hui/map-mean-average-precision-for-object-detection-45c121a31173) for more details).
116+
117+
In the current Face_Detection directory run:
118+
```
119+
git clone https://github.com/wondervictor/WiderFace-Evaluation.git
120+
cd WiderFace-Evaluation
121+
python3 setup.py build_ext --inplace
122+
mv ../scut_evaluation.py ./
123+
```
124+
125+
Run ```IS_QVGA_MONO=1 python3 scut_evaluation.py -p ../rpool_face_m4_val ``` in WiderFace-Evaluation folder.
126+
127+
where `-p` is the '--save_folder' used for `scut_test.py` above.
128+
129+
This script should output the mAP on SCUT Head Part B Validation set. Our best performance using RPool_Face_M4 model is: 0.61.
130+
131+
132+
##### Dump RNNPool Input Output Traces and Weights
133+
134+
For saving model weights and/or input output pairs for each patch through RNNPool in numpy format use the command below. Put images which you want to save traces for in <your_image_folder> . Specify output folder for saving model weights in numpy format in <your_save_model_numpy_folder>. Specify output folder for saving input output traces of RNNPool in numpy format in <your_save_traces_numpy_folder>. Note that input traces will be saved in a folder named 'inputs' and output traces in a folder named 'outputs' inside <your_save_traces_numpy_folder>.
135+
136+
```shell
137+
python3 dump_model.py --model ./weights/RPool_Face_M4_best_state.pth --model_arch RPool_Face_M4 --image_folder <your_image_folder> --save_model_npy_dir <your_save_model_numpy_folder> --save_traces_npy_dir <your_save_traces_numpy_folder>
138+
```
139+
If you wish to save only model weights, do not specify --save_traces_npy_dir. If you wish to save only traces do not specify --save_model_npy_dir.
140+
141+
Code has been built upon https://github.com/yxlijun/S3FD.pytorch

examples/pytorch/vision/Face_Detection/data/choose_config.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,4 +12,4 @@
1212
name = name + '_qvga'
1313

1414

15-
cfg = import_module('data.' + name)
15+
cfg = import_module('data.' + name)

examples/pytorch/vision/Face_Detection/data/config.py

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -54,12 +54,8 @@
5454
_C.NUM_CLASSES = 2
5555
_C.USE_NMS = True
5656

57-
# dataset config
58-
_C.HOME = '/mnt/' ## change here ----------
59-
6057
# face config
6158
_C.FACE = EasyDict()
6259
_C.FACE.TRAIN_FILE = './data/face_train.txt'
6360
_C.FACE.VAL_FILE = './data/face_val.txt'
64-
_C.FACE.WIDER_DIR = '/mnt/WIDER_FACE' ## change here ---------
6561
_C.FACE.OVERLAP_THRESH = [0.1, 0.35, 0.5]

examples/pytorch/vision/Face_Detection/data/config_qvga.py

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -53,12 +53,8 @@
5353
_C.NUM_CLASSES = 2
5454
_C.USE_NMS = True
5555

56-
# dataset config
57-
_C.HOME = '/mnt/'
58-
5956
# face config
6057
_C.FACE = EasyDict()
6158
_C.FACE.TRAIN_FILE = './data/face_train.txt'
6259
_C.FACE.VAL_FILE = './data/face_val.txt'
63-
_C.FACE.WIDER_DIR = '/mnt/WIDER_FACE'
6460
_C.FACE.OVERLAP_THRESH = [0.1, 0.35, 0.5]

0 commit comments

Comments
 (0)