77
88### 1. 📣 数据介绍
99
10- 确定了业务场景之后,需要手机大量的数据 (之前参加过一个安全帽识别检测的比赛,但是数据在比赛平台无法下载为己用),一般来说包含两大来源,一部分是网络数据,可以通过百度、Google图片爬虫拿到,另一部分是用户场景的视频录像,后一部分相对来说数据量更大,但出于商业因素几乎不会开放。本文开源的安全帽检测数据集([SafetyHelmetWearing-Dataset, SHWD](https://github.com/njvisionpower/Safety-Helmet-Wearing-Dataset))主要通过爬虫拿到,总共有7581张图像,包含9044个佩戴安全帽的bounding box(正类),以及111514个未佩戴安全帽的bounding box(负类),所有的图像用labelimg标注出目标区域及类别。其中每个bounding box的标签:hat”表示佩戴安全帽,“person”表示普通未佩戴的行人头部区域的bounding box。另外本数据集中person标签的数据大多数来源于[SCUT-HEAD](https://github.com/HCIILAB/SCUT-HEAD-Dataset-Release)数据集,用于判断是未佩戴安全帽的人。大致说一下数据集构造的过程:
10+ 确定了业务场景之后,需要收集大量的数据 (之前参加过一个安全帽识别检测的比赛,但是数据在比赛平台无法下载为己用),一般来说包含两大来源,一部分是网络数据,可以通过百度、Google图片爬虫拿到,另一部分是用户场景的视频录像,后一部分相对来说数据量更大,但出于商业因素几乎不会开放。本文开源的安全帽检测数据集([SafetyHelmetWearing-Dataset, SHWD](https://github.com/njvisionpower/Safety-Helmet-Wearing-Dataset))主要通过爬虫拿到,总共有7581张图像,包含9044个佩戴安全帽的bounding box(正类),以及111514个未佩戴安全帽的bounding box(负类),所有的图像用labelimg标注出目标区域及类别。其中每个bounding box的标签:“ hat”表示佩戴安全帽,“person”表示普通未佩戴的行人头部区域的bounding box。另外本数据集中person标签的数据大多数来源于[SCUT-HEAD](https://github.com/HCIILAB/SCUT-HEAD-Dataset-Release)数据集,用于判断是未佩戴安全帽的人。大致说一下数据集构造的过程:
1111
12121.数据爬取
1313
@@ -47,15 +47,15 @@ Packages:
4747- opencv-python
4848- tqdm
4949
50- 将预训练的darknet的权重下载,下载地址:< https://pjreddie.com/media/files/yolov3.weights > ,并将该weight文件拷贝大 ` ./data/darknet_weights/ ` 下,因为这是darknet版本的预训练权重,需要转化为Tensorflow可用的版本,运行如下代码可以实现:
50+ 将预训练的darknet的权重下载,下载地址:< https://pjreddie.com/media/files/yolov3.weights > ,并将该weight文件拷贝到 ` ./data/darknet_weights/ ` 下,因为这是darknet版本的预训练权重,需要转化为Tensorflow可用的版本,运行如下代码可以实现:
5151
5252``` shell
5353python convert_weight.py
5454```
5555
5656这样转化后的Tensorflow checkpoint文件被存放在:` ./data/darknet_weights/ ` 目录。你也可以下载已经转化好的模型:
5757
58- [ Google云盘] (( https://drive.google.com/drive/folders/1mXbNgNxyXPi7JNsnBaxEv1-nWr7SVoQt?usp=sharing ) [ GitHub Release] ( https://github.com/wizyoung/YOLOv3_TensorFlow/releases/ )
58+ [ Google云盘] ( https://drive.google.com/drive/folders/1mXbNgNxyXPi7JNsnBaxEv1-nWr7SVoQt?usp=sharing ) [ GitHub Release] ( https://github.com/wizyoung/YOLOv3_TensorFlow/releases/ )
5959
6060
6161### 3.🔰 训练数据构建
@@ -67,17 +67,19 @@ python convert_weight.py
6767``` shell
6868python data_pro.py
6969```
70- 分割训练集,验证集,测试集并在` ./data/my_data/ ` 下生成` train.txt/val.txt/test.txt ` ,对于一张图像对应一行数据,包括` image_index ` ,` image_absolute_path ` ,` box_1 ` ,` box_2 ` ,...,` box_n ` ,每个字段中间是用空格分隔的,其中:
70+ 分割训练集,验证集,测试集并在` ./data/my_data/ ` 下生成` train.txt/val.txt/test.txt ` ,对于一张图像对应一行数据,包括` image_index ` ,` image_absolute_path ` , ` img_width ` , ` img_height ` , ` box_1 ` ,` box_2 ` ,...,` box_n ` ,每个字段中间是用空格分隔的,其中:
7171
7272+ ` image_index ` 文本的行号
73+ + ` image_absolute_path ` 一定是绝对路径
74+ + ` img_width ` , ` img_height ` ,` box_1 ` ,` box_2 ` ,...,` box_n ` 中涉及数值的取值一定取int型
7375+ ` box_x ` 的形式为:` label_index, x_min,y_min,x_max,y_max ` (注意坐标原点在图像的左上角)
7476+ ` label_index ` 是label对应的index(取值为0-class_num-1),这里要注意YOLO系列的模型训练与SSD不同,label不包含background
7577
7678例子:
7779
7880```
79- 0 xxx/xxx/a.jpg 0 453 369 473 391 1 588 245 608 268
80- 1 xxx/xxx/b.jpg 1 466 403 485 422 2 793 300 809 320
81+ 0 xxx/xxx/a.jpg 1920,1080, 0 453 369 473 391 1 588 245 608 268
82+ 1 xxx/xxx/b.jpg 1920,1080, 1 466 403 485 422 2 793 300 809 320
8183...
8284```
8385
@@ -98,6 +100,8 @@ person
98100python get_kmeans.py
99101```
100102
103+ ![ ] ( docs/kmeans.png )
104+
101105可以得到9个anchors和平均的IOU,把anchors保存在文本文件:` ./data/yolo_anchors.txt ` ,
102106
103107** 注意: Kmeans计算出的YOLO Anchors是在在调整大小的图像比例的,默认的调整大小方法是保持图像的纵横比。**
@@ -112,20 +116,21 @@ python get_kmeans.py
112116<summary ><mark ><font color =darkred >修改arg.py</font ></mark ></summary >
113117<pre ><code >
114118### Some paths
115- train_file = './data/my_data/train.txt' # The path of the training txt file.
116- val_file = './data/my_data/val.txt' # The path of the validation txt file.
119+ train_file = './data/my_data/label/ train.txt' # The path of the training txt file.
120+ val_file = './data/my_data/label/ val.txt' # The path of the validation txt file.
117121restore_path = './data/darknet_weights/yolov3.ckpt' # The path of the weights to restore.
118122save_dir = './checkpoint/' # The directory of the weights to save.
119123log_dir = './data/logs/' # The directory to store the tensorboard log files.
120124progress_log_path = './data/progress.log' # The path to record the training progress.
121125anchor_path = './data/yolo_anchors.txt' # The path of the anchor txt file.
122126class_name_path = './data/coco.names' # The path of the class names.
123127### Training releated numbers
124- batch_size = 2 # 需要调整为自己的类别数
128+ batch_size = 32 #6
125129img_size = [416, 416] # Images will be resized to `img_size` and fed to the network, size format: [width, height]
126- total_epoches = 500 # 训练周期调整
127- train_evaluation_step = 50 # Evaluate on the training batch after some steps.
128- val_evaluation_epoch = 1 # Evaluate on the whole validation dataset after some steps. Set to None to evaluate every epoch.
130+ letterbox_resize = True # Whether to use the letterbox resize, i.e., keep the original aspect ratio in the resized image.
131+ total_epoches = 500
132+ train_evaluation_step = 100 # Evaluate on the training batch after some steps.
133+ val_evaluation_epoch = 50 # Evaluate on the whole validation dataset after some epochs. Set to None to evaluate every epoch.
129134save_epoch = 10 # Save the model after some epochs.
130135batch_norm_decay = 0.99 # decay in bn ops
131136weight_decay = 5e-4 # l2 weight decay
@@ -134,45 +139,52 @@ global_step = 0 # used when resuming training
134139num_threads = 10 # Number of threads for image processing used in tf.data pipeline.
135140prefetech_buffer = 5 # Prefetech_buffer used in tf.data pipeline.
136141### Learning rate and optimizer
137- optimizer_name = 'adam ' # Chosen from [sgd, momentum, adam, rmsprop]
142+ optimizer_name = 'momentum ' # Chosen from [sgd, momentum, adam, rmsprop]
138143save_optimizer = True # Whether to save the optimizer parameters into the checkpoint file.
139- learning_rate_init = 1e-3
140- lr_type = 'exponential ' # Chosen from [fixed, exponential, cosine_decay, cosine_decay_restart, piecewise]
144+ learning_rate_init = 1e-4
145+ lr_type = 'piecewise ' # Chosen from [fixed, exponential, cosine_decay, cosine_decay_restart, piecewise]
141146lr_decay_epoch = 5 # Epochs after which learning rate decays. Int or float. Used when chosen `exponential` and `cosine_decay_restart` lr_type.
142147lr_decay_factor = 0.96 # The learning rate decay factor. Used when chosen `exponential` lr_type.
143148lr_lower_bound = 1e-6 # The minimum learning rate.
144- # piecewise params
145- pw_boundaries = [60, 80 ] # epoch based boundaries
146- pw_values = [learning_rate_init, 3e-5, 1e-4 ]
149+ # only used in piecewise lr type
150+ pw_boundaries = [30, 50 ] # epoch based boundaries
151+ pw_values = [learning_rate_init, 3e-5, 1e-5 ]
147152### Load and finetune
148153# Choose the parts you want to restore the weights. List form.
149- # Set to None to restore the whole model.
150- restore_part = ['yolov3/darknet53_body']
154+ # restore_include: None, restore_exclude: None => restore the whole model
155+ # restore_include: None, restore_exclude: scope => restore the whole model except `scope`
156+ # restore_include: scope1, restore_exclude: scope2 => if scope1 contains scope2, restore scope1 and not restore scope2 (scope1 - scope2)
157+ # choise 1: only restore the darknet body
158+ # restore_include = ['yolov3/darknet53_body']
159+ # restore_exclude = None
160+ # choise 2: restore all layers except the last 3 conv2d layers in 3 scale
161+ restore_include = None
162+ restore_exclude = ['yolov3/yolov3_head/Conv_14', 'yolov3/yolov3_head/Conv_6', 'yolov3/yolov3_head/Conv_22']
151163# Choose the parts you want to finetune. List form.
152164# Set to None to train the whole model.
153165update_part = ['yolov3/yolov3_head']
154166### other training strategies
155- multi_scale_train = False # Whether to apply multi-scale training strategy. Image size varies from [320, 320] to [640, 640] by default.
156- use_label_smooth = False # Whether to use class label smoothing strategy.
157- use_focal_loss = False # Whether to apply focal loss on the conf loss.
158- use_mix_up = False # Whether to use mix up data augmentation strategy. # 数据增强
167+ multi_scale_train = True # Whether to apply multi-scale training strategy. Image size varies from [320, 320] to [640, 640] by default.
168+ use_label_smooth = True # Whether to use class label smoothing strategy.
169+ use_focal_loss = True # Whether to apply focal loss on the conf loss.
170+ use_mix_up = True # Whether to use mix up data augmentation strategy.
159171use_warm_up = True # whether to use warm up strategy to prevent from gradient exploding.
160172warm_up_epoch = 3 # Warm up training epoches. Set to a larger value if gradient explodes.
161173### some constants in validation
162- # nms 非极大值抑制
163- nms_threshold = 0.5 # iou threshold in nms operation
164- score_threshold = 0.5 # threshold of the probability of the classes in nms operation
165- nms_topk = 50 # keep at most nms_topk outputs after nms
174+ # nms
175+ nms_threshold = 0.45 # iou threshold in nms operation
176+ score_threshold = 0.01 # threshold of the probability of the classes in nms operation, i.e. score = pred_confs * pred_probs. set lower for higher recall.
177+ nms_topk = 150 # keep at most nms_topk outputs after nms
166178# mAP eval
167179eval_threshold = 0.5 # the iou threshold applied in mAP evaluation
180+ use_voc_07_metric = False # whether to use voc 2007 evaluation metric, i.e. the 11-point metric
168181### parse some params
169182anchors = parse_anchors(anchor_path)
170183classes = read_class_names(class_name_path)
171184class_num = len(classes)
172185train_img_cnt = len(open(train_file, 'r').readlines())
173186val_img_cnt = len(open(val_file, 'r').readlines())
174- train_batch_num = int(math.ceil(float(train_img_cnt) / batch_size)) # iteration
175-
187+ train_batch_num = int(math.ceil(float(train_img_cnt) / batch_size))
176188lr_decay_freq = int(train_batch_num * lr_decay_epoch)
177189pw_boundaries = [float(i) * train_batch_num + global_step for i in pw_boundaries]
178190</code ></pre >
0 commit comments