Optim code and update README

SWHL · SWHL · commit de6a4f3d80e6 · 2022-07-04T09:59:06.000+08:00
diff --git a/README.md b/README.md
@@ -9,142 +9,162 @@
     <a href="./LICENSE"><img src="https://img.shields.io/badge/License-Apache%202-dfd.svg"></a>
 </p>
 
-#### LabelImg标注数据 → YOLOV5格式
-- 将[labelImg](https://github.com/tzutalin/labelImg)库标注的yolo数据格式一键转换为YOLOV5格式数据
-- labelImg标注数据目录结构如下（详情参见`dataset/labelImg_dataset`）：
-  ```text
-    labelImg_dataset
-    ├── classes.txt
-    ├── images(13).jpg
-    ├── images(13).txt
-    ├── images(3).jpg
-    ├── images(3).txt
-    ├── images4.jpg
-    ├── images4.txt
-    ├── images5.jpg
-    ├── images5.txt
-    ├── images6.jpg
-    ├── images7.jpg
-    └── images7.txt
-  ```
-- 转换
-  ```shell
-  python labelImg_2_yolov5.py --src_dir dataset/labelImg_dataset --out_dir dataset/labelImg_dataset_output
-  ```
-  - `--src_dir`：labelImg标注后所在目录
-  - `--out_dir`： 转换之后的数据存放位置
-- 转换后目录结构（详情参见`dataset/labelImg_dataset_output`）：
-  ```text
-  labelImg_dataset_output/
-    ├── classes.txt
-    ├── images
-    │   ├── images(13).jpg
-    │   ├── images(3).jpg
-    │   ├── images4.jpg
-    │   ├── images5.jpg
-    │   └── images7.jpg
-    ├── labels
-    │   ├── images(13).txt
-    │   ├── images(3).txt
-    │   ├── images4.txt
-    │   ├── images5.txt
-    │   └── images7.txt
-    ├── non_labels        # 这是没有标注txt的图像
-    │   └── images6.jpg
-    ├── test.txt
-    ├── train.txt
-    └── val.txt
-  ```
-- 可以进一步直接对`dataset/labelImg_dataset_output`目录作转COCO的转换
-  ```shell
-  python yolov5_2_coco.py --dir_path dataset/lablelImg_dataset_output
-  ```
 
+#### labelImg标注yolo格式数据 → YOLOV5格式
+<details>
 
-#### YOLOV5格式数据 → COCO
-- 可以将一些背景图像加入到训练中，具体做法是：直接将背景图像放入`backgroud_images`目录即可。
-- 转换程序会自动扫描该目录，添加到训练集中，可以无缝集成后续[YOLOX](https://github.com/Megvii-BaseDetection/YOLOX)的训练。
-- YOLOV5训练格式目录结构（详情参见`dataset/YOLOV5`）：
+  - 将[labelImg](https://github.com/tzutalin/labelImg)库标注的yolo数据格式一键转换为YOLOV5格式数据
+  - labelImg标注数据目录结构如下（详情参见`dataset/labelImg_dataset`）：
     ```text
-    YOLOV5
-    ├── classes.txt
-    ├── background_images  # 一般是和要检测的对象容易混淆的图像
-    │   └── bg1.jpeg
-    ├── images
-    │   ├── images(13).jpg
-    │   └── images(3).jpg
-    ├── labels
-    │   ├── images(13).txt
-    │   └── images(3).txt
-    ├── train.txt
-    └── val.txt
+      labelImg_dataset
+      ├── classes.txt
+      ├── images(13).jpg
+      ├── images(13).txt
+      ├── images(3).jpg
+      ├── images(3).txt
+      ├── images4.jpg
+      ├── images4.txt
+      ├── images5.jpg
+      ├── images5.txt
+      ├── images6.jpg  # 注意这个是没有标注的
+      ├── images7.jpg
+      └── images7.txt
     ```
-
-- 转换
+  - 转换
     ```shell
-  python yolov5_2_coco.py --dir_path dataset/YOLOV5 --mode_list train,val
-  ```
-  - `--dir_path`：整理好的数据集所在目录
-  - `--mode_list`：指定生成的json，前提是要有对应的txt文件，可单独指定。（e.g. `train,val,test`）
-
-- 转换后目录结构（详情参见`dataset/YOLOV5_COCO_format`）：
-    ```text
-    YOLOV5_COCO_format
-    ├── annotations
-    │   ├── instances_train2017.json
-    │   └── instances_val2017.json
-    ├── train2017
-    │   ├── 000000000001.jpg
-    │   └── 000000000002.jpg  # 这个是背景图像
-    └── val2017
-        └── 000000000001.jpg
+    python labelImg_2_yolov5.py --src_dir dataset/labelImg_dataset \
+                                --out_dir dataset/labelImg_dataset_output \
+                                --val_ratio 0.2 \
+                                --have_test true \
+                                --test_ratio 0.2
     ```
+    - `--src_dir`：labelImg标注后所在目录
+    - `--out_dir`： 转换之后的数据存放位置
+    - `--val_ratio`：生成验证集占整个数据的比例，默认是`0.2`
+    - `--have_test`：是否生成test部分数据，默认是`True`
+    - `--test_ratio`：test数据整个数据百分比，默认是`0.2`
 
-#### YOLOV5 YAML描述文件 → COCO
-- YOLOV5 yaml 数据文件需要包含：
+  - 转换后目录结构（详情参见`dataset/labelImg_dataset_output`）：
     ```text
-    YOLOV5_yaml
-    ├── images
-    │   ├── train
-    │   │   ├── images(13).jpg
-    │   │   └── images(3).jpg
-    │   └── val
-    │       ├── images(13).jpg
-    │       └── images(3).jpg
-    ├── labels
-    │   ├── train
-    │   │   ├── images(13).txt
-    │   │   └── images(3).txt
-    │   └── val
-    │       ├── images(13).txt
-    │       └── images(3).txt
-    └── sample.yaml
+    labelImg_dataset_output/
+      ├── classes.txt
+      ├── images
+      │   ├── images(13).jpg
+      │   ├── images(3).jpg
+      │   ├── images4.jpg
+      │   ├── images5.jpg
+      │   └── images7.jpg
+      ├── labels
+      │   ├── images(13).txt
+      │   ├── images(3).txt
+      │   ├── images4.txt
+      │   ├── images5.txt
+      │   └── images7.txt
+      ├── non_labels        # 这是没有标注图像的目录
+      │   └── images6.jpg
+      ├── test.txt
+      ├── train.txt
+      └── val.txt
+    ```
+  - 可以进一步直接对`dataset/labelImg_dataset_output`目录作转COCO的转换
+    ```shell
+    python yolov5_2_coco.py --dir_path dataset/lablelImg_dataset_output
     ```
 
-- 转换
-  ```shell
-  python yolov5_yaml_2_coco.py --yaml_path dataset/YOLOV5_yaml/sample.yaml
-  ```
+</details>
+
+#### YOLOV5格式数据 → COCO
+<details>
+
+  - 可以将一些背景图像加入到训练中，具体做法是：直接将背景图像放入`backgroud_images`目录即可。
+  - 转换程序会自动扫描该目录，添加到训练集中，可以无缝集成后续[YOLOX](https://github.com/Megvii-BaseDetection/YOLOX)的训练。
+  - YOLOV5训练格式目录结构（详情参见`dataset/YOLOV5`）：
+      ```text
+      YOLOV5
+      ├── classes.txt
+      ├── background_images  # 一般是和要检测的对象容易混淆的图像
+      │   └── bg1.jpeg
+      ├── images
+      │   ├── images(13).jpg
+      │   └── images(3).jpg
+      ├── labels
+      │   ├── images(13).txt
+      │   └── images(3).txt
+      ├── train.txt
+      └── val.txt
+      ```
 
-#### darknet格式数据 → COCO
-- darknet训练数据目录结构（详情参见`dataset/darknet`）：
+  - 转换
+      ```shell
+    python yolov5_2_coco.py --dir_path dataset/YOLOV5 --mode_list train,val
+    ```
+    - `--dir_path`：整理好的数据集所在目录
+    - `--mode_list`：指定生成的json，前提是要有对应的txt文件，可单独指定。（e.g. `train,val,test`）
+
+  - 转换后目录结构（详情参见`dataset/YOLOV5_COCO_format`）：
   ```text
-  darknet
-  ├── class.names
-  ├── gen_config.data
-  ├── gen_train.txt
-  ├── gen_valid.txt
-  └── images
-      ├── train
-      └── valid
+  YOLOV5_COCO_format
+  ├── annotations
+  │   ├── instances_train2017.json
+  │   └── instances_val2017.json
+  ├── train2017
+  │   ├── 000000000001.jpg
+  │   └── 000000000002.jpg  # 这个是背景图像
+  └── val2017
+      └── 000000000001.jpg
   ```
+</details>
 
-- 转换
-  ```shell
-  python darknet2coco.py --data_path dataset/darknet/gen_config.data
-  ```
+#### YOLOV5 YAML描述文件 → COCO
+<details>
+
+  - YOLOV5 yaml 数据文件目录结构如下（详情参见`dataset/YOLOV5_yaml`）：
+      ```text
+      YOLOV5_yaml
+      ├── images
+      │   ├── train
+      │   │   ├── images(13).jpg
+      │   │   └── images(3).jpg
+      │   └── val
+      │       ├── images(13).jpg
+      │       └── images(3).jpg
+      ├── labels
+      │   ├── train
+      │   │   ├── images(13).txt
+      │   │   └── images(3).txt
+      │   └── val
+      │       ├── images(13).txt
+      │       └── images(3).txt
+      └── sample.yaml
+      ```
+
+  - 转换
+    ```shell
+    python yolov5_yaml_2_coco.py --yaml_path dataset/YOLOV5_yaml/sample.yaml
+    ```
+
+  #### darknet格式数据 → COCO
+  - darknet训练数据目录结构（详情参见`dataset/darknet`）：
+    ```text
+    darknet
+    ├── class.names
+    ├── gen_config.data
+    ├── gen_train.txt
+    ├── gen_valid.txt
+    └── images
+        ├── train
+        └── valid
+    ```
+
+  - 转换
+    ```shell
+    python darknet2coco.py --data_path dataset/darknet/gen_config.data
+    ```
+</details>
 
 #### 可视化COCO格式下图像
+<details>
+
 ```shell
 python coco_visual.py --vis_num 1 \
                     --json_path dataset/YOLOV5_COCO_format/annotations/instances_train2017.json \
@@ -155,5 +175,8 @@ python coco_visual.py --vis_num 1 \
 - `--json_path`：查看图像的json文件路径
 - `--img_dir`: 查看图像所在的目录
 
+</details>
+
+
 #### 相关资料
 - [MSCOCO数据标注详解](https://blog.csdn.net/wc781708249/article/details/79603522)
diff --git a/dataset/labelImg_dataset_output/test.txt b/dataset/labelImg_dataset_output/test.txt
@@ -1 +1 @@
-dataset\labelImg_dataset_output\images\images(3).jpg
+dataset\labelImg_dataset_output\images\images5.jpg
diff --git a/dataset/labelImg_dataset_output/train.txt b/dataset/labelImg_dataset_output/train.txt
@@ -1,3 +1,4 @@
+dataset\labelImg_dataset_output\images\images(13).jpg
 dataset\labelImg_dataset_output\images\images4.jpg
+dataset\labelImg_dataset_output\images\images(3).jpg
 dataset\labelImg_dataset_output\images\images5.jpg
-dataset\labelImg_dataset_output\images\images7.jpg
diff --git a/dataset/labelImg_dataset_output/val.txt b/dataset/labelImg_dataset_output/val.txt
@@ -1 +1 @@
-dataset\labelImg_dataset_output\images\images(13).jpg
+dataset\labelImg_dataset_output\images\images7.jpg
diff --git a/docs/README_en.md b/docs/README_en.md
@@ -9,7 +9,10 @@ English | [简体中文](../README.md)
     <a href=". /LICENSE"><img src="https://img.shields.io/badge/License-Apache%202-dfd.svg"></a>
 </p>
 
-#### LabelImg label data → YOLOV5 format
+#### labelImg label data → YOLOV5 format
+<details>
+    <summary>Click to expand</summary>
+
 - Convert the yolo data format marked by the [labelImg](https://github.com/tzutalin/labelImg) library to YOLOV5 format data with one click
 - The labelImg label data directory structure is as follows (see `dataset/labelImg_dataset` for details):
   ````text
@@ -28,11 +31,19 @@ English | [简体中文](../README.md)
     └── images7.txt
   ````
 - Convert
-  ```shell
-  python labelImg_2_yolov5.py --src_dir dataset/labelImg_dataset --out_dir dataset/labelImg_dataset_output
-  ````
-  - `--src_dir`: the directory where labelImg is marked
-  - `--out_dir`: The data storage location after conversion
+    ```shell
+    python labelImg_2_yolov5.py --src_dir dataset/labelImg_dataset \
+                                --out_dir dataset/labelImg_dataset_output \
+                                --val_ratio 0.2 \
+                                --have_test true \
+                                --test_ratio 0.2
+    ```
+    - `--src_dir`: the directory where labelImg is stored after labeling.
+    - `--out_dir`: the location where the data is stored after conversion.
+    - `--val_ratio`: the ratio of the generated validation set to the whole data, default is `0.2`.
+    - `--have_test`: whether to generate the test part of the data, the default is `True`.
+    - `--test_ratio`: percentage of the whole data of the test data, default is `0.2`.
+
 - Converted directory structure (see `dataset/labelImg_dataset_output` for details):
   ````text
   labelImg_dataset_output/
@@ -49,7 +60,7 @@ English | [简体中文](../README.md)
     │   ├── images4.txt
     │   ├── images5.txt
     │   └── images7.txt
-    ├── non_labels        # This is the image without label txt
+    ├── non_labels        # This is the catalog without the labeled images.
     │   └── images6.jpg
     ├── test.txt
     ├── train.txt
@@ -59,8 +70,12 @@ English | [简体中文](../README.md)
   ```shell
   python yolov5_2_coco.py --dir_path dataset/labellImg_dataset_output
   ````
+</details>
 
 #### YOLOV5 format data → COCO
+<details>
+    <summary>Click to expand</summary>
+
 - Some background images can be added to the training by directly placing them into the `backgroud_images` directory.
 - The conversion program will automatically scan this directory and add it to the training set, allowing seamless integration with subsequent [YOLOX](https://github.com/Megvii-BaseDetection/YOLOX) training.
 - YOLOV5 training format directory structure (see `dataset/YOLOV5` for details).
@@ -98,8 +113,12 @@ English | [简体中文](../README.md)
     └── val2017
         └── 000000000001.jpg
     ```
+</details>
 
 #### YOLOV5 YAML description file → COCO
+<details>
+    <summary>Click to expand</summary>
+
 - The YOLOV5 yaml data file needs to contain.
     ```text
     YOLOV5_yaml
@@ -142,8 +161,12 @@ English | [简体中文](../README.md)
   ```shell
   python darknet2coco.py --data_path dataset/darknet/gen_config.data
   ```
+</details>
 
 #### Visualize images in COCO format
+<details>
+    <summary>Click to expand</summary>
+
 ```shell
 python coco_visual.py --vis_num 1 \
                     --json_path dataset/YOLOV5_COCO_format/annotations/instances_train2017.json \
@@ -154,5 +177,8 @@ python coco_visual.py --vis_num 1 \
 - `--json_path`: path to the json file of the image to view
 - `--img_dir`: view the directory where the image is located
 
+</details>
+
+
 #### Related information
 - [MSCOCO Data Annotation Details](https://blog.csdn.net/wc781708249/article/details/79603522)
diff --git a/labelImg_2_yolov5.py b/labelImg_2_yolov5.py

Original file line number	Diff line number	Diff line change
`@@ -1 +1 @@`
`1`		`-dataset\labelImg_dataset_output\images\images(3).jpg`
	`1`	`+dataset\labelImg_dataset_output\images\images5.jpg`
Original file line number	Diff line number	Diff line change
`@@ -1,3 +1,4 @@`
	`1`	`+dataset\labelImg_dataset_output\images\images(13).jpg`
`1`	`2`	`dataset\labelImg_dataset_output\images\images4.jpg`
	`3`	`+dataset\labelImg_dataset_output\images\images(3).jpg`
`2`	`4`	`dataset\labelImg_dataset_output\images\images5.jpg`
`3`		`-dataset\labelImg_dataset_output\images\images7.jpg`