Skip to content

Commit ab5e13f

Browse files
authored
update readme (#790)
* update readme * update readme * update readme * update readme
1 parent 8baba51 commit ab5e13f

File tree

20 files changed

+169
-1361
lines changed

20 files changed

+169
-1361
lines changed

configs/cls/mobilenetv3/README.md

Lines changed: 18 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,9 @@ English | [中文](README_CN.md)
22

33
# MobileNetV3 for text direction classification
44

5-
## 1. Introduction
5+
## Introduction
66

7-
### 1.1 MobileNetV3: [Searching for MobileNetV3](https://arxiv.org/abs/1905.02244)
7+
### MobileNetV3: [Searching for MobileNetV3](https://arxiv.org/abs/1905.02244)
88

99
MobileNetV3[[1](#references)] was published in 2019, which combines the deep separable convolution of V1, the Inverted Residuals and Linear Bottleneck of V2, and the SE (Squeeze and Excitation) module to search the configuration and parameters of the network using NAS (Neural Architecture Search). MobileNetV3 first uses MnasNet to perform a coarse structure search, and then uses reinforcement learning to select the optimal configuration from a set of discrete choices. Besides, MobileNetV3 fine-tunes the architecture using NetAdapt. Overall, MobileNetV3 is a lightweight network having good performance in classification, detection and segmentation tasks.
1010

@@ -16,7 +16,7 @@ MobileNetV3[[1](#references)] was published in 2019, which combines the deep sep
1616
</p>
1717

1818

19-
### 1.2 Text direction classifier
19+
### Text direction classifier
2020

2121
The text directions in some images are revered, so that the text cannot be regconized correctly. Therefore. we use a text direction classifier to classify and rectify the text direction. The MobileNetV3 paper releases two versions of MobileNetV3: *MobileNetV3-Large* and *MobileNetV3-Small*. Taking the tradeoff between efficiency and accuracy, we adopt the *MobileNetV3-Small* as the text direction classifier.
2222

@@ -32,32 +32,34 @@ Currently we support the 0 and 180 degree classification. You can update the par
3232
</div>
3333

3434

35-
## 2. Results
35+
## Results
36+
37+
| mindspore | ascend driver | firmware | cann toolkit/kernel |
38+
|:---------:|:---------------:|:------------:|:-------------------:|
39+
| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
3640

3741
MobileNetV3 is pretrained on ImageNet. For text direction classification task, we further train MobileNetV3 on RCTW17, MTWI and LSVT datasets.
3842

43+
Experiments are tested on ascend 910* with mindspore 2.3.1 graph mode
3944
<div align="center">
4045

41-
| **Model** | **Context** | **Specification** | **Pretrained dataset** | **Training dataset** | **Accuracy** | **Train T.** | **Throughput** | **Recipe** | **Download** |
42-
|-------------------|----------------|--------------|----------------|------------|---------------|---------------|----------------|-----------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
43-
| MobileNetV3 | D910x4-MS2.0-G | small | ImageNet | RCTW17, MTWI, LSVT | 94.59% | 154.2 s/epoch | 5923.5 img/s | [yaml](cls_mv3.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/cls/cls_mobilenetv3-92db9c58.ckpt) |
46+
| **model name** | **cards** | **batch size** | **img/s** | **accuracy** | **config** | **weight** |
47+
|----------------|-----------|----------------|-----------|--------------|-----------------------------------------------------|------------------------------------------------|
48+
| MobileNetV3 | 4 | 256 | 5923.5 | 94.59% | [yaml](cls_mv3.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/cls/cls_mobilenetv3-92db9c58.ckpt) |
4449
</div>
4550

4651

47-
#### Notes
48-
- Context: Training context denoted as {device}x{pieces}-{MS version}{MS mode}, where MS (MindSpore) mode can be G - graph mode or F - pynative mode with ms function. For example, D910x8-G is for training on 8 pieces of Ascend 910 NPU using graph mode.
49-
5052

5153

52-
## 3. Quick Start
54+
## Quick Start
5355

54-
### 3.1 Installation
56+
### Installation
5557

5658
Please refer to the [installation instruction](https://github.com/mindspore-lab/mindocr#installation) in MindOCR.
5759

58-
### 3.2 Dataset preparation
60+
### Dataset preparation
5961

60-
Please download [RCTW17](https://rctw.vlrlab.net/dataset), [MTWI](https://tianchi.aliyun.com/competition/entrance/231684/introduction), and [LSVT](https://rrc.cvc.uab.es/?ch=16&com=introduction) datasets, and then process the images and labels in desired format referring to [dataset_converters](https://github.com/mindspore-lab/mindocr/blob/main/tools/dataset_converters/README.md) (Coming soon...).
62+
Please download [RCTW17](https://rctw.vlrlab.net/dataset), [MTWI](https://tianchi.aliyun.com/competition/entrance/231684/introduction), and [LSVT](https://rrc.cvc.uab.es/?ch=16&com=introduction) datasets, and then process the images and labels in desired format referring to [dataset_converters](https://github.com/mindspore-lab/mindocr/blob/main/tools/dataset_converters/README.md).
6163

6264
The prepared dataset file struture is suggested to be as follows.
6365

@@ -75,7 +77,7 @@ The prepared dataset file struture is suggested to be as follows.
7577
> If you want to use your own dataset for training, please convert the images and labels to the desired format referring to [dataset_converters](https://github.com/mindspore-lab/mindocr/blob/main/tools/dataset_converters/README.md).
7678
7779

78-
### 3.3 Update yaml config file
80+
### Update yaml config file
7981

8082
Update the dataset directories in yaml config file. The `dataset_root` will be concatenated with `data_dir` and `label_file` respectively to be the complete image directory and label file path.
8183

@@ -117,29 +119,8 @@ model:
117119
num_classes: *num_classes # 2 or 4
118120
```
119121
120-
### 3.4 Training
121-
122-
* Standalone training
123-
124-
Please set `distribute` in yaml config file to be `False`.
125-
126-
```shell
127-
python tools/train.py -c configs/cls/mobilenetv3/cls_mv3.yaml
128-
```
129-
130-
* Distributed training
131-
132-
Please set `distribute` in yaml config file to be `True`.
133-
134-
```shell
135-
# n is the number of NPUs
136-
mpirun --allow-run-as-root -n 4 python tools/train.py -c configs/cls/mobilenetv3/cls_mv3.yaml
137-
```
138-
139-
The training result (including checkpoints, per-epoch performance and curves) will be saved in the directory parsed by the arg `ckpt_save_dir` in yaml config file. The default directory is `./tmp_cls`.
140-
141122
142-
### 3.5 Evaluation
123+
### Evaluation
143124
144125
Please set the checkpoint path to the arg `ckpt_load_path` in the `eval` section of yaml config file, set `distribute` to be `False`, and then run:
145126

configs/cls/mobilenetv3/README_CN.md

Lines changed: 19 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,9 @@
22

33
# MobileNetV3用于文字方向分类
44

5-
## 1. 概述
5+
## 概述
66

7-
### 1.1 MobileNetV3: [Searching for MobileNetV3](https://arxiv.org/abs/1905.02244)
7+
### MobileNetV3: [Searching for MobileNetV3](https://arxiv.org/abs/1905.02244)
88

99
MobileNetV3[[1](#参考文献)]于2019年发布,这个版本结合了V1的deep separable convolution,V2的Inverted Residuals and Linear Bottleneck,以及SE(Squeeze and Excitation)模块,并使用NAS(Neural Architecture Search)搜索最优网络的配置和参数。MobileNetV3 首先使用 MnasNet 进行粗粒度的结构搜索,然后使用强化学习从一组离散选择中选择最优配置。另外,MobileNetV3 还使用 NetAdapt 对架构进行微调。总之,MobileNetV3是一个轻量级的网络,在分类、检测和分割任务上有不错的表现。
1010

@@ -16,7 +16,7 @@ MobileNetV3[[1](#参考文献)]于2019年发布,这个版本结合了V1的deep
1616
<em>图 1. MobileNetV3整体架构图 [<a href="#参考文献">1</a>] </em>
1717
</p>
1818

19-
### 1.2 文字方向分类器
19+
### 文字方向分类器
2020

2121
在某些图片中,文字方向是反过来或不正确的,导致文字无法被正确识别。因此,我们使用了文字方向分类器来对文字方向进行分类并校正。MobileNetV3论文提出了两个版本的MobileNetV3:*MobileNetV3-Large**MobileNetV3-Small*。为了兼顾性能和分类准确性,我们采用*MobileNetV3-Small*作为文字方向分类器。
2222

@@ -32,32 +32,34 @@ MobileNetV3[[1](#参考文献)]于2019年发布,这个版本结合了V1的deep
3232
</div>
3333

3434

35-
## 2. 实验结果
35+
## 实验结果
36+
37+
| mindspore | ascend driver | firmware | cann toolkit/kernel |
38+
|:---------:|:---------------:|:------------:|:-------------------:|
39+
| 2.3.1 | 24.1.RC2 | 7.3.0.1.231 | 8.0.RC2.beta1 |
3640

3741
MobileNetV3在ImageNet上预训练。另外,我们进一步在RCTW17、MTWI和LSVT数据集上进行了文字方向分类任务的训练。
3842

43+
在采用图模式的ascend 910*上实验结果,mindspore版本为2.3.1
3944
<div align="center">
4045

41-
| **模型** | **环境配置** | **规格** | **预训练数据集** | **训练数据集** | **准确率从** | **训练时间** | **吞吐量** | **配置文件** | **模型权重下载** |
42-
|-------------------|----------------|--------------|----------------|------------|---------------|---------------|----------------|-----------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
43-
| MobileNetV3 | D910x4-MS2.0-G | small | ImageNet | RCTW17, MTWI, LSVT | 94.59% | 154.2 s/epoch | 5923.5 img/s | [yaml](cls_mv3.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/cls/cls_mobilenetv3-92db9c58.ckpt) |
46+
| **模型名称** | **卡数** | **单卡批量大小** | **img/s** | **准确率** | **配置** | **权重** |
47+
|-------------|--------|------------|-----------|---------|----------------------|------------------------------------------------------------------------------------------|
48+
| MobileNetV3 | 4 | 256 | 5923.5 | 94.59% | [yaml](cls_mv3.yaml) | [ckpt](https://download.mindspore.cn/toolkits/mindocr/cls/cls_mobilenetv3-92db9c58.ckpt) |
4449
</div>
4550

4651

47-
#### 注释:
48-
- 环境配置:训练的环境配置表示为 {处理器}x{处理器数量}-{MS模式},其中 MS(MindSpore) 模式可以是 G-graph 模式或 F-pynative 模式。
49-
50-
## 3. 快速上手
52+
## 快速上手
5153

52-
### 3.1 安装
54+
### 安装
5355

5456
请参考MindOCR套件的[安装指南](https://github.com/mindspore-lab/mindocr#installation)
5557

56-
### 3.2 数据准备
58+
### 数据准备
5759

58-
#### 3.2.1 ICDAR2015 数据集
60+
#### ICDAR2015 数据集
5961

60-
请下载[RCTW17](https://rctw.vlrlab.net/dataset)[MTWI](https://tianchi.aliyun.com/competition/entrance/231684/introduction)[LSVT](https://rrc.cvc.uab.es/?ch=16&com=introduction)数据集,然后参考[数据转换](https://github.com/mindspore-lab/mindocr/blob/main/tools/dataset_converters/README_CN.md)章节对数据集和标注进行格式转换(敬请期待)
62+
请下载[RCTW17](https://rctw.vlrlab.net/dataset)[MTWI](https://tianchi.aliyun.com/competition/entrance/231684/introduction)[LSVT](https://rrc.cvc.uab.es/?ch=16&com=introduction)数据集,然后参考[数据转换](https://github.com/mindspore-lab/mindocr/blob/main/tools/dataset_converters/README_CN.md)章节对数据集和标注进行格式转换。
6163

6264
完成数据准备工作后,数据的目录结构应该如下所示:
6365

@@ -75,7 +77,7 @@ MobileNetV3在ImageNet上预训练。另外,我们进一步在RCTW17、MTWI和
7577
> 用户如果想要使用自己的数据集进行训练,请参考[数据转换](https://github.com/mindspore-lab/mindocr/blob/main/tools/dataset_converters/README_CN.md)对数据集和标注进行格式转换。
7678
7779

78-
### 3.3 配置说明
80+
### 配置说明
7981

8082

8183
在配置文件中更新数据集路径。其中`dataset_root`会分别和`data_dir`以及`label_file`拼接构成完整的数据集目录和标签文件路径。
@@ -118,30 +120,7 @@ model:
118120
num_classes: *num_classes # 2 or 4
119121
```
120122
121-
122-
### 3.4 训练
123-
124-
* 单卡训练
125-
126-
请确保yaml文件中的`distribute`参数为`False`。
127-
128-
``` shell
129-
python tools/train.py -c configs/cls/mobilenetv3/cls_mv3.yaml
130-
```
131-
132-
* 分布式训练
133-
134-
请确保yaml文件中的`distribute`参数为`True`。
135-
136-
```shell
137-
# n is the number of NPUs
138-
mpirun --allow-run-as-root -n 4 python tools/train.py -c configs/cls/mobilenetv3/cls_mv3.yaml
139-
yaml
140-
```
141-
142-
训练结果(包括checkpoint、每个epoch的性能和曲线图)将被保存在yaml配置文件的`ckpt_save_dir`参数配置的路径下,默认为`./tmp_cls`。
143-
144-
### 3.5 评估
123+
### 评估
145124
146125
评估环节,在yaml配置文件中将`ckpt_load_path`参数配置为checkpoint文件的路径,并设置`distribute`为`False`,然后运行:
147126

0 commit comments

Comments
 (0)