You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: configs/cls/mobilenetv3/README.md
+18-37Lines changed: 18 additions & 37 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,9 +2,9 @@ English | [中文](README_CN.md)
2
2
3
3
# MobileNetV3 for text direction classification
4
4
5
-
## 1. Introduction
5
+
## Introduction
6
6
7
-
### 1.1 MobileNetV3: [Searching for MobileNetV3](https://arxiv.org/abs/1905.02244)
7
+
### MobileNetV3: [Searching for MobileNetV3](https://arxiv.org/abs/1905.02244)
8
8
9
9
MobileNetV3[[1](#references)] was published in 2019, which combines the deep separable convolution of V1, the Inverted Residuals and Linear Bottleneck of V2, and the SE (Squeeze and Excitation) module to search the configuration and parameters of the network using NAS (Neural Architecture Search). MobileNetV3 first uses MnasNet to perform a coarse structure search, and then uses reinforcement learning to select the optimal configuration from a set of discrete choices. Besides, MobileNetV3 fine-tunes the architecture using NetAdapt. Overall, MobileNetV3 is a lightweight network having good performance in classification, detection and segmentation tasks.
10
10
@@ -16,7 +16,7 @@ MobileNetV3[[1](#references)] was published in 2019, which combines the deep sep
16
16
</p>
17
17
18
18
19
-
### 1.2 Text direction classifier
19
+
### Text direction classifier
20
20
21
21
The text directions in some images are revered, so that the text cannot be regconized correctly. Therefore. we use a text direction classifier to classify and rectify the text direction. The MobileNetV3 paper releases two versions of MobileNetV3: *MobileNetV3-Large* and *MobileNetV3-Small*. Taking the tradeoff between efficiency and accuracy, we adopt the *MobileNetV3-Small* as the text direction classifier.
22
22
@@ -32,32 +32,34 @@ Currently we support the 0 and 180 degree classification. You can update the par
- Context: Training context denoted as {device}x{pieces}-{MS version}{MS mode}, where MS (MindSpore) mode can be G - graph mode or F - pynative mode with ms function. For example, D910x8-G is for training on 8 pieces of Ascend 910 NPU using graph mode.
49
-
50
52
51
53
52
-
## 3. Quick Start
54
+
## Quick Start
53
55
54
-
### 3.1 Installation
56
+
### Installation
55
57
56
58
Please refer to the [installation instruction](https://github.com/mindspore-lab/mindocr#installation) in MindOCR.
57
59
58
-
### 3.2 Dataset preparation
60
+
### Dataset preparation
59
61
60
-
Please download [RCTW17](https://rctw.vlrlab.net/dataset), [MTWI](https://tianchi.aliyun.com/competition/entrance/231684/introduction), and [LSVT](https://rrc.cvc.uab.es/?ch=16&com=introduction) datasets, and then process the images and labels in desired format referring to [dataset_converters](https://github.com/mindspore-lab/mindocr/blob/main/tools/dataset_converters/README.md) (Coming soon...).
62
+
Please download [RCTW17](https://rctw.vlrlab.net/dataset), [MTWI](https://tianchi.aliyun.com/competition/entrance/231684/introduction), and [LSVT](https://rrc.cvc.uab.es/?ch=16&com=introduction) datasets, and then process the images and labels in desired format referring to [dataset_converters](https://github.com/mindspore-lab/mindocr/blob/main/tools/dataset_converters/README.md).
61
63
62
64
The prepared dataset file struture is suggested to be as follows.
63
65
@@ -75,7 +77,7 @@ The prepared dataset file struture is suggested to be as follows.
75
77
> If you want to use your own dataset for training, please convert the images and labels to the desired format referring to [dataset_converters](https://github.com/mindspore-lab/mindocr/blob/main/tools/dataset_converters/README.md).
76
78
77
79
78
-
### 3.3 Update yaml config file
80
+
### Update yaml config file
79
81
80
82
Update the dataset directories in yaml config file. The `dataset_root` will be concatenated with `data_dir` and `label_file` respectively to be the complete image directory and label file path.
81
83
@@ -117,29 +119,8 @@ model:
117
119
num_classes: *num_classes # 2 or 4
118
120
```
119
121
120
-
### 3.4 Training
121
-
122
-
* Standalone training
123
-
124
-
Please set `distribute` in yaml config file to be `False`.
The training result (including checkpoints, per-epoch performance and curves) will be saved in the directory parsed by the arg `ckpt_save_dir` in yaml config file. The default directory is `./tmp_cls`.
140
-
141
122
142
-
### 3.5 Evaluation
123
+
### Evaluation
143
124
144
125
Please set the checkpoint path to the arg `ckpt_load_path` in the `eval` section of yaml config file, set `distribute` to be `False`, and then run:
0 commit comments