You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[Enhancement] Discard deprecated lmdb dataset format and only support img+label now (#1681)
* [Enhance] Discard deprecated lmdb dataset format and only support img+label now
* rename
* update
* add ut
* updata document
* update docs
* update test
* update test
* Update dataset.md
Co-authored-by: liukuikun <[email protected]>
Copy file name to clipboardExpand all lines: docs/en/migration/dataset.md
+16-28Lines changed: 16 additions & 28 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -230,38 +230,26 @@ Specifically, we provide three dataset classes [IcdarDataset](mmocr.datasets.Icd
230
230
parser_cfg=dict(
231
231
type='LineJsonParser',
232
232
keys=['filename', 'text'],
233
-
pipeline=[])
233
+
pipeline=[]))
234
234
```
235
235
236
-
3. [RecogLMDBDataset](mmocr.datasets.RecogLMDBDataset) supports LMDBformat annotations for text recognition. You just need to add a new dataset config to `configs/textrecog/_base_/datasets`and specify its dataset typeas`RecogLMDBDataset`. For example, the following example shows how to configure and load the **label-only lmdb**`label.lmdb`from the toy dataset.
237
-
238
-
```python
239
-
data_root='tests/data/rec_toy_dataset/'
240
-
241
-
lmdb_dataset=dict(
242
-
type='RecogLMDBDataset',
243
-
data_root=data_root,
244
-
ann_file='label.lmdb',
245
-
data_prefix=dict(img_path='imgs'),
246
-
pipeline=[])
247
-
```
236
+
3.[RecogLMDBDataset](mmocr.datasets.RecogLMDBDataset) supports LMDB format dataset (img+labels) for text recognition. You just need to add a new dataset config to `configs/textrecog/_base_/datasets` and specify its dataset type as `RecogLMDBDataset`. For example, the following example shows how to configure and load the **both labels and images**`imgs.lmdb` from the toy dataset.
248
237
249
-
When the `lmdb`file contains **both labels and images**, in addition to setting the dataset type to `RecogLMDBDataset`asin the above example, you also need to replace the [`LoadImageFromFile`](mmocr.datasets.transforms.LoadImageFromFile) with [`LoadImageFromLMDB`](mmocr.datasets.transforms.LoadImageFromLMDB) in the data pipelines.
238
+
- set the dataset type to `RecogLMDBDataset`
250
239
251
-
```python
252
-
# Specify the dataset type as RecogLMDBDataset
253
-
data_root='tests/data/rec_toy_dataset/'
240
+
```python
241
+
# Specify the dataset type as RecogLMDBDataset
242
+
data_root ='tests/data/rec_toy_dataset/'
254
243
255
-
lmdb_dataset=dict(
256
-
type='RecogLMDBDataset',
257
-
data_root=data_root,
258
-
ann_file='imgs.lmdb',
259
-
data_prefix=dict(img_path='imgs.lmdb'), # setting the img_path as the lmdb name
260
-
pipeline=[])
261
-
```
244
+
lmdb_dataset =dict(
245
+
type='RecogLMDBDataset',
246
+
data_root=data_root,
247
+
ann_file='imgs.lmdb',
248
+
pipeline=None)
249
+
```
262
250
263
-
Also, replacing the image loading transforms in`train_pipeline`and`test_pipeline`, for example:
251
+
- replace the [`LoadImageFromFile`](mmocr.datasets.transforms.LoadImageFromFile) with [`LoadImageFromNDArray`](mmocr.datasets.transforms.LoadImageFromNDArray) in the data pipelines in `train_pipeline` and `test_pipeline`., for example:
0 commit comments