ValueError: class `EpochBasedTrainLoop` in mmengine/runner/loops.py: class `OCRDataset` in mmocr/datasets/ocr_dataset.py: Annotation must have data_list and metainfo keys #1839

w1125 · 2023-04-05T08:44:17Z

w1125
Apr 5, 2023

使用mmocr1.x分支的模型训练数据集ctw1500，报错，请问一下问题出现的原因
Traceback (most recent call last):
File "/root/miniconda3/lib/python3.8/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg
obj = obj_cls(**args) # type: ignore
File "/root/miniconda3/lib/python3.8/site-packages/mmengine/runner/loops.py", line 44, in init
super().init(runner, dataloader)
File "/root/miniconda3/lib/python3.8/site-packages/mmengine/runner/base_loop.py", line 26, in init
self.dataloader = runner.build_dataloader(
File "/root/miniconda3/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1341, in build_dataloader
dataset = DATASETS.build(dataset_cfg)
File "/root/miniconda3/lib/python3.8/site-packages/mmengine/registry/registry.py", line 548, in build
return self.build_func(cfg, *args, **kwargs, registry=self)
File "/root/miniconda3/lib/python3.8/site-packages/mmengine/registry/build_functions.py", line 135, in build_from_cfg
raise type(e)(
ValueError: class OCRDataset in mmocr/datasets/ocr_dataset.py: Annotation must have data_list and metainfo keys
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "tools/train.py", line 111, in
main()
File "tools/train.py", line 107, in main
runner.train()
File "/root/miniconda3/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1667, in train
self._train_loop = self.build_train_loop(
File "/root/miniconda3/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1459, in build_train_loop
loop = LOOPS.build(
File "/root/miniconda3/lib/python3.8/site-packages/mmengine/registry/registry.py", line 548, in build
return self.build_func(cfg, *args, **kwargs, registry=self)
File "/root/miniconda3/lib/python3.8/site-packages/mmengine/registry/build_functions.py", line 135, in build_from_cfg
raise type(e)(
ValueError: class EpochBasedTrainLoop in mmengine/runner/loops.py: class OCRDataset in mmocr/datasets/ocr_dataset.py: Annotation must have data_list and metainfo keys

gaotongxiao · 2023-04-05T15:48:04Z

gaotongxiao
Apr 5, 2023
Maintainer

Please use dataset preparer to preparer the dataset.

python tools/dataset_converters/prepare_dataset.py ctw1500 --task textdet

0 replies

w1125 · 2023-04-06T05:49:15Z

w1125
Apr 6, 2023
Author

dataset-zoo里似乎没有ctw1500的配置 UserWarning: ctw1500 is not supported yet. Please check dataset zoo for supported datasets. warnings.warn(f'{dataset} is not supported yet. Please check '

…

---- Replied Message ---- | From | Tong ***@***.***> | | Date | 04/05/2023 23:48 | | To | ***@***.***> | | Cc | ***@***.***>***@***.***> | | Subject | Re: [open-mmlab/mmocr] ValueError: class `EpochBasedTrainLoop` in mmengine/runner/loops.py: class `OCRDataset` in mmocr/datasets/ocr_dataset.py: Annotation must have data_list and metainfo keys (Discussion #1839) | Please use dataset preparer to preparer the dataset. python tools/dataset_converters/prepare_dataset.py ctw1500 --task textdet — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>

1 reply

gaotongxiao Apr 6, 2023
Maintainer

It has been included in the latest 1.0.0 release. Please pull down the 1.x branch and try again.

w1125 · 2023-04-07T08:43:38Z

w1125
Apr 7, 2023
Author

更改分支后可以下载ctw1500，但是ctw1500和totaltext报错 urllib.error.URLError: <urlopen error [Errno 99] Cannot assign requested address> 始终无法成功下载，于是按照文档说明的数据集迁移指令转换标注文件，不知道这样做是否可行？另外，数据集准备完成后，运行训练指令，修改dbnetpp模型后运行报错， python tools/train.py configs/textdet/dbnetpp/dbnetpp_resnet50-oclip_fpnc_1200e_icdar2015.py --work-dir result/PANet KeyError: 'DBNet is not in the model registry. Please check whether the value of `DBNet` is correct or it was registered as expected. More details can be found at https://mmengine.readthedocs.io/en/latest/advanced_tutorials/config.html#import-the-custom-module' 尝试使用原始配置的dbnet和fcenet，全部都会报错没有注册代码运行环境，配置和错误信息如下所示，请问该如何解决这个问题，感谢您的回复。 python tools/train.py configs/textdet/dbnetpp/dbnetpp_resnet50-oclip_fpnc_1200e_icdar2015.py --work-dir result/PANet 04/07 16:08:05 - mmengine - INFO -

…

------------------------------------------------------------ System environment: sys.platform: linux Python: 3.8.10 (default, Jun 4 2021, 15:09:15) [GCC 7.5.0] CUDA available: True numpy_random_seed: 1578743701 GPU 0: NVIDIA GeForce RTX 3090 CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 11.1, V11.1.105 GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 PyTorch: 1.8.1+cu111 PyTorch compiling details: PyTorch built with: - GCC 7.3 - C++ Version: 201402 - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications - Intel(R) MKL-DNN v1.7.0 (Git Hash 7aed236906b1f7a05c0917e5257a1af05e9ff683) - OpenMP 201511 (a.k.a. OpenMP 4.5) - NNPACK is enabled - CPU capability usage: AVX2 - CUDA Runtime 11.1 - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86 - CuDNN 8.0.5 - Magma 2.5.2 - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.8.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, TorchVision: 0.9.1+cu111 OpenCV: 4.6.0 MMEngine: 0.7.0 Runtime environment: cudnn_benchmark: False mp_cfg: {'mp_start_method': 'fork', 'opencv_num_threads': 0} dist_cfg: {'backend': 'nccl'} seed: None Distributed launcher: none Distributed training: False GPU number: 1

------------------------------------------------------------ 04/07 16:08:05 - mmengine - INFO - Config: file_client_args = dict(backend='disk') model = dict( type='DBNet', backbone=dict( type='CLIPResNet', init_cfg=dict( type='Pretrained', checkpoint= 'https://download.openmmlab.com/mmocr/backbone/resnet50-oclip-7ba0c533.pth' )), neck=dict( type='PAFPN', in_channels=[256, 512, 1024, 2048], out_channels=256, num_outs=5, asf_cfg=dict(attention_type='ScaleChannelSpatial')), det_head=dict( type='DBHead', in_channels=256, module_loss=dict(type='DBModuleLoss'), postprocessor=dict( type='DBPostprocessor', text_repr_type='quad', epsilon_ratio=0.002)), data_preprocessor=dict( type='TextDetDataPreprocessor', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], bgr_to_rgb=True, pad_size_divisor=32)) train_pipeline = [ dict(type='LoadImageFromFile', color_type='color_ignore_orientation'), dict( type='LoadOCRAnnotations', with_bbox=True, with_polygon=True, with_label=True), dict( type='TorchVisionWrapper', op='ColorJitter', brightness=0.12549019607843137, saturation=0.5), dict( type='ImgAugWrapper', args=[['Fliplr', 0.5], { 'cls': 'Affine', 'rotate': [-10, 10] }, ['Resize', [0.5, 3.0]]]), dict(type='RandomCrop', min_side_ratio=0.1), dict(type='Resize', scale=(640, 640), keep_ratio=True), dict(type='Pad', size=(640, 640)), dict( type='PackTextDetInputs', meta_keys=('img_path', 'ori_shape', 'img_shape')) ] test_pipeline = [ dict(type='LoadImageFromFile', color_type='color_ignore_orientation'), dict(type='Resize', scale=(4068, 1024), keep_ratio=True), dict( type='LoadOCRAnnotations', with_polygon=True, with_bbox=True, with_label=True), dict( type='PackTextDetInputs', meta_keys=('img_path', 'ori_shape', 'img_shape', 'scale_factor', 'instances')) ] default_scope = 'mmocr' env_cfg = dict( cudnn_benchmark=False, mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0), dist_cfg=dict(backend='nccl')) randomness = dict(seed=None) default_hooks = dict( timer=dict(type='IterTimerHook'), logger=dict(type='LoggerHook', interval=5), param_scheduler=dict(type='ParamSchedulerHook'), checkpoint=dict(type='CheckpointHook', interval=20), sampler_seed=dict(type='DistSamplerSeedHook'), sync_buffer=dict(type='SyncBuffersHook'), visualization=dict( type='VisualizationHook', interval=1, enable=False, show=False, draw_gt=False, draw_pred=False)) log_level = 'INFO' log_processor = dict(type='LogProcessor', window_size=10, by_epoch=True) load_from = None resume = False val_evaluator = dict(type='HmeanIOUMetric') test_evaluator = dict(type='HmeanIOUMetric') vis_backends = [dict(type='LocalVisBackend')] visualizer = dict( type='TextDetLocalVisualizer', name='visualizer', vis_backends=[dict(type='LocalVisBackend')]) icdar2015_textdet_data_root = 'data/icdar2015' icdar2015_textdet_train = dict( type='OCRDataset', data_root='data/icdar2015', ann_file='textdet_train.json', filter_cfg=dict(filter_empty_gt=True, min_size=32), pipeline=[ dict(type='LoadImageFromFile', color_type='color_ignore_orientation'), dict( type='LoadOCRAnnotations', with_bbox=True, with_polygon=True, with_label=True), dict( type='TorchVisionWrapper', op='ColorJitter', brightness=0.12549019607843137, saturation=0.5), dict( type='ImgAugWrapper', args=[['Fliplr', 0.5], { 'cls': 'Affine', 'rotate': [-10, 10] }, ['Resize', [0.5, 3.0]]]), dict(type='RandomCrop', min_side_ratio=0.1), dict(type='Resize', scale=(640, 640), keep_ratio=True), dict(type='Pad', size=(640, 640)), dict( type='PackTextDetInputs', meta_keys=('img_path', 'ori_shape', 'img_shape')) ]) icdar2015_textdet_test = dict( type='OCRDataset', data_root='data/icdar2015', ann_file='textdet_test.json', test_mode=True, pipeline=[ dict(type='LoadImageFromFile', color_type='color_ignore_orientation'), dict(type='Resize', scale=(4068, 1024), keep_ratio=True), dict( type='LoadOCRAnnotations', with_polygon=True, with_bbox=True, with_label=True), dict( type='PackTextDetInputs', meta_keys=('img_path', 'ori_shape', 'img_shape', 'scale_factor', 'instances')) ]) optim_wrapper = dict( type='OptimWrapper', optimizer=dict(type='SGD', lr=0.002, momentum=0.9, weight_decay=0.0001)) train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=1200, val_interval=20) val_cfg = dict(type='ValLoop') test_cfg = dict(type='TestLoop') param_scheduler = [ dict(type='LinearLR', end=200, start_factor=0.001), dict(type='PolyLR', power=0.9, eta_min=1e-07, begin=200, end=1200) ] train_dataloader = dict( batch_size=16, num_workers=24, persistent_workers=True, sampler=dict(type='DefaultSampler', shuffle=True), dataset=dict( type='OCRDataset', data_root='data/icdar2015', ann_file='textdet_train.json', filter_cfg=dict(filter_empty_gt=True, min_size=32), pipeline=[ dict( type='LoadImageFromFile', color_type='color_ignore_orientation'), dict( type='LoadOCRAnnotations', with_bbox=True, with_polygon=True, with_label=True), dict( type='TorchVisionWrapper', op='ColorJitter', brightness=0.12549019607843137, saturation=0.5), dict( type='ImgAugWrapper', args=[['Fliplr', 0.5], { 'cls': 'Affine', 'rotate': [-10, 10] }, ['Resize', [0.5, 3.0]]]), dict(type='RandomCrop', min_side_ratio=0.1), dict(type='Resize', scale=(640, 640), keep_ratio=True), dict(type='Pad', size=(640, 640)), dict( type='PackTextDetInputs', meta_keys=('img_path', 'ori_shape', 'img_shape')) ])) val_dataloader = dict( batch_size=1, num_workers=4, persistent_workers=True, sampler=dict(type='DefaultSampler', shuffle=False), dataset=dict( type='OCRDataset', data_root='data/icdar2015', ann_file='textdet_test.json', test_mode=True, pipeline=[ dict( type='LoadImageFromFile', color_type='color_ignore_orientation'), dict(type='Resize', scale=(4068, 1024), keep_ratio=True), dict( type='LoadOCRAnnotations', with_polygon=True, with_bbox=True, with_label=True), dict( type='PackTextDetInputs', meta_keys=('img_path', 'ori_shape', 'img_shape', 'scale_factor', 'instances')) ])) test_dataloader = dict( batch_size=1, num_workers=4, persistent_workers=True, sampler=dict(type='DefaultSampler', shuffle=False), dataset=dict( type='OCRDataset', data_root='data/icdar2015', ann_file='textdet_test.json', test_mode=True, pipeline=[ dict( type='LoadImageFromFile', color_type='color_ignore_orientation'), dict(type='Resize', scale=(4068, 1024), keep_ratio=True), dict( type='LoadOCRAnnotations', with_polygon=True, with_bbox=True, with_label=True), dict( type='PackTextDetInputs', meta_keys=('img_path', 'ori_shape', 'img_shape', 'scale_factor', 'instances')) ])) auto_scale_lr = dict(base_batch_size=16) launcher = 'none' work_dir = 'result/PANet' 04/07 16:08:06 - mmengine - WARNING - Failed to import mmocr.models, please check the location of the registry model is correct. Traceback (most recent call last): File "tools/train.py", line 114, in <module> main() File "tools/train.py", line 103, in main runner = Runner.from_cfg(cfg) File "/root/miniconda3/lib/python3.8/site-packages/mmengine/runner/runner.py", line 439, in from_cfg runner = cls( File "/root/miniconda3/lib/python3.8/site-packages/mmengine/runner/runner.py", line 406, in __init__ self.model = self.build_model(model) File "/root/miniconda3/lib/python3.8/site-packages/mmengine/runner/runner.py", line 808, in build_model model = MODELS.build(model) File "/root/miniconda3/lib/python3.8/site-packages/mmengine/registry/registry.py", line 548, in build return self.build_func(cfg, *args, **kwargs, registry=self) File "/root/miniconda3/lib/python3.8/site-packages/mmengine/registry/build_functions.py", line 241, in build_model_from_cfg return build_from_cfg(cfg, registry, default_args) File "/root/miniconda3/lib/python3.8/site-packages/mmengine/registry/build_functions.py", line 100, in build_from_cfg raise KeyError( KeyError: 'DBNet is not in the model registry. Please check whether the value of `DBNet` is correct or it was registered as expected. More details can be found at https://mmengine.readthedocs.io/en/latest/advanced_tutorials/config.html#import-the-custom-module' At 2023-04-06 19:30:53, "Tong Gao" ***@***.***> wrote: It has been included in the latest 1.0.0 release. Please pull down the 1.x branch and try again. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>

1 reply

gaotongxiao Apr 7, 2023
Maintainer

始终无法成功下载，于是按照文档说明的数据集迁移指令转换标注文件，不知道这样做是否可行？

Yes - if you are migrating the annotations from MMOCR 0.x to 1.x.

For the second question, it's likely that you installed MMOCR via MIM but now trying to use its source code. Try to add PYTHONPATH=. before your command.

The better way is to re-install MMOCR from source. You just need to run the command in MMOCR's directory: pip install -e ..

w1125 · 2023-04-09T05:21:13Z

w1125
Apr 9, 2023
Author

感谢您的回复,我尝试了两种方法，但是并没有解决问题，把过程放在这里看一下是否有问题。 1.cd mmocr pip install -e .显示重新安装了mmocr 继续执行train文件依旧报错 Installing collected packages: mmocr Attempting uninstall: mmocr Found existing installation: mmocr 1.0.0rc6 Uninstalling mmocr-1.0.0rc6: Successfully uninstalled mmocr-1.0.0rc6 Running setup.py develop for mmocr Successfully installed mmocr-1.0.0rc6 2.PYTHONPATH =./mmocr 执行train文件，同样显示报错

…

---- 回复的原邮件 ---- | 发件人 | Tong ***@***.***> | | 日期 | 2023年04月07日 17:52 | | 收件人 | ***@***.***> | | 抄送至 | ***@***.***>***@***.***> | | 主题 | Re: [open-mmlab/mmocr] ValueError: class `EpochBasedTrainLoop` in mmengine/runner/loops.py: class `OCRDataset` in mmocr/datasets/ocr_dataset.py: Annotation must have data_list and metainfo keys (Discussion #1839) | 始终无法成功下载，于是按照文档说明的数据集迁移指令转换标注文件，不知道这样做是否可行？ Yes - if you are migrating the annotations from MMOCR 0.x to 1.x. For the second question, it's likely that you installed MMOCR via MIM but now trying to use its source code. Try to add PYTHONPATH=. before your command. The better way is to re-install MMOCR from source. You just need to run the command in MMOCR's directory: pip install -e .. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>

0 replies

w1125 · 2023-04-09T06:29:13Z

w1125
Apr 9, 2023
Author

之后我创建了新的环境，以源码方式重新安装mmocr，不做改动执行train文件，显示MMEngine==0.7.0 is used but incompatible. Please install mmengine>=0.7.1, <1.0.0. 执行命令:mim install mmengine==0.7.1 ERROR: Could not find a version that satisfies the requirement mmengine==0.7.1 (from versions: 0.0.1rc0, 0.1.0, 0.2.0, 0.3.0, 0.3.1, 0.3.2, 0.4.0) ERROR: No matching distribution found for mmengine==0.7.1 感谢您的回复,我尝试了两种方法，但是并没有解决问题，把过程放在这里看一下是否有问题。 1.cd mmocr pip install -e .显示重新安装了mmocr 继续执行train文件依旧报错 Installing collected packages: mmocr Attempting uninstall: mmocr Found existing installation: mmocr 1.0.0rc6 Uninstalling mmocr-1.0.0rc6: Successfully uninstalled mmocr-1.0.0rc6 Running setup.py develop for mmocr Successfully installed mmocr-1.0.0rc6 2.PYTHONPATH =./mmocr 执行train文件，同样显示报错

0 replies

gaotongxiao · 2023-04-10T03:07:19Z

gaotongxiao
Apr 10, 2023
Maintainer

之后我创建了新的环境，以源码方式重新安装mmocr，不做改动执行train文件，显示MMEngine==0.7.0 is used but incompatible. Please install mmengine>=0.7.1, <1.0.0.
执行命令:mim install mmengine==0.7.1
ERROR: Could not find a version that satisfies the requirement mmengine==0.7.1 (from versions: 0.0.1rc0, 0.1.0, 0.2.0, 0.3.0, 0.3.1, 0.3.2, 0.4.0)
ERROR: No matching distribution found for mmengine==0.7.1

I would recommend you stick to this way - reinstall everything in a new environment.

Did you switch your pypi source to a stale one? It only has the MMEngine version up to 0.4.0. You should switch it to the default official source and try again. Another way is to clone MMEngine as well and install it from source.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ValueError: class `EpochBasedTrainLoop` in mmengine/runner/loops.py: class `OCRDataset` in mmocr/datasets/ocr_dataset.py: Annotation must have data_list and metainfo keys #1839

Uh oh!

{{title}}

Uh oh!

Replies: 6 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

ValueError: class EpochBasedTrainLoop in mmengine/runner/loops.py: class OCRDataset in mmocr/datasets/ocr_dataset.py: Annotation must have data_list and metainfo keys #1839

Uh oh!

w1125 Apr 5, 2023

Replies: 6 comments · 2 replies

Uh oh!

gaotongxiao Apr 5, 2023 Maintainer

Uh oh!

w1125 Apr 6, 2023 Author

Uh oh!

gaotongxiao Apr 6, 2023 Maintainer

Uh oh!

w1125 Apr 7, 2023 Author

Uh oh!

gaotongxiao Apr 7, 2023 Maintainer

Uh oh!

w1125 Apr 9, 2023 Author

Uh oh!

Uh oh!

w1125 Apr 9, 2023 Author

Uh oh!

gaotongxiao Apr 10, 2023 Maintainer

ValueError: class `EpochBasedTrainLoop` in mmengine/runner/loops.py: class `OCRDataset` in mmocr/datasets/ocr_dataset.py: Annotation must have data_list and metainfo keys #1839

w1125
Apr 5, 2023

Replies: 6 comments 2 replies

gaotongxiao
Apr 5, 2023
Maintainer

w1125
Apr 6, 2023
Author

gaotongxiao Apr 6, 2023
Maintainer

w1125
Apr 7, 2023
Author

gaotongxiao Apr 7, 2023
Maintainer

w1125
Apr 9, 2023
Author

w1125
Apr 9, 2023
Author

gaotongxiao
Apr 10, 2023
Maintainer