-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Hi, When I reimplementation this work, i got the error like:
/data1/user/lly/work/pkg/mmpose-0.25.0/mmpose/utils/setup_env.py:32: UserWarning: Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
warnings.warn(
/data1/user/lly/work/pkg/mmpose-0.25.0/mmpose/utils/setup_env.py:42: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
warnings.warn(
/data1/user/lly/miniconda3/envs/mmpose/lib/python3.8/site-packages/torch/cuda/init.py:104: UserWarning:
NVIDIA GeForce RTX 4090 with CUDA capability sm_89 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_61 sm_70 sm_75 compute_37.
If you want to use the NVIDIA GeForce RTX 4090 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/
warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name))
2023-09-24 16:31:39,594 - mmpose - INFO - Environment info:
sys.platform: linux
Python: 3.8.18 (default, Sep 11 2023, 13:40:15) [GCC 11.2.0]
CUDA available: True
GPU 0,1,2,3: NVIDIA GeForce RTX 4090
CUDA_HOME: /usr
NVCC: Cuda compilation tools, release 10.1, V10.1.243
GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0
PyTorch: 1.8.1
PyTorch compiling details: PyTorch built with:
- GCC 7.3
- C++ Version: 201402
- Intel(R) oneAPI Math Kernel Library Version 2023.1-Product Build 20230303 for Intel(R) 64 architecture applications
- Intel(R) MKL-DNN v1.7.0 (Git Hash 7aed236906b1f7a05c0917e5257a1af05e9ff683)
- OpenMP 201511 (a.k.a. OpenMP 4.5)
- NNPACK is enabled
- CPU capability usage: AVX2
- CUDA Runtime 10.2
- NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
- CuDNN 7.6.5
- Magma 2.5.2
- Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=10.2, CUDNN_VERSION=7.6.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.8.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,
TorchVision: 0.9.1
OpenCV: 4.8.0
MMCV: 1.3.17
MMCV Compiler: GCC 7.3
MMCV CUDA Compiler: 10.2
MMPose: 0.25.0+fd361ca
2023-09-24 16:31:39,594 - mmpose - INFO - Distributed training: True
2023-09-24 16:31:42,305 - mmpose - INFO - Config:
dataset_info = dict(
dataset_name='coco',
paper_info=dict(
author=
'Lin, Tsung-Yi and Maire, Michael and Belongie, Serge and Hays, James and Perona, Pietro and Ramanan, Deva and Doll{'a}r, Piotr and Zitnick, C Lawrence',
title='Microsoft coco: Common objects in context',
container='European conference on computer vision',
year='2014',
homepage='http://cocodataset.org/'),
keypoint_info=dict({
0:
dict(name='nose', id=0, color=[51, 153, 255], type='upper', swap=''),
1:
dict(
name='left_eye',
id=1,
color=[51, 153, 255],
type='upper',
swap='right_eye'),
2:
dict(
name='right_eye',
id=2,
color=[51, 153, 255],
type='upper',
swap='left_eye'),
3:
dict(
name='left_ear',
id=3,
color=[51, 153, 255],
type='upper',
swap='right_ear'),
4:
dict(
name='right_ear',
id=4,
color=[51, 153, 255],
type='upper',
swap='left_ear'),
5:
dict(
name='left_shoulder',
id=5,
color=[0, 255, 0],
type='upper',
swap='right_shoulder'),
6:
dict(
name='right_shoulder',
id=6,
color=[255, 128, 0],
type='upper',
swap='left_shoulder'),
7:
dict(
name='left_elbow',
id=7,
color=[0, 255, 0],
type='upper',
swap='right_elbow'),
8:
dict(
name='right_elbow',
id=8,
color=[255, 128, 0],
type='upper',
swap='left_elbow'),
9:
dict(
name='left_wrist',
id=9,
color=[0, 255, 0],
type='upper',
swap='right_wrist'),
10:
dict(
name='right_wrist',
id=10,
color=[255, 128, 0],
type='upper',
swap='left_wrist'),
11:
dict(
name='left_hip',
id=11,
color=[0, 255, 0],
type='lower',
swap='right_hip'),
12:
dict(
name='right_hip',
id=12,
color=[255, 128, 0],
type='lower',
swap='left_hip'),
13:
dict(
name='left_knee',
id=13,
color=[0, 255, 0],
type='lower',
swap='right_knee'),
14:
dict(
name='right_knee',
id=14,
color=[255, 128, 0],
type='lower',
swap='left_knee'),
15:
dict(
name='left_ankle',
id=15,
color=[0, 255, 0],
type='lower',
swap='right_ankle'),
16:
dict(
name='right_ankle',
id=16,
color=[255, 128, 0],
type='lower',
swap='left_ankle')
}),
skeleton_info=dict({
0:
dict(link=('left_ankle', 'left_knee'), id=0, color=[0, 255, 0]),
1:
dict(link=('left_knee', 'left_hip'), id=1, color=[0, 255, 0]),
2:
dict(link=('right_ankle', 'right_knee'), id=2, color=[255, 128, 0]),
3:
dict(link=('right_knee', 'right_hip'), id=3, color=[255, 128, 0]),
4:
dict(link=('left_hip', 'right_hip'), id=4, color=[51, 153, 255]),
5:
dict(link=('left_shoulder', 'left_hip'), id=5, color=[51, 153, 255]),
6:
dict(link=('right_shoulder', 'right_hip'), id=6, color=[51, 153, 255]),
7:
dict(
link=('left_shoulder', 'right_shoulder'),
id=7,
color=[51, 153, 255]),
8:
dict(link=('left_shoulder', 'left_elbow'), id=8, color=[0, 255, 0]),
9:
dict(
link=('right_shoulder', 'right_elbow'), id=9, color=[255, 128, 0]),
10:
dict(link=('left_elbow', 'left_wrist'), id=10, color=[0, 255, 0]),
11:
dict(link=('right_elbow', 'right_wrist'), id=11, color=[255, 128, 0]),
12:
dict(link=('left_eye', 'right_eye'), id=12, color=[51, 153, 255]),
13:
dict(link=('nose', 'left_eye'), id=13, color=[51, 153, 255]),
14:
dict(link=('nose', 'right_eye'), id=14, color=[51, 153, 255]),
15:
dict(link=('left_eye', 'left_ear'), id=15, color=[51, 153, 255]),
16:
dict(link=('right_eye', 'right_ear'), id=16, color=[51, 153, 255]),
17:
dict(link=('left_ear', 'left_shoulder'), id=17, color=[51, 153, 255]),
18:
dict(
link=('right_ear', 'right_shoulder'), id=18, color=[51, 153, 255])
}),
joint_weights=[
1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.2, 1.2, 1.5, 1.5, 1.0, 1.0, 1.2,
1.2, 1.5, 1.5
],
sigmas=[
0.026, 0.025, 0.025, 0.035, 0.035, 0.079, 0.079, 0.072, 0.072, 0.062,
0.062, 0.107, 0.107, 0.087, 0.087, 0.089, 0.089
])
log_level = 'INFO'
load_from = None
resume_from = None
dist_params = dict(backend='nccl')
workflow = [('train', 1)]
checkpoint_config = dict(interval=5, create_symlink=False)
evaluation = dict(interval=5, metric='mAP', save_best='AP')
optimizer = dict(
type='AdamW',
lr=0.001,
betas=(0.9, 0.999),
weight_decay=0.01,
paramwise_cfg=dict(
custom_keys=dict(relative_position_bias_table=dict(decay_mult=0.0))))
optimizer_config = dict(grad_clip=None)
lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=11710,
warmup_ratio=0.001,
step=[120, 150])
total_epochs = 160
log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook')])
channel_cfg = dict(
num_output_channels=17,
dataset_joints=17,
dataset_channel=[[
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16
]],
inference_channel=[
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16
])
norm_cfg = dict(type='SyncBN', requires_grad=True)
model = dict(
type='TopDown',
pretrained='./pretrained/solider_swin_base.pth',
backbone=dict(
type='SwinTransformer',
in_channels=3,
pretrain_img_size=224,
patch_size=4,
window_size=7,
embed_dims=128,
strides=(4, 2, 1, 1),
depths=(2, 2, 18, 2),
num_heads=(4, 8, 16, 32),
drop_path_rate=0.0,
drop_rate=0.0,
attn_drop_rate=0.0,
semantic_weight=0.8),
keypoint_head=dict(
type='TopdownHeatmapSimpleHead',
in_channels=1024,
in_index=3,
out_channels=17,
num_deconv_layers=1,
num_deconv_kernels=(4, ),
num_deconv_filters=(256, ),
extra=dict(final_conv_kernel=1),
loss_keypoint=dict(type='JointsMSELoss', use_target_weight=True)),
train_cfg=dict(),
test_cfg=dict(
flip_test=True,
post_process='default',
shift_heatmap=True,
modulate_kernel=11))
data_root = '/data2/DataSet/COCO'
data_cfg = dict(
image_size=[288, 384],
heatmap_size=[72, 96],
num_output_channels=17,
num_joints=17,
dataset_channel=[[
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16
]],
inference_channel=[
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16
],
soft_nms=False,
nms_thr=1.0,
oks_thr=0.9,
vis_thr=0.2,
use_gt_bbox=False,
det_bbox_thr=0.0,
bbox_file=
'/data2/DataSet/COCO/person_detection_results/COCO_val2017_detections_AP_H_56_person.json'
)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='TopDownRandomFlip', flip_prob=0.5),
dict(
type='TopDownHalfBodyTransform',
num_joints_half_body=8,
prob_half_body=0.3),
dict(
type='TopDownGetRandomScaleRotation', rot_factor=40, scale_factor=0.5),
dict(type='TopDownAffine'),
dict(type='ToTensor'),
dict(
type='NormalizeTensor',
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
dict(type='TopDownGenerateTarget', sigma=2),
dict(
type='Collect',
keys=['img', 'target', 'target_weight'],
meta_keys=[
'image_file', 'joints_3d', 'joints_3d_visible', 'center', 'scale',
'rotation', 'bbox_score', 'flip_pairs'
])
]
val_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='TopDownAffine'),
dict(type='ToTensor'),
dict(
type='NormalizeTensor',
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
dict(
type='Collect',
keys=['img'],
meta_keys=[
'image_file', 'center', 'scale', 'rotation', 'bbox_score',
'flip_pairs'
])
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='TopDownAffine'),
dict(type='ToTensor'),
dict(
type='NormalizeTensor',
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
dict(
type='Collect',
keys=['img'],
meta_keys=[
'image_file', 'center', 'scale', 'rotation', 'bbox_score',
'flip_pairs'
])
]
data = dict(
samples_per_gpu=12,
workers_per_gpu=2,
val_dataloader=dict(samples_per_gpu=12),
test_dataloader=dict(samples_per_gpu=12),
train=dict(
type='TopDownCocoDataset',
ann_file=
'/data2/DataSet/COCO/annotation/person_keypoints_train2017.json',
img_prefix='/data2/DataSet/COCO/data/train2017/',
data_cfg=dict(
image_size=[288, 384],
heatmap_size=[72, 96],
num_output_channels=17,
num_joints=17,
dataset_channel=[[
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16
]],
inference_channel=[
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16
],
soft_nms=False,
nms_thr=1.0,
oks_thr=0.9,
vis_thr=0.2,
use_gt_bbox=False,
det_bbox_thr=0.0,
bbox_file=
'/data2/DataSet/COCO/person_detection_results/COCO_val2017_detections_AP_H_56_person.json'
),
pipeline=[
dict(type='LoadImageFromFile'),
dict(type='TopDownRandomFlip', flip_prob=0.5),
dict(
type='TopDownHalfBodyTransform',
num_joints_half_body=8,
prob_half_body=0.3),
dict(
type='TopDownGetRandomScaleRotation',
rot_factor=40,
scale_factor=0.5),
dict(type='TopDownAffine'),
dict(type='ToTensor'),
dict(
type='NormalizeTensor',
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
dict(type='TopDownGenerateTarget', sigma=2),
dict(
type='Collect',
keys=['img', 'target', 'target_weight'],
meta_keys=[
'image_file', 'joints_3d', 'joints_3d_visible', 'center',
'scale', 'rotation', 'bbox_score', 'flip_pairs'
])
]),
val=dict(
type='TopDownCocoDataset',
ann_file='/data2/DataSet/COCO/annotation/person_keypoints_val2017.json',
img_prefix='/data2/DataSet/COCO/data/val2017/',
data_cfg=dict(
image_size=[288, 384],
heatmap_size=[72, 96],
num_output_channels=17,
num_joints=17,
dataset_channel=[[
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16
]],
inference_channel=[
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16
],
soft_nms=False,
nms_thr=1.0,
oks_thr=0.9,
vis_thr=0.2,
use_gt_bbox=False,
det_bbox_thr=0.0,
bbox_file=
'/data2/DataSet/COCO/person_detection_results/COCO_val2017_detections_AP_H_56_person.json'
),
pipeline=[
dict(type='LoadImageFromFile'),
dict(type='TopDownAffine'),
dict(type='ToTensor'),
dict(
type='NormalizeTensor',
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
dict(
type='Collect',
keys=['img'],
meta_keys=[
'image_file', 'center', 'scale', 'rotation', 'bbox_score',
'flip_pairs'
])
]),
test=dict(
type='TopDownCocoDataset',
ann_file='/data2/DataSet/COCO/annotation/person_keypoints_val2017.json',
img_prefix='/data2/DataSet/COCO/data/val2017/',
data_cfg=dict(
image_size=[288, 384],
heatmap_size=[72, 96],
num_output_channels=17,
num_joints=17,
dataset_channel=[[
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16
]],
inference_channel=[
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16
],
soft_nms=False,
nms_thr=1.0,
oks_thr=0.9,
vis_thr=0.2,
use_gt_bbox=False,
det_bbox_thr=0.0,
bbox_file=
'/data2/DataSet/COCO/person_detection_results/COCO_val2017_detections_AP_H_56_person.json'
),
pipeline=[
dict(type='LoadImageFromFile'),
dict(type='TopDownAffine'),
dict(type='ToTensor'),
dict(
type='NormalizeTensor',
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]),
dict(
type='Collect',
keys=['img'],
meta_keys=[
'image_file', 'center', 'scale', 'rotation', 'bbox_score',
'flip_pairs'
])
]))
fp16 = dict(loss_scale='dynamic')
work_dir = './work_dirs/swin_base_coco_384x288_lly'
gpu_ids = range(0, 1)
2023-09-24 16:31:42,306 - mmpose - INFO - Set random seed to 768543290, deterministic: False
Traceback (most recent call last):
File "/data1/user/lly/miniconda3/envs/mmpose/lib/python3.8/site-packages/mmcv/utils/registry.py", line 53, in build_from_cfg
return obj_cls(**args)
File "/data1/user/lly/work/pkg/mmpose-0.25.0/mmpose/models/detectors/top_down.py", line 48, in init
self.backbone = builder.build_backbone(backbone)
File "/data1/user/lly/work/pkg/mmpose-0.25.0/mmpose/models/builder.py", line 19, in build_backbone
return BACKBONES.build(cfg)
File "/data1/user/lly/miniconda3/envs/mmpose/lib/python3.8/site-packages/mmcv/utils/registry.py", line 213, in build
return self.build_func(*args, **kwargs, registry=self)
File "/data1/user/lly/miniconda3/envs/mmpose/lib/python3.8/site-packages/mmcv/cnn/builder.py", line 27, in build_model_from_cfg
return build_from_cfg(cfg, registry, default_args)
File "/data1/user/lly/miniconda3/envs/mmpose/lib/python3.8/site-packages/mmcv/utils/registry.py", line 45, in build_from_cfg
raise KeyError(
KeyError: 'SwinTransformer is not in the models registry'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "tools/train.py", line 201, in
main()
File "tools/train.py", line 175, in main
model = build_posenet(cfg.model)
File "/data1/user/lly/work/pkg/mmpose-0.25.0/mmpose/models/builder.py", line 39, in build_posenet
return POSENETS.build(cfg)
File "/data1/user/lly/miniconda3/envs/mmpose/lib/python3.8/site-packages/mmcv/utils/registry.py", line 213, in build
return self.build_func(*args, **kwargs, registry=self)
File "/data1/user/lly/miniconda3/envs/mmpose/lib/python3.8/site-packages/mmcv/cnn/builder.py", line 27, in build_model_from_cfg
return build_from_cfg(cfg, registry, default_args)
File "/data1/user/lly/miniconda3/envs/mmpose/lib/python3.8/site-packages/mmcv/utils/registry.py", line 56, in build_from_cfg
raise type(e)(f'{obj_cls.name}: {e}')
KeyError: "TopDown: 'SwinTransformer is not in the models registry'"
Killing subprocess 2122891
Traceback (most recent call last):
File "/data1/user/lly/miniconda3/envs/mmpose/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/data1/user/lly/miniconda3/envs/mmpose/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/data1/user/lly/miniconda3/envs/mmpose/lib/python3.8/site-packages/torch/distributed/launch.py", line 340, in
main()
File "/data1/user/lly/miniconda3/envs/mmpose/lib/python3.8/site-packages/torch/distributed/launch.py", line 326, in main
sigkill_handler(signal.SIGTERM, None) # not coming back
File "/data1/user/lly/miniconda3/envs/mmpose/lib/python3.8/site-packages/torch/distributed/launch.py", line 301, in sigkill_handler
raise subprocess.CalledProcessError(returncode=last_return_code, cmd=cmd)
subprocess.CalledProcessError: Command '['/data1/user/lly/miniconda3/envs/mmpose/bin/python', '-u', 'tools/train.py', '--local_rank=0', 'configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/swin_base_coco_384x288_lly.py', '--launcher', 'pytorch']' returned non-zero exit status 1.
can you tell me the solution to reimplement the training.