Skip to content

在MSMT17数据集上运行test.py出现内存溢出 #19

@emlssyj

Description

@emlssyj

2024-05-11 09:36:30 transreid INFO: Namespace(config_file='configs/msmt17/swin_small.yml', opts=['TEST.WEIGHT', './log/msmt17/swin_small/transformer_120.pth', 'TEST.RE_RANKING', 'True', 'MODEL.SEMANTIC_WEIGHT', '0.2'])
2024-05-11 09:36:30 transreid INFO: Loaded configuration file configs/msmt17/swin_small.yml
2024-05-11 09:36:30 transreid INFO:
MODEL:
PRETRAIN_HW_RATIO: 2
METRIC_LOSS_TYPE: 'triplet'
IF_LABELSMOOTH: 'off'
IF_WITH_CENTER: 'no'
NAME: 'transformer'
NO_MARGIN: True
DEVICE_ID: ('0')
TRANSFORMER_TYPE: 'swin_small_patch4_window7_224'
STRIDE_SIZE: [16, 16]

INPUT:
SIZE_TRAIN: [384, 128]
SIZE_TEST: [384, 128]
PROB: 0.5 # random horizontal flip
RE_PROB: 0.5 # random erasing
PADDING: 10
PIXEL_MEAN: [0.5, 0.5, 0.5]
PIXEL_STD: [0.5, 0.5, 0.5]

DATASETS:
NAMES: ('msmt17')
ROOT_DIR: ('/TransReID/data')

DATALOADER:
SAMPLER: 'softmax_triplet'
NUM_INSTANCE: 4
NUM_WORKERS: 8

SOLVER:
OPTIMIZER_NAME: 'SGD'
MAX_EPOCHS: 120
BASE_LR: 0.0008
WARMUP_EPOCHS: 20
IMS_PER_BATCH: 64
WARMUP_METHOD: 'cosine'
LARGE_FC_LR: False
CHECKPOINT_PERIOD: 120
LOG_PERIOD: 20
EVAL_PERIOD: 10
WEIGHT_DECAY: 1e-4
WEIGHT_DECAY_BIAS: 1e-4
BIAS_LR_FACTOR: 2

TEST:
EVAL: True
IMS_PER_BATCH: 256
RE_RANKING: False
WEIGHT: ''
NECK_FEAT: 'before'
FEAT_NORM: 'yes'

OUTPUT_DIR: './log/msmt17/swin_small'

2024-05-11 09:36:30 transreid INFO: Running with config:
DATALOADER:
NUM_INSTANCE: 4
NUM_WORKERS: 8
REMOVE_TAIL: 0
SAMPLER: softmax_triplet
DATASETS:
NAMES: msmt17
ROOT_DIR: /media/lab/Disk1/TransReID/data
ROOT_TRAIN_DIR: ../data
ROOT_VAL_DIR: ../data
INPUT:
PADDING: 10
PIXEL_MEAN: [0.5, 0.5, 0.5]
PIXEL_STD: [0.5, 0.5, 0.5]
PROB: 0.5
RE_PROB: 0.5
SIZE_TEST: [384, 128]
SIZE_TRAIN: [384, 128]
MODEL:
ATT_DROP_RATE: 0.0
COS_LAYER: False
DEVICE: cuda
DEVICE_ID: 0
DEVIDE_LENGTH: 4
DIST_TRAIN: False
DROPOUT_RATE: 0.0
DROP_OUT: 0.0
DROP_PATH: 0.1
FEAT_DIM: 512
GEM_POOLING: False
ID_LOSS_TYPE: softmax
ID_LOSS_WEIGHT: 1.0
IF_LABELSMOOTH: off
IF_WITH_CENTER: no
JPM: False
LAST_STRIDE: 1
METRIC_LOSS_TYPE: triplet
NAME: transformer
NECK: bnneck
NO_MARGIN: True
PRETRAIN_CHOICE: imagenet
PRETRAIN_HW_RATIO: 2
PRETRAIN_PATH:
REDUCE_FEAT_DIM: False
RE_ARRANGE: True
SEMANTIC_WEIGHT: 0.2
SHIFT_NUM: 5
SHUFFLE_GROUP: 2
SIE_CAMERA: False
SIE_COE: 3.0
SIE_VIEW: False
STEM_CONV: False
STRIDE_SIZE: [16, 16]
TRANSFORMER_TYPE: swin_small_patch4_window7_224
TRIPLET_LOSS_WEIGHT: 1.0
OUTPUT_DIR: ./log/msmt17/swin_small
SOLVER:
BASE_LR: 0.0008
BIAS_LR_FACTOR: 2
CENTER_LOSS_WEIGHT: 0.0005
CENTER_LR: 0.5
CHECKPOINT_PERIOD: 120
COSINE_MARGIN: 0.5
COSINE_SCALE: 30
EVAL_PERIOD: 10
GAMMA: 0.1
IMS_PER_BATCH: 64
LARGE_FC_LR: False
LOG_PERIOD: 20
MARGIN: 0.3
MAX_EPOCHS: 120
MOMENTUM: 0.9
OPTIMIZER_NAME: SGD
SEED: 1234
STEPS: (40, 70)
TRP_L2: False
WARMUP_EPOCHS: 20
WARMUP_FACTOR: 0.01
WARMUP_METHOD: cosine
WEIGHT_DECAY: 0.0001
WEIGHT_DECAY_BIAS: 0.0001
TEST:
DIST_MAT: dist_mat.npy
EVAL: True
FEAT_NORM: yes
IMS_PER_BATCH: 256
NECK_FEAT: before
RE_RANKING: True
WEIGHT: ./log/msmt17/swin_small/transformer_120.pth
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15} cam_container
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15} cam_container
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15} cam_container
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15} cam_container
=> MSMT17 loaded
2024-05-11 09:36:31 transreid.check INFO: Dataset statistics:
2024-05-11 09:36:31 transreid.check INFO: ----------------------------------------
2024-05-11 09:36:31 transreid.check INFO: subset | # ids | # images | # cameras
2024-05-11 09:36:31 transreid.check INFO: ----------------------------------------
2024-05-11 09:36:31 transreid.check INFO: train | 1041 | 32621 | 15
2024-05-11 09:36:31 transreid.check INFO: query | 3060 | 11659 | 15
2024-05-11 09:36:31 transreid.check INFO: gallery | 3060 | 82161 | 15
2024-05-11 09:36:31 transreid.check INFO: ----------------------------------------
using img_triplet sampler
using Transformer_type: swin_small_patch4_window7_224 as a backbone
/media/lab/Disk1/SOLIDER-REID/model/backbones/swin_transformer.py:1159: UserWarning: DeprecationWarning: pretrained is deprecated, please use "init_cfg" instead
warnings.warn('DeprecationWarning: pretrained is deprecated, '
===========building transformer===========
Loading pretrained model from ./log/msmt17/swin_small/transformer_120.pth
2024-05-11 09:36:37 transreid.test INFO: Enter inferencing
The test feature is normalized
=> Enter reranking
/media/lab/Disk1/SOLIDER-REID/utils/reranking.py:40: UserWarning: This overload of addmm_ is deprecated:
addmm_(Number beta, Number alpha, Tensor mat1, Tensor mat2)
Consider using one of the following signatures instead:
addmm_(Tensor mat1, Tensor mat2, *, Number beta, Number alpha) (Triggered internally at ../torch/csrc/utils/python_arg_parser.cpp:1630.)
distmat.addmm_(1, -2, feat, feat.t())
Killed

可能存在内存泄露导致内存溢出被Killed,实验环境为128G内存,请指教是哪里出问题了?感谢!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions