Skip to content

Bus error when training until iteration 4110 #9

@Nestarneal

Description

@Nestarneal

Hi,
I try to reproduce your results in the repository Realtime_Multi-Person_Pose_Estimation, and I follow the Training Steps in README to download the LMDB data and this repository for training.

I've tried three times for training, but each time when the iteration reaches 4110, I always get the following error:

I1031 21:17:26.459512 32355 sgd_solver.cpp:106] Iteration 4110, lr = 2e-05
*** Aborted at 1509455849 (unix time) try "date -d @1509455849" if you are using GNU date ***
PC: @     0x7fc0ee06da5e (unknown)
*** SIGBUS (@0x7ee49acba50e) received by PID 32355 (TID 0x7fc0b33dd700) from PID 18446744072011621646; stack trace: ***
    @     0x7fc0ee009cb0 (unknown)
    @     0x7fc0ee06da5e (unknown)
    @     0x7fc0eee2b9de (unknown)
    @     0x7fc0eee2ba2b (unknown)
    @     0x7fc0eff43cea caffe::db::LMDBCursor::value()
    @     0x7fc0effc890e caffe::DataReader::Body::read_one()
    @     0x7fc0effc8ef4 caffe::DataReader::Body::InternalThreadEntry()
    @     0x7fc0eff3bea5 caffe::InternalThread::entry()
    @     0x7fc0e52e242f thread_proxy
    @     0x7fc0d6cb4184 start_thread
    @     0x7fc0ee0d0ffd (unknown)
    @                0x0 (unknown)
Bus error

I create a script for training, and its contents are as following:

#!/usr/bin/env sh
/path/to/caffe.bin train --solver=pose_solver.prototxt --gpu=1 --weights=../../../model/vgg/VGG_ILSVRC_19_layers.caffemodel 2>&1 | tee output/$(date +%y%m%d_%H%M).txt 

Do you have any idea about this?

Many thanks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions