Skip to content

About the CPU memory leak #26

@superhero-7

Description

@superhero-7

Hi jeamin,

Thanks for you interesting work.
I have been working with you codebase for the REG(referring expression generation) task.And I modify the fintune code to fit the REG task.And I find it works well.But recently,I find that , during training,the usage of CPU memory become more and more.I fintue for 20 epochs with 4 RTX 2080 and 32 cpus, it occupy about 26G memory in the begin and increase to 90G in 15 epoch,which cause the error:
RuntimeError: DataLoader worker (pid 28449) is killed by signal: Killed.
I didn't encounter this problem before,but i think the problem have alreay existed before,just the usage of cpu memory didn't exceed 90G,so I didn't find this proble.I don't know whether you encountered such problem or not?Do you have any suggestion about this?
I will appreciate very much if you can reply me as soon as possible!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions