-
Notifications
You must be signed in to change notification settings - Fork 58
Description
Hi jeamin,
Thanks for you interesting work.
I have been working with you codebase for the REG(referring expression generation) task.And I modify the fintune code to fit the REG task.And I find it works well.But recently,I find that , during training,the usage of CPU memory become more and more.I fintue for 20 epochs with 4 RTX 2080 and 32 cpus, it occupy about 26G memory in the begin and increase to 90G in 15 epoch,which cause the error:
RuntimeError: DataLoader worker (pid 28449) is killed by signal: Killed.
I didn't encounter this problem before,but i think the problem have alreay existed before,just the usage of cpu memory didn't exceed 90G,so I didn't find this proble.I don't know whether you encountered such problem or not?Do you have any suggestion about this?
I will appreciate very much if you can reply me as soon as possible!