-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
Hi,
Thank you for sharing this interesting work!
I have a few questions regarding the implementation in gpu_work.py, and I'd appreciate it if you're happy to help!
At line 73 you have for step in range(num_step):
I am not sure why this additional for loop is needed. It looks like we can get rid of this since we already are iterating over the epochs in the outer-most for loop.
In addition,
temp_model = copy.deepcopy(worker_list[0].model)
for name, param in temp_model.named_parameters():
for worker in worker_list[1:]:
param.data += worker.model.state_dict()[name].data
param.data /= size
if total_step % 50 == 0:
test_all(temp_model, temp_train_loader, temp_test_loader,
criterion, None, total_step, tb, device, n_swap=n_swap)
if total_step == early_stop:
break
Currently, you're creating a temp_model = copy.deepcopy(worker_list[0].model) at every iteration, but test_all is only called every 50 iteartions. I wonder if we can move this under if total_step % 50 == 0:?
Thanks!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels