-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Hi:
I am interested in your work and read the code carefully. In the training stage, you set 2 config paras: max_epoch and training_size.
https://github.com/p0werHu/articulated-objects-motion-prediction/blob/f6198bdc4041e1dd54f6367a8c54ddd016137fe1/src/config.py#L15
https://github.com/p0werHu/articulated-objects-motion-prediction/blob/f6198bdc4041e1dd54f6367a8c54ddd016137fe1/src/config.py#L16
Actually, in each circulation of the training_size, you load all the data with train_loader. it seems you define the iteration with the concept of epoch. (as the code below showes, it's in your file train.py)That means, you train the model for training_size * max_epoch times!! it's too much. and you store the checkpoint file only when one epoch ended which means the model has been trained for training_size times. if the best model occurs in a train time that is not the train_size's integral number of times, you may ignore it.
for epoch in range(config.max_epoch):
print("At epoch:{}".format(str(epoch + 1)))
prog = Progbar(target=config.training_size)
prog_valid = Progbar(target=config.validation_size)
# Train
#with torch.autograd.set_detect_anomaly(True):
for it in range(config.training_size):
for i, data in enumerate(train_loader, 0):
encoder_inputs = data['encoder_inputs'].float().to(device)
decoder_inputs = data['decoder_inputs'].float().to(device)
decoder_outputs = data['decoder_outputs'].float().to(device)
prediction = net(encoder_inputs, decoder_inputs, train=True)
loss = Loss(prediction, decoder_outputs, bone_length, config)
net.zero_grad()
loss.backward()
_ = torch.nn.utils.clip_grad_norm_(net.parameters(), 5)
optimizer.step()so I want to know your purpose of adding the circulation of the training_size, maybe I don't understand the advantage of your setting.