-
Notifications
You must be signed in to change notification settings - Fork 202
Description
Hi,
I'm using SAM with losstype="fgan", but I see that my loss is close to -1 during training epochs, but then goes to around -1e22 and sometimes even NaN when I get to the test epochs (and the generator and discriminator have losses negative of each other). [Ref: https://github.com//issues/61]
Do you have an idea of why this might happen?
Edit: I'm seeing that this is happening randomly at any epoch, not just during the validation phase. This could be something to do with outlier samples, but that is unlikely. I will tried with losstype="gan" but that takes significantly more time per model (2 hrs vs 5 minutes).
Also can you explain the loss term in for the fgan: https://github.com/FenTechSolutions/CausalDiscoveryToolbox/blob/master/cdt/causality/graph/SAM.py#L328
gen_loss = -th.mean(th.exp(disc_vars_g - 1), [0, 2]).sum()
Can you also point me to where this is taken from?
Also, it seems like the backward pass is being run for the train as well as the test epochs: https://github.com/FenTechSolutions/CausalDiscoveryToolbox/blob/master/cdt/causality/graph/SAM.py#L365
if epoch < train + test - 1:
loss.backward()
Is this correct?
