SAM, Generative loss explodes during validation phase

Hi, 

I'm using SAM with ```losstype="fgan"```, but I see that my loss is close to -1 during training epochs, but then goes to around -1e22 and sometimes even NaN when I get to the test epochs (and the generator and discriminator have losses negative of each other). [Ref: https://github.com/FenTechSolutions/CausalDiscoveryToolbox/issues/61]

![image](https://github.com/FenTechSolutions/CausalDiscoveryToolbox/assets/162341791/2f46dee6-e868-442a-a886-eae621edbacc)

Do you have an idea of why this might happen?

Edit: I'm seeing that this is happening randomly at any epoch, not just during the validation phase. This could be something to do with outlier samples, but that is unlikely. I will tried with ```losstype="gan"``` but that takes significantly more time per model (2 hrs vs 5 minutes).

Also can you explain the loss term in for the fgan: https://github.com/FenTechSolutions/CausalDiscoveryToolbox/blob/master/cdt/causality/graph/SAM.py#L328

```
gen_loss = -th.mean(th.exp(disc_vars_g - 1), [0, 2]).sum()
```

Can you also point me to where this is taken from?

Also, it seems like the backward pass is being run for the train as well as the test epochs: https://github.com/FenTechSolutions/CausalDiscoveryToolbox/blob/master/cdt/causality/graph/SAM.py#L365

```
if epoch < train + test - 1:
        loss.backward()
```

Is this correct?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SAM, Generative loss explodes during validation phase #162

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

SAM, Generative loss explodes during validation phase #162

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions