NaN Issue with SAC Code

action = T.tanh(actions)*T.tensor(self.max_action).to(self.device) 
        log_probs = probabilities.log_prob(actions)
        log_probs -= T.log(1-action.pow(2) + self.reparam_noise) --> produces negative outputs inside the log, which in turn produces nan
        log_probs = log_probs.sum(1, keepdim=True)

How can I fix this issue? Are the following modifications correct?

action = T.tanh(actions)*T.tensor(self.max_action).to(self.device) 
        log_probs = probabilities.log_prob(actions)
        log_probs -= T.log(1-T.tanh(actions).pow(2) + self.reparam_noise)
        log_probs = log_probs.sum(1, keepdim=True)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

NaN Issue with SAC Code #7

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

NaN Issue with SAC Code #7

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions