Replies: 1 comment 7 replies
-
I don't recommend to use ReLU, since ReLU function is not derivable at x=0, and its derivative is not continuous at x<0 and x>0. However, the force is the negative energy gradient. |
Beta Was this translation helpful? Give feedback.
7 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,


I am training a model using the "se_a" type. I am using "ReLu" as an activation function, and the "batch size" is 1. The problem is that the loss converges slowly and there is a significant difference between test and train loss (after 170,000 batches, the training loss is: 12, and the test loss is:13). Also, the force losses are increasing. What is your advice for better convergence?
I attached the "lcurve.out" plot for total and force losses. I would be thankful if you help me with this issue.
Beta Was this translation helpful? Give feedback.
All reactions