-
Notifications
You must be signed in to change notification settings - Fork 24
Description
As mentioned in the paper, the optimization loss is considered to be non-negative, however, many objectives are not necessarily non-negative so it would be helpful to remove this restriction.
Currently, the optimizer fails without warning if a loss which goes below zero is used.
To demonstrate this, I modified a single line in L4_MNIST.ipynb to subtracted a constant from the cross entropy objective: train_op = opt.minimize(cross_entropy-5.).
This, of course, shouldn't affect the optimization, however, when run the optimizer fails completely:
Epoch 0; Current Batch Loss: 2.3025853633880615
Epoch 1; Current Batch Loss: 1.02925705909729
Epoch 2; Current Batch Loss: 8078224896.0
Epoch 3; Current Batch Loss: 10578548736.0
Epoch 4; Current Batch Loss: 210841823608832.0
Epoch 5; Current Batch Loss: nan
Epoch 6; Current Batch Loss: nan
Epoch 7; Current Batch Loss: nan
[...]
Test accuracy: 0.09799999743700027
As a temporary work-around, a transformation may be used such as
pos_loss = tf.exp(loss - initial_loss)
where initial_loss is loss evaluated at the beginning of the optimization process.
The L4 optimizer does function with this transformation, however, empirically it doesn't seem to do very well.
It would be ideal if the optimizer could work with the original (un-transformed) loss.