Skip to content

Better testing and documentation of the LR Finder #6

@wrongu

Description

@wrongu

See src/nn_lib/optim/lr_finder.py. The concept is good; we improve on the lightning LR Finder by using isotonic regression to smooth out the losses. But...

  • empirically, it sometimes seems to err on the side of a too-small LR, which is problematic e.g. when stitching and training 'to convergence'. A too small LR up front may simply result in the system not learning at all.
  • we haven't thoroughly tested the behavior of the LR Finder when the optimizer is something other than SGD. In other projects, we went ahead and used the LR finder with the Adam optimizer and it seemed to work most of the time. However, any adaptive optimizer (such as Adam) will have some strange behavior where bigger LRs are influenced by the history of gradients / updates from the smaller LRs. It is unclear how or whether this will affect things. Most likely our best solution is to have the LRFinder instantiate its own momentum-less SGD optimizer rather than have the user pass in an optimizer?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions