Better testing and documentation of the LR Finder

See `src/nn_lib/optim/lr_finder.py`. The concept is good; we improve on the lightning LR Finder by using isotonic regression to smooth out the losses. But...

- empirically, it sometimes seems to err on the side of a too-small LR, which is problematic e.g. when stitching and training 'to convergence'. A too small LR up front may simply result in the system not learning at all.
- we haven't thoroughly tested the behavior of the LR Finder when the optimizer is something other than SGD. In other projects, we went ahead and used the LR finder with the Adam optimizer and it _seemed to work most of the time_. However, any adaptive optimizer (such as Adam) will have some strange behavior where bigger LRs are influenced by the history of gradients / updates from the smaller LRs. It is unclear how or whether this will affect things. Most likely our best solution is to have the LRFinder instantiate its own momentum-less SGD optimizer rather than have the user pass in an optimizer?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better testing and documentation of the LR Finder #6

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Better testing and documentation of the LR Finder #6

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions