Skip to content

Repository files navigation

Mean behavior is differentiable

This repository explores an interesting behavior during neural network training from Keller Jordan Mean behavior is differentiable in the learning rate. Neural network training is an unpredictable, unstructured process that constrasts structured engineering work. Two models trained with the same hyperparameters and dataset can create different model behaviors. The goal is to turn deep learning into a more structured approach.

The behavior Keller found that I try reproducing is the learning rate behavior: Learning rate has a locally linear effect on average neural network behavior across repeated runs of training.

The paper Edge of Stochastic Stability: Revisiting the Edge of Stability for SGD from Arseniy Andreyev and Pierfrancesco Beneventano attempts to explain this phenomenon using batch sharpness and lambda max measurements.

Quick Start

pip install -r requirements.txt sbatch run_experiment.sbatch

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors