pytorch-optimizer v3.3.0
Change Log
Feature
- Support
PaLMvariant forScheduleFreeAdamWoptimizer. (#286, #288)- you can use this feature by setting
use_palmtoTrue.
- you can use this feature by setting
- Implement
ADOPToptimizer. (#289, #290) - Implement
FTRLoptimizer. (#291) - Implement
Cautious optimizerfeature. (#294)- Improving Training with One Line of Code
- you can use it by setting
cautious=TrueforLion,AdaFactorandAdEMAMixoptimizers.
- Improve the stability of
ADOPToptimizer. (#294) - Support a new projection type
randomforGaLoreProjector. (#294) - Implement
DeMooptimizer. (#300, #301) - Implement
Muonoptimizer. (#302) - Implement
ScheduleFreeRAdamoptimizer. (#304) - Implement
LaPropoptimizer. (#304) - Support
Cautiousvariant toLaProp,AdamP,Adoptoptimizers. (#304).
Refactor
- Big refactoring, removing direct import from
pytorch_optimizer.*.- I removed some methods not to directly import from it from
pytorch_optimzier.*because they're probably not used frequently and actually not an optimizer rather utils only used for specific optimizers. pytorch_optimizer.[Shampoo stuff]->pytorch_optimizer.optimizers.shampoo_utils.[Shampoo stuff].shampoo_utilslikeGraft,BlockPartitioner,PreConditioner, etc. You can check the details here.
pytorch_optimizer.GaLoreProjector->pytorch_optimizer.optimizers.galore.GaLoreProjector.pytorch_optimizer.gradfilter_ema->pytorch_optimizer.optimizers.grokfast.gradfilter_ema.pytorch_optimizer.gradfilter_ma->pytorch_optimizer.optimizers.grokfast.gradfilter_ma.pytorch_optimizer.l2_projection->pytorch_optimizer.optimizers.alig.l2_projection.pytorch_optimizer.flatten_grad->pytorch_optimizer.optimizers.pcgrad.flatten_grad.pytorch_optimizer.un_flatten_grad->pytorch_optimizer.optimizers.pcgrad.un_flatten_grad.pytorch_optimizer.reduce_max_except_dim->pytorch_optimizer.optimizers.sm3.reduce_max_except_dim.pytorch_optimizer.neuron_norm->pytorch_optimizer.optimizers.nero.neuron_norm.pytorch_optimizer.neuron_mean->pytorch_optimizer.optimizers.nero.neuron_mean.
- I removed some methods not to directly import from it from
Docs
- Add more visualizations. (#297)
Bug
- Add optimizer parameter to
PolySchedulerconstructor. (#295)
Contributions
thanks to @tanganke