File tree Expand file tree Collapse file tree 2 files changed +17
-1
lines changed
Expand file tree Collapse file tree 2 files changed +17
-1
lines changed Original file line number Diff line number Diff line change 1+ # SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+ # SPDX-License-Identifier: Apache-2.0
3+ #
4+ # Licensed under the Apache License, Version 2.0 (the "License");
5+ # you may not use this file except in compliance with the License.
6+ # You may obtain a copy of the License at
7+ #
8+ # http://www.apache.org/licenses/LICENSE-2.0
9+ #
10+ # Unless required by applicable law or agreed to in writing, software
11+ # distributed under the License is distributed on an "AS IS" BASIS,
12+ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+ # See the License for the specific language governing permissions and
14+ # limitations under the License.
15+ from emerging_optimizers .psgd .psgd import *
Original file line number Diff line number Diff line change @@ -35,7 +35,8 @@ class PSGDPro(torch.optim.Optimizer):
3535 Preconditioned Stochastic Gradient Descent (PSGD) (https://arxiv.org/abs/1512.04202) is a preconditioned optimization algorithm
3636 that fits amplitudes of perturbations of preconditioned stochastic gradient to match that of the perturbations of parameters.
3737 PSGD with Kronecker-factored Preconditioner (PSGD-Kron-Whiten) is a variant of PSGD that reduces memory and computational complexity.
38- Procrustes step is an algorithm to update the preconditioner which respects a particular geometry.
38+ Procrustes step is an algorithm to update the preconditioner which respects a particular geometry: Q^0.5 * E * Q^1.5, see Stochastic Hessian
39+ Fittings with Lie Groups (https://arxiv.org/abs/2402.11858) for more details.
3940
4041 Args:
4142 params: Iterable of parameters to optimize or dicts defining parameter groups
You can’t perform that action at this time.
0 commit comments