Skip to content

Commit fb1941f

Browse files
authored
Merge pull request #122 from PerformanceEstimation/features/FISTA
Features/fista
2 parents ccc28ff + 1999ea0 commit fb1941f

16 files changed

+498
-152
lines changed

PEPit/examples/composite_convex_minimization/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
from .accelerated_douglas_rachford_splitting import wc_accelerated_douglas_rachford_splitting
22
from .accelerated_proximal_gradient import wc_accelerated_proximal_gradient
3+
from .accelerated_proximal_gradient_simplified import wc_accelerated_proximal_gradient_simplified
34
from .bregman_proximal_point import wc_bregman_proximal_point
45
from .douglas_rachford_splitting import wc_douglas_rachford_splitting
56
from .douglas_rachford_splitting_contraction import wc_douglas_rachford_splitting_contraction
@@ -13,6 +14,7 @@
1314

1415
__all__ = ['accelerated_douglas_rachford_splitting', 'wc_accelerated_douglas_rachford_splitting',
1516
'accelerated_proximal_gradient', 'wc_accelerated_proximal_gradient',
17+
'accelerated_proximal_gradient_simplified', 'wc_accelerated_proximal_gradient_simplified',
1618
'bregman_proximal_point', 'wc_bregman_proximal_point',
1719
'douglas_rachford_splitting', 'wc_douglas_rachford_splitting',
1820
'douglas_rachford_splitting_contraction', 'wc_douglas_rachford_splitting_contraction',

PEPit/examples/composite_convex_minimization/accelerated_proximal_gradient.py

Lines changed: 32 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
from math import sqrt
2+
13
from PEPit import PEP
24
from PEPit.functions import SmoothStronglyConvexFunction
35
from PEPit.functions import ConvexFunction
@@ -14,7 +16,7 @@ def wc_accelerated_proximal_gradient(mu, L, n, wrapper="cvxpy", solver=None, ver
1416
and where :math:`h` is closed convex and proper.
1517
1618
This code computes a worst-case guarantee for the **accelerated proximal gradient** method,
17-
also known as **fast proximal gradient (FPGM)** method.
19+
also known as **fast proximal gradient (FPGM)** method or FISTA [1].
1820
That is, it computes the smallest possible :math:`\\tau(n, L, \\mu)` such that the guarantee
1921
2022
.. math :: F(x_n) - F(x_\\star) \\leqslant \\tau(n, L, \\mu) \\|x_0 - x_\\star\\|^2,
@@ -26,31 +28,26 @@ def wc_accelerated_proximal_gradient(mu, L, n, wrapper="cvxpy", solver=None, ver
2628
:math:`\\tau(n, L, \\mu)` is computed as the worst-case value of
2729
:math:`F(x_n) - F(x_\\star)` when :math:`\\|x_0 - x_\\star\\|^2 \\leqslant 1`.
2830
29-
**Algorithm**: Accelerated proximal gradient is described as follows, for :math:`t \in \\{ 0, \\dots, n-1\\}`,
31+
**Algorithm**: Initialize :math:`\\lambda_1=1`, :math:`y_1=x_0`. One iteration of FISTA is described by
3032
3133
.. math::
32-
:nowrap:
3334
3435
\\begin{eqnarray}
35-
x_{t+1} & = & \\arg\\min_x \\left\\{h(x)+\\frac{L}{2}\|x-\\left(y_{t} - \\frac{1}{L} \\nabla f(y_t)\\right)\\|^2 \\right\\}, \\\\
36-
y_{t+1} & = & x_{t+1} + \\frac{i}{i+3} (x_{t+1} - x_{t}),
36+
\\text{Set: }\\lambda_{t+1} & = & \\frac{1 + \\sqrt{4\\lambda_t^2 + 1}}{2}\\\\
37+
x_t & = & \\arg\\min_x \\left\\{h(x)+\\frac{L}{2}\|x-\\left(y_t - \\frac{1}{L} \\nabla f(y_t)\\right)\\|^2 \\right\\}\\\\
38+
y_{t+1} & = & x_t + \\frac{\\lambda_t-1}{\\lambda_{t+1}} (x_t-x_{t-1}).
3739
\\end{eqnarray}
3840
39-
where :math:`y_{0} = x_0`.
40-
41-
**Theoretical guarantee**: A **tight** (empirical) worst-case guarantee for FPGM is obtained in
42-
[1, method FPGM1 in Sec. 4.2.1, Table 1 in sec 4.2.2], for :math:`\\mu=0`:
43-
44-
.. math:: F(x_n) - F_\\star \\leqslant \\frac{2 L}{n^2+5n+2} \\|x_0 - x_\\star\\|^2,
41+
**Theoretical guarantee**: The following worst-case guarantee can be found in e.g., [1, Theorem 4.4]:
4542
46-
which is attained on simple one-dimensional constrained linear optimization problems.
43+
.. math:: f(x_n)-f_\\star \\leqslant \\frac{L}{2}\\frac{\\|x_0-x_\\star\\|^2}{\\lambda_n^2}.
4744
4845
**References**:
49-
50-
`[1] A. Taylor, J. Hendrickx, F. Glineur (2017).
51-
Exact worst-case performance of first-order methods for composite convex optimization.
52-
SIAM Journal on Optimization, 27(3):1283–1313.
53-
<https://arxiv.org/pdf/1512.07516.pdf>`_
46+
47+
`[1] A. Beck, M. Teboulle (2009).
48+
A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems.
49+
SIAM journal on imaging sciences, 2009, vol. 2, no 1, p. 183-202.
50+
<https://www.ceremade.dauphine.fr/~carlier/FISTA>`_
5451
5552
5653
Args:
@@ -84,19 +81,19 @@ def wc_accelerated_proximal_gradient(mu, L, n, wrapper="cvxpy", solver=None, ver
8481
(PEPit) Setting up the problem: additional constraints for 0 function(s)
8582
(PEPit) Compiling SDP
8683
(PEPit) Calling SDP solver
87-
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.05263158422835028
84+
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.05167329605152958
8885
(PEPit) Primal feasibility check:
89-
The solver found a Gram matrix that is positive semi-definite up to an error of 5.991982341524508e-09
90-
All the primal scalar constraints are verified up to an error of 1.4780313955381486e-08
86+
The solver found a Gram matrix that is positive semi-definite up to an error of 6.64684463996332e-09
87+
All the primal scalar constraints are verified up to an error of 1.6451693951591295e-08
9188
(PEPit) Dual feasibility check:
9289
The solver found a residual matrix that is positive semi-definite
9390
All the dual scalar values associated with inequality constraints are nonnegative
94-
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 7.783914601477293e-08
95-
(PEPit) Final upper bound (dual): 0.052631589673196755 and lower bound (primal example): 0.05263158422835028
96-
(PEPit) Duality gap: absolute: 5.444846476465592e-09 and relative: 1.034520726726044e-07
91+
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 8.587603813802402e-08
92+
(PEPit) Final upper bound (dual): 0.051673302055698395 and lower bound (primal example): 0.05167329605152958
93+
(PEPit) Duality gap: absolute: 6.004168814910393e-09 and relative: 1.1619480996379491e-07
9794
*** Example file: worst-case performance of the Accelerated Proximal Gradient Method in function values***
98-
PEPit guarantee: f(x_n)-f_* <= 0.0526316 ||x0 - xs||^2
99-
Theoretical guarantee: f(x_n)-f_* <= 0.0526316 ||x0 - xs||^2
95+
PEPit guarantee: f(x_n)-f_* <= 0.0516733 ||x0 - xs||^2
96+
Theoretical guarantee: f(x_n)-f_* <= 0.0661257 ||x0 - xs||^2
10097
10198
"""
10299

@@ -118,26 +115,28 @@ def wc_accelerated_proximal_gradient(mu, L, n, wrapper="cvxpy", solver=None, ver
118115
# Set the initial constraint that is the distance between x0 and x^*
119116
problem.set_initial_condition((x0 - xs) ** 2 <= 1)
120117

121-
# Compute n steps of the accelerated proximal gradient method starting from x0
118+
# Compute n steps of the accelerated proximal gradient method starting from x0
122119
x_new = x0
123120
y = x0
121+
lam = 1
124122
for i in range(n):
123+
lam_old = lam
124+
lam = (1 + sqrt(4 * lam_old ** 2 + 1)) / 2
125125
x_old = x_new
126126
x_new, _, hx_new = proximal_step(y - 1 / L * f.gradient(y), h, 1 / L)
127-
y = x_new + i / (i + 3) * (x_new - x_old)
127+
y = x_new + (lam_old - 1) / lam * (x_new - x_old)
128128

129-
# Set the performance metric to the function value accuracy
129+
# Set the performance metric to the function value accuracy
130130
problem.set_performance_metric((f(x_new) + hx_new) - Fs)
131131

132132
# Solve the PEP
133133
pepit_verbose = max(verbose, 0)
134134
pepit_tau = problem.solve(wrapper=wrapper, solver=solver, verbose=pepit_verbose)
135135

136-
# Compute theoretical guarantee (for comparison)
137-
if mu == 0:
138-
theoretical_tau = 2 * L / (n ** 2 + 5 * n + 2) # tight, see [2], Table 1 (column 1, line 1)
139-
else:
140-
theoretical_tau = 2 * L / (n ** 2 + 5 * n + 2) # not tight (bound for smooth convex functions)
136+
# Theoretical guarantee (for comparison)
137+
theoretical_tau = L / (2 * lam_old ** 2)
138+
139+
if mu != 0:
141140
print('Warning: momentum is tuned for non-strongly convex functions.')
142141

143142
# Print conclusion if required
Lines changed: 155 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,155 @@
1+
from PEPit import PEP
2+
from PEPit.functions import SmoothStronglyConvexFunction
3+
from PEPit.functions import ConvexFunction
4+
from PEPit.primitive_steps import proximal_step
5+
6+
7+
def wc_accelerated_proximal_gradient_simplified(mu, L, n, wrapper="cvxpy", solver=None, verbose=1):
8+
"""
9+
Consider the composite convex minimization problem
10+
11+
.. math:: F_\\star \\triangleq \\min_x \\{F(x) \equiv f(x) + h(x)\\},
12+
13+
where :math:`f` is :math:`L`-smooth and :math:`\\mu`-strongly convex,
14+
and where :math:`h` is closed convex and proper.
15+
16+
This code computes a worst-case guarantee for the **accelerated proximal gradient** method,
17+
also known as **fast proximal gradient (FPGM)** method.
18+
That is, it computes the smallest possible :math:`\\tau(n, L, \\mu)` such that the guarantee
19+
20+
.. math :: F(x_n) - F(x_\\star) \\leqslant \\tau(n, L, \\mu) \\|x_0 - x_\\star\\|^2,
21+
22+
is valid, where :math:`x_n` is the output of the **accelerated proximal gradient** method,
23+
and where :math:`x_\\star` is a minimizer of :math:`F`.
24+
25+
In short, for given values of :math:`n`, :math:`L` and :math:`\\mu`,
26+
:math:`\\tau(n, L, \\mu)` is computed as the worst-case value of
27+
:math:`F(x_n) - F(x_\\star)` when :math:`\\|x_0 - x_\\star\\|^2 \\leqslant 1`.
28+
29+
**Algorithm**: Accelerated proximal gradient is described as follows, for :math:`t \in \\{ 0, \\dots, n-1\\}`,
30+
31+
.. math::
32+
:nowrap:
33+
34+
\\begin{eqnarray}
35+
x_{t+1} & = & \\arg\\min_x \\left\\{h(x)+\\frac{L}{2}\|x-\\left(y_{t} - \\frac{1}{L} \\nabla f(y_t)\\right)\\|^2 \\right\\}, \\\\
36+
y_{t+1} & = & x_{t+1} + \\frac{i}{i+3} (x_{t+1} - x_{t}),
37+
\\end{eqnarray}
38+
39+
where :math:`y_{0} = x_0`.
40+
41+
**Theoretical guarantee**: A **tight** (empirical) worst-case guarantee for FPGM is obtained in
42+
[1, method FPGM1 in Sec. 4.2.1, Table 1 in sec 4.2.2], for :math:`\\mu=0`:
43+
44+
.. math:: F(x_n) - F_\\star \\leqslant \\frac{2 L}{n^2+5n+2} \\|x_0 - x_\\star\\|^2,
45+
46+
which is attained on simple one-dimensional constrained linear optimization problems.
47+
48+
**References**:
49+
50+
`[1] A. Taylor, J. Hendrickx, F. Glineur (2017).
51+
Exact worst-case performance of first-order methods for composite convex optimization.
52+
SIAM Journal on Optimization, 27(3):1283–1313.
53+
<https://arxiv.org/pdf/1512.07516.pdf>`_
54+
55+
56+
Args:
57+
L (float): the smoothness parameter.
58+
mu (float): the strong convexity parameter.
59+
n (int): number of iterations.
60+
wrapper (str): the name of the wrapper to be used.
61+
solver (str): the name of the solver the wrapper should use.
62+
verbose (int): level of information details to print.
63+
64+
- -1: No verbose at all.
65+
- 0: This example's output.
66+
- 1: This example's output + PEPit information.
67+
- 2: This example's output + PEPit information + solver details.
68+
69+
Returns:
70+
pepit_tau (float): worst-case value.
71+
theoretical_tau (float): theoretical value.
72+
73+
Example:
74+
>>> pepit_tau, theoretical_tau = wc_accelerated_proximal_gradient_simplified(L=1, mu=0, n=4, wrapper="cvxpy", solver=None, verbose=1)
75+
(PEPit) Setting up the problem: size of the Gram matrix: 12x12
76+
(PEPit) Setting up the problem: performance measure is the minimum of 1 element(s)
77+
(PEPit) Setting up the problem: Adding initial conditions and general constraints ...
78+
(PEPit) Setting up the problem: initial conditions and general constraints (1 constraint(s) added)
79+
(PEPit) Setting up the problem: interpolation conditions for 2 function(s)
80+
Function 1 : Adding 30 scalar constraint(s) ...
81+
Function 1 : 30 scalar constraint(s) added
82+
Function 2 : Adding 20 scalar constraint(s) ...
83+
Function 2 : 20 scalar constraint(s) added
84+
(PEPit) Setting up the problem: additional constraints for 0 function(s)
85+
(PEPit) Compiling SDP
86+
(PEPit) Calling SDP solver
87+
(PEPit) Solver status: optimal (wrapper:cvxpy, solver: MOSEK); optimal value: 0.05263158422835028
88+
(PEPit) Primal feasibility check:
89+
The solver found a Gram matrix that is positive semi-definite up to an error of 5.991982341524508e-09
90+
All the primal scalar constraints are verified up to an error of 1.4780313955381486e-08
91+
(PEPit) Dual feasibility check:
92+
The solver found a residual matrix that is positive semi-definite
93+
All the dual scalar values associated with inequality constraints are nonnegative
94+
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 7.783914601477293e-08
95+
(PEPit) Final upper bound (dual): 0.052631589673196755 and lower bound (primal example): 0.05263158422835028
96+
(PEPit) Duality gap: absolute: 5.444846476465592e-09 and relative: 1.034520726726044e-07
97+
*** Example file: worst-case performance of the Accelerated Proximal Gradient Method in function values***
98+
PEPit guarantee: f(x_n)-f_* <= 0.0526316 ||x0 - xs||^2
99+
Theoretical guarantee: f(x_n)-f_* <= 0.0526316 ||x0 - xs||^2
100+
101+
"""
102+
103+
# Instantiate PEP
104+
problem = PEP()
105+
106+
# Declare a strongly convex smooth function and a convex function
107+
f = problem.declare_function(SmoothStronglyConvexFunction, mu=mu, L=L)
108+
h = problem.declare_function(ConvexFunction)
109+
F = f + h
110+
111+
# Start by defining its unique optimal point xs = x_* and its function value Fs = F(x_*)
112+
xs = F.stationary_point()
113+
Fs = F(xs)
114+
115+
# Then define the starting point x0
116+
x0 = problem.set_initial_point()
117+
118+
# Set the initial constraint that is the distance between x0 and x^*
119+
problem.set_initial_condition((x0 - xs) ** 2 <= 1)
120+
121+
# Compute n steps of the accelerated proximal gradient method starting from x0
122+
x_new = x0
123+
y = x0
124+
for i in range(n):
125+
x_old = x_new
126+
x_new, _, hx_new = proximal_step(y - 1 / L * f.gradient(y), h, 1 / L)
127+
y = x_new + i / (i + 3) * (x_new - x_old)
128+
129+
# Set the performance metric to the function value accuracy
130+
problem.set_performance_metric((f(x_new) + hx_new) - Fs)
131+
132+
# Solve the PEP
133+
pepit_verbose = max(verbose, 0)
134+
pepit_tau = problem.solve(wrapper=wrapper, solver=solver, verbose=pepit_verbose)
135+
136+
# Compute theoretical guarantee (for comparison)
137+
theoretical_tau = 2 * L / (n ** 2 + 5 * n + 2) # tight if mu == 0, see [1], Table 1 (column 1, line 1)
138+
if mu != 0:
139+
print('Warning: momentum is tuned for non-strongly convex functions.')
140+
141+
# Print conclusion if required
142+
if verbose != -1:
143+
print('*** Example file:'
144+
' worst-case performance of the Accelerated Proximal Gradient Method in function values***')
145+
print('\tPEPit guarantee:\t f(x_n)-f_* <= {:.6} ||x0 - xs||^2'.format(pepit_tau))
146+
print('\tTheoretical guarantee:\t f(x_n)-f_* <= {:.6} ||x0 - xs||^2'.format(theoretical_tau))
147+
148+
# Return the worst-case guarantee of the evaluated method ( and the reference theoretical value)
149+
return pepit_tau, theoretical_tau
150+
151+
152+
if __name__ == "__main__":
153+
pepit_tau, theoretical_tau = wc_accelerated_proximal_gradient_simplified(L=1, mu=0, n=4,
154+
wrapper="cvxpy", solver=None,
155+
verbose=1)

PEPit/examples/composite_convex_minimization/three_operator_splitting.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -97,7 +97,7 @@ def wc_three_operator_splitting(mu1, L1, L3, alpha, theta, n, wrapper="cvxpy", s
9797
(PEPit) Final upper bound (dual): 0.4754523347677999 and lower bound (primal example): 0.4754523346392658
9898
(PEPit) Duality gap: absolute: 1.285341277856844e-10 and relative: 2.703407227628939e-10
9999
*** Example file: worst-case performance of the Three Operator Splitting in distance ***
100-
PEPit guarantee: ||w^2_n - w^1_n||^2 <= 0.475452 ||x0 - ws||^2
100+
PEPit guarantee: ||w^1_n - w^0_n||^2 <= 0.475452 ||w^1_0 - w^0_0||^2
101101
102102
"""
103103

PEPit/examples/nonconvex_optimization/gradient_descent_quadratic_lojasiewicz_expensive.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -106,13 +106,13 @@ def wc_gradient_descent_quadratic_lojasiewicz_expensive(L, mu, gamma, n, wrapper
106106
All the dual matrices to lmi are positive semi-definite
107107
All the dual scalar values associated with inequality constraints are nonnegative up to an error of 5.671954340368105e-10
108108
(PEPit) The worst-case guarantee proof is perfectly reconstituted up to an error of 2.0306640495891322e-08
109-
(PEPit) Final upper bound (dual): 0.6832669563172779 and lower bound (primal example): 0.6832669556328734
109+
(PEPit) Final upper bound (dual): 0.6832669563172779 and lower bound (primal example): 0.6832669556328734
110110
(PEPit) Duality gap: absolute: 6.844044220244427e-10 and relative: 1.0016647466735981e-09
111111
*** Example file: worst-case performance of gradient descent with fixed step-size ***
112-
*** (smooth problem satisfying a Lojasiewicz inequality; expert version) ***
112+
*** (smooth problem satisfying a Lojasiewicz inequality; expensive version) ***
113113
PEPit guarantee: f(x_1) - f(x_*) <= 0.683267 (f(x_0)-f_*)
114114
Theoretical guarantee: f(x_1) - f(x_*) <= 0.727273 (f(x_0)-f_*)
115-
115+
116116
"""
117117
# Instantiate PEP
118118
problem = PEP()
@@ -160,7 +160,7 @@ def wc_gradient_descent_quadratic_lojasiewicz_expensive(L, mu, gamma, n, wrapper
160160
# Print conclusion if required
161161
if verbose != -1:
162162
print('*** Example file: worst-case performance of gradient descent with fixed step-size ***')
163-
print('*** \t (smooth problem satisfying a Lojasiewicz inequality; expert version) ***')
163+
print('*** \t (smooth problem satisfying a Lojasiewicz inequality; expensive version) ***')
164164
print('\tPEPit guarantee:\t f(x_1) - f(x_*) <= {:.6} (f(x_0)-f_*)'.format(pepit_tau))
165165
print('\tTheoretical guarantee:\t f(x_1) - f(x_*) <= {:.6} (f(x_0)-f_*)'.format(theoretical_tau))
166166

0 commit comments

Comments
 (0)