You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: examples/blind_deconv.py
+72-2Lines changed: 72 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -38,16 +38,78 @@ def _():
38
38
returnBiconvexProblem, cp, mo, np, plt
39
39
40
40
41
+
@app.cell(hide_code=True)
42
+
def_(mo):
43
+
mo.md(r"""
44
+
## Introduction
45
+
46
+
Blind deconvolution is a technique used to recover some sharp signal or image from a blurred observation when the blur itself is unknown.
47
+
It jointly estimates both the original signal and the blur kernel, with some prior knowledge about their structures.
48
+
49
+
Suppose we are given a data vector $d \in \mathbf{R}^{m + n - 1}$, which is the convolution of an unknown sparse signal $x \in \mathbf{R}^n$ and an unknown smooth vector $y \in \mathbf{R}^m$ with bounded $\ell_\infty$-norm (i.e., bounded largest entry).
50
+
Additionally, we have the prior knowledge that both the vectors $x$ and $y$ are nonnegative.
51
+
The corresponding blind deconvolution problem can be formulated as the following biconvex optimization problem:
\text{subject to} & x \succeq 0,\quad y \succeq 0\\
57
+
& {\|y\|}_\infty \leq \beta
58
+
\end{array}
59
+
\]
60
+
61
+
with variables $x$ and $y$, where $\alpha_{\rm sp}, \alpha_{\rm sm} > 0$ are the regularization parameters for the sparsity of $x$ and smoothness of $y$, respectively, and $\beta > 0$ is the bound on the $\ell_\infty$-norm of the vector $y$.
62
+
The matrix $D \in \mathbf{R}^{(m - 1) \times m}$ is the first-order difference operator, given by,
Copy file name to clipboardExpand all lines: examples/dict_learning.py
+50-1Lines changed: 50 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -38,14 +38,55 @@ def _():
38
38
returnBiconvexProblem, cp, mo, np, plt
39
39
40
40
41
+
@app.cell(hide_code=True)
42
+
def_(mo):
43
+
mo.md(r"""
44
+
## Introduction
45
+
46
+
We consider the sparse dictionary learning problem, which aims to find a dictionary matrix $D \in \mathbf{R}^{m \times k}$ and a sparse code matrix $X \in \mathbf{R}^{k \times n}$, such that the data matrix $Y \in \mathbf{R}^{m \times n}$ can be well approximated by their product $DX$, while the matrix $X$ is sparse and the matrix $D$ has bounded Frobenius norm.
47
+
The dictionary learning problem can be formulated as the following biconvex optimization problem:
with variables $D$ and $X$, where $\alpha > 0$ is the sparsity regularization parameter, and $\beta > 0$ is the bound on the Frobenius norm of the dictionary matrix.
We consider the fiting problem of a logistic input-output hidden Markov model (IO-HMM) to some dataset.
47
+
Suppose we are given a dataset $(x(t), y(t))$, $t = 1, \ldots, m$, where each sample consists of an input feature vector $x(t) \in \mathbf{R}^n$ and an output label $y(t) \in \{0, 1\}$, generated from a $K$-state IO-HMM, according to the following procedure:
48
+
Let $\hat{z}(t) \in \{1, \ldots, K\}$, $t = 1, \ldots, m$, be the state label of the IO-HMM with initial state distribution $p_{\rm init} \in \mathbf{R}^K$ with $\mathbf{1}^T p_{\rm init} = 1$ and transition matrix $P_{\rm tr} \in \mathbf{R}^{K \times K}$ with $P_{\rm tr} \mathbf{1} = \mathbf{1}$.
49
+
At the time step $t$, the state label $\hat{z}(t)$ is sampled according to
50
+
51
+
\[
52
+
\hat{z}(t) \sim \left\{
53
+
\begin{array}{ll}
54
+
{\rm Cat}(p_{\rm init}) & t = 0\\
55
+
{\rm Cat}(p_{\hat{z}(t - 1)}) & t > 0,
56
+
\end{array}\right.
57
+
\]
58
+
59
+
where the vector $p_{\hat{z}(t-1)} \in \mathbf{R}^K$ denotes the $\hat{z}(t-1)$th row of the matrix $P_{\rm tr}$, and ${\rm Cat}(p)$ denotes the categorical distribution with $p$ being the vector of category probabilities.
60
+
Then, given the feature vector $x(t) \in \mathbf{R}^n$, the output $y(t) \in \{0, 1\}$ of this IO-HMM at time step $t$ is then generated from a logistic model, i.e.,
where $\theta_{\hat{z}(t)} \in \{\theta_1, \ldots, \theta_K\} \subseteq \mathbf{R}^n$ is the coefficient.
67
+
68
+
We are interested in recovering the transition matrix $P_{\rm tr}$, the model parameters $\theta_1, \ldots, \theta_K$, and the unobserved state labels $\hat{z}(1), \ldots, \hat{z}(m)$, given the dataset $(x(t), y(t))$, $t = 1, \ldots, m$.
69
+
Noticing that the transition matrix $P_{\rm tr}$ can be easily estimated from the state labels $\hat{z}(t)$, $t = 1, \ldots, m$, we consider the following biconvex optimization problem for fitting the IO-HMM:
where the optimization variables are $\theta_k \in \mathbf{R}^n$, $k = 1, \ldots, K$, and $z(t) \in \mathbf{R}^K$, $t = 1, \ldots, m$.
81
+
Note that the variable $z(t)$ is a soft assignment vector for the hidden state label $\hat{z}(t)$, where the $k$th entry of $z(t)$ indicates the probability of the state being $k$ at time step $t$, and $\hat{z}(t)$ can be estimated as the index of the largest entry of $z(t)$ after solving the problem above.
82
+
83
+
Each component of this problem can be interpreted as follows:
84
+
The first term in the objective function is the negative log-likelihood of the observed data under the IO-HMM model, given the state assignment probabilities $z(t)$, $t = 1, \ldots, m$, and the model parameters $\theta_k$, $k = 1, \ldots, K$.
85
+
The second term is a Tikhonov regularization on the model parameters $\theta_k$, with regularization parameter $\alpha_\theta > 0$.
86
+
The third term is a temporal smoothness regularization on the state assignment probabilities, where $D_{\rm kl}(p, q)$ denotes the Kullback-Leibler divergence between two probability distributions $p$ and $q$, and $\alpha_z > 0$ is the corresponding regularization parameter.
87
+
The constraints on the variables $z(t)$, $t = 1, \ldots, m$, ensure that they are valid probability distributions.
88
+
The sets ${\cal C}_k \subseteq \mathbf{R}^n$, $k = 1, \ldots, K$, are nonempty closed convex sets that encode potential prior knowledge about the model parameters $\theta_k$.
89
+
""")
90
+
return
91
+
92
+
93
+
@app.cell(hide_code=True)
94
+
def_(mo):
95
+
mo.md(r"""
96
+
## Generate problem data
97
+
98
+
We consider the case of $n = 2$, and the feature vector for each sample is generated according to
99
+
100
+
\[
101
+
x(t) \sim ({\cal U}(-5, 5),\ 1),
102
+
\]
103
+
104
+
where ${\cal U}(a, b)$ denotes a uniform distribution over the interval $[a, b]$, and the second entry of $x(t)$ is always $1$ to account for the bias term.
105
+
""")
106
+
return
107
+
108
+
39
109
@app.cell
40
110
def_(np):
41
111
m=1800
@@ -58,6 +128,22 @@ def _(np):
58
128
returnK, coefs, labels, m, n, p_tr, xs, ys
59
129
60
130
131
+
@app.cell(hide_code=True)
132
+
def_(mo):
133
+
mo.md(r"""
134
+
## Specify and solve the problem
135
+
136
+
To fully specify the biconvex problem, it is assumed that we are given the following prior knowledge about the coefficients:
Suppose we are given a set of data points $x_i \in \mathbf{R}^n$, $i = 1, \ldots, m$, and we would like to cluster them into $k$ groups, using the $k$-means clustering method.
48
+
This corresponds to the following biconvex optimization problem:
\text{subject to} & 0 \preceq z_i \preceq \mathbf{1},\quad \mathbf{1}^T z_i = 1,\quad i = 1, \ldots, m
54
+
\end{array}
55
+
\]
56
+
57
+
with variables $\bar{x}_i \in \mathbf{R}^n$, $i = 1, \ldots, k$, and $z_i \in \mathbf{R}^k$, $i = 1, \ldots, m$.
58
+
59
+
We can interpret the problem formulation as follows:
60
+
The variables $\bar{x}_1, \ldots, \bar{x}_k$ represent the cluster centroids, and each variable $z_i$ is a soft assignment vector for data point $x_i$, where the $j$th entry of $z_i$ indicates the probability of the sample $x_i$ belonging to cluster $j$.
61
+
Then, the objective function represents the total within-cluster variance, which we would like to minimize.
62
+
""")
63
+
return
64
+
65
+
66
+
@app.cell(hide_code=True)
67
+
def_(mo):
68
+
mo.md(r"""
69
+
## Generate problem data
70
+
""")
71
+
return
72
+
73
+
42
74
@app.cell
43
75
def_(make_blobs):
44
76
n=2
@@ -49,6 +81,14 @@ def _(make_blobs):
49
81
returnk, m, n, xs
50
82
51
83
84
+
@app.cell(hide_code=True)
85
+
def_(mo):
86
+
mo.md(r"""
87
+
## Specify and solve the problem
88
+
""")
89
+
return
90
+
91
+
52
92
@app.cell
53
93
def_(BiconvexProblem, cp, k, m, n, xs):
54
94
xbars=cp.Variable((k, n))
@@ -62,6 +102,14 @@ def _(BiconvexProblem, cp, k, m, n, xs):
0 commit comments