-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathmain.tex
More file actions
112 lines (101 loc) · 7.98 KB
/
main.tex
File metadata and controls
112 lines (101 loc) · 7.98 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
\documentclass{article}
\usepackage[utf8]{inputenc}
\usepackage{physics}
\usepackage{amsfonts, amsmath, amssymb}
\usepackage{indentfirst}
\usepackage{todonotes}
\title{Singular Value Decomposition and Applications}
\author{}
\date{August 2022}
\begin{document}
\maketitle
\section{Introduction}
\section{Matrices, vectors and products}
We begin by introducing some notation and interpretations of the multiplication operations among matrices with matrices and matrices with and vectors. These will be essential for interpreting Singular Value Decomposition intuitively.
Throughout this text, vectors $\vb{a}\in\mathbb{R}^m$ are represented by $m\times 1$ column matrices $\mqty[a^1 & a^2 & \dots & a^m]^{\intercal}\in\mathbb{R}^{m \times 1} $. Vectors (and matrices) are bold-typed while its components (entries) are not.
First we highlight two complementary ways to see matrices and their products. We can interpret a matrix $\vb{A}\in \mathbb{R}^{m\times n}$ as an array of $n$ column vectors $\vb{a}_i\in\mathbb{R}^{m\times 1}$, with $i=1,\dots,n$:
\begin{equation}
\vb{A}=\left[\begin{array}{cccc}
\mid & \mid & & \mid \\
\vb{a}_{1} & \vb{a}_{2} & \cdots & \vb{a}_{n} \\
\mid & \mid & & \mid
\end{array}\right],
\end{equation}
or as a column array of $m$ transpose vectors $\boldsymbol{\alpha}_i^\intercal=\mqty[\alpha_i^1, & \dots & \alpha_i^n]\in\mathbb{R}^{1\times n}$, $i=1,\dots, m$
\begin{equation}
\vb{A} = \mqty[\boldsymbol{\alpha}_1^\intercal \\ \boldsymbol{\alpha}_2^\intercal \\ \vdots \\ \boldsymbol{\alpha}_m^\intercal].
\end{equation}
\todo[inline]{find out how to insert horizontal bars to represent the complete rows}
The transpose vectors $\boldsymbol{\alpha}_i^\intercal$ are the $i$th row of the matrix $\vb{A}$.
\todo[inline]{add a simple ecxample displaying the $\vb{a}$ and $\boldsymbol{\alpha}$ vectors}
The product of a matrix with a vector, say $\vb{Ax}$, can be seen as the $m\times 1$ matrix whose $i$th row is the scalar product of the $\boldsymbol{\alpha_i}$ vector with $\vb{x}$ vector, \textit{i.e.}
\begin{equation}
\vb{Ax} = \mqty[\boldsymbol{\alpha}_1^\intercal \\ \boldsymbol{\alpha}_2^\intercal \\ \vdots \\ \boldsymbol{\alpha}_m^\intercal] \vb{ x} = \mqty[\boldsymbol{\alpha}_1^\intercal \vb{x}\\ \boldsymbol{\alpha}_2^\intercal \vb{x}\\ \vdots \\ \boldsymbol{\alpha}_m^\intercal\vb{x}].
\end{equation}
Alternatively, we can think of this product as a linear combination of the column vectors of $\vb{A}$, with the coefficients being the components $x^i$ of the vector $\vb{x}$:
\begin{equation}
\vb{A}=\left[\begin{array}{cccc}
\mid & \mid & & \mid \\
\vb{a}_{1} & \vb{a}_{2} & \cdots & \vb{a}_{n} \\
\mid & \mid & & \mid
\end{array}\right]\mqty[x^1 \\ x^2 \\ \vdots \\ x^n] = \sum_i \vb{a}_i x^i
\end{equation}
\todo[inline]{add a simple example}
As for the products of matrices $\vb{A}\in\mathbb{R}^{m\times n}$ and $\vb{B}\in\mathbb{R}^{n\times k}$, we can see it as
\begin{equation}
\begin{aligned}
\vb{AB} & = \mqty[\boldsymbol{\alpha}_1^\intercal \\ \boldsymbol{\alpha}_2^\intercal \\ \vdots \\ \boldsymbol{\alpha}_m^\intercal] \mqty[\vb{b}_1 & \vb{b}_2 & \dots & \vb{b}_k] \\
& = \mqty[\boldsymbol{\alpha}_1^\intercal\vb{b}_1 & \boldsymbol{\alpha}_1^\intercal\vb{b}_2 & \dots & \boldsymbol{\alpha}_1^\intercal\vb{b}_k\\
\boldsymbol{\alpha}_2^\intercal\vb{b}_1 & \boldsymbol{\alpha}_2^\intercal\vb{b}_2 & \dots & \boldsymbol{\alpha}_2^\intercal\vb{b}_k\\
\vdots & \vdots & \ddots & \vdots\\
\boldsymbol{\alpha}_m^\intercal\vb{b}_m & \boldsymbol{\alpha}_1^\intercal\vb{b}_2 & \dots & \boldsymbol{\alpha}_m^\intercal\vb{b}_k]
\end{aligned}\
\end{equation}
In this form it's clear why $\vb{A}^\intercal \vb{A}$, which equals
\begin{equation}
\begin{aligned}
\vb{A}^\intercal\vb{A} & = \mqty[\vb{a}_1^\intercal\vb{a}_1 & \vb{a}_1^\intercal\vb{a}_2 & \dots & \vb{a}_1^\intercal\vb{a}_k\\
\vb{a}_2^\intercal\vb{a}_1 & \vb{a}_2^\intercal\vb{a}_2 & \dots & \vb{a}_2^\intercal\vb{a}_k\\
\vdots & \vdots & \ddots & \vdots\\
\vb{a}_m^\intercal\vb{a}_m & \vb{a}_1^\intercal\vb{a}_2 & \dots & \vb{a}_m^\intercal\vb{a}_k],
\end{aligned}\
\end{equation}
is called the \textit{self-covariance matrix}. The diagonal entries are $\norm{\vb{a}_i}^2$, while the off-diagonal ones are the projections of the $i$th and $j$th column vectors of $\vb{A}$. In a matrix where the columns are highly correlated, $\vb{A}^\intercal\vb{A}$ is a dense matrix, whereas diagonal terms dominate if the columns are less correlated..
We can also see the product of matrices $\vb{AB}$ in the list of columns $\times$ column rows format.
\todo[inline]{add a simple example first}
\begin{equation}
\vb{AB} = \mqty[\vb{a}_1 & \vb{a}_2 & \dots & \vb{a}_m ] \mqty[\boldsymbol{\beta}_1^\intercal\\ \boldsymbol{\beta}_2^\intercal\\ \vdots \\ \boldsymbol{\beta}_n^\intercal\\] = \vb{a}_1\boldsymbol{\beta}_1^\intercal + \vb{a}_2\boldsymbol{\beta}_2^\intercal+\dots+ \vb{a}_n\boldsymbol{\beta}_n^\intercal
\end{equation}
where we see that the product $\vb{AB}$ is the sum of $n$ matrices of rank 1.
\section{Unitary Matrices}
Besides the interpretations presented above, we also need to understand unitary (or othogonal) matrices to interpret SVD. Unitary matrices are special matrices $\vb{U}=\mqty[\vb{u}_1 & \dots & \vb{u}_n]\in\mathbb{R}^{n^2}$ in which the column vectors $\vb{u}_i\in\mathbb{R}^n$ are all orthonormal, \textit{i.e.} $\vb{u}_i^\intercal \vb{u}_j = \delta_{ij}$. This allows us to employ them as basis vectors, and expand any $\vb{x}\in\mathbb{R}^n$ as $$\vb{x}= \sum_{i=1} x^i \vb{u}_i$$
Unitary matrices are quite easy to invert. Their inverse $\vb{U}^{-1}$ is simply their transpose $\vb{U}^{\intercal}$. We can prove this claim by checking $U^\intercal U = 1$ in the column array of row vectors $\times$ array of column vectors interpretation
\begin{equation}
\vb{U}^{\intercal}\vb{U} = \mqty[\vb{u}_1^\intercal \\ \vb{u}_2^\intercal \\ \vdots \\\vb{u}_n^\intercal]\mqty[\vb{u}_1 & \vb{u}_1 & \dots & \vb{u}_n] =
\mqty[\vb{u}_1^\intercal\vb{u}_1 & \vb{u}_1^\intercal\vb{u}_2 & \dots & \vb{u}_1^\intercal\vb{u}_n\\
\vb{u}_2^\intercal\vb{u}_1 & \vb{u}_2^\intercal\vb{u}_2 & \dots & \vb{u}_2^\intercal\vb{u}_n\\
\vdots & \vdots & \ddots & \vdots\\
\vb{u}_n^\intercal\vb{u}_1 & \vb{u}_1^\intercal\vb{u}_2 & \dots & \vb{u}_n^\intercal\vb{u}_n],
\end{equation}
or, a little more subtly, we can prove $\vb{U}\vb{U}^\intercal=1$ in the array of columns $\times$ column array of rows interpretation
\begin{equation}
\vb{U}\vb{U}^\intercal = \vb{u}_1\vb{u}_1^\intercal + \vb{u}_2\vb{u}_2^\intercal + \dots + \vb{u}_n\vb{u}_n^\intercal
\end{equation}
Recalling that $\vb{x}$ can be expanded in terms of the columns of $\vb{U}$, we have
\begin{equation}
\begin{aligned}
\vb{U}\vb{U}^\intercal \vb{x} &= \vb{u}_1\vb{u}_1^\intercal\vb{x} + \vb{u}_2\vb{u}_2^\intercal\vb{x} + \dots + \vb{u}_n\vb{u}_n^\intercal\vb{x}\\
& = \vb{u}_1 x^1 + \vb{u}_2 x^2 + \dots + \vb{u}_m x^n\\
& = \vb{x}
\end{aligned}
\end{equation}
since $\vb{x}$ is arbitrary, we conclude $\vb{U}\vb{U}^\intercal=1$.
Unitary matrices have this name because they are analogous to unitary complex numbers in some sense. Unitary complex numbers have modulus $1$ and thus preserve the modulus of any complex number upon multiplication. Similarly, unitary matrices preserve the norm, or more generally, the inner product, of any vector. To see this, let $\vb{y}_1 = \vb{U}^\intercal \vb{x}_1$ and $\vb{y}_2 = \vb{U}^\intercal \vb{x}_1$. It's easy to see that $\vb{y}_2^\intercal \vb{y}_1 = \vb{x}_2^\intercal \vb{U} \vb{U}^\intercal \vb{x}_1 = \vb{x}_2^\intercal\vb{x}_1$.
\section{Singular Value Decomposition}[
Now we turn to our main goal: to make sens of the Singular Values Decomposition. Simply stated, the decomposition consists of expressing \textbf{any} matrix as the product of insightful matrices
\begin{equation}
\vb{A}=\vb{U}\boldsymbol{\Sigma}\vb{V}^\intercal
\end{equation}
\section{Applications}
\subsection{Solving linear systems}
\end{document}