Skip to content

Commit 62e67f2

Browse files
committed
Update exp-svd.md
1 parent b971a95 commit 62e67f2

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

morea/nlp/exp-svd.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,9 +17,9 @@ morea_labels:
1717

1818
The singular value decomposition is a neat way to understand what is happening with matrix operations, and one good direction to interpret LLMs under the hood. We begin with a motivating example: the movie ratings data. The below is a snippet of preferences (from 1-5 where 5 is best) of 26 movies rated by 61 students (anonymized):
1919

20-
<img src="./movieprefs.png" alt="Movie ratings" width="500px"/>
20+
<img src="./movieprefs.png" alt="Movie ratings" width="900px"/>
2121

22-
Let \\( R\\) be the \\(n\times p\\) matrix of these movie ratings, and let \\(X\\) be the centered version of \\(R\\) (namely, the mean of all the rows of \\(R\\) is subtracted out from each row of \\(R\\)). In this specific example, \\(n=61\\) and \\(p=26\\). From what we learned about PCA, the directions of maximum variance of the rows are captured by the eigenvectors of \\(X^TX\\). Let \\(V\\) be the matrix of eigenvectors of \\(X^TX\\), i.e. \\(V\\) is \\(p\times p\\) whose columns are the eigenvectors of \\(X^TX\\). Following convention, let the eigenvectors be arranged as the columns of \\(V\\), \\(\v_1\upto \v_n\\), with associated eigenvalues \\(\lambda_1\upto \lambda_p\\) satisfying \\(\lambda_1\ge \lambda_2\ge\cdots\ge \lambda_p\)). Therefore, we have for \\(i=1\upto p\\) that
22+
Let \\( R\\) be the \\(n\times p\\) matrix of these movie ratings, and let \\(X\\) be the centered version of \\(R\\) (namely, the mean of all the rows of \\(R\\) is subtracted out from each row of \\(R\\)). In this specific example, \\(n=61\\) and \\(p=26\\). From what we learned about PCA, the directions of maximum variance of the rows are captured by the eigenvectors of \\(X^TX\\). Let \\(V\\) be the matrix of eigenvectors of \\(X^TX\\), i.e. \\(V\\) is \\(p\times p\\) whose columns are the eigenvectors of \\(X^TX\\). Following convention, let the eigenvectors be arranged as the columns of \\(V\\), \\(\v_1\upto \v_n\\), with associated eigenvalues \\(\lambda_1\upto \lambda_p\\) satisfying \\(\lambda_1\ge \lambda_2\ge\cdots\ge \lambda_p\\). Therefore, we have for \\(i=1\upto p\\) that
2323

2424
$$ X^TX \v_i = \lambda_i \v_i. \tag*{(1)}$$
2525

0 commit comments

Comments
 (0)