Singular Value Decomposition
Last updated
Last updated
Using our movie recommendation as an example, we are given a rating matrix and we wish to perform a singular value decomposition on such matrix, such that
The sigma matrix is said to be our diagonal singular matrix, with singular values filling up its diagonal sorted in decreasing order. The top left corner singular value has the highest value and it descendes as we move toward the bottom right. The U matrix and M matrix represent the latent information for each of our users and movies. However, before we dive too deep into the details, let's do a refresher on singular value decomposition.
Using conventional definition, given a matrix A, we want to decompose it into three different matrices, , and . We need to first construct a symmetric matrix of using .
Then we find the eigenvalues and eigenvectors of this symmetric matrix of
We will use the square root of its eigenvalues to construct a singular matrix
We can now use the eigenvectors as columns of our V matrix. In this case, numpy has already done it for us.
SVD is now complete, we can easily verify it by performing the following:
There are several properties of SVD we should know about.
Going back to the movie example, imagine that we have 4 movies
Toy Story
Finding Nemo
Braveheart
Last Samurai
And we have 4 users
Alice
Bob
Calvin
Debby
We have the following rating matrix from the submitted ratings by each user. And notice that half of the users likes Pixar animated films a lot while the other half tends to have strong preference toward historical films.
Notice that the user matrix, we can clearly see that user 1 and user 4 are similar to each other in interest and user 2 and 3 are also similar to each other. This is telling us that Alice and Debby have similar taste in movies and Calvin and Bob share interest in historical drama.
What we did up there is effectively a low rank apprixomation. The original singular matrix had a rank 4 but we learned that most of the singular values are not really important. We dropped the extra 2 ranks and we can still produce a matrix that similar to the original one.
For example
Look! It looks just like the original!!!
Finally we can solve for U using , note that is the inverse of .
It is always possible to decompose a real valued matrix into
The matrices , , and are unique
The matrices and are column orthogonal i.e. and
The entries are positive and sorted in descending order