A mathematically grounded exploration of latent factor modeling for personalized movie recommendations.
Recommender systems play a crucial role in mitigating information overload by inferring user preferences from historical interactions.
This project develops a Movie Recommendation Engine using Matrix Factorization, specifically Truncated Singular Value Decomposition (SVD), to uncover hidden structures in the user–item rating matrix.
By decomposing the rating matrix into orthogonal user and item feature spaces, we extract latent dimensions representing taste patterns such as genre affinity and stylistic preferences.
The system balances expressiveness and generalization by retaining only the top-k singular values, yielding a low-rank approximation that preserves the majority of spectral energy.
We minimize Root Mean Squared Error (RMSE) across observed ratings and compare:
- 📉 Closed-form SVD
- ⚙️ Iterative Optimization (Gradient Descent, Alternating Least Squares)
✨ This work emphasizes both mathematical rigor (orthogonality, rank reduction, spectral energy) and practical considerations in evaluation and deployment.
Given a sparse user–movie rating matrix, the goal is to predict unobserved ratings to enable personalized recommendations.
| Challenge | Description |
|---|---|
| Sparsity | Most users rate only a few movies. |
| Scalability | The dataset can involve thousands of users and items. |
| Overfitting | High-rank reconstructions memorize noise instead of learning meaningful patterns. |
Learn latent representations capturing intrinsic taste dimensions through linear algebraic factorization.
Let the user–item rating matrix be:
where missing entries are unobserved.
The full SVD factorizes the rating matrix as:
where
- U ∈ ℝm×m: Orthonormal user singular vectors
- Σ ∈ ℝm×n: Diagonal matrix of singular values σ₁ ≥ σ₂ ≥ … ≥ σᵣ
- V ∈ ℝn×n: Orthonormal item singular vectors
To reduce dimensionality, we keep only the top k singular values:
The rank-k approximation minimizes the Frobenius norm error:
Energy retained after truncation is given by:
This quantifies how much of the total variance (spectral energy) is preserved in the top-k components.
Latent user and item features are constructed as:
The predicted rating for user ( u ) and item ( i ) is:
Instead of direct SVD, latent vectors can be learned by minimizing the reconstruction loss:
$$ L = \sum_{(u,i)\in\Omega} (R_{u,i} - p_u \cdot q_i)^2
- \lambda (||p_u||^2 + ||q_i||^2) $$
where:
- Ω = set of observed user–item pairs
- λ = regularization parameter
with error term:
For fixed item matrix ( Q ), optimize user factors ( P ) via ridge regression, and vice versa:
- Fix ( Q ), solve for ( P )
- Fix ( P ), solve for ( Q )
Repeat until convergence.
Model performance is measured using Root Mean Squared Error (RMSE):
| 💡 Concept | 🧮 Description |
|---|---|
| Orthogonality | Ensures that latent features (columns of U and V) are orthogonal, meaning UᵀU = I and VᵀV = I. This guarantees that each latent dimension captures unique, uncorrelated information. |
| Rank Reduction | By using a truncated rank-k approximation Rₖ = UₖΣₖVₖᵀ, the model retains only the most significant singular values—reducing noise and preventing overfitting. |
| Spectral Energy Concentration | Measures how much of the total variance (energy) is captured by the top-k components: $$ Energy(k) = \frac{\sum_{i=1}^{k} \sigma_i^2}{\sum_{i=1}^{r} \sigma_i^2} $$ Higher energy retention indicates stronger representation of dominant behavioral patterns. |
.png)
.png)