## Recommender system

Based on toy example from this [blog post](https://hackernoon.com/introduction-to-recommender-system-part-1-collaborative-filtering-singular-value-decomposition-44c9659c5e75)

We have a database of movies and user ratings, but since most users watch and rate only a small subset of all possible movies, there is a lot of missing data. Our job is to predict what other movies a user might like, based on the movies that the user has rated. 

Recall that SVD gives the optimal (in terms of Frobenius norm) low rank reconstruction for a matrix. This is true even for sparse matrices, and we make use of this to make predictions about user movie preferences.

Note: Real world recommender systems based on SVD calculate an approximate SVD using iterative methods for computational efficiency, but the idea is the same - we assume that the data can be modeled by $k$ latent factors, then reconstruct the rank-$k$ matrix. You'd also normalize the data in a real-use case.

In [None]:
from collections import OrderedDict
import pandas as pd
import numpy as np

In [None]:
ratings = pd.DataFrame([
    [2,None,2,4,5,None],
    [5,None,4,None,None,1],
    [None,None,5,None,2,None],
    [None,1,None,5,None,4],
    [None,None,4,None,None,2,],
    [4,5,None,1,None,None]],
    index=list('ABCDEF'),
    columns=['The Avengers', 'Sherlock', 'Transformers',
             'Matrix', 'Titanic', 'Me Before You']
)

ratings = ratings.astype(pd.SparseDtype("float", np.nan))
ratings

**Implement and explain the following steps**

In [None]:
from scipy.sparse.linalg import svds
X = ratings.sparse.to_coo()
print(X)

In [None]:
U, s, Vt = svds(X, k=min(ratings.shape)-1)
s

In [None]:
perm = np.arange(len(s))[::-1]
U = U[:, perm]
s = s[perm]
Vt = Vt[perm, :]

In [None]:
k = 3
Y = U[:, :k] @ np.diag(s[:k]) @ Vt[:k, :]
Y

In [None]:
user = 'E'
pd.DataFrame(dict(
    Observed = ratings.loc[user].sparse.to_dense(),
    Predicted = Y[ratings.index.tolist().index(user)]))