Let us first create the matrix \(M\). We will create a matrix of low rank and then sample from it to create \(M\).
import numpy as np
import numpy.linalg as ln
import math
m = 100 # users
n = 1000 # items
d = 2
X = np.random.randn(m,d) - 0.5
Y = np.random.randn(d,n) - 0.5
M_full = np.dot(X,Y)
print ln.matrix_rank(M_full)
We will sample values from \(M_{full}\) to create a dummy utility matrix \(M\). Let us assume that each user has rated a maximum of 20 items.
M = np.zeros(M_full.shape)
deg = 20
for i in range(m):
# randomly choose how many items this user has rated
q = np.random.randint(1,deg);
# now generate the samples
s = np.random.randint(0,n,q)
M[i,s] = M_full[i,s]
#print M
sum(M[10,:]**2)
Now, we can proceed with the UV decomposition. Lets do the standard gradient descent first.
The update equation for \(u_{rs}\) was,
\[ x=\frac{\sum_j v_{sj}\left( m_{rj} - \sum_{k\neq s} u_{rk}v_{kj}\right) }{\sum_j v_{sj}^2} \]
and for \(v_{rs}\)
\[ y=\frac{\sum_i u_{ir}\left( m_{is} - \sum_{k\neq r} u_{ik}v_{ks}\right) }{\sum_i v_{ir}^2} \]
Here is the pseudocode,
repeat until converged
U = np.zeros(X.shape) + 0.5 V = np.zeros(Y.shape) + 0.5
I,J = np.nonzero(M)
for q in range(10): for s in range(0,d): # loop over d # loop over U for r in range(0,m): # x = U[r,s] den = sum(V[s,:]2) num = 0.0 for j in range(0,n): num += V[s,j](M[r,j] - np.inner(U[r,:], V[:,j]) + U[r,s]V[s,j]) if (den > 0.0001): U[r,s] = num/den # loop over V for r in range(0,n): # y = V[s,r] den = sum(U[:,s]2) num = 0.0 for i in range(0,m): num += U[i,s](M[i,r] - np.inner(U[i,:], V[:,r]) + U[i,s]V[s,r]) if (den > 0.0001): V[s,r] = num/den P = np.dot(U, V) err = 0.0; for i,j in zip(I,J): err += (M[i,j] - P[i,j])**2 print math.sqrt(err)
print U, V.transpose()
For the stochastic gradient descent case, we are looping over each data point and updating the corresponding row/column of \(U/V\).
U = np.zeros(X.shape) + 0.5
V = np.zeros(Y.shape) + 0.5
step = 0.05 # lambda N actually ...
I,J = np.nonzero(M)
# should have multiple passes until convergence.
for q in range(100):
# loop over data m_(i,j)
for i,j in zip(I,J):
g = M[i,j] - np.inner(U[i,:], V[:,j])
dcdu = -2*g*V[:,j]
dcdv = -2*g*U[i,:]
U[i,:] = U[i,:] - step * dcdu
V[:,j] = V[:,j] - step * dcdv
P = np.dot(U, V)
err = 0.0;
for i,j in zip(I,J):
err += (M[i,j] - P[i,j])**2
print math.sqrt(err)
print U, V.transpose()
Q = np.random.randn(6,6)
Q[1,3] = 0
Q[4,5] = 0
Q[1,2] = 0
I,J = np.nonzero(Q)
print len(I)