Hari Sundar


Let us first create the matrix \(M\). We will create a matrix of low rank and then sample from it to create \(M\).

import numpy as np
import numpy.linalg as ln
import math

m = 100     # users
n = 1000    # items
d = 2

X = np.random.randn(m,d) - 0.5
Y = np.random.randn(d,n) - 0.5

M_full = np.dot(X,Y)
print ln.matrix_rank(M_full)

We will sample values from \(M_{full}\) to create a dummy utility matrix \(M\). Let us assume that each user has rated a maximum of 20 items.

M = np.zeros(M_full.shape)
deg = 20
for i in range(m):
    # randomly choose how many items this user has rated
    q = np.random.randint(1,deg);
    # now generate the samples
    s = np.random.randint(0,n,q)
    M[i,s] = M_full[i,s]

#print M

sum(M[10,:]**2)

Now, we can proceed with the UV decomposition. Lets do the standard gradient descent first.

Gradient Descent

The update equation for \(u_{rs}\) was,

\[ x=\frac{\sum_j v_{sj}\left( m_{rj} - \sum_{k\neq s} u_{rk}v_{kj}\right) }{\sum_j v_{sj}^2} \]

and for \(v_{rs}\)

\[ y=\frac{\sum_i u_{ir}\left( m_{is} - \sum_{k\neq r} u_{ik}v_{ks}\right) }{\sum_i v_{ir}^2} \]

Here is the pseudocode,

  1. Initialize U,V
  2. pick one entry at a time of U,V to optimize,
    1. compute value for this entry.
  3. repeat until converged

    U = np.zeros(X.shape) + 0.5 V = np.zeros(Y.shape) + 0.5

    I,J = np.nonzero(M)

    should have multiple passes until convergence.

    for q in range(10): for s in range(0,d): # loop over d # loop over U for r in range(0,m): # x = U[r,s] den = sum(V[s,:]2) num = 0.0 for j in range(0,n): num += V[s,j](M[r,j] - np.inner(U[r,:], V[:,j]) + U[r,s]V[s,j]) if (den > 0.0001): U[r,s] = num/den # loop over V for r in range(0,n): # y = V[s,r] den = sum(U[:,s]2) num = 0.0 for i in range(0,m): num += U[i,s](M[i,r] - np.inner(U[i,:], V[:,r]) + U[i,s]V[s,r]) if (den > 0.0001): V[s,r] = num/den P = np.dot(U, V) err = 0.0; for i,j in zip(I,J): err += (M[i,j] - P[i,j])**2 print math.sqrt(err)

    print U, V.transpose()

Stochastic Gradient descent …

For the stochastic gradient descent case, we are looping over each data point and updating the corresponding row/column of \(U/V\).

U = np.zeros(X.shape) + 0.5
V = np.zeros(Y.shape) + 0.5 

step = 0.05 # lambda N actually ...

I,J = np.nonzero(M)

# should have multiple passes until convergence.
for q in range(100):
    # loop over data m_(i,j)
    for i,j in zip(I,J):
        g = M[i,j] - np.inner(U[i,:], V[:,j])
        dcdu = -2*g*V[:,j]
        dcdv = -2*g*U[i,:]
        U[i,:] = U[i,:] - step * dcdu
        V[:,j] = V[:,j] - step * dcdv
    P = np.dot(U, V)
    err  = 0.0;
    for i,j in zip(I,J):
        err += (M[i,j] - P[i,j])**2
    print math.sqrt(err)
    
print U, V.transpose()        


Q = np.random.randn(6,6)
Q[1,3] = 0
Q[4,5] = 0
Q[1,2] = 0
I,J = np.nonzero(Q)

print len(I)