Collaborative Filtering

Collaborative Filtering Section Introduction

Collaborative Filtering
Generalize what we studied so far
Idea: Rank each item by score
ex. PageRank, Reddit, HN, Bandit…
-Finding a score for each item s(j)
-j is item, s() means score
-Non-specific to any particular user (Doesn’t matter who you are, will see the same score for each item)
Notations
s(j): Average rating for each item
Recommendation_System22
Recommendation_System23
Sum of all ratings for item j divided by the number of ratings for item j
Personalize the score
s(i,j): Can depend both on the user i and item j
Recommendation_System24
Right hand side still doesn’t depend on i
->Every user i still sees the same score for each item
-Useful to introduce new conventions and symbols and modify this formula later on
-i = 1…N, N: number of users, j = 1…M, M: Number of items in our dataset
Ratings Matrix
Central object for the next few sections of the course
Recommendation_System25
| User \ Item | Item 1 | Item 2 | … | Item m | |————-|——–|——–|—–|——–| | User 1 | 5 | 3 | … | 1 | | User 2 | 4 | | … | 2 | | … | … | … | … | … | | User n | 2 | 1 | … | 5 |

Relationship to NLP
User-item matrix is reminiscent of term-document matrix
(term: word document matrix)
X(t, d): # of times term t appears in document d
-In terms of recommendation systems, can think of X(t, d) as “how much does t like the item d?”
-ex. gravity appears a lot in a document written by Albert Einstein
=> gravity likes this document Albert Einstein wrote
-Data having the same structure -> Techniques used to analyze the same matrix, cross over the two fields
-Matrix Factorization: Deep NLP
-SVD: Unsupervised Deep Learning + NLP
Sparsity
Charasteristics of user-item matrix that makes it unique to recommendation system is that it is sparse
(Not necessarily meaning entries having 0s)
-Term-document matrix is sparse:
Most entries are 0
-User-item matrix is sparse
More entries are empty
(Most entries of user item matrix is undefined or don’t exist)
Why?
Suppose Netflix having 100 million users
Ex1.
Suppose you are an average user, How many movies have you watched?
Among those watched movies, how many have you rated?
Ex2.
In Google, 500,000 movies exist
Number of movies rated, is just a fraction of a percent of total of movies in the world
-> Full user-item matrix is expected to be extremely sparse
Goal of Collaborative Filtering
-Want to make recommendations
-Ratings matrix, r(i, j), most of the values don’t exist-This is good
-Ex. Matrix is dense, every value filled up
-> Good mathematically, but business-wise, it sucks (=Every user has already seen every movie)
Nothing to recommend
=> Matrix must be sparse so that you can have something to recommend to the user
-Guess the ratings in the ratings matrix might be
Want to guess what you might rate a movie you haven’t seen yet
Recommendation_System26
Regression problem, to predict a real value variable
r hat: prediction for what you will rate an item
Can recommend items sorted by its scores
E.g. If you think a movie will be rated 5, definitely recommend that movie
=> Want to find s(i,j), user item recommendation score, and will be predicted scores of items you haven’t seen
Regression
Want to predict a real number
Objective is MSE(mean squared error)
Outline
User-user Collaborative Filtering
Item-item Collaborative Filtering
Process
(1)Take our model’s predicted ratings
(2)Compare them to the actual ratings
(3)Square the difference
(4)Take average of those squared differences
Recommendation_System27

User-user Collaborative Filtering

User-user Collaborative Filtering
Ex. Slice of the user-movie matrix
-Alice really seems to like action movies(Batmen, X-men, Star Wars), dislike romance movies(The Notebook, Bridget Jones’s Diary)
-Bob thinks similar ways as Alice
Bob likes action movies(Batmen, X-men, Star Wars) as well and really dislikes romance movies(The Notebook, Bridget Jones’s Diary)
-Carol has the opposite perspective to Alice and Bob
Carol like romance movies(The Notebook, Bridget Jones’s Diary), but don’t care about the action movies(Batmen, X-men, Star Wars)
-Bob hasn’t seen Star Wars. Would Star Wars be a good recommendation for Bob?
Intuitive Perspective
Probably a good recommendation. Looking at Alice’s ratings, Bob’s is very similar
-> we can assume Bob will feel the same way Alice does
Mathematically, Bob and Alice’s ratings are highly correlated and Bob’s ratings don’t agree with Carol’s at all(Negative Correlation)
Average Rating
-Limited
Not personalized, doesn’t take into account the user i at all. Equally treat everyone’s rating of the movie
-Bob’s s(i, j) equally depends on Alice’s rating and Carol’s rating, even though he doesn’t agree with Carol
| | Batman | X-Men | Star Wars | The Notebook | Bridget Jones’s Diary | |——-|——–|——-|———–|————–|———————–| | Alice | 5 | 4.5 | 5 | 2 | 1 | | Bob | 4.5 | 4 | | 2 | 2 | | Carol | 2 | 3 | 1 | 5 | 5 |

Weighting Ratings
Way of fixing Average Rating is to put a weight in each rating
-Want it to be small for users I don’t agree with, large for users I do agree with
Intuitively, want Alice’s ratings to matter more, and Carol’s weighting to matter less
-Have to divide by the weights themselves in the denominator (Want the final rating to be on the right scale)
Recommendation_System28
ri’j: Rating that user i’ gives to the item j
wii’: Weight between user i and user i’
-Want the weight to be large if user i and user i’ are in agreement and want to be small if they are not
Another issue with average rating
-Your interpretation of rating is different from mine
-Users can be biased to be optimistic or pessimistic
Optimistic: Rate most of the movies of 5, rate bad movie as 3
Pessimistic: Rate most of the movies 1 or 2, good movie as 4
Recommendation_System29