Skip to content

sin-ha/recommender-systems

Repository files navigation

recommender-systems

building recommender systems that recommends movies

the dataset used in the project is the rating.csv file available in the kaggle competion linked below

https://www.kaggle.com/grouplens/movielens-20m-dataset

part 1

user-user collaborative filtering.py

in this we use the pearson corelation by calculating the measure of realation between any two person given their similarity in rating two movies thus improving our predictions from the normal average it is a non-optimised algorthm and takes O(N*N +M) time complexity to run hence a smaller subset of data is considered in which the sparsity of the matrix can be reduced by only choosig the most frequent rating people and most frequently rated movies this algorithm is highly compute intensive and hence a only 10k most frequent users and 2.5k most rated movies are accounted for In each iteration the similarity is calculated between two users by the pearson corelation and the top 5 neighbours are stored only

improvement

since this is a mathematical model which is deterministic hence we improve it by considering more neighbours which give us a better estimation

part 2

matrix factorization.py
Based on SVD we try to produce our rating matrix as a product of two latent space matrices U and W the predicted value is given as r^ = WT.U + b +c +mu and hence the gradients and cost function

About

building recommender systems that recommends movies

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors