CSE472 Assignment 3- Matrix Factorization (ALS) Solved

30.00 $

Category:

Description

Rate this product

Assignment 3: Matrix Factorization for Recommender System

Description of the Assignment

In this assignment, you’ll be building a recommendation system to make predictions related to reviews of Electronics products on Amazon. You’ll be given data that comprises Users, Items, and Ratings. We’ll focus on the User-Item utility matrix and build an alternating least square (ALS) based recommendation systems.

 

To begin, download the files for this assignment from:

http://jmcauley.ucsd.edu/cse255/data/assignment2.tar.gz

train.json.gz: 1,000,000 reviews to be used for training.  It is not necessary to use all reviews for training, for example if doing so proves too computationally intensive. These files are one-json-per-line. You will need the following three fields,

itemID: The ID of the item.  This is a hashed product identifier from Amazon.

reviewerID: The ID of the reviewer.  This is a hashed user identifier from Amazon.

rating: Rating given by the reviewer. The range is 0-5.

Rating.txt: Pairs (userIDs and itemIDs) on which you are to predict ratings (see the tasks below).

In the following snapshot you are shown the relevant fields from training file.

 

 

Use the following guidelines.

  1. Taking ratings data from the given data set, build an ALS model with a small number of latent factors, between 10-50 factors.
  2. We strongly recommend that you first try your code on a smaller dataset.
  3. Split the data set into 60-20-20 train-validate-test partitions. That is, the first 60% of the data is the training set. The next 20% is for validation and the remaining 20% is for test. You’ll use the training set to learn your ALS model and use the validation set to choose the regularization parameter and number of latent factors. Your splits cannot have any overlapping users but can have overlapping products.
  4. You’ll evaluate these systems via RMSE (root mean square error) metrics on Validation and Test sets. Make sure you try different regularization parameters and several latent factor dimensions and select the model that gives you the best RMSE on the validation set.
  5. Once you have finished choosing your model using the validation set, you’ll test it on the test set and report that error as your final error metric.
  6. Finally, write a simple recommendation engine that will take the ALS model and a ratings file that contains a few ratings from one user and then comes back with a recommendation of products for that user.
  7. You can use any data structure library for sparse matrix representation and any linear algebra library for matrix inversion.

 

Special Instructions

  1. Although you are allowed to consult Dr. Google, don’t copy anything! If you do copy from internet or from any other person or from any other source, you will be severely punished and it is obvious. More than that, we expect fairness and honesty from you. Don’t disappoint us!
  2. Upload the code in Moodle within 10:00 P.M. of 29th June, 2018 (Friday). This is a strict system-imposed deadline for both section A and B.
  3. For Python and Matlab, you may not get supporting software in the lab. If you do program in these languages, bring your computer in the sessional.
  4. You are allowed to show the assignment in your own laptop during the final submission. But in that case, ensure an internet connection as you have to instantly download your code from the Moodle and show it.
  5. Contact Sharif sir for any typos or discrepancies in this document.
  • Assignment-3-Matrix-Factorization-ALS-psr81o.zip