Using Deep Learning to Build a Recommendation Engine
The rise of advanced computational techniques using deep learning has increased the popularity of recommendation engines. Recommendation engines are perhaps most well known for their use in e-commerce platforms, where they are applied to identify relevant products for online shoppers. In this blog post, we’ll be discussing how one of deep learning’s architecture types, called an autoencoder, can be used to train a network to build a recommendation engine.
To explain this technique, we’ll be using Torch: an advanced library for deep learning applications that is currently available for Linux and OSX operating systems. As per the descriptions given in the pytorch website, it has two high-level features:
• Tensor computation with strong GPU acceleration.
• Deep neural networks built on a tape based autograd system
In this recommendation engine illustration, we’ll use pytorch to architect our neural network model. The dataset is publicly available in the movielens website for reviews of 27,000 movies by 138,000 users. For computation complexity, we will use a subset of these data of 1 million ratings from 6,000 users on 4,000 movies.
How Autoencoders Work
A simple illustration of how encoding and decoding works in autoencoders has been given in the picture below. The first few layers work as encoding the data to lower dimensional weight vectors and the last few layers work on decoding the data from the weights to recreate the weights.
Building the Neural Network
In order to classify whether a new user with ratings for a few movies will like a new movie he has not watched yet, we will assume a rating above 3 is a like and a rating below 3 is not a like. In order to build the neural network, we have to define the loss function which is Mean Square error loss in our case. The optimizer used will be RMS prop which helps in fitting the parameters of the neural network to the dataset. The number of epochs can be tuned based on the findings on the loss on training, validation and test set.
The network we built is a 3 hidden layer network of 20, 10, 20 and a final layer containing the nodes same as the number of movies. First, the user ratings and the movie database were combined to form a matrix where each row represents the ratings of a user for all the movies he has watched and with the columns representing the movies. Finally, these datasets are converted to torch tensors for computational efficiency.
After the model is trained, it seems to work perfectly for the test data set where in order to calculate the mean square error in our model we have removed all the new movie ratings and used the ratings for which we had data. The test loss is quite similar to the train loss. The model can be fine-tuned with the right number of layers and training it on the largest data set available. This trained model can then be used to predict the ratings of a new user if he has predictions for at least a few movies among the available data set.
The simple implementation can be found on my GitHub here: https://github.com/deepakmahapatra/AutoEncoders