Optimization Techniques for latent collaborative filtering

4 min readJun 29, 2018

In my previous article, I wrote about Latent collaborative filtering and why it is one of the most used collaborative filtering algorithms.

In summary,

We have a user history of what product they rated or purchased,
Find hidden factors that influence the user’s probability to buy or rating the product,
Use those hidden factors to predict how user would like or dislike the product
We start by taking a user item rating history matrix R and decompose it into user factor(P) where each user mapped into hidden factor and product factor matrix(Q) where each product mapped to hidden factor.
This decomposition can be set as an optimization problem and this optimization is what we are going to solve.
We need to find product factor Pu for each user and item factor vector Qi . Find the complete set of factor vectors which minimize the error on the training set between training set and predicted rating. Predicted rating for any product is the dot product of factor vectors for user and product.
Taking the sum of squares of all the errors in the training set, we can calculate the error function. And minimizing the error function is the goal of our optimization problem.

Stochastic Gradient Descent

It takes each rating on the training set and perform some action on it. Say a rating be r u, i for user u and item i. Now what we are looking are two vectors P u and Q i whose dot product gives the rating that is close to the rating in the dataset or actual rating.

Algorithm:

Initialize some random value for P u and Q i .
Calculate predicted rating by (Pu . Qi) and calculate error by finding difference between actual rating and predicted rating.
From the initial position, calculate the slope of function and move slowly downwards, ( the goal is to reach global minimum from initial value)
Calculate slope at every point until you reach the minimum ( the point from where error increases ).
The error is given by

(equation 1) error function

6. Now you want to find the slope in this point and move slightly downwards

Here, the partial derivative term is the slope at that point, gamma is the learning rate that defines how steep we take the step downwards.

Alternating Least Squares

In stochastic gradient descent, we have to loop through entire training set to find the optimal values of pu and qi and this process is serial. But using alternating least squares, we can perform this calculation in parallel. Let’s see again in equation 1(error function) above, we have two variables in the equation pu and qi. Now what if we keep pu in the equation constant. Then the equation becomes quadratic and we can solve the value for qi. Now in next iteration we keep pu constant and solve for qi. We keep on alternating and solving the values until we reach the point where both pu and qi converge.

Use of matrix factorization in recommendation system was a big improvement. This was invented during the Netflix prize competition. It was a million dollar prize and there were lot of improvement in recommendation systems at this time. You can read more about the competition here. The other things that made a good impact on recommendation system other than matrix factorization were:

Normalizing for user biases( some people tend to rate all products high and there are some people who rate products low even if they like the product so we need some kind of weights to descent the ratings) and temporal effect( user’s interest and taste change along time for e.g I used to listen rock songs very much when I was kid now I don’t generally listen them ).

I will discuss more about recommendation systems in my next articles. If you like this post, don’t forget to clap the post and follow me on medium and on twitter.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Machine Learning

Deep Learning

Data Science

Artificial Intelligence

Recommendation System

Written by Rabin Poudyal

272 Followers

27 Following

Software Engineer, Data Science Practitioner. Say "Hi!" via email: rabinpoudyal1995@gmail.com or visit my website https://rabinpoudyal.com.np

No responses yet

Write a response

What are your thoughts?

Also publish to my profile

More from Rabin Poudyal

Building a knowledge graph in python from scratch

Rabin Poudyal

Building a knowledge graph in python from scratch

A knowledge graph is one of the widely used applications of machine learning that tech giants like Google and Microsoft are using in their…

Jan 12, 2020

Build a recommendation engine from scratch for your university project

Rabin Poudyal

Build a recommendation engine from scratch for your university project

Almost every CS student need to complete a final year project. There is a lot of confusion in what language to choose, what frameworks to…

Sep 7, 2018

409

Content Based Filtering in Recommendation Systems

Rabin Poudyal

Content Based Filtering in Recommendation Systems

This is one of the simple approach of recommending products or contents to the user. The idea here is that if a user indicates (s)he likes…

Jun 21, 2018

Nearest neighbour based method for collaborative filtering

Rabin Poudyal

Nearest neighbour based method for collaborative filtering

It is one of the method for performing collaborative filtering. If collaborative filtering is new to you don’t forget to read this article…

Jun 25, 2018

See all from Rabin Poudyal

Recommended from Medium

Siena Duplan

LLM-Powered Topic Modeling

Before Large Language Models (LLMs), data scientists like myself relied on what now feels like primitive techniques to perform Natural…

Sep 26, 2024

How Does Our Sense of Humor Change With Age? A Statistical Analysis

Fanfare

Daniel Parris

How Does Our Sense of Humor Change With Age? A Statistical Analysis

How do our comedic sensibilities form and transform over time?

Jun 22, 2024

343

Lists

Predictive Modeling w/ Python

20 stories1857 saves

Natural Language Processing

1977 stories1620 saves

Practical Guides to Machine Learning

10 stories2225 saves

data science and AI

40 stories341 saves

Exploring Recommendation Systems: Review of Matrix Factorization & Deep Learning Models

TDS Archive

Angel Das

Exploring Recommendation Systems: Review of Matrix Factorization & Deep Learning Models

Summary of Recommender Systems (Alternate Least Square, LightFM, Matrix Factorization with Neural Networks, and Neural Collaborative…

Nov 10, 2022

290

Next Generation of Click Through Rate Prediction — Episode 1

Shobeir Seddington

Next Generation of Click Through Rate Prediction — Episode 1

DCN v3 Paper Review

Sep 17, 2024

Introduction to Embedding-Based Recommender Systems

Data Science Collective

Dr. Robert Kübler

Introduction to Embedding-Based Recommender Systems

Learn to build a simple recommender in TensorFlow

Jan 25, 2023

670

From Wide and Deep Model to Deep-Cross Network (DCN) for Recommender Models

Yan Xu

From Wide and Deep Model to Deep-Cross Network (DCN) for Recommender Models

Overview

Jan 8

See more recommendations

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams