This is one of the simple approach of recommending products or contents to the user. The idea here is that if a user indicates (s)he likes a product by clicking, or by giving high rating, or by searching or browsing it means (s)he has the high possibility that they would buy the product. Now our approach for recommending the products would be to recommend the product to the user which has similar attributes or descriptive characteristics of the product like it would be brand of the product, color of product, size of product and so on. Normally content based filtering is famous with text documents, articles and more.
The steps in recommending products or contents to the user in content based filtering are as follows:
- Identify the factors which describe and differentiate the products and the factors which might be infulencial in weather a user would buy the product or not,
- Represent all the products in terms of those factors or descriptors or attributes,
- Create a tuple or number vector for each product that represents the strength of each factors for the product,
- Now start to look at the users and their history and create a user profile based on their history. It will have the same number of factors and their strength would indicate how much influenced the user is towards that factor,
- Recommend the user those products that are nearest to them in terms of those factors.
Now let’s take an example of recommending the movie based on content filtering. Now we are going to identify factors that might be relevant to recommend a movie. Those factors could be either Drama/Comedy or Commercial/Arty. Now we create a scale and we map all the products on those scales.
For example, Let’s say Rabin is a user of our recommendation system. Rabin has been shown 10 movies and he clicks on 7 commercial movies and 9 drama movies. Now we created a scale for our factors to be (-1, 1). If the movie is extremely comedy and commercial then it has score 1 and if the movie is extremely dramatic or arty then the score would be -1. So Rabin’s scale on the following space would be in (-0.9, 0.7 ).
Now the movie A and movie B are also dramatic and commercial which means the movies and contents are pretty similar. So it would be wiser to recommend movies A and B to Rabin. So this is the idea behind content based filtering.
Weak points of content based filtering:
- Mapping products and users to the factor space is a manual process and it would be very time consuming to come up with result,
- Factors need to be picked manually and not by algorithm so we could not come up with a good factors that represents the product’s persona.
Pure Content based filtering is not very popular but it is used with other forms of recommendation systems to form hybrid and more powerful recommendation systems. At the first time, when user sign ups to the recommendation system app, we can ask user about their interests so we could correctly recommend products to them since we can know in which factor space they may fall into.