Saturday, July 27, 2013

Is there such a thing as "best" Recommender System algorithm?

I received emails from users asking which recommender system algorithm they should use. Usually people start looking for articles on which approach has a better performance, and once they find something convincing they start to implement it.

I believe that the best recommender system depends on the data and the problem you have to deal with.

With that in mind, I decided to publish here some pros and cons for each recommender type (collaborative, content and hybrid), so people can decide for their own what algoritms better suit their needs.

I've already presented these approaches here, so if you know nothing about recommender systems, you can read it there first.

Collaborative Filtering

Pros

  • Recommends diverse items to users, being innovative;
  • Good practical results (read Amazon's article);
  • It is widely used, and you can find several OpenSource  implementations of it (Apache Mahout);
  • It can be used on ratings from users on items;
  • It can deal with video and audio data;

Cons

  •   It suffers with scarcity of data, if you don't have many ratings for example you might end up with bad results;
  •   When the number of ratings grow, scalability becomes an issue, it might be hard to calculate similarity for all users;


Content Based Filtering

Pros

  • It works better with smaller amount of information than Collaborative Filtering;
  • It uses description of items, so it works well with tagged items, and it usually matches well users preferences profile;

Cons

  • It doesn't work so well for video or audio data with no text tags;
  • Frequently recommends repetitive items, staying only on similar things that the user has already seen;

Hybrid Systems

Pros

  • Usually the most effective approach (more accuracy on results);
  • It overcomes the single approaches;

Cons

  • Hard to find a balance when combining the two approaches;
  • Challenging to implement;


No comments:

Post a Comment