Monday, October 18, 2010

Recommender Systems

"Suggest new items that fit the user’s preference."
 

Introduction

The increasing amount of information in the web has promoted the advance of the recommender systems research area. 
These systems help users by offering useful suggestions to them. The aim of Recommender Systems is to provide personalized recommendations, representing a fundamental role on e-commerce (widely used by companies such as Amazon, Netflix and Google).
They highlight items that the users have not yet seen and may appreciate. Such items include books, restaurants, webpages or even lifestyles. A suggestion is usually made based on the user's historical preferences.
These preferences may be collected implicitly or explicitly. When a user is buying an item, or entering a web-page, for example, he is giving an implicit preference feedback. In the case of a user giving a rating to an article, he is providing an explicit feedback.
A substantial challenge in this area is the volume of information available. A recommendation algorithm may have to deal with enormous data sets, and for that, scalability and effectiveness have to be taken in count.
These systems are usually classified based on how recommendations are made. The most common categories are: Collaborative Filtering, Content-Based Filtering and Hybrid Filtering.

Recommender Systems Classification 

  • Content-Based Filtering
"If an user has liked the movie "Titanic", a typical recommendation would be "Ghost" because both of the items present the feature "Romance category". "

This approach relates the user's  past  preferences with content information. Therefore, it defines objects of interest to the users based on the associated features of items. 
The content-based technique proposes a method that creates an user profile based on his preferences on items. This profile is then compared with information about other items. Those with the most similarity, namely, the items with closer descriptions regarding the profile, are then recommended to the user.

  • Collaborative Filtering
"If an user A has liked the movie "Matrix " and "The Lord of the Rings" and many other users that have liked these two movies also liked "Memento", then it is likely that "Memento" will be recommended to user A."

The second technique recommends an item by comparing an user with a neighborhood of users. In collaborative filtering method, we are presented with a set of ratings of a user on items. It then identifies a group of other users and compares the ratings of the first with the ratings of the group.

  • Hybrid Filtering
These two major recommender techniques present theirs strength and weakness. The collaborative filtering doesn't need many information about the items. However, it presents some problems when a new item is inserted in the system, since no one has hated it yet. Despite of the content-based method not having this new item problem, it can lack of originality when giving recommendations. The recommended items are those very similar with the ones that the user has already seen, leading to uncreative suggestions. In addition, this method works well for items with good textual descriptions but it can fail with other types of multimedia items.
An hybrid approach is then proposed to overcome these problems. The Hybrid Filtering combines the two above cited methods to avoid the problems existent in them. In Hybrid Recommender Systems:  Survey and Experiments by Robin Burke it is presented many options of combining the two approaches and creating an hybrid method.



Where can I get a Recommender System?

If you are willing to have your own recommender system you may have two options: you can either build your own from zero or download one package from the internet and addapt it to your needs.
In case that you are planning to build you own, there are many good articles about the subject available. I suggest for you to understand your own dataset, discover wich approach is better  for you and then develop it. As testing data,y ou can use the package from MovieLens (from Grouplens Research), a set of ratings that users have given on movies.
If however, you can't spend that much time and effort on the subject, you may download an open source project from the internet. I recommend the Apache Mahout project, build in Java.


Some useful links and articles

Articles:
Greg Linden, Brent Smith, and Jeremy York. Amazon.com recommendations: Item-to-item
collaborative filtering.

Robin Burke. Hybrid recommender systems: Survey and experiments. User Modeling
and User-Adapted Interaction.

 Badrul M. Sarwar, George Karypis, Joseph A. Konstan, and John Reidl. Item-based collaborative filtering recommendation algorithms. 

Links:
    http://mahout.apache.org/

    http://www.grouplens.org/

    http://ict.ewi.tudelft.nl/~jun/CollaborativeFiltering.html

    http://glaros.dtc.umn.edu/gkhome/

    1 comment:

    1. Very interesting, I would love if you worked with me in a project. A recommender system would help to bust the application.

      King Regards!

      ReplyDelete