Wednesday, January 25, 2012

Articles about Recommender Systems, Mahout and Hadoop Framework

Seeing that Recommender Systems has drawn a lot of attention in this past year, I would like to recommend further reading to those who want to obtain greater knowledge in the subject.
I will indicate some articles that have helped me study the matter:

This article written by Adomavicius introduces Recommender Systems very well. It explains the main three types of these systems (Content-Based, Collaborative Filtering and Hybrid Recommendation). I also gives a formal mathematical definition of a Recommender Systems, which for some people can be great. I greatly recommend any other article you may find of Adomavicius.

This paper also provides great overview of Recommender Systems and a very interesting comparison between Collaborative Filtering and Content-Based approaches.

This Article explains the Recommender System approach developed by Amazon. The paper describes how Amazon developed a Collaborative-Filtering method that has better practical performance than other Collaborative Filtering methods. It uses the Item-Item approaches, which instead of comparing the vector of users in the rating matrix, compares the vector of items. This approach is better explained in this other post of this blog.

If you are willing to know more about Apache Hadoop, this is a good way to start. I must say that the Oficial Hadoop website is obviously a great reference.

I can't help but indicating my own articles on the matter. The first one compares several Open Source Recommender Systems and the second explains how I tried to build a distributed Recommender Systems using Hadoop.

This book remains as one of the best references on Artificial Intelligence in general. It does not discuss Recommender Systems, but still worth the reading. It starts by defining that Artificial Intelligence are "Systems that act Rationally", explaining all AI history.
It covers most of AI main algorithms, including the famous Hill-Climbing, Simulated Annealing, BFS and DFS. It covers also Machine Learning areas, such the algorithms Support Vector Machines and K-means.
I definitely recommend it.