Posts

Showing posts with the label Apache Mahout

BigData White Papers

I don't know about you, but I always like to read the white papers that originate OpenSource projects (when available of course :) ). I have been working with BigData quite a lot lately and this area is mostly dominated by Apache OpenSource projects.  So, naturally (given the nerd that I am) I tried to investigate their history. I created a list of articles and companies that originated most BigData Apache projects. Here it is! Hope you guys find it interesting too. :) Apache Hadoop  Based on: Google MapReduce and GFS  Papers: https://static.googleusercontent.com/media/research.google.com/en//archive/mapreduce-osdi04.pdf https://static.googleusercontent.com/media/research.google.com/en//archive/gfs-sosp2003.pdf Apache Spark   Created by: University of California, Berkeley  Papers:  http://people.csail.mit.edu/matei/papers/2012/nsdi_spark.pdf http://people.csail.mit.edu/matei/papers/2010/hotcloud_spark.pdf http://peo...

Dummy Mahout Recommender System Example

I already talked about the Open Source Apache Mahout here , and now I'll show a dummy dummy first example of how to use its recommender system. It is a basic Java example that I used to try out Mahout. Hope it helps people starting to work with it. package myexample; import org.apache.mahout.cf.taste.common.TasteException; import org.apache.mahout.cf.taste.impl.model.XmlFile; import org.apache.mahout.cf.taste.impl.recommender.CachingRecommender; import org.apache.mahout.cf.taste.impl.recommender.GenericItemBasedRecommender; import org.apache.mahout.cf.taste.impl.similarity.LogLikelihoodSimilarity; import org.apache.mahout.cf.taste.impl.recommender.slopeone.SlopeOneRecommender; import org.apache.mahout.cf.taste.impl.similarity.PearsonCorrelationSimilarity; import org.apache.mahout.cf.taste.similarity.ItemSimilarity; import org.apache.mahout.cf.taste.neighborhood.UserNeighborhood; import org.apache.mahout.cf.taste.similarity.UserSimilarity; import org.apache.mahout.cf.taste....

Open Source Recommendation Systems Survey

Image
Here follows a survey I did back in 2010 when I was studying Recommender Systems. Hope it is useful. The growth of web content and the expansion of e-commerce has deeply increased the interest on recommender systems. This fact has led to the development of some open source projects in the area. Among the recommender systems algorithms available in the web, we can distinguish the following:   Duine , Apache Mahout , OpenSlopeOne , Cofi , SUGGEST and Vogoo . All of these projects offers collaborative-filtering implementations, in different programming languages. The Duine Framework supplies also an hybrid implementation. It is a Java software that presents the content-based and collaborative filtering in a switching engine: it dynamically switches between each prediction given the current state of the data. For example if there aren't many ratings available, it uses the content-based approach, and switches to the collaborative when the scenario changes. ...

Articles about Recommender Systems, Mahout and Hadoop Framework

Seeing that Recommender Systems has drawn a lot of attention in this past year, I would like to recommend further reading to those who want to obtain greater knowledge in the subject. I will indicate some articles that have helped me study the matter: G. Adomavicius and A. Tuzhilin Towards the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions. 2001 This article written by Adomavicius introduces Recommender Systems very well. It explains the main three types of these systems (Content-Based, Collaborative Filtering and Hybrid Recommendation). I also gives a formal mathematical definition of a Recommender Systems, which for some people can be great. I greatly recommend any other article you may find of Adomavicius. Laurent Candillier , Frank Meyer , Kris Jack, Françoise Fessant.A State-of-the-Art Recommender Systems. This paper also provides great overview of Recommender Systems and a very interesting comparison between...

Apache Mahout

Image
“Scalable machine learning library” Mahout is a solid Java framework in the Artificial Intelligence area. It is a machine learning project by the Apache Software Foundation that tries to build intelligent algorithms that learn from some data input. What is special about Mahout is that  it is a scalable library, prepared to deal with huge datasets. Its algorithms are built on top of the Apache Hadoop project and, so, they work with distributed computing. Mahout offers algorithms in three major  areas: Clustering, Categorization and Recommender Systems. This lats part was incoporated in April 4 th 2008, from the previous Taste Recommender System project. Mahout currently implements  a collaborative filtering engine that supports the user-based, item-based and Slope-one recommender systems. Other algorithms available in the package are  the k-means, fuzzy k-Means clustering, Canopy, Dirichlet and Mean-Shift. They also have The Naive Bayes, Complement...