BigData White Papers

I don't know about you, but I always like to read the white papers that originate OpenSource projects (when available of course :) ).

I have been working with BigData quite a lot lately and this area is mostly dominated by Apache OpenSource projects.

 So, naturally (given the nerd that I am) I tried to investigate their history. I created a list of articles and companies that originated most BigData Apache projects.

Here it is! Hope you guys find it interesting too. :)



Apache Hadoop 

Based on: Google MapReduce and GFS 
Papers:


Apache Spark 

Created by: University of California, Berkeley 
Papers: 



Apache Hive 

Created by: Facebook
Papers: 




Apache Impala 

Based on: Google F1
Papers:


Apache HBase

Based on: Google BigTable
Papers:


Apache Drill 

Based on: Google Dremel
Papers: 


Apache Pig 

Created by: Yahoo!
Papers: 


Apache Oozie 

Created by: Yahoo!
Papers: 


Apache Sqoop 

Started as a module for Apache Hadoop on issue https://issues.apache.org/jira/browse/HADOOP-5815 by Aaron Kimball.
Links:


Apache Flume

Links:

Comments

Post a Comment

Popular posts from this blog

Slope One

Apache Mahout

Error when using smooth.spline