The progress of Hadoop (Hadoop eco-System) was greatly influenced by Google. The challenge to tame this eruption of big data was recognized and accepted by the data engineers at Google as early as late nineties and early 2000s. As a company on the path to becoming synonymous with global search, they were not only trying to tame this tsunami of big data,
they were shooting for an even bigger goal: to do this using COTS (commercial off the shelf) machines so as to keep the cost as low as practically possible. We thanks to Google engineers to showing us the path for the future of Information Technology. Below table shows this influential relationship.
Google Publication |
Hadoop |
Characteristics |
GFS & MapReduce (2004) |
HDFS & MapReduce (2006) |
Batch Programs |
Sawzall (2005) |
Pig & Hive (2008) |
Batch Queries |
BigTable (2006) |
HBase (2008) |
Online key/value |
Dremel (2010) |
Impala (2012) |
Online Queries |
Spanner (2012) |
???? |
Transactions, Etc. |