Google Gives Us A Big Data Map

The progress of Hadoop (Hadoop eco-System) was greatly influenced by Google.  The challenge to tame this eruption of big data was recognized and accepted by the data engineers at Google  as early as late nineties and early 2000s. As a company on the path to becoming synonymous with global search, they were not only trying to tame this tsunami of big data,

Read More…

Apache Hadoop Installation and Cluster Setup: Part-3

Let’s start with HadoopNameNode (master), repeat this for SNN and 2 slaves. Connect to HadoopNameNode through PuTTY and follow the commands.

Update the packages and dependencies

$ sudo apt-get update

Install Oracle Java

Install the latest Oracle Java (JDK) 7 in Ubuntu

Read More…

Setting up Hadoop Cluster on Amazon Cloud

I wanted to get familiar with the big data world, and decided to test Hadoop on Amazon Cloud. It was a really interesting and informative experience. The aim of this blog is to share my experience, thoughts and observations related to both practical and non-practical use of Apache Hadoop.

Overview 

A typical Hadoop multi-node cluster

Read More…