The progress of Hadoop (Hadoop eco-System) was greatly influenced by Google. The challenge to tame this eruption of big data was recognized and accepted by the data engineers at Google as early as late nineties and early 2000s. As a company on the path to becoming synonymous with global search, they were not only trying to tame this tsunami of big data,
The Hadoop Ecosystem Table
List of major projects/tools surrounding hadoop with their category which build up Enterprise Data Platform. It is growing at a rapid pace to keeping in mind three Vs of Big Data: Volume (Big), Velocity (Fast) and Variety (Smart). Find the table below:
Installing Cloudera Manager and CDH on Amazon EC2: Part-1
Log into the AWS console.
Go to EC2. Please refer my previous post “Setting up infrastructure with Amazon EC2: Part-1” for more info.
I have chosen CentOS 6.4 as our underlying operating system.
Apache Hadoop Installation and Cluster Setup: Part-3
Let’s start with HadoopNameNode (master), repeat this for SNN and 2 slaves. Connect to HadoopNameNode through PuTTY and follow the commands.
Update the packages and dependencies
$ sudo apt-get update |
Install Oracle Java
Install the latest Oracle Java (JDK) 7 in Ubuntu
Setting up Client Access to Amazon EC2 Instances: Part-2
To prepare to connect to a Linux instance from Windows using PuTTY
1. Download and install PuTTY and PuTTYgen from here.
2. Start PuTTYgen (from the Start menu, click All Programs > PuTTY > PuTTYgen)
Setting up infrastructure with Amazon EC2: Part-1
If you’ve already signed up for Amazon Web Services (AWS), you can start using Amazon EC2 immediately. You can open the Amazon EC2 console, click Launch Instance, and follow the steps in the launch wizard to launch your first instance.
Get Amazon **FREE** AWS Account
If you do not already have an account, please create a new free one. Amazon EC2 comes with eligible free-tier instances. Please find the below free-tier usages for your reference. For more information, see AWS Free Tier
Setting up Hadoop Cluster on Amazon Cloud
I wanted to get familiar with the big data world, and decided to test Hadoop on Amazon Cloud. It was a really interesting and informative experience. The aim of this blog is to share my experience, thoughts and observations related to both practical and non-practical use of Apache Hadoop.
Overview
A typical Hadoop multi-node cluster