Data Warehouse: Classic Use Cases for Hadoop in DW

Enterprise Data Warehousing (EDW) has been a mainstay of many major corporations for the last 20 years. However, with the tremendous growth of data (doubling every two years), the enterprise data warehouses are exceeding their capacity too quickly. Load processing windows are similarly being maxed out, adversely affecting service and threatening the delivery of critical business insights. So it becomes very expensive for organisations to process and maintain large datasets.

Read More…

CAP Theorem: How a distributed system can provide C + A without P?

The CAP theorem, also known as Brewer’s theorem, states that it is impossible for a distributed computer system to simultaneously provide all three of the following guarantees:

  • Consistency – all nodes always give the same result.
  • Availability – a guarantee that nodes always answer queries and accept updated.
  • Partition tolerance – system continues working even if one or more nodes become silent or not responsive.

Read More…

Hadoop: MapReduce Vs Spark

Sometimes I came across a question “Is Apache Spark going to replace Hadoop MapReduce?“. It depends based on your use cases. Here I tried to explained features of Apache Spark and Hadoop MapReduce as data processing. I hope this blog post will help to answer some of your questions which might have coming to your mind these days.

Read More…

Data Warehouse: Teradata Vs Hadoop

Teradata is a fully horizontal scalable relational database management system (RDBMS). In other words, Massively Parallel Processing (MPP) database systems based on a cluster of commodity hardware (computers) called “shared-nothing” nodes (each node has separate CPU, memory, and disks to process data locally) connected through a high-speed interconnect. Horizontal partitioning of relational tables, along with the parallel execution of SQL queries.

Read More…

Google Gives Us A Big Data Map

The progress of Hadoop (Hadoop eco-System) was greatly influenced by Google.  The challenge to tame this eruption of big data was recognized and accepted by the data engineers at Google  as early as late nineties and early 2000s. As a company on the path to becoming synonymous with global search, they were not only trying to tame this tsunami of big data,

Read More…