Introduction To Marketing Technology Course

How to query Hadoop for getting counts

$100,000 One Hundred Thousand Dollars. Just sitting there in plain sight – waiting to be picked up. It is yours, if you only have the courage and wisdom to pick it up. No one is stopping you! There is no resistance. “What is the catch?”, you may ask. There is none really! It is an… Continue reading How to query Hadoop for getting counts

Friday "Term of the week" Series

Term of the week – Apache Storm

Apache Storm is a system for processing streaming data in real time. Built on top of YARN, it is powerful for real-time analytics, machine learning and continuous monitoring operations. Some of the important characteristics of Apache Storm are – It is a distributed real-time computation system for processing large volumes of high-velocity data. It is… Continue reading Term of the week – Apache Storm

Tuesday Big Data Series

INFOGRAPHIC – HDFS and it’s features that make it so awesome!

Read more here.

Monday Technology Series

INFOGRAPHIC – SQL vs. NoSQL Simplified

Read more here.

Tuesday Big Data Series

INFOGRAPHIC – Apache Pig

To know more, read the complete post here.

Friday "Term of the week" Series

Term of the Week : NoSQL

NoSQL database is the solution to handle large volumes of structured, semi-structured or unstructured data. It was developed to handle the shortcomings of relational databases in the big data scenario. NoSQL databases are popular because of their simplicity in design, horizontal scaling, ability to handle large volumes of data in various formats and higher data availability. Related Posts:… Continue reading Term of the Week : NoSQL

Tuesday Big Data Series

An introduction to MongoDB

MongoDB is one of the leading NoSQL database. It’s a new generation of database. In the past, web applications have used relational databases to store data. But, with growing data, scalability and availability of data are some of the main concerns of any web application (or an organization). MongoDB was designed with web applications in mind.… Continue reading An introduction to MongoDB

Friday "Term of the week" Series

Term of the Week : YARN

Apache Hadoop YARN (Yet Another Resource Negotiator) is a cluster management technology. It decouples MapReduce’s resource management and scheduling capabilities from the data processing component, enabling Hadoop to support more varied processing approaches and a broader array of applications. YARN, (or sometimes called as MR2), is an extended and an improved version of MR1. It was… Continue reading Term of the Week : YARN

Monday Technology Series

Understanding YARN and it’s components

Apache Hadoop YARN (Yet Another Resource Negotiator) is a cluster management technology. It decouples MapReduce’s resource management and scheduling capabilities from the data processing component, enabling Hadoop to support more varied processing approaches and a broader array of applications. YARN, (or sometimes called as MR2), is an extended and an improved version of MR1. It was… Continue reading Understanding YARN and it’s components

Tuesday Big Data Series

What is a Capacity Scheduler in Hadoop?

Capacity Scheduler is a pluggable scheduler for Hadoop which allows for multiple-tenants to securely share a large cluster such that their applications are allocated resources in a timely manner under constraints of allocated capacities. Before the onset of Big Data and Hadoop, every organization had their own set of resources that had sufficient capacity to meet the… Continue reading What is a Capacity Scheduler in Hadoop?