Tuesday Big Data Series

INFOGRAPHIC – HDFS and it’s features that make it so awesome!

Read more here.

Tuesday Big Data Series

How does Apache Storm work?

In the previous post, we saw an introduction to Apache Storm, it’s characteristics and different use cases. Now, let us take a look at the Apache Storm architecture below – Reference A storm cluster is similar to a Hadoop cluster. In a Hadoop cluster, we run what is called as MapReduce jobs, but in Storm we… Continue reading How does Apache Storm work?

Tuesday Big Data Series

What is Apache Storm?

The ability to extract, transform and process real-time data is critical today. In the beginning, there wasn’t any support for real-time data processing. But, with the demand and constant technological progress, there are quite a few technologies now that support real-time data processing. Apache Storm is one of them. Apache Storm is a system for… Continue reading What is Apache Storm?

Tuesday Big Data Series

INFOGRAPHIC – Apache Pig

To know more, read the complete post here.

Tuesday Big Data Series

An introduction to MongoDB

MongoDB is one of the leading NoSQL database. It’s a new generation of database. In the past, web applications have used relational databases to store data. But, with growing data, scalability and availability of data are some of the main concerns of any web application (or an organization). MongoDB was designed with web applications in mind.… Continue reading An introduction to MongoDB

Tuesday Big Data Series

What is NoSQL?

So, what has changed in the past few years? The volume of data has increased tremendously. The kind of data we are dealing with has changed. We no longer have the plain-old text format to deal with. We have audio, videos, images, and other complex formats of data that needs to be dealt with. Number of users… Continue reading What is NoSQL?

Tuesday Big Data Series

What is a Fair Scheduler in Hadoop?

Fair Scheduler is a pluggable scheduler for Hadoop that allows YARN applications to share resources in large clusters fairly. As the name suggests, it allocates resources such that all applications get an equal share. By default, this is done on the basis of the memory. But, i t can be configured to schedule with both memory… Continue reading What is a Fair Scheduler in Hadoop?