Tuesday Big Data Series

What is Apache Storm?

The ability to extract, transform and process real-time data is critical today. In the beginning, there wasn’t any support for real-time data processing. But, with the demand and constant technological progress, there are quite a few technologies now that support real-time data processing. Apache Storm is one of them.

Apache Storm is a system for processing streaming data in real time. Built on top of YARN, it is powerful for real-time analytics, machine learning and continuous monitoring operations.

Some of the important characteristics of Apache Storm are –

  • It is a distributed real-time computation system for processing large volumes of high-velocity data.
  • It is extremely fast with the ability to process one million 100 byte messages per second per node.
  • It is highly scalable.
  • Like other Hadoop systems, it is fault-tolerant and reliable.
  • It guarantees that each unit of data will be processed at least once or exactly once.
  • It is easy to operate.

Here are some of the use cases in which Storm can be used –

Screen Shot 2016-07-08 at 8.22.25 PM.png

Reference

 

One thought on “What is Apache Storm?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s