Apache Storm is a system for processing streaming data in real time. Built on top of YARN, it is powerful for real-time analytics, machine learning and continuous monitoring operations. Some of the important characteristics of Apache Storm are – It is a distributed real-time computation system for processing large volumes of high-velocity data. It is… Continue reading Term of the week – Apache Storm
In the previous post, we saw an introduction to Apache Storm, it’s characteristics and different use cases. Now, let us take a look at the Apache Storm architecture below – Reference A storm cluster is similar to a Hadoop cluster. In a Hadoop cluster, we run what is called as MapReduce jobs, but in Storm we… Continue reading How does Apache Storm work?
NoSQL database is the solution to handle large volumes of structured, semi-structured or unstructured data. It was developed to handle the shortcomings of relational databases in the big data scenario. NoSQL databases are popular because of their simplicity in design, horizontal scaling, ability to handle large volumes of data in various formats and higher data availability. Related Posts:… Continue reading Term of the Week : NoSQL
Apache Spark is another project of Apache that offers parallel data processing and which can work with Hadoop to develop Big Data applications. It is a fast and general engine for large-scale data processing. Let us look at some of the features of Apache Spark one by one – Real Time Processing Unlike Map-Reduce, Spark can handle… Continue reading 5 awesome features of Apache Spark