Monday Technology Series

INFOGRAPHIC – SQL vs. NoSQL Simplified

Read more here.

Monday Technology Series

INFOGRAPHIC – 7 reasons why your email may have bounced

To know more, read the complete post here.

Monday Technology Series

INFOGRAPHIC – Characteristics of a good call-to-action (CTA)

To know more, read the complete post here.

Wednesday Marketing Series

INFOGRAPHIC – Do’s and Don’ts of Direct Email Marketing

To know more, read the complete post here.

Monday Technology Series

SQL vs. NoSQL Simplified

Definition SQL Databases – These are relational databases. NoSQL Databases – These are non-relational or distributed databases. Data Storage Types SQL Databases – Single way to store data. NoSQL Databases – Multiple ways to store data – key-value stores, document stores, wide-column stores, graph stores. Data storage models SQL Databases – Data is stored in a table where… Continue reading SQL vs. NoSQL Simplified

Monday Technology Series

Understanding YARN and it’s components

Apache Hadoop YARN (Yet Another Resource Negotiator) is a cluster management technology. It decouples MapReduce’s resource management and scheduling capabilities from the data processing component, enabling Hadoop to support more varied processing approaches and a broader array of applications. YARN, (or sometimes called as MR2), is an extended and an improved version of MR1. It was… Continue reading Understanding YARN and it’s components

Monday Technology Series

What is Apache Tez?

The Apache Tez project is an extensible framework built on top of Apache Hadoop YARN. It is used to process data, that earlier took multiple MR jobs, now in a single Tez job which uses Directed Acyclic Graph (DAG) for data processing. It is used for building high performance batch and interactive data processing applications. It drastically improves… Continue reading What is Apache Tez?

Monday Technology Series

What is Apache Hive?

Apache Hive is a data warehouse infrastructure built on top of Hadoop which allows querying and managing large datasets residing in distributed storage. It provides an SQL-like language called as HiveQL with schema on read and transparently converts queries to map reduce, tez or spark jobs. All these execution engines run on Hadoop YARN. The HiveQL language also… Continue reading What is Apache Hive?

Tuesday Big Data Series

Understanding Pig Data Model

Pig has a simple yet rich data model which consists the following four types: Atom An atom consists of a single atomic value which can be a string or a number. Examples – ‘tom’ or 2 Tuple A tuple is a sequence of fields each of which can be of any datatype. Examples – (‘tom’, ‘california’) or… Continue reading Understanding Pig Data Model

Monday Technology Series

Apache Pig and it’s awesome features!

Pig is an open source scripting platform by Apache used for analyzing and processing large data sets. It allows users to write complex map reduce problems using a simple scripting language called Pig Latin. Pig translates the Pig Latin script into MapReduce so that it can be executed within YARN for access to a single dataset stored… Continue reading Apache Pig and it’s awesome features!