The term Apache Pig refers to an open source scripting platform by Apache used for analyzing and processing large data sets. It allows users to write complex map reduce problems using a simple scripting language called Pig Latin. Pig translates the Pig Latin script into MapReduce so that it can be executed within YARN for access to a single dataset stored in the Hadoop Distributed File System (HDFS).
Pig consists of a compiler that produces a sequence of Map Reduce programs. The large scale parallel implementations of these Map Reduce programs already exist. Pig Latin consists of built-in functions which are quite textual and can be used to perform different operations over the actual Map Reduce implementation.
Related Posts :