The Apache Tez project is an extensible framework built on top of Apache Hadoop YARN. It is used to process data, that earlier took multiple MR jobs, now in a single Tez job which uses Directed Acyclic Graph (DAG) for data processing. It is used for building high performance batch and interactive data processing applications. It drastically improves the speed, while maintaining the Map Reduce’s capability to scale to petabytes of data. Apache Hive and Apache Pig use Apache Tez.
Related Posts :