Don’t teach a pig to sing. It wastes your time and annoys the pig – Robert Heinlein Pig Scripts query big data and generate output files. Once. If you want them to do the same thing everyday, you need to schedule them. That is where Oozie comes in. Oozie is a pig scheduler. Now, obviously,… Continue reading Pig Scripts and Oozie Workflow
Oozie, an open source project, is implemented as a Java web application that runs in a Java servlet container and is distributed under the Apache License 2.0. It is a workflow scheduling system to manage Hadoop jobs. Oozie combines multiple jobs sequentially into one logical unit of work. It is integrated with the Hadoop stack, with YARN… Continue reading Term of the Week : Oozie
Why do we need Oozie? The Hadoop stack consists of a variety of tools like Pig, Map Reduce, Hive, HBase, Sqoop etc. At times, when dealing with large data sets, we might have to use a combination of either of these technologies along with plain old Java, Python, Perl or shell scripts to get work done… Continue reading What is Oozie?