Tuesday Big Data Series

Understanding HDFS quotas

Every Hadoop system has an Hadoop Administrator and Hadoop users/developers. The Administrator is responsible for deployment and maintenance of the entire infrastructure. He is responsible for cluster availability, file system management, security, installation of latest updates, and all other things that need to keep the system up and running. The administrator is also responsible for… Continue reading Understanding HDFS quotas

Monday Technology Series

What is Oozie?

Why do we need Oozie? The Hadoop stack consists of a variety of tools like Pig, Map Reduce, Hive, HBase, Sqoop etc. At times, when dealing with large data sets, we might have to use a combination of either of these technologies along with plain old Java, Python, Perl or shell scripts to get work done… Continue reading What is Oozie?