Job Scheduling in Hadoop – A 7 Year Perspective

April 25, 2014 by Updated July 28th, 2017

In a recent presentation at Flipkart’s 2014 SlashN conference, I summarized seven years of progress in Hadoop and Big Data. In its beginning stages, Hadoop exhibited several weaknesses in its job scheduling. As a result, users who shared a Hadoop cluster would experience a slow cluster due to a bad job, or one user might take down an entire cluster with a job. Due to these and other consequences, developers turned to push scheduling with Corona. To learn more about the past and future of Hadoop scheduling, see my full presentation below.