Job Scheduling in Hadoop – A 7 Year Perspective
In a recent presentation at Flipkart’s 2014 SlashN conference, I summarized seven years of progress in Hadoop and Big Data. In its beginning stages, Hadoop exhibited several weaknesses in its job scheduling. As a result, users who shared a Hadoop cluster would experience a slow cluster due to a bad job, or one user might take down an entire cluster with a job. Due to these and other consequences, developers turned to push scheduling with Corona. To learn more about the past and future of Hadoop scheduling, see my full presentation below.