Using Apache Spark? Learn more about the benefits of using Apache Spark on Qubole.
Learn More

Case Study: Pinterest’s Journey to Qubole

July 29, 2014 by Updated December 5th, 2017

Pinterest-Journey-QuboleWith 20 terabytes of new data logged each day, managing big data is not an option at Pinterest. In order to provide an optimal user experience with the most relevant and recent content, Pinterest turned to Hadoop to help process the data.

Unfortunately, Hadoop in its raw form doesn’t act as a self-serve platform because it is only built for a technical user and lacks elasticity.

In order to overcome these limitations, Pinterest at first turned to Amazon Elastic MapReduce to run its Hadoop jobs. However, as the workload scaled to a few hundred nodes, Amazon EMR became less stable, and the engineering team started to run into limitations with EMR’s versions of Hive.

Due to these limitations, Pinterest decided to migrate its Hadoop jobs to Qubole. In a blog post on Pinterest’s engineering blog, Mohammad Shahangian, outlined why the company chose Qubole over other big data platforms.

  • Ability to scale horizontally to 1,000s of nodes on a single cluster
  • 24/7 engineering support for data infrastructure
  • A user interface for non-technical users
  • Hive integration
  • Multi-cluster support and a simplified executor abstraction layer
  • Baked AMI customization
  • Support for spot instances
  • S3 eventual consistency protection
  • Graceful autoscaling clusters

Currently Pinterest has more than 100 regular users of MapReduce who run more than 2,000 jobs each day. To learn more about how Pinterest built its self-serve platform, check out their blog post.

  • Blog Subscription

    Get the latest updates on all things big data.
  • Recent Posts

  • Categories

  • Events

    Big Data World London

    Mar. 12, 2019 | London, UK

    Data Innovation Summit

    Mar. 14, 2019 | Stockholm, Sweden

    Spark + AI Summit

    Apr. 23, 2019 | San Francisco, CA

    Strata NY

    Sep. 23, 2019 | New York, NY

    Big Data World Asia

    Oct. 9, 2019 | Singapore