Right Tool for the Job: Running Apache Spark at Scale in the Cloud

July 20, 2020

Apache Spark is powerful open source engine used for processing complex, memory-intensive workloads. However, running Apache Spark in the cloud can be complex and challenging. Qubole has re-engineered Apache Spark, optimising its performance and efficiency while reducing any administrative overheads. Today, Qubole runs some of the world’s largest Apache Spark clusters in the cloud. In this webinar, we’ll take a deeper look at the use cases for Apache Spark, including ETL and machine learning, and compare Apache Spark on Qubole versus Open Source Apache Spark. We’ll cover: - Why Apache Spark is essential for big data processing - How to deploy Spark at scale in the cloud and enable all data users - The enhancements made to Qubole Spark - A live demo and real-world examples of Apache Spark on Qubole

Previous Video
The Open Data Lake Talks  Optimizing Costs in A Changing World
The Open Data Lake Talks Optimizing Costs in A Changing World

As organizations grapple with the sudden economic turmoil created by the pandemic, there is a critical need...

Next Video
Right Tool for the Job: Using Qubole Presto for Interactive and Ad-Hoc Queries
Right Tool for the Job: Using Qubole Presto for Interactive and Ad-Hoc Queries

Presto is the go-to query engine of Qubole customers for interactive and reporting use cases due to its exc...