Running Apache Spark jobs cheaper while maximizing performance - Brad Caffey, Expedia Group

October 28, 2020

Presented by Brad Caffey, Staff Big Data Engineer, Expedia Group In a Covid-19 world, companies are looking for ways to reduce cloud spending as much as possible. While many Apache Spark tuning guides discuss how to get the best performance using Spark, none of them ever discuss that performance's cost. In this session, we'll cover a proven tuning technique for Apache Spark that lowers job costs on AWS while maximizing performance. Topics include: * the principle for how to make Apache Spark jobs cost-efficient * how to determine the AWS costs for your Apache Spark job * how to determine the most cost-efficient executor configuration for your cluster * how to migrate your existing jobs to the cost-efficient executor * how to improve performance with your cost-efficient executor

Previous Video
Want real-time analytics? Model your storage right or bust - Matt Falk, Oribtal Insight
Want real-time analytics? Model your storage right or bust - Matt Falk, Oribtal Insight

Presented by Matt Falk, VP Engineering, Orbital Insight As technology evolves, we find ourselves surrounde...

Next Video
Why Nextdoor Ditched a Data Warehouse for a Centralized Data Lake - Ivan Peng, Nextdoor
Why Nextdoor Ditched a Data Warehouse for a Centralized Data Lake - Ivan Peng, Nextdoor

Presented by Ivan Peng, Software Engineer - Data Platform, Nextdoor Qubole and Nextdoor have been partners...