|Cluster Type||Total # of Clusters||Configured with Heterogeneous|
|Hadoop1||12||0 (Not Supported)|
Oracle Data Cloud (ODC) is a business unit of the Oracle Corporation that serves 430,000 customers in 175 countries. ODC delivers consumer data to the world’s top 100 advertising agencies and largest digital publishers about who their customers are, what they do, where they go, and what they buy.
At Oracle Data Cloud (ODC), big data is our business. We tend to use larger clusters and process for sustained periods of time. Over the past couple years, we’ve seen the demand for certain instance families increase and put the health of our clusters at risk. As our EC2 bills were already climbing at an alarming rate, we couldn’t justify moving to more on-demand nodes, so we needed to keep using spot nodes in the majority of cases while keeping clusters stable.
“Because we run almost 100% spot nodes, we were suffering catastrophic spot losses given the size of our clusters and the scale at oracle,” explains Justin Wainwright, System Analyst Oracle. “This resulted in missing SLA, job failures and most importantly wasted time on business pipeline. Heterogeneous was the feature we were looking for.”
Oracle Data Cloud Team became the first beta customer for heterogeneous cluster feature when Qubole launched it back in August 2016. They were excited about the great potential this feature can bring to their daily operation.
Justin and his team started with smaller, non-critical operation clusters to test the water then with positive results. They expanded the configurations to entire cluster fleet Oracle Data Cloud owns. Qubole has been doing cost comparison and analysis along with Oracle for this amazing transformation journey – based on the statistics, we’ve seen up to
90% cost saving compared to on-demand node cost and usually 20-50% cost saving compared to homogeneous spot configurations.
Oracle Data Cloud team also helped Qubole to make a better product during this journey and shared their experience in configuring heterogeneous cluster with other Qubole users, such as:
“As our EC2 costs kept climbing and the spot market became more volatile, heterogeneous was the only option that made sense.
Even as our usage has grown over the past 6 months, since switching to heterogeneous, our costs have either gone down or, at least, stayed the same.” -Justin Wainwright, System Analysts, Oracle Data Cloud
Oracle Data Cloud was able to reduce spot loss and significantly lower the total operation cost further with heterogeneous configuration.
At pre-heterogeneous peak usage (Fall 2016), almost 40% of their EC2 costs were Qubole jobs running on on-demand nodes. As of May 2017, on-demand costs are no more than 20% and nearly half of the clusters have been configured as heterogeneous.
QDS empowers Oracle Data Cloud operation team with the capability to scale out fast while keeping the cost low and their customer satisfaction high.