In a previous post, we outlined the case for selecting cloud infrastructure over an on-premises deployment for managing big data workloads. Taking advantage of Spot instances to realize substantial cost savings is one of the benefits of selecting the cloud. Spot instances are a feature of AWS consisting of spare EC2 instances offered at a discount. The price of the instance changes in real-time based on demand. AWS users make a bid indicating the most they are willing to pay for the instance. If the Spot price is less than the bid, the user receives that instance.
Big Data and Cost Benefits of Using Spot Instances
Big data workloads can be bursty, with data teams needing to scale jobs at a moment’s notice. By incorporating Spot instances, data teams can better manage the cost of rapidly growing workloads.
Since Spot node prices are based on AWS users’ bids, using them leads to significant cost savings. When compared to On-Demand instances, users who utilize Spot nodes can save up to 80% on the price, even for the same instance type.
Challenges with Managing Spot Instances
While Spot nodes can save on cost, they require extensive time and resources to manage due to the following factors:
- Spot nodes can be reclaimed by AWS at any time, meaning job loss is possible.
- If demand for the Spot node grows and the Spot price is larger than the bid price, the Spot instance will then go away. To increase their chances of getting the Spot node, users will need to bid higher.
- While high bids can be a smart strategy, AWS can still terminate the instance if there are no spare instances of that type.
- There’s also a chance that the Spot instance request will still be denied. This happens in cases where the Spot price is higher than the bid or spare capacity is insufficient.
- Spot nodes can also take a while to be approved — at times taking several minutes.
Automated Spot Management With Qubole
Qubole Data Service (QDS) provides a policy-based way to automate the Spot bidding process, allowing data teams to take full advantage of Spot instances without devoting resources to managing it. Qubole can use AWS Spot nodes when dynamically adding cluster nodes or as part of the core minimum nodes for a cluster (not recommended for stability purposes).
QDS users can select a maximum bid they are willing to pay for a Spot instance. The system then automatically places bids for them, making the process easy to use. Qubole Hadoop clusters begin with nodes at On-Demand instances and can be rebalanced automatically by switching On-Demand instances for Spot nodes when Spot availability is higher. It works by identifying On-Demand instances that aren’t busy performing tasks, provisioning Spot nodes from AWS, then terminating the previously identified On-Demand instances.
With this ease of use, Qubole clusters can be used for advanced provisioning strategies. Those strategies come in three categories:
- On-Demand Only: Auto-scaled nodes that are added will only be On-Demand instances.
- Spot Instances Only: Auto-scaled nodes that are added will only be Spot nodes.
- Hybrid: Auto-scaled nodes combine On-Demand and Spot nodes. Users are able to choose what the maximum percentage of Spot nodes is.
Additional built-in intelligence in using Spot nodes with QDS include:
- Qubole Placement Policy: Qubole has multiple pricing options for Stable Spot nodes (conservative pricing) and Volatile Spot nodes (aggressive pricing). Via the placement policy, Qubole spreads out HDFS storage across Stable and Volatile nodes, thereby minimizing the risk of job loss due to loss of a Spot instance.
- Fallback to on-demand instances after a configurable timeout: there is no guarantee of getting Spot nodes. Qubole can automatically fall back to requesting on-demand nodes if Spot nodes cannot be provisioned within a configurable timeout period.
- Intelligent AZ Selection: Spot pricing can vary by AZ (availability zone), sometimes by up to 15-20%. Qubole can automatically select an optimal AZ based on Spot pricing for the cluster instance type chosen. Currently AZ selection is only supported for non-VPC clusters.
You can read more about integration with AWS Spot nodes and Qubole’s intelligent Spot management in our documentation. This feature is available for Hadoop, Spark, and Presto clusters.
Customer Data on Usage and Savings
A majority (~82%) of Qubole customers’ clusters use automated Spot management. Qubole customers run nearly half of all their workloads using Spot instances, which yields up to 80% cost benefits as compared to on-demand pricing.
One particular customer, BloomReach, which offers a data-driven marketing solution, was able to utilize Spot instances in 85% of its workloads.
Jorge Rodriguez, Tech Lead in BloomReach’s data platform team, explained. “The nice thing about Qubole is that even when Spot instances are reclaimed, your job doesn’t necessarily have to fail … because it will just spawn a new Spot instance, and your job will continue running.”
To learn more about how BloomReach leveraged automated Spot management into its big data ETL environment, click here.
This is part of a series exploring the benefits of cloud architecture. See the first post of the series here, and come back for more on the separation of compute and storage and the economics of provisioning to peak.