When creating a new Presto cluster, it can be difficult to determine how many worker nodes you will require and also predict future requirements. In an ideal world, the number of worker nodes should be sized in a manner that ensures efficiency while also avoiding excessive costs. Using autoscaling can assist in achieving this balance by automatically adjusting the number of worker nodes as workloads change over time, thus reducing the need for manual intervention.
Presto clusters are capable of scaling down to a minimum number of worker nodes if the cluster is idle for a specified period of time. Autoscaling helps you manage Presto clusters by automatically adjusting the number of worker nodes.
Here are some of the advantages of this:
Never run out of memory with autoscaling
One major challenge of running a Presto cluster is to make the right decision in terms of the number of worker nodes required to run your queries. Since all queries are not equal, it’s not always possible to predict the number of worker nodes that will be needed to run a task. Based on the CPU utilization, the number of worker nodes increases in autoscaling to make sure that the queries being run are executed without the fear of running out of memory.
Thanks to autoscaling in Presto, you don’t have to worry about whether your deployment can support your requirements.
Save cost with Idle time
In the absence of queries being sent to a Presto cluster, it will be economical to reduce the number of worker nodes. In the event that the idle time feature is enabled, the system will monitor the query activity, and, in the event that there has been no activity for a predefined period of time, such as ten minutes, the number of worker nodes will be reduced to the minimum required.
Here’s how idle time cost savings are transformation workloads and ad hoc querying.
- For the transformation workload, the query can potentially run for several hours. This makes it impractical to decide when to manually stop the cluster or reduce the number of running nodes. Idle time cost savings wait for a certain time frame of inactivity and then reduce the worker node to a minimum automatically until the next query hits the cluster again.
- For ad hoc querying, the querying is not continuous. Therefore, scaling into a minimum worker node count between each query is effective in saving costs.
How Does Autoscaling Work in Presto?
Getting started with autoscaling a Presto cluster is easy. Here’s how autoscaling works on a Presto cluster:
- The Presto Server keeps track of the launch time of each node.
- At regular intervals, the Presto Server takes a snapshot of the state of running queries. It then compares this snapshot with the previous snapshots and estimates the time required to finish the queries. If this time exceeds a threshold value (set to one minute by default and configurable through acsm.bds.target-latency), the Presto Server adds more nodes to the cluster.
- If QDS gauges that the cluster is running more nodes than required to complete the running queries within the threshold value, it starts decommissioning the excess nodes.
After new nodes are added, you may notice that they sometimes are not being used by the queries that are already in progress since new nodes are used by queries in progress only for certain operations such as TableScans and Partial Aggregations.
If no query in progress requires any of these operations, the new nodes remain unused initially. However, all new queries that are started after the nodes are added can use the new nodes.
Qubole recommends configuring instances of multiple families. QDS skips requesting instances of the families for which spot losses were witnessed at the cluster level within a specified time frame. If spot losses were seen for all configured instance families, QDS tries to provision instances synchronously and finally falls back to On-Demand if configured in case of unavailability of spot nodes.
Whenever a spot loss notification is received in a Presto cluster, Qubole’s autoscaling immediately starts adding replacement nodes to the cluster without waiting for the spot node to be interrupted by the Cloud Provider. This ensures that disruption to workloads due to the number of workers in the cluster reducing is minimal. However, this may result in the cluster’s size going above its configured maximum size.
Presto Query Optimization
Qubole’s autoscaling has been modified to detect queries waiting to be scheduled for execution due to the inability to meet a minimum workers’ requirement and can upscale the cluster to meet this demand. While nodes are being requested from the cloud provider and added to the cluster, queries will be queued on the Presto coordinator node, which can be viewed as queries in the “Waiting for Resources” state in the Presto UI.
This feature enables users to reduce the minimum size of a cluster to one worker node. “query-manager.required-workers” can then be set to the older value of the minimum size that was kept to ensure that the queries did not fail. This results in cost savings with minimal impact on performance: the cluster will scale down to one node during idle periods and bring up the required nodes before executing the first batch of queries post an idle period.
Now, isn’t that easy?
In the cloud, computational resources can be provisioned and de-provisioned according to computational demand, by enabling the cluster to expand and contract quickly and automatically. In proportion to the workload, auto-scaling results in full utilization of all the time. Additionally, auto-scaling enables customers to realize significant cost savings.
Qubole’s auto-scaling service polls Presto to determine which queries are still running and obtains a progress report based on Presto’s own internal statistics regarding query execution. The auto-scaling service runs a sequence of these reports to derive the optimal number of nodes from it.
The required workers feature enables significant cost savings by providing a way to reduce the minimum size of autoscaling Presto clusters to one worker node and increase Spot node utilization in the cluster with limited impact on query performance.