Meet the First Autonomous Data Platform
The question is how, when the hurdles—complexity, scalability, speed, cost,
The answer is the world’s first Autonomous Data Platform
Have you been dreaming of use cases limited only by your imagination? You’ve come to the right place. Our Autonomous Data Platform self-manages and self-optimizes by sending Alerts, Insights and Recommendations (AIR) based on Cloud Agents connected to your data team’s specific data policies and preferences.
Cloud Agents perform actions the data team determines. These typically include:
Executing automated tasks, based on a policy or configuration
Bundling specific low-level features
Learning based on individual, company and system-wide behavior
Cloud Agents are valuable to a data team because they:
Workload Aware Auto-Scaling Agent:
The Auto-scaling Agent augments the basic auto-scaling feature available in the Enterprise Edition with storage-based scaling and aggressive down-scaling.
The Workload Aware Auto-scaling Agent can reduce compute spend by as much as 33% over basic auto-scaling solutions available in the market today.
Workload Aware Auto-scaling offers the following capabilities:
QDS continuously monitors the cluster’s HDFS storage to ensure it can support current jobs, and it will launch more nodes, if necessary.
When a cluster has sufficient compute resources but requires additional storage, the agent can dynamically add storage using EBS to avoid provisioning a new compute node.
Aggressive downscaling is triggered when you reduce the maximum size of a cluster while it’s running. To save costs, QDS terminates nodes that are closest to completing their tasks and closest to their billing boundary.
Your mappers may be running idly waiting for reducers to finish their job. Offloading conserves compute resources by saving mapper data to HDFS or object storage.
Comparing Qubole performance and cost against two fixed-cluster scenarios under typical fluctuating load conditions.
13% faster then QDS, but 32% more expensive
Automatically optimizes performance and cost in response to elastic demand
10% cheaper than QDS, but 90% slower
Learn how much Workload Aware Auto-scaling can save you from our benchmarking analysis.
The Spot Shopper Agent ‘shops’ for the best combination of price and performance, based on the policy you provide. It achieves this by shopping across different instance types, by dynamically rebalancing Spot and On Demand nodes and by considering different Availability Zones.
The Spot Shopper Agent can reduce compute spend by as much as 50% over solutions that exclusively rely on on-demand type instances.
Spot Shopper offers the following capabilities:
With Heterogeneous Clusters, slave nodes comprising the cluster may be of different instance types. Heterogeneity in Spot nodes is highly beneficial because Spot prices can change rapidly, and Spot Shopper can make the lowest-cost purchasing decision in real time.
Unless you specify a particular AZ when you configure the cluster, Qubole can automatically select the AZ with the lowest Spot prices for the region and instance type you’ve specified.
Fluctuations in the market may mean that QDS cannot always obtain as many Spot instances as your cluster specification calls for. In these circumstances, the Spot Shopper Agent will automatically rebalance the cluster later on when prices drop by swapping out on-demand nodes for Spot nodes, ensuring that you continue to get the lowest prices possible [learn more].
The Placement Policy option enables QDS to make a best effort to store one replica of each HDFS block on a stable node. This prevents job failures that could occur if all replicas were lost as a result of AWS reclaiming many Spot instances at once [learn more].
The Data Caching Agent automates the movement of data for performance optimization.
Data Caching automatically determines the right set of data to cache in the cluster so that interactive, ad-hoc queries run faster and don’t need to retrieve data for each query.
Data Caching makes optimal use of ORC, Parquet and Avro data formats by minimizing the amount of data that’s read when selecting only specific columns.