QDS supports Presto as a Service running on the AWS Cloud. Presto is an ANSI SQL-based real-time querying engine developed by Facebook. On QDS, analysts using Presto can query data on HDFS or stored in S3. In particular, Presto is best at certain workloads where a faster query engine is needed to offer interactive speeds for data exploration, or a wide variety of connectors are required to query multiples data sources. This level of performance can be achieved at any scale without the extensive costs required for data warehousing implementations.
Interactive, near real-time performance for SQL queries over petabyte scale data.
Read and write optimization for cloud storage dramatically enhances query performance and the user experience and reduces processing costs. Plus, with advanced Autoscaling, you’ll pay only for resources actually used
Like all big data solutions supported by QDS, clusters scale up and down automatically during query execution. And you don’t have to worry about starting or stopping Presto clusters. QDS does all the heavy lifting for you.
QDS supports a wide variety of Amazon EC2 instance types for your Presto cluster, giving you the freedom to optimize instance selection for your workload requirements and AWS pricing options.
QDS lets you automatically incorporate Amazon spot instances that can cost up to 90% less than on-demand instances.
With QDS’ pay-per-use pricing model, you’ll only pay for what you actually use by compute hour.
QDS supports Amazon VPC, a service that extends your private network into the cloud. Amazon VPC provides fine-grained access control both to and from Amazon EC2 instances in your virtual network. Plus, you can launch dedicated instances within a VPC on single-tenant hardware.
QDS makes it easy to debug both active and historical Presto queries. Results and logs are always available even without active running clusters.
Analysts can create custom user-defined functions in addition to standard open source functions to easily migrate existing Presto work to QDS.
QDS supports caching within the cluster to improve performance for queries that frequently make use of the same data set.
Presto-backed visualization tools on QDS enable up to the minute pivot summaries and dashboards for petabytes of data. BI and visualization tools connect to QDS-managed Presto clusters through the ODBC driver.
QDS for Presto works best when users are SQL-proficient and need access to quickly query data, but do not want to invest in and move to a data warehouse solution.
Use Presto to query your petabyte scale data and get results quickly. With Qubole, data is persistent but compute clusters are elastic. You only pay for the compute when you actually run queries. Leveraging QDS for Presto enables the speed and scale of an always-on solution such as Redshift without paying for or managing always-on clusters.
Presto users can write ANSI SQL queries that unify data across sources, including object stores such as S3, relational databases such as MySQL, and real-time streams such as Amazon Kinesis. With QDS providing a central metastore for defining the structure of your data, you can join multiple data sources together to get a complete analytical view of your organization.
We continue to see the fast pace that Presto achieves, and as our data quickly scales over time, I would not be surprised to see Presto’s query time inversely match that rate of expansion.
QDS gives you the freedom to work with Presto along with Spark, Hadoop MapReduce, and Hive as part of one unified interface with unified metadata. Choose the right solution for the right workload rather than being locked into any single technology. Presto is designed for interactive, ad-hoc querying over large data sets. For batch or ETL workloads where reliability is paramount, Qubole offers Hive and MapReduce as options that maximize both performance and scale. For machine learning and iterative algorithm design, Qubole offers Spark with a Notebook interface.
Qubole is a significantly more polished product than EMR. Data scientists can explore their data in S3, create tables and query those tables all via an easy-to-use web UI
Qubole’s fantastic support has been key in our successful deployment. They continue to deliver of new features and revisit the ones that we ask for
Our goal at MediaMath was to take our existing industry leading infrastructure to the next level handling new complex analytics tasks. Qubole has helped us enable this goal with minimal risk.
Instead of worrying about provisioning clusters of machines or job flows or whatever, Qubole lets you focus on your data and your queries … The Qubole guys have been extremely helpful!
The service spins up users’ clusters only when a job is started, then automatically scales or contracts them based on the workload, and spins the servers down once the job is done.
Qubole’s Hadoop and Hive interfaces are vastly superior to the default CLIs, which scare business analysts and hinder meaningful analyses of the gaming logs that we collect. With Qubole, business analysts are self-sufficient in using a Big Data platform to meet their advanced analytic needs.
Online Gaming Company
top-performing technologies in the data industry are definitely taking aim at democratizing data tools and bringing the power of data to smaller businesses. This is a major change in the data industry, and Qubole Data Service is a great example
I’m very happy to be using Qubole in production. Qubole has saved me a lot of time, effort, and trouble in getting my data processing pipelines up and running. My data pipelines process Appnexus data in Amazon S3 which is then stored in Vertica. The engineering team understands the complexities and provided awesome support!
Real-time Ads Retargeting Startup
There’s a whole world of web companies, SMBs and other non-Facebooks or Yahoos that will want to use Hadoop but not want to run it in-house…offering a cloud service makes it easier for these users to get started with the platform and for Qubole to keep improving.
Qubole offers a big data ETL and exploration service through auto-scaling Hadoop clusters with a web user interface for data exploration and integration with various data sources. The service can do (nearly) everything EMR can do, and it goes further
Big Data Republic
Simba knows Big Data access. Qubole knows Big Data. Qubole’s founders authored Apache Hive, built key parts of the Hadoop eco-system and brought Apache HBase to Facebook
“The integration of Tableau and Qubole makes it faster and easier for our customers to operationalize Big Data…lowers the resource barriers to deriving the benefits of Big Data because customers can deploy our joint solution seamlessly and cost effectively.”