PRESTO

What is Presto? 

Presto is a high-performance, distributed SQL query engine for big data. Presto was originally designed and developed at Facebook for their data analysts to run interactive queries on its large data warehouse in Apache Hadoop.

Presto’s architecture allows users to query a variety of data sources such as Hadoop, AWS S3, Alluxio, MySQL, Cassandra, Kafka, and MongoDB. One can even query data from multiple data sources within a single query.

PRESTO QUERY ENGINE ON QUBOLE

Qubole has been offering a managed Presto service since 2014. We offer our customers multiple Presto versions and maintain a regular upgrade process. Qubole’s managed Presto offering has been tailored to the needs of our customers. Qubole blends the latest features form the open source community with Qubole’s proprietary solutions that boost performance, lower cost, improve user experience, and provide smooth administration of Presto clusters.



KEY BENEFITS OF PRESTO ON QUBOLE

Performance Boost

  • Dynamic Filtering
  • Fast Caching with RubiX
  • Smart Query Retry

Lower Cloud Operation Cost

  • Intelligent Flexible Node Management
  • Workload-aware Autoscaling
  • Heterogeneous Cluster Support

Ease of Use

  • Simplified Cluster Configuration
  • Zero downtime upgrades
  • Comprehensive Administration Experience

Enterprise-Ready

  • Enterprise-grade security
  • Apache Ranger Support
  • JDBC/ODBC connectors
  • Integration with 3rd party tools



Presto on Qubole

Presto Cost Optimization

QuboleOpen Source
Graceful Low-cost Compute Shutdown *
Spot (AWS) Rebalancing
Spot Block (AWS) Support
Workload-Aware Autoscaling
User-Based Autoscaling
Aggressive Downscaling with graceful decommissioning
Heterogeneous Clusters
Per-second billing
Smart Query Retry
Cost Explorer & Analysis
Strict Mode
(prevent runaway queries)

 

* AWS Spot, Azure Lo-cost VMs, Google Pre-emptible VMs

Presto Performance

QuboleOpen Source
Compute Optimization for joins and filters
Required Worker Node
S3 Direct writes optimization
S3 listing optimization
Rubix (distributed caching)

Presto Workspaces

QuboleOpen Source
Versioning
Scheduling
Dashboarding (Presto Notebook)
Collaboration and sharing

Debugging and Profiling

QuboleOpen Source
Monitoring (Ganglia, DataDog, etc)
Intelligent Log Access

Presto Security

QuboleOpen Source
Access control for notebooks, clusters, jobs, structured data
Audit end-user activity logs
Apache Ranger Integration
SSO with SAML 2.0 support
Data encryption
HIPAA, SOC2 Type2, ISO-27001 compliant environments

Presto Integrations

QuboleOpen Source
Custom Connector with BI tools (Tableau, Looker, etc.)
REST API
AWS Glue Support
Data Source Connectors (Redshift, Postgres, Kinesis*, etc)

 

* Kinesis is being contributed back to OSS

Service & Support

QuboleOpen Source
24/7 support from our Presto experts
Support multiple versions of Presto