Today marks an important day for the whole Qubole community—customers, partners, and employees—as we update our environments with the latest major product release “R54.” In this release, we continue to strengthen our industry leadership with many new features, including patent-pending Financial Governance and cluster management capabilities that help our customers reduce CAPEX and OPEX 30%-50% annually over alternatives.
Qubole’s rich set of financial governance capabilities provides complete visibility and control of spend, and its cluster management optimizations intelligently scale clusters based on workloads. The platform assigns more capacity or releases resources based on data jobs’ requirements, achieving best-in-class TCO and administrator-to-user ratios of over 1:200.
In addition, R54 brings many new features, enhancements, and fixes under 16 top-level projects across multiple clouds, for Artificial Intelligence (AI), Machine Learning (ML), and advanced analytics. For example, administrators can fine-tune controls and limits on usage, including per-second billing for Presto workloads, and monitor cost savings of Spot nodes or Spot Blocks on AWS.
Security and Compliance
As with prior releases, a major focus of R54 has been enterprise security, with enhancements for defining and enforcing role-based access controls of Qubole artifacts that contain data and metadata. For example, commands, data store connections, data previews, and results. Also, R54 provides greater data privacy and protection through Apache Ranger for granular data access controls at the row and column levels.
Supporting Open Source Software
Qubole’s support for Open Source Software (OSS) spans delivering the latest versions, releasing software to open source communities, and developing complementary capabilities for OSS. For example:
- RubiX—a technology we open-sourced in 2016, brings up to 3.7x performance improvements to Apache Spark jobs through advanced data caching. RubiX is Qubole’s next-generation distributed file cache, optimized for cloud storage and any file format.
- Apache Spark—in addition to the improvements through RubiX, dynamic filtering for joins provides up to 2x performance gains. Furthermore, by extending the platform’s patent-pending metadata caching system, Qubole can deliver greater incremental performance gains to Spark jobs.
- R54 also includes Apache Spark Structured Streaming with comprehensive support for the Kinesis connector (which we open-sourced in 2018); and enhancements to SparkLens (an open-source profiling tool also developed by Qubole for understanding the scaling limits of Apache Spark applications).
- Apache Airflow—with a Python 3.5 environment and new package management features; plus easier ways to edit and synchronize Directed Acyclic Graphs (DAGs) and other files.
This release delivers enhancements for querying big data sources using SQL, with new versions of Presto and Hive; and an enhanced Notebook experience with Presto through a native interpreter implementation and concurrency support.
What’s New on Microsoft Azure
In addition to some of the cross-platform features listed above, R54 brings many capabilities specific to Microsoft Azure. Some of the salient ones include:
- A 20% reduction in cluster start times using Azure Resource Manager (ARM) templates.
- Significant improvements in response times for ad hoc / interactive queries in Presto, using RubiX advanced caching.
- Fastpath optimizations, which reduce end-to-end latency of Presto commands submitted from Microsoft Power BI and other BI tools.
- Improved ease of use and access to clusters through public or private static IPs.
- Improved support for larger datasets and scalability of cluster storage with Block Storage Upscaling.
R54 is the result of thousands of hours of work by many teams, as well as the invaluable testing and feedback from some of our customers and partners. Big kudos and thank you to all involved!
To learn more about R54, see the What’s New section in our product documentation, and let us know what you think via the “Send Feedback” button on the top right of the Qubole interface.