Blogs

  • Addressing Regulatory GDPR and CCPA frameworks with Qubole ACID and Apache Ranger

    Addressing Regulatory GDPR and CCPA frameworks with Qubole ACID and Apache Ranger

    Data lakes are at the heart of digital transformation in the enterprises. As more organizations run analytics, machine learning, and ETL workloads on the data… The post Addressing Regulatory GDPR...

    Read Blog
  • Practical Guide to Financial Governance of Data Lake Initiatives

    Practical Guide to Financial Governance of Data Lake Initiatives

    Introduction Enterprises are today becoming more data-driven as their data is the fuel to their innovation engine to build new products, outmaneuver the competition and… The post Practical Guide...

    Read Blog
  • Introducing Qubole Release 57

    Introducing Qubole Release 57

    Release 57 (R57) brings many new capabilities and enhancements that help simplify and improve the efficiency and performance of your data processing projects.

    Read Blog
  • Calculating 30 billion speed estimates a week with Apache Spark on Qubole

    Calculating 30 billion speed estimates a week with Apache Spark on Qubole

    This post is a guest publication written by Saba El-Hilo, a Senior Data Engineer at Mapbox. A version of this post first appeared as a… The post Calculating 30 billion speed estimates a week with...

    Read Blog
  • Hive on Qubole runs 4x faster than Hive on Alternative Platforms

    Hive on Qubole runs 4x faster than Hive on Alternative Platforms

    Introduction ETL workloads form a major component of big data processing at any data-driven organization – from SMBs to enterprises, and ETL data pipelines at… The post Hive on Qubole runs 4x...

    Read Blog
  • Scaling Tez Application using Application Timeline Server v1.5

    Scaling Tez Application using Application Timeline Server v1.5

    Introduction In an earlier blog post, we presented a secure, multi-tenant, reliable, and scalable service that provides access to logs and history for MRv2 applications.… The post Scaling Tez...

    Read Blog
  • Qubole Open-Sources Multi-Engine Support for Updates and Deletes in Data Lakes

    Qubole Open-Sources Multi-Engine Support for Updates and Deletes in Data Lakes

    Qubole now supports efficient updates and deletes for data stored in Cloud data lakes. Users can make inserts, updates and deletes on transactional Hive Tables—defined… The post Qubole...

    Read Blog
  • Announcing General Availability of Qubole on Google Cloud

    Announcing General Availability of Qubole on Google Cloud

    We, at Qubole are excited to announce General Availability of the Qubole data platform on Google Cloud – a self-service, collaborative, enterprise platform for data… The post Announcing General...

    Read Blog
  • Introducing Hive 3.1.1 in Qubole

    Introducing Hive 3.1.1 in Qubole

    Qubole is the first and only vendor to deliver Hive 3.1.1 in the cloud

    Read Blog
  • Building a Data Lake the Right Way

    Building a Data Lake the Right Way

    Key considerations for building a scalable transactional data lake Data-driven companies are driving rapid business transformation with cloud data lakes. Cloud data lakes are enabling… The post...

    Read Article
  • Announcing Presto Summit India on September 05, 2019

    Announcing Presto Summit India on September 05, 2019

    We are super excited to announce the first ever Presto Summit in India on September 05, 2019 with Presto Co-Founders – Martin, David, and Dain!… The post Announcing Presto Summit India on...

    Read Blog
  • Data Governance for SparkSQL

    Data Governance for SparkSQL

    Introducing the new Apache Spark Data Access Control Framework on the Qubole platform

    Read Blog
  • Data Wrangling and Predictive Analytics for the 2019 Cricket World Cup Using PySpark & Python

    Data Wrangling and Predictive Analytics for the 2019 Cricket World Cup Using PySpark & Python

    How to acquire, transform and analyze semi-structured data and apply predictive analytics to predict future performance

    Read Blog
  • Presto Optimizations for Aggregations Over Distinct Values

    Presto Optimizations for Aggregations Over Distinct Values

    How Qubole Presto optimizations speed up the execution of queries with the DISTINCT operation

    Read Blog
  • How to Leverage AWS Spot Instances While Mitigating the Risk of Loss

    How to Leverage AWS Spot Instances While Mitigating the Risk of Loss

    Advancements in Qubole that reduce the odds of Spot instance losses in Qubole managed clusters

    Read Blog
  • Improve Apache Spark Performance by 2.9x with Amazon S3 Select Integration

    Improve Apache Spark Performance by 2.9x with Amazon S3 Select Integration

    Automatically use the S3 Select service whenever applicable to speed up queries

    Read Blog
  • Introducing a New Presto Scheduler to Improve Cache Reads by Up to 9x in RubiX

    Introducing a New Presto Scheduler to Improve Cache Reads by Up to 9x in RubiX

    Technical deep dive into the new scheduling algorithm in Presto on Qubole

    Read Blog
  • How Data Science Teams Can Succeed at Machine Learning at Enterprise Scale

    How Data Science Teams Can Succeed at Machine Learning at Enterprise Scale

    Strategies to address common challenges data science teams face as they scale up ML operations

    Read Blog
  • A Technical Overview of Quantum by Qubole: An Interactive Serverless SQL Engine

    A Technical Overview of Quantum by Qubole: An Interactive Serverless SQL Engine

    Quantum offers direct standards-compliant SQL access to your object stores or data lakes

    Read Blog
  • Introducing Quantum, a New Serverless Engine, on Qubole Data Platform

    Introducing Quantum, a New Serverless Engine, on Qubole Data Platform

    Use Quantum serverless engine to realize value from your data faster and pay only for queries your run

    Read Blog
  • loading
    Loading More...