Corporate Blog

Cloud-native Big Data Activation Platform

  • Integration of RStudio & Qubole Platform come together at your Fingertips | Qubole

    Integration of RStudio & Qubole Platform come together at your Fingertips | Qubole

    The integration of both platforms accelerates data science and scientific research with a single click access to large datasets within the RStudio integrated development environment… The post...

    Read Blog
  • Snowflake Pricing – say no to Snowflake tax

    Snowflake Pricing – say no to Snowflake tax

    The history of the software industry is littered with examples of data management companies that lock customer data into their proprietary systems, and then go… The post Snowflake Pricing – say no...

    Read Blog
  • How to Optimize Costs in a Changing World

    How to Optimize Costs in a Changing World

    Last week, we welcomed our customers Justin Wainwright, Systems Analyst at Oracle Data Cloud and Rajit Saha, Director of Data Platform at LendingClub, to discuss… The post How to Optimize Costs in...

    Read Blog
  • What is an Open Data Lake?

    A data lake is a system or repository that stores data in its raw format as well as transformed trusted datasets and provides both programmatic… The post What is an Open Data Lake? appeared first...

    Read Blog
  • Introducing Qubole Cost Explorer

    Introducing Qubole Cost Explorer

    “If you are a cloud adopter rapidly adopting cloud services, but not developing the finance governance muscle, you will certainly be visiting the cloud optimization… The post Introducing Qubole...

    Read Blog
  • Cloud Data Lakes – Four Must-have TCO Optimization Capabilities

    Cloud Data Lakes – Four Must-have TCO Optimization Capabilities

    Enterprises leverage cloud providers’ compute and storage services for their ad-hoc data analytics, streaming analytics and ML use cases as cloud data lakes provide significant… The post Cloud...

    Read Blog
  • Data Warehouses vs. Data Lakes

    Data Warehouses vs. Data Lakes

    All data-driven organizations use data in three ways: To report on the past To understand the present To predict the future Data warehouses and Business… The post Data Warehouses vs. Data Lakes...

    Read Blog
  • A Message to Our Customers & Partners from Qubole CEO Ashish Thusoo

    To our valued customers and partners, I hope all of you, your colleagues, families and friends are safe and healthy and practicing social distancing in… The post A Message to Our Customers &...

    Read Blog
  • Data Lake Essentials, Part 3 – Data Catalog and Data Mining

    Data Lake Essentials, Part 3 – Data Catalog and Data Mining

    Data Lake Essentials, Part 3 – Data Lake Data Catalog, Metadata and Search In this multi-part series we will take you through the architecture of… The post Data Lake Essentials, Part 3 – Data...

    Read Blog
  • Cloud Data Lakes – Best Practices

    Cloud Data Lakes – Best Practices

    This is an abridged version of the article that appears on NewStack BI tools have been the go-to for data analysts who help business track… The post Cloud Data Lakes – Best Practices appeared...

    Read Blog
  • Apache Airflow Tutorial – ETL/ELT Workflow Orchestration Made Easy

    Apache Airflow Tutorial – ETL/ELT Workflow Orchestration Made Easy

    Apache Airflow is one of the most powerful platforms used by Data Engineers for orchestrating workflows. Airflow was already gaining momentum in 2018, and at… The post Apache Airflow Tutorial –...

    Read Blog
  • Data Lake Essentials, Part 2 – File Formats, Compression and Security

    Data Lake Essentials, Part 2 – File Formats, Compression and Security

    Data Lake essentials, part 2 – file formats, compression and security In this multi-part series we will take you through the architecture of a Data… The post Data Lake Essentials, Part 2 – File...

    Read Blog
  • Data Lake Essentials – Part 1 – Storage and Data Processing

    Data Lake Essentials – Part 1 – Storage and Data Processing

    Data Lake essentials, part 1 – storage and data processing In this multi-part series we will take you through the architecture of a Data Lake.… The post Data Lake Essentials – Part 1 – Storage and...

    Read Blog
  • Apache Spark Benchmark for Autoscaling: Qubole versus competition

    Apache Spark Benchmark for Autoscaling: Qubole versus competition

    This blog covers new benchmark tests to better understand Autoscaling behaviour of concurrent Apache Spark applications. We believe that this will help in advancing research… The post Apache Spark...

    Read Blog
  • Streamlining Operations of Machine Learning Models

    Streamlining Operations of Machine Learning Models

    Guest authors: Jerry Xu, Co-founder and CEO Datatron; Lekhni Randive, Product Manager, Datatron Qubole author: Jorge Villamariona, Sr. Product Marketing Manager, Qubole In today’s world,… The post...

    Read Blog
  • Apache Sqoop 1.4.7 – 9 reasons why you need it

    Apache Sqoop 1.4.7 – 9 reasons why you need it

    The sixth release of Apache Sqoop i.e. 1.4.7 is out! This is one of the most significant updates to the Sqoop platform. We give you… The post Apache Sqoop 1.4.7 – 9 reasons why you need it...

    Read Blog
  • Analytics and ML simplified with Jupyter Notebooks and Apache Spark

    Analytics and ML simplified with Jupyter Notebooks and Apache Spark

    Data scientists use Notebooks for data exploration, interactive data analytics, machine learning, and collaboration. Once set up, a Notebook provides a convenient way to save,… The post Analytics...

    Read Blog
  • Per-Bucket Configuration Support in Presto

    Per-Bucket Configuration Support in Presto

    Introduction Presto can access S3 Buckets using one of the following options: IAM roles provided in the configuration Access-key/Secret-key provided in the configuration Credentials fetched… The...

    Read Blog
  • Optimized Upscaling for Managing Workloads in Cloud

    Optimized Upscaling for Managing Workloads in Cloud

    Introduction Qubole provides powerful automation that optimizes underlying cloud compute management for data lakes. Qubole cluster management continuously optimizes both performance and cost by...

    Read Blog
  • Qubole: The Super Powers of Support

    Qubole: The Super Powers of Support

    Introducing Qubole Support Qubole processes over 250 Petabytes of data in a month, and the diversity of data we process, clouds platforms we run on,… The post Qubole: The Super Powers of Support...

    Read Blog
  • loading
    Loading More...