Data Lake Storage
What is Data Lake Architecture? In this multi-part series, we will take you through the architecture of a Data Lake. We can explore data lake…
What is Data Lake Architecture? In this multi-part series, we will take you through the architecture of a Data Lake. We can explore data lake…
This blog covers new benchmark tests to better understand the Autoscaling behavior of concurrent Apache Spark applications. We believe that this will help in advancing…
Guest authors: Jerry Xu, Co-founder, and CEO, Datatron; Lekhni Randive, Product Manager, Datatron Qubole author: Jorge Villamariona, Sr. Product Marketing Manager, Qubole In today’s world,…
The sixth release of Apache Sqoop i.e. 1.4.7 is out! This is one of the most significant updates to the Sqoop platform. We give you…
Data scientists use Notebooks for data exploration, interactive data analytics, machine learning, and collaboration. Once set up, a Notebook provides a convenient way to save,…
Introduction Presto can access S3 Buckets using one of the following options: IAM roles provided in the configuration Access-key/Secret-key provided in the configuration Credentials fetched…
Introduction Qubole provides powerful automation that optimizes underlying cloud compute management for data lakes. Qubole cluster management continuously optimizes both performance and cost by lowering…
Introducing Qubole Support Qubole processes over 250 Petabytes of data in a month, and the diversity of data we process, cloud platforms we run on,…
Introduction Enterprises are today becoming more data-driven as their data is the fuel to their innovation engine to build new products, outmaneuver the competition and…
Each month, about an exabyte of data is processed using Qubole’s data platform on Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure,…
This post is a guest publication written by Saba El-Hilo, a Senior Data Engineer at Mapbox. A version of this post first appeared as a…
Free access to Qubole for 30 days to build data pipelines, bring machine learning to production, and analyze any data type from any data source.
See what our Open Data Lake Platform can do for you in 35 minutes.