Apache Spark

  • Running Apache Spark at Scale in the Cloud

    Running Apache Spark at Scale in the Cloud

    Deep dive into the use cases for Apache Spark on Qubole, including ETL and machine learning

    Watch Webinar
  • Accelerating Time to Value of Big Data of Apache Spark

    Accelerating Time to Value of Big Data of Apache Spark

    This ebook deep dives into Apache Spark optimizations that improve performance, reduce costs and deliver unmatched scale

    Read eBook
  • Accelerate The Time To Value Of Apache Spark Applications With Qubole

    Accelerate The Time To Value Of Apache Spark Applications With Qubole

    Qubole improves the performance of Spark workloads with enhancements such as fast storage, distributed caching, advanced indexing, metadata caching, job isolation on multi-tenant clusters. Watch here

    Read Article
  • Ensighten: Building a world-class digital advertising analytics platform using Qubole

    Ensighten: Building a world-class digital advertising analytics platform using Qubole

    Ensighten was able to decouple their compute from storage and handle user-level management and permissions across a variety of Spark, Hadoop and Presto with Qubole

    Read Article
  • AgilOne: Machine Learning at Enterprise Scale

    AgilOne: Machine Learning at Enterprise Scale

    AgilOne runs a variety of workloads for querying data, running ML models, orchestrating ML workflows, and more on Qubole

    Read Article
  • Nauto Improves its Data Scientist Productivity, Accelerates Product Development

    Nauto Improves its Data Scientist Productivity, Accelerates Product Development

    Nauto Improves its Data Scientist Prodcutivity, Accelerates Product Development

    Read Article
  • TrafficGuard Halts Digital Ad Fraud with Qubole

    TrafficGuard Halts Digital Ad Fraud with Qubole

    TrafficGuard relies on big data processing to detect and prevent ad fraud, which requires a robust infrastructure.

    Read Article
  • Apache Spark Getting Started Guide

    Apache Spark Getting Started Guide

    Self-paced guide to the Apache Spark analytics engine using Qubole

    Start Training
  • Big Data Activation Report

    Big Data Activation Report

    The data on big data -- what engines are used most, for what, and which are the rising stars.

    Read Report
  • Improve Apache Spark Performance by 2.9x with Amazon S3 Select Integration

    Improve Apache Spark Performance by 2.9x with Amazon S3 Select Integration

    Automatically use the S3 Select service whenever applicable to speed up queries

    Read Blog
  • Using Qubole Notebooks to Predict Future Sales with PySpark

    Using Qubole Notebooks to Predict Future Sales with PySpark

    Build and use a time-series analysis model to forecast future sales from historical sales data

    Read Blog
  • Improving Recover Partitions Performance with Spark on Qubole

    Improving Recover Partitions Performance with Spark on Qubole

    Significantly improve the overall performance of running Hadoop-based engines on the cloud object store

    Read Blog
  • Sentiment Analysis with H2O, PySpark and Word2Vec on Qubole9:37

    Sentiment Analysis with H2O, PySpark and Word2Vec on Qubole

    Using Qubole Notebooks to analyze Amazon product reviews using word2vec, pyspark, and H2O Sparkling water Developed and productionized on Qubole Notebooks.

    Watch Video
  • Qubole Enhances Spark Performance with Dynamic Filtering, a SQL Join Optimization

    Qubole Enhances Spark Performance with Dynamic Filtering, a SQL Join Optimization

    How Dynamic Filtering in Spark dramatically improves the performance of Join Queries

    Read Blog
  • Increase Apache Spark Performance by Up to 4x with RubiX Distributed Cache

    Increase Apache Spark Performance by Up to 4x with RubiX Distributed Cache

    How RubiX differs from Spark’s internal cache, and its performance improvement for Spark workloads

    Read Blog
  • Using Direct Writes to Significantly Increase the Performance of Spark Workloads

    Using Direct Writes to Significantly Increase the Performance of Spark Workloads

    Direct Writes delivers performance improvements of up to 40x for write-heavy Spark workloads

    Read Blog
  • Sparklens Report: A Free Community Service from Qubole

    Sparklens Report: A Free Community Service from Qubole

    Introducing sparklens.qubole.com, a reporting service built on top of Sparklens to lower the pain of sharing Sparklens output

    Read Blog
  • How to Increase Your Big Data Value with Apache Spark on Qubole

    How to Increase Your Big Data Value with Apache Spark on Qubole

    Run large Apache Spark clusters on the cloud without fear of job loss or out-of-control cloud costs

    Read Blog
  • loading
    Loading More...