Data Engineering

  • Big Data Engineering for Machine Learning

    Big Data Engineering for Machine Learning

    How big data engines are used for exploring and preparing data, building pipelines, and delivering data sets to ML applications

    Read White Paper
  • The Key to Building Data Pipelines for Machine Learning: Support for Multiple Engines

    The Key to Building Data Pipelines for Machine Learning: Support for Multiple Engines

    Which engines are most effective for each stage of the data engineering cycle

    Read Blog
  • Workload-Aware: Auto-Scaling A new paradigm for Big Data Workloads

    Workload-Aware: Auto-Scaling A new paradigm for Big Data Workloads

    Learn more about Workload-Aware-Auto-Scaling-- an alternative architectural approach to Auto-Scaling that is better suited for the Cloud and applications like Hadoop, Spark and Presto.

    Read Flipbook
  • UPCOMING: Mastering Data Governance on Cloud Data Lakes with Multiple Engines

    UPCOMING: Mastering Data Governance on Cloud Data Lakes with Multiple Engines

    Qubole data privacy and integrity experts cover how to maintain data integrity and privacy of data residing in data lakes using various open-source engines.

    Read Article
  • Data Engineering Pitfalls and How to Avoid Them

    Data Engineering Pitfalls and How to Avoid Them

    Simple, practical solutions for common challenges faced by data engineering teams

    Watch Webinar
  • How To Build Scalable Data Pipelines for Machine Learning

    How To Build Scalable Data Pipelines for Machine Learning

    Common challenges faced by data engineers when building pipelines for ML and how to address them

    Watch Webinar
  • What’s New with Airflow on Qubole? DAG Explorer and More

    What’s New with Airflow on Qubole? DAG Explorer and More

    Use Apache Airflow to author workflows as directed acyclic graphs (DAGs) of tasks

    Read Blog
  • ETL Processes with AWS Data Pipeline And Qubole

    ETL Processes with AWS Data Pipeline And Qubole

    How to facilitate event-based processing of long running ETL processes with AWS Data Pipeline and Qubole

    Read Blog
  • Airflow on Anaconda: A Match Made in Heaven, Perfected by Qubole

    Airflow on Anaconda: A Match Made in Heaven, Perfected by Qubole

    How Airflow on Anaconda makes running machine learning pipelines and data science tasks seamless

    Read Blog
  • Big Data Activation Report

    Big Data Activation Report

    The data on big data -- what engines are used most, for what, and which are the rising stars.

    Read Report
  • G2 Crowd Grid Report for Big Data Processing and Distribution | Fall 2019

    G2 Crowd Grid Report for Big Data Processing and Distribution | Fall 2019

    Which vendors rank highest in customer satisfaction for big data processing

    Read Report
  • Leveraging Streaming and Batch Data Sets for ML Applications

    Leveraging Streaming and Batch Data Sets for ML Applications

    Learn how to use Qubole to acquire and transform data sets for data science and analytics, make data sets available to different users, and fully leverage your data lake.

    Watch Webinar
  • Nauto Improves its Data Scientist Productivity, Accelerates Product Development

    Nauto Improves its Data Scientist Productivity, Accelerates Product Development

    Nauto Improves its Data Scientist Prodcutivity, Accelerates Product Development

    Read Article
  • Ibotta Builds a Self-Service Data Lake to Enable Business Growth

    Ibotta Builds a Self-Service Data Lake to Enable Business Growth

    Ibotta cut costs thanks to Qubole’s autoscaling and downscaling capabilities, and the ability to isolate workloads to separate clusters.

    Read Article
  • Poshmark Experiences Hyper Growth and Uses Qubole to Create Value for the Poshmark Community

    Poshmark Experiences Hyper Growth and Uses Qubole to Create Value for the Poshmark Community

    Qubole saved Poshmark up to one year to start transforming big data into creating value for its community

    Read Article
  • loading
    Loading More...