IBM and Qubole Take Data Science and Apache Spark to the Public Cloud

Start Free Trial
October 24, 2016 by Updated January 8th, 2024


This morning IBM and Qubole made an exciting announcement that will provide the growing number of data scientists with a comprehensive environment based on public cloud infrastructure.

IBM is well known for its long-standing leadership in data science and its Watson Data Platform. It’s also a major committer to Apache Spark. The recently announced IBM Watson Data Platform allows users to find, share and collaborate on data through a set of flexible and composable services, accessed via a self-service “data access and browsing” capability. It is a platform that integrates all data types for AI power decision making, regardless of where they reside, using Spark as the underlying technology.

Qubole is one of the largest production and enterprise-scale Spark-as-a-Service providers. Our customers use the Qubole Data Service (QDS) to launch over 1000 Spark clusters per month. Some of these are very large – including a 500 node cluster using 120 TB of memory. While many of these clusters support data science applications our customers use Spark for ETL workloads as well.

With today’s announcement, Watson Data Platform users can connect to QDS to leverage public cloud infrastructure and Apache Spark. Using Qubole as part of Watson Data Platform, users will be able to access, process, and analyze data residing in any public cloud, close to the source, and without incurring in-data movement costs. Combining the power of analytic solutions like Watson Analytics with the auto-scale and automated big data management in public clouds, provided by Qubole, creates an ideal solution for demanding data science workloads.

We’re excited by IBM’s commitment to data science and cognitive experiences. Watson Data Platform Experiences are not just self-service tools for data professionals. They include content and a community infrastructure enabling data professionals to learn from each other and work collaboratively. QDS will be added to the inaugural version of IBM’s Data Science User Experience which was released in June of 2016.

With this partnership, customers can take advantage of the data they currently have in public clouds and “put it to work” to accelerate their path toward becoming an insights-driven enterprise.

Start Free Trial
Read Airflow as a Service on QDS is Generally Available