Qubole Supercharges Capabilities for Data Science and Exploration via RStudio Integration

RStudio Server Pro provides data scientists with data processing, enterprise collaboration, data management and security benefits from within the Qubole data lake platform 

SANTA CLARA, Calif., Aug. 6, 2020 Qubole, the open date lake company, today announced that customers can now enable, access, and work with their enterprise-grade RStudio Integrated Development Environment (IDE) directly within the Qubole Open Data Lake Platform. By seamlessly integrating RStudio Server Pro with Qubole, customers will have access to out-of-the-box features and unique managed services that supercharge data science and data exploration workflows for R users while optimizing costs for R-based projects.

Data scientists depend on RStudio as one of the top tools of choice for machine learning, deep data exploration, interactive data analytics, and collaboration. With massive amounts of data now traversing the enterprise and becoming more accessible, data scientists and analysts need the power of computational frameworks that work with the R programming language, such as Apache Spark, to quickly make sense of this data and derive actionable insights for their businesses.

“We are excited to see Qubole standardize on RStudio Server Pro,” said Tareef Kawaf, President of RStudio PBC. “Through this strategic integration, organizations will now be able to easily analyze and access large datasets on a secure and highly scalable platform using any of the major cloud environments (AWS, Azure, or Google) of their choice.”

Key benefits of the integration for R users include:

  • Higher Productivity: Continue using familiar tools and languages to run and execute their R jobs with the power of Qubole, skipping the learning curve
  • Faster Results: Single-click access to very large datasets for their R-based AI/ML projects to improve the accuracy of models and report in an easy and simplified way with Qubole
  • Increased ROI: Granular visibility into the workload, job, and cluster spend, as well as TCO optimization for their data science projects with Qubole

Through this integration, users gain a persistent user workspace managed on the user’s cloud storage, providing easy access to optimized Spark clusters powered by Qubole clusters from Rstudio environments, including various pre-packaged libraries needed to start the big data journey with R and a convenient method in sparklyr to start Spark sessions on Qubole clusters.

“Users, and especially data scientists, like to work in their environment of choice. Through the RStudio integration, we are empowering data science, teams, with robust—but easy to use—toolsets to meet their diverse data exploration needs. Qubole is building on its commitment to help businesses scale by addressing an expansive range of use cases on data lakes,” said Ashish Thusoo, CEO and co-founder, Qubole. Our native integration with RStudio provides out-of-the-box statistical and graphical analysis functionality to further simplify the end-to-end machine learning workflow.”

To learn more about how to enable RStudio Server Pro’s capabilities into your Qubole platform, contact our support team or visit here.

For more information about how Qubole simplifies machine learning, streaming analytics, and data exploration, visit Qubole.com.


Qubole is the open data lake platform for analytics and machine learning that large enterprises depend on to quickly harness the power of data and gain valuable business insights. Only Qubole provides a truly open platform that works with all major cloud providers and data processing engines. The company’s unified environment includes optimized versions of Spark, Presto, Hive, and Airflow, with intelligent automation that scales usage up or down to meet service-level needs and minimize cloud costs. Based in Santa Clara, Calif., Qubole has offices in New York City, San Francisco, London, Singapore, and Bangalore. For more information, visit us online.