Shefali Aggarwal
Memory Cost Model in Qubole Presto
In this post, we describe the design and implementation of a memory cost model in Qubole Presto, which when provided with a query and relevant…
Success with Predictive Analytics—Getting the Basics Right
Data has been deemed the oil that will fuel the next industrial revolution. Enterprises have the opportunity to leverage data at their disposal, be it…
Introducing Presto 0.193 in QDS
At Qubole, it has been our constant endeavor to ensure that our customers benefit from the latest improvements and features in open source releases while…
Qubole is GDPR Ready
Qubole has been preparing for GDPR—Europe’s General Data Protection Regulation. We were founded by real-world operators who understand that security, confidentiality, and data privacy are…
Evolution of Hadoop
Over the course of the next month, we will be going deeper into some of the trends uncovered in our 2018 Big Data Activation Report.…
The Data On Big Data
Introducing the Qubole 2018 Big Data Activation Report In our first-ever Big Data Activation Report, we analyze anonymous data from over 200 customers to provide…
Under The Hood : Building AIR at Qubole
In one of our previous blog posts on AIR Infrastructure, we discussed the various data sources for AIR and the architecture for collecting these data…
Machine Learning: Using the H2O Framework with Apache Spark Clusters on Qubole
H2O’s Sparkling Water allows users to utilize H2O machine learning algorithms on Qubole Spark. With Sparkling Water, users can drive computation from Scala/R/Python and utilize…
The Importance of Data Due Diligence
Earlier this month, a study was published indicating that the widely used “Reddit dataset” (released in 2015 by Jason Baumgartner) had significant, previously unidentified gaps.…
Snowflake Machine Learning
This is Part 1 of 3 Read Part 2 of 3 Read Part 3 of 3 Snowflake Big Data Snowflake and Qubole have partnered to…