Using Qubole Data Service, we have reduced our monthly cluster processing costs by 50%. And, we’re getting more for our money. Advanced features such as auto-scaling and S3 I/O optimization provide more flexibility and faster turnaround time to support the needs of our business users.
Shailesh Garg, Head of Analytics
Komli Media’s success in the digital advertising business depends on recognizing and reaching targeted audiences efficiently and at scale. The company uses Big Data capabilities built on Hadoop to understand ad campaign performance. Komli has amassed) 100+ TB of data in Hadoop, arming its business users with information such as ads served, clicks, reach, impressions, data events, and consumer behaviors. Users rely on insights derived from this data to optimize campaigns, improve real-time bidding algorithms and to identify new opportunities. Komli encountered a number of issues with its initial implementation of Hadoop centering around the ad-hoc nature of its Big Data processing requirements explains Shailesh Garg, Engineering Manager at Komli Media. “Suddenly, there’s huge demand from our users. I would just have to say no because I knew that we didn’t have access to the computational resources we would need to process the requests. We were too vulnerable to the nature of loads on our platform and didn’t have a lot of flexibility.” Another challenge was the company’s sheer volume of data. Komli collected around 700 GB of raw data each day or about 21 terabytes each month. Fixed clusters made it impossible for Komli to handle requests to process a month’s worth of data with an acceptable turnaround time. Even smaller queries could take as long as 15 hours to process. Because of the long wait, users couldn’t rely on getting the data they needed to get their jobs done. Ad-hoc processing of Big Data was also costing Komli too much money. Monthly cluster processing costs had escalated to $15,000 per month. With no ability to add or remove compute resources based on actual usage, the company had to over provision hardware to support its variable workloads.
Komli realized that it needed to switch to a managed cloud service to support its ad-hoc Big Data processing. It looked at options such as Amazon’s EMR, but decided that Qubole Data Service (QDS) offered some unique features that would best meet its needs. “The overwhelming reason we selected Qubole was because it is the only vendor that offers what I consider to be true auto-scaling,” comments Shailesh Garg. “By true auto-scaling, I mean that auto-scaling is self-service – if the load on our cluster is high – the cluster automatically expands. Conversely, nodes are automatically removed when the load is low. This is different from manual auto-scaling where you need to pre-define auto-scaling capacity.” Performance was also another area where QDS offered a higher value proposition than other options. Running Hadoop clusters on AWS would have compromised transfer speeds between S3 and Komli’s Hadoop cluster on AWS using EMR. Therefore, AWS jobs would have run more slowly than running them with QDS which has extensive I/O optimization for S3. This was important since Komli needed faster query execution.
QDS delivered the improvements in Big Data processing and total cost of ownership that Komli was looking for, faster performance and unlimited scale at a lower cost:
“Our business has a lot more flexibility now thanks to QDS,” says Shailesh Garg. “Nowadays users get their data processed in one or two hours max instead of the 10 to 15 hours it took before QDS. I’m no longer afraid when my business users ask for more data. In fact, I’m happy that they are able to depend on the availability of Big Data to make better decisions and I encourage them to experiment with even more business questions.” Komli also reports that Qubole offers outstanding technical support. It is pleased with the support team’s responsiveness and Big Data expertise throughout its migration to QDS.
Following its initial success with QDS, Komli is in the process of building a Big Data wish list including all the Big Data needs that it has not been able to previously support. It’s also looking at potentially adding Apache Hive data warehouse software to make it easier for our business analyst to query our data
Qubole is a significantly more polished product than EMR. Data scientists can explore their data in S3, create tables and query those tables all via an easy-to-use web UI
Qubole’s fantastic support has been key in our successful deployment. They continue to deliver of new features and revisit the ones that we ask for
Our goal at MediaMath was to take our existing industry leading infrastructure to the next level handling new complex analytics tasks. Qubole has helped us enable this goal with minimal risk.
Instead of worrying about provisioning clusters of machines or job flows or whatever, Qubole lets you focus on your data and your queries … The Qubole guys have been extremely helpful!
The service spins up users’ clusters only when a job is started, then automatically scales or contracts them based on the workload, and spins the servers down once the job is done.
Qubole’s Hadoop and Hive interfaces are vastly superior to the default CLIs, which scare business analysts and hinder meaningful analyses of the gaming logs that we collect. With Qubole, business analysts are self-sufficient in using a Big Data platform to meet their advanced analytic needs.
Online Gaming Company
top-performing technologies in the data industry are definitely taking aim at democratizing data tools and bringing the power of data to smaller businesses. This is a major change in the data industry, and Qubole Data Service is a great example
I’m very happy to be using Qubole in production. Qubole has saved me a lot of time, effort, and trouble in getting my data processing pipelines up and running. My data pipelines process Appnexus data in Amazon S3 which is then stored in Vertica. The engineering team understands the complexities and provided awesome support!
Real-time Ads Retargeting Startup
There’s a whole world of web companies, SMBs and other non-Facebooks or Yahoos that will want to use Hadoop but not want to run it in-house…offering a cloud service makes it easier for these users to get started with the platform and for Qubole to keep improving.
Qubole offers a big data ETL and exploration service through auto-scaling Hadoop clusters with a web user interface for data exploration and integration with various data sources. The service can do (nearly) everything EMR can do, and it goes further
Big Data Republic
Simba knows Big Data access. Qubole knows Big Data. Qubole’s founders authored Apache Hive, built key parts of the Hadoop eco-system and brought Apache HBase to Facebook
“The integration of Tableau and Qubole makes it faster and easier for our customers to operationalize Big Data…lowers the resource barriers to deriving the benefits of Big Data because customers can deploy our joint solution seamlessly and cost effectively.”