Merkle’s Data Science Group Increases Capabilities While Reducing Model Runtimes and Cloud Costs with Qubole

Business Problem Overview

Merkle applies an industry-focused methodology to the solutions it provides to its clients, from customer strategy consulting, to audience and experience planning, to experience design and creation, to performance media and site execution. At the heart of these solutions is Merkle’s deep heritage in the data, analytics, and technology capabilities that enable them.

One key service that Merkle provides its clients is audience creation—identifying people who are likely to engage with a particular marketing campaign in a meaningful way. With increased competition, this type of service has become increasingly commoditized, intensifying pressure on marketing agencies (like Merkle) to advance their machine learning techniques to gain richer insights for their clients and increase their competitive advantage.

Machine learning involves gathering and combining data sets consisting of millions of records to build predictive models. Processing these enormous data sets requires considerable computing resources, especially with the ever-increasing urgency to obtain these predictions at a moments notice—more than Merkle’s prior on-premises environment could efficiently scale to provide. “Models would take about a day to run,” recalls Luke Berszakiewicz, Senior Manager of Data Science at Merkle. This meant that end-to-end modeling projects could take up to three weeks from start to finish—far longer than today’s near real-time world demands..

In 2018, members of Merkle’s data science team began scouting for a cloud-based vendor to modernize its machine learning processes, improve scalability and runtimes, and reduce data-processing costs. The team considered several suppliers but swiftly selected Qubole.

About Merkle

Merkle For more than 30 years, Merkle, a global data-driven marketing agency—has partnered with hundreds of organizations to build and maximize their customer portfolios. Merkle’s client roster includes a variety of top consumer brands and leading not-for-profits. The company, which employs more than 9,600 people worldwide in more than 50 offices around the world, and generated $1.1 billion in net revenue in 2019, uses data, technology, and analytics to pinpoint “areas of opportunity for our clients,” says Merkle president Craig Dempster.1

1Business Wire (

Superior Scalability and Runtime

Partnering with Qubole netted near immediate results for Merkle in terms of greater scalability and reduced processing times. “Queries run faster vis a vis traditional server environments because they can more easily scale up computing resources on demand” says Berszakiewicz. This has reduced model runtimes in half—fully automated use cases can take “as little as half a day to complete.”

Qubole does more than allow Merkle to run models more quickly. It also enables the team to build and test more models in a collaborative manner. Before partnering with Qubole, says Berszakiewicz, “We could only test a certain number of scenarios at any given moment.” In contrast, Qubole “allows us to test a wide range of models and model parameters in a short amount of time.”

Qubole allows us to test a wide range of models and model parameters in a short amount of time to come up with the best fit for a particular business problem.

Luke Berszakiewicz, Senior Manager of Data Science, Merkle

That’s not all. Using Qubole has also freed up bandwidth for other use cases, such as more efficient automated impression-level reporting with “drastically reduced human manual intervention, that wasn’t really possible before,” says Senior Associate in Data Science Peter Durham. One such use case entails matching ad impression data with conversion data for reporting purposes—a task that involves data volumes of up to 400 million records. Another entails serving up ad hoc query and reporting capabilities to clients for unplanned analyses. With Qubole’s cloud-based infrastructure, says Durham, “we can respond faster to our client’s requests and with greater levels of insights”

Easy Adoption, Collaboration, and Integration

Before adopting Qubole, the Merkle data science team experimented with a different cloud-based tool. “The amount of time it took for someone who was green to operate in a cloud environment on that tech stack was about three months” recalls Berszakiewicz. “That’s just a long onboarding process.” In contrast, Berszakiewicz observes, Qubole is essentially plug-and-play. “If a user is familiar with SQL, they can start using Hive or Presto on Qubole quickly. If they’re familiar with Python or R, they can start writing scripts in Qubole notebooks.”

Not having to manage infrastructure removes a big headache from our side. End-users don’t need to spend much time considering the hardware or the software running on top of it.

Peter Durham, Senior Associate in Data Science, Merkle

Low Compute Costs

Qubole scales automatically. When customers require compute power, Qubole automatically provisions the necessary compute resources to scale up and meet demand—and automatically scales back down when the demand passes. The result: “Autoscaling delivers significant cost avoidance compared to on-premises servers for running identical workloads” says Berszakiewicz.

And unlike other analytics tools, which require significant manual computations, Qubole’s Cost Explorer facilitates the measurement of these cost savings, as well as of speed and ease of use.

Luke Berszakiewicz, Senior Manager of Data Science, Merkle

Looking Ahead

Looking ahead, Merkle’s data science team plans to continue its ongoing effort to enhance and improve its portfolio of machine learning models. The team also hopes to incorporate streaming data into its models. Both of these efforts will require the team to lean even more heavily on Qubole’s data processing technology.

Business Value

  • Reduced model runtimes from a day to five or six hours
  • Shortened project duration’s from three weeks to four days
  • Increased capability to test a wide range of scenarios in a short amount of time (versus only being able to test one or two scenarios)
  • Improved efficiency to fulfill other use cases such as reporting and ad hoc analysis
  • Near-zero onboarding time for new members of the Data and Analytics team
  • Increased collaboration and efficiencies for analytics/reporting and ML
  • Reduced compute costs by multiple factors for the same workloads compared to the previous on-premises environment.

In addition, Merkle clients who use the company’s models have seen an average 25 percent lift in sales, 26 percent increase in revenue, and 25 percent decrease in the cost of analysis, as well as increases in margin, conversion, and search traffic.2


2 Merkle (

Download the PDF version of this case study.

About Qubole

Qubole is the cloud-native data management platform for analytics and machine learning that allows enterprises to quickly harness the power of data to gain valuable business insights. Only Qubole provides a unified environment for all major cloud providers and data processing engines. The company’s unified environment includes optimized versions of Spark, Presto, Hive and Airflow, with intelligent automation technology that scales usage up or down to meet service-level needs and minimize cloud costs. Based in Santa Clara, Calif., Qubole has offices in New York City, San Francisco, London, Singapore and Bangalore. For more information, visit us online.