Dave Wilby (Return Path) showcases innovative AI and Machine Learning platform powered by Qubole


Dave Wilby: Good afternoon, everyone. Can you hear me? Thumbs up. Yes? Yay. This is so bizarre. It’s like, “All right.” Good afternoon. My name is Dave Wilby. I’m the chief operating officer for a company called Return Path. “Return who?” you may be asking. Return Path, we are a company that helps commercial senders reach the inbox through aiding them to improve their deliverability. How many of you here have a spam folder? Yes, all right. How many of you are Gmail users? Okay. How many use other mailbox providers? Good to know.

One of the other things we do is that we work in the community with mailbox providers to determine what is good email and what is bad email and to try and keep the bad email out of your inbox. Now, you may think that we do a really poor job, but equally, I would dare you to go into your spam folder and just see how much interesting email we do put away for you. We work with the mailbox providers to keep the bad guys out and let the good guys in. On the commercial side, we help the deliverability of commercial emails that you have probably signed up for, those coupons that you want to hear, offers you want to see, receipts you want to get, and travel information, and that’s where we sit.

We sit in the middle of the ecosystem. We have three big data sources where we pull from. Today, I want to give you an example of how we’re using big data on a platform with our cloud partners at Qubole to solve real-world problems. Those three big data sources are, we have a panel of over two million subscribers who use our tools every day in their inbox, that gives us first-party information. For those of you who are on the security and privacy side, we’re fully GDPR compliant, we have permission, it’s all aboveboard, we’re not selling the data to anyone.

The second piece is a collaboration between ourselves and over 80 mailbox providers around the globe where we share information about how well the email ecosystem is doing. By that, I mean we produce a blacklist of bad senders. These are the guys you should definitely not even bother to put in the spam folder, that you should just reject them when the email arrives in the inbox. In return, they give us information that around how much of volume is arriving at that particular inbox. That gives us insights to close to 2.6 billion inboxes across the globe.

Finally, we have over 30 million parked domains. In other words, domains on the internet that should never be receiving email. They may well be domains that have once received email and now, have closed down or they are typos. For example, your bank with a misspelling. That gives us a good triangulation across three different vectors. What customers actually see in their inbox, what mailbox providers are actually delivering to the inbox, and from an infrastructure point of view, what shouldn’t be delivered? Between those three data points, we now have an unprecedented view about the health of the email ecosystem.

I don’t know how many of you were chemists at school, I failed miserably, but one of the things I do remember is the periodic table. This is our data periodic table. It’s nothing to do any more than there will be a quiz at the end of the presentation, and if you can remember more than two boxes, you win a prize. No, seriously. It’s about showing you how many disparate sorts of data you can have in your organization. If you think about your environment, think about all the different ways that you can collect data, think of all the different aspects that go to make up a picture of how your clients are doing, how your partners are doing, and without doubt, it becomes more and more complex.

Big data is only getting bigger and more complex. The infrastructure picture behind me is to basically just demonstrate that we have all those millions and millions of data points coming into a scalable platform, and in partnership with Qubole, we have built an infrastructure that allows us to do a couple of things. One is consolidate all that data together, but more importantly, it gives me, as the COO, confidence that I am presenting to my developers, analysts, and my end customers a safe environment in which to explore that data and improve their business without getting swamped.

How many of you here have data scientists on call or on consultancy? How many times do you have them going, “As you can see from the data,” and they send you this great, big Excel spreadsheet of numbers and you’re like, “Actually, I can’t see.” This is a great environment in which you can now build a picture of how well your business is doing. This isn’t really a technical talk. This talk is really about showing you how we use data to solve some problems and how we’re combining those different datasets together. How many of you send commercial email? How many of you send it out just to your clients, not just generating business? This is a fair smattering across the group.

All right, motherhood and apple pie moment, if you send more email, you’ll generate more revenue, right? Well, sort of. The problem with it is yes, you will, the more people you send to, but those same people that you send to will start to get disgruntled, if you over mail them, if you send too frequently, if you send the wrong message, they’re going to unsubscribe. Then you’re in this vicious cycle. You’re in this vicious cycle of paying more to acquire more people to send to, because your boss is standing over you going, “Send more email.” You end up in this terrible vicious cycle, which eventually will kill you. Your returns on investment will no longer work.

How do we get over that problem? One of the challenges we faced was a lot of our big commercial centers were saying, “We want to send more, but it’s hurting us. How can we get to the right client at the right time?” Getting to the inbox at the right time with the right message to the right client is the absolute target you want to hit. Too much email, too many unsubscribes, too much churn in the business. Problem is it’s a multi-variant problem, meaning it’s not just how much you can send and how frequently, it’s the engagement with the brand, it could be the life cycle of the customer at the time they’re engaged with your brand.

Personally, I moved house in June. For June, July, and August, I was desperately keen to get any offer from any home improvement store anywhere. It was like, “Yes, send me those vouchers.” Now, we’ve been in seven months, eight months, I don’t want those vouchers. Periodically, I want an update, yes, but I don’t want it. How do you take that all into effect? Traditional methodologies say, “All right. I’m going to send my list of subscribers and I’m going to segment them in typically, 30, 60, and 90-day list.” That can work, but you’re going to miss a lot of opportunity, because what you’re missing is the fine-grained detail within each one of those categories.

What we’ve built is a new solution that is a machine learning AI-driven buzzword bingo fully enabled with a badge and a t-shirt solution. Realistically, what it’s doing is it’s looking for patterns, and it’s marrying together the engagement that this particular customer and their clients have with the brand, and it’s saying, “How does that change over the life cycle of their engagement with you? How quickly can we identify those patterns? Then, can we actually segment, not just on how often we send, but how we send and what is the content of those sends?” We’re bringing together the data from the client, and we’re bringing together that data from their ESP platform, from their CRM platform.

We’re bringing it together with a look-alike model that we have from our panel of two and a half million inboxes. We’re basically saying, “I’m going to segment but in a slightly different way.” I’m going to segment and say, “These customers are highly engaged with the inbox and highly engaged with the brand at this time, you can send more frequently to them. This middle set are highly engaged in the inbox but not necessarily with your brand but maybe with your competitors, you need to send to them at this frequency. Then this final set is probably in the bucket of you just need to keep them warm, because they may have a life event that’s going to change which will drive them to engage with your brand.”

Okay? We realized that if we just do a one-off scoring model, it’s not going to work. You have to continuously come back and revisit that model. First of all, the distinction we have between machine learning and AI is that machine learning gives a recommendation, and then the human or pinkware, as we like to call it, makes the final decision. The AI version brings it into a fully blown automated solution which then just self-learns. Now, one of the tricky things we found is that the model worked really well except when it starts testing itself.

You’ll start, you’ll see great performance, and then it will throw a curve ball to test its own assumptions. If you’re working with a client, that could be a really scary moment, because it’s great improvement, great improvement, great improvement, nothing, and then you’re back up again. It’s like, “This is a learning experience I want to share with you.” It’s like, “It’s not all roses. You want to be careful and position that they are testing and working together.” This particular use case is a bricks and mortar retailer in the US, who I will not name, but basically had multiple millions of subscribers, and they were sending to everybody once a day. The entire list, every day. Their churn rates were close to about 30%, 40%, and they’re pouring money into acquiring more and more users. Everybody who came into the retail store was when you got your receipt, they were saying, “Can I email it to you?” Any possible way, so it was a massive opportunity. What we did was we brought together, started working with them for a four-week period, and we also split the list. We did a control and we did a test sample.

What we managed to discover on the list growth side was significant improvements, a 20% list lift in your growth. It’s not because we did anything magic, what we did was we stopped people churning and unsubscribing, which means your acquisition dollars go to actually grow the list rather than keeping it a steady state, make sense? That’s real dollars you can take back up the chain or reinvest in your own infrastructure.

The second one and this is the money shot is did we generate more revenue for them? Yes, we did significantly. The challenge here though is when we first started out, the model started sending less, and that’s a real hard place to go. If you can imagine going to your boss is going, “I think I can send more and make more money, but I’m going to start off by sending less and see how that discussion goes in your weekly one to one,” so it’s a bit of a challenge there, but what we have seen with the model is a break-even point of four to five months to break even and then it starts accelerating. Does that make sense? Okay. Rip-roaring success, this is going to be a product from you, Return Path. Awesome, no.

In the world of agile development, this is a failure. Why? Because I cherry-picked the one example that worked, out of 10, 9 failed. Why did nine fail? Because the customer was unwilling to implement the [unintelligible 00:12:24] mile. The recommendations for the model were coming back, they were coming back saying, “You need to send less,” or for the chronic under mailer who wasn’t saying enough email, they didn’t have enough creative, so they were like, “I’ll send the same thing again.” Like, “No. No, no, no,” so not everything is milk and honey when it comes to big data, but big data is very interesting especially when you combine it and look at that multi-variant problem to solve real customer problems.

We haven’t completely thrown it away, we are actually now bringing this to market as a professional services offering. We wanted it as a turnkey, press the big red button and it would work, now, we’re bringing it to market as a professional service offering. Word of warning, not everything in the big data space is good. The second problem I want to talk to you today about is how many of you send to Gmail and have a full and understanding of exactly what’s going on with Gmail? No, good, because nobody does. Gmail, although we have a relationship with over 80 mailbox providers around the world– Anybody here from Gmail, Google, before I– No? Good, all right. I could be a little bit more candid then.

They don’t share anything. They do not share data with anybody, so it is very difficult as a email provider and a email sender to understand what’s going on. You could have this weird dichotomy where you’ve sent an email program, week one, you send exactly the same one or very similar one, week two, you get a very different result. This is something our customers have been telling us about, “If you can solve this problem for us, Dave, it’d be awesome.” Well, I think we might have, which is great, and it’s a product that’s going to come to market in the June timeframe in North America and worldwide in October.

We know from observation that Gmail uses incredibly sophisticated engagement level monitoring to determine whether you’re going to get placed in the inbox. From the moment you press the button saying, “I’m going to send a campaign till the end of that campaign,” Gmail will react so quickly that if the first, say, hundred users don’t like it, they either say, “This is spam,” or, “I’m not interested,” or they’re not engaging with it, they will then move your campaign to a folder, and if you still don’t get a response, they will then move it to the spam folder, and eventually, they will put it in the junk folder or just deny it at the gateway. We see this, we have data for it over and over again.

It’s a highly reactive very quick turnaround on engagement. What can we do? The solution that we’ve built is, again, building from those two million panelists to say, “Can we effectively score your list based on engagement? Can we reorder the list, so that you are sending to your most engaged users first, then your medium engaged, and then your low engaged, which means you’ve got more chance of the whole campaign going through?” Yes, we can. We build look-alike models.

With, again, building on the Qubole platform that we have, we’re pulling in the panel data, we’re looking at the customer data, we’re marrying the two together, creating look-alike models, and then extrapolating that out and sending that back automagically back into ESP platform, so the delivery cycle actually works. Proof of the pudding, in the eating. A beautiful chart put together by one of my data scientists, who I actually managed to convince that a graph would be better than numbers. It’s still not a thing of beauty so bear with me. Point one on the graph, so the line chart is deliverability, the lower bar chart is their list size.

We’ve categorized it into either highly engaged with the brand or highly engaged with their inbox. For those of you who are color blind, my apologies, I’ll explain it later, but if you’re okay with it, I’ll plow on. You can see at this point, they bought something or they did an acquisition strategy. Huge list increase here, the next thing is they start sending to that list, you see a huge bump. The good news is Gmail has a model that is innocent until proven guilty. They’ll deliver email until people start complaining about it or they ignore it. As soon as they reach that point, you start to see this curve coming down. They’re either not engaging with their brand or they’re saying this is spam.

If you ever want to– I shouldn’t probably say this, but if you ever want to hurt a competitor, receive their email and say, “This is spamming Gmail.” It’s massive. Oh, I know. I shouldn’t say that. Anyway, you’re beginning to see that this is a cyclical pattern that they would do a great campaign, then it would dive off, they’d acquire more users. They do an activity that generates more to their list, and they start going in this cycle. We then engaged with them at this point, and you can see, it’s just going to continue, but during this period between this line and this line, we started analyzing the list.

We started reordering the list and started saying, “All right, you need to start pulling this back in and then send in this order rather than send in the random order that you have today.” What happened? Two things happened. First of all, you’re highly engaged users stopped oscillating, they became a lot better at deliverability. You stop being penalized for that lack of engagement. The second thing that happen is you’re low engaged users, again, you stopped that big oscillation, and note that the oscillation stopped while the list kept growing. In the past, as soon as you started losing people, your oscillation started to kick in again.

This gives you a much better performance against your email campaign in Gmail. Now, one of the questions I get around this is, “If I’ve reordered the list once, I’m done aren’t I, Dave?” The good news for me is that you’re not, because that engagement level with your brand will change over time much like we had with the same frequency model, it’s going to be exactly the same. It has a half-life of about four weeks, so after about four weeks, you need to revisit that list. Okay? All make sense? All right.

Two practical examples, hopefully, of how we have used big data to create new solutions. Combining data from multiple sources is key, it can also be bloody confusing, frustrating as hell, so I would recommend that you work and build a partner platform. Find something that can scale with your business, that gives you the ability to have predictable, reproducible, and scalable results. The other thing that we discovered is that we decoupled our data science from our engineers. In the past, the data scientists would come up with a solution and then they would be annoyed and frustrated that they had to wait behind other more important things that engineers are working on.

Now, with Qubole as a platform, what we’ve have now is a platform that allows us to rapidly prototype. A data scientist can come up with a brand new idea and a brand new model, they can deploy it into what’s known as a “notebook”, I can then push that out to my clients. My clients being my own professional services teams or my customers and they can test rapidly and give us feedback rather than waiting for a gap in the engineering calendar, then pushing the results out, encode, then harvesting the results, finding it didn’t work and coming back again. We’ve come down to a less than two-week iteration process, where we can rapidly prototype. Now, one of the side effects is also that, now, my data scientists believe that they’re all engineers, that they can build anything, so just a word of warning there, but rapidly prototyping is the benefit with the right platform. The other thing is that it has driven complete cultural change, and now, my data scientists are the rock star of our organization, because they can get in front of customers, they get that immediate feedback loop, and they’re producing real-world examples.

Thanks for your attention.