Predicting Customer Churn with XGBoost & Apache Spark in AWS

October 9, 2018

In this video we will focus solely on XGBoost (a distributed machine learning algorithm) and the Telco Customer Churn Dataset to train and predict Customer Churn using automated Apache Spark ML pipelines manage by Qubole and their Notebooks. We will then explore productionizing the trained XGBoost ML pipeline behind a Customer Web Portal to perform real-time scoring of a customer and present tailored offers to preempt customer churn. Through this journey we will also cover the machine learning portability formats Predictive Model Markup Language (PMML) and Portable Format for Analytics (PFA) for model export.

Previous Article
How to Leverage AWS Spot Instances While Mitigating the Risk of Loss
How to Leverage AWS Spot Instances While Mitigating the Risk of Loss

Advancements in Qubole that reduce the odds of Spot instance losses in Qubole managed clusters

No More Videos