Predicting Customer Churn with XGBoost & Apache Spark in AWS

In this video we will focus solely on XGBoost (a distributed machine learning algorithm) and the Telco Customer Churn Dataset to train and predict Customer Churn using automated Apache Spark ML pipelines manage by Qubole and their Notebooks. We will then explore productionizing the trained XGBoost ML pipeline behind a Customer Web Portal to perform real-time scoring of a customer and present tailored offers to preempt customer churn. Through this journey we will also cover the machine learning portability formats Predictive Model Markup Language (PMML) and Portable Format for Analytics (PFA) for model export.