Big news! I won a second Kaggle competition (apparently lightning does strike twice), including a $2,500 prize and an invitation to be a guest speaker at the WSDM 2018 Conference.
Over the Thanksgiving and Christmas Breaks I decided to compete in another Kaggle competition. This time the challenge was to build a subscription churn mode for the Asian music subscription company, kkbox. I quickly rose to the top 5 and held out until the end to place 1st out of 575 team.
My solution consisted of Microsoft T-SQL for ETL/munging and some feature creation (the datasets were quite large, ~30GB) and Python/PANDAS/Sklearn for modelling with XGBoost/LightGBM being the primary learning algorithms used. Feature engineering dominated this competition and creativity paid off!
———Tools Used———
- Microsoft SQL Server 2016, Linux mode (Azure VM)
- LightGBM Python Library – 2.0.11
- XGBoost Python Library – 0.6
- SKLearn – 0.19.1
- Pandas – 0.22.0
- NumPy – 1.14.0 rc1
———–Links————
- Recorded Presentation (video): https://www.youtube.com/watch?v=OEDUzVH1aDI
- My Medium post detailing my solution: https://medium.com/@bryan.gregory1/predicting-customer-churn-extreme-gradient-boosting-with-temporal-data-332c0d9f32bf
- WSDM 2018 Cup Challenge: https://wsdm-cup-2018.kkbox.events/
- Kaggle Competition Overview: https://www.kaggle.com/c/kkbox-churn-prediction-challenge
- Final standings: https://www.kaggle.com/c/kkbox-churn-prediction-challenge/leaderboard