Predictive Analytics · Retention Strategy

Customer Churn Prediction & Retention Strategy

An end-to-end churn prediction workflow built for a subscription business — identifying customers most at risk of leaving, understanding the key drivers, and translating model outputs into targeted retention actions by segment.

Python pandas scikit-learn XGBoost scipy matplotlib

Client work is confidential. This case study uses the publicly available IBM Telco Customer Churn dataset to demonstrate the analytical approach.

The Challenge

The business problem

Subscription businesses lose revenue when customers leave — and the cost of acquiring a new customer far exceeds the cost of retaining an existing one. The challenge is identifying churn risk early enough to act, and understanding which customers to prioritise.

  • Which customers are most likely to churn?
  • What attributes are most strongly linked to leaving?
  • Which segments should be prioritised for retention?
  • How can model outputs be turned into targeted business action?
Approach

How the work was structured

01
Data Cleaning
Converted TotalCharges to numeric, removed 11 incomplete records, validated the resulting 7,032-row dataset.
02
Exploratory Analysis
Examined churn rates across contract type, tenure, monthly charges, internet service, payment method, and demographics.
03
Feature Engineering
Created tenure bands, service count, contract risk flag, support flag, and high-charge indicator to aid model interpretability.
04
Predictive Modelling
Trained and compared Logistic Regression, Random Forest, and XGBoost using class-balancing to handle the 74/26 churn imbalance.
05
Retention Strategy
Mapped model outputs and EDA findings to segment-level retention actions with an illustrative revenue impact estimate.
Exploratory Analysis

What the data revealed

2026-05-22T19:59:30.169616 image/svg+xml Matplotlib v3.10.8, https://matplotlib.org/
2026-05-22T19:59:30.387535 image/svg+xml Matplotlib v3.10.8, https://matplotlib.org/
2026-05-22T19:59:30.724971 image/svg+xml Matplotlib v3.10.8, https://matplotlib.org/
Predictive Modelling

Model performance

Three models compared on an 80/20 stratified split with class-balancing to handle the 74/26 churn imbalance. For churn prediction, recall and ROC-AUC are the primary metrics — identifying at-risk customers matters more than raw accuracy.

ModelAccuracyPrecision RecallF1ROC-AUC
Logistic Regression 0.726 0.491 0.794 0.607 0.834
Random Forest 0.784 0.622 0.476 0.539 0.816
XGBoost 0.737 0.504 0.607 0.551 0.794
2026-05-22T19:59:30.972449 image/svg+xml Matplotlib v3.10.8, https://matplotlib.org/
2026-05-22T19:59:31.284906 image/svg+xml Matplotlib v3.10.8, https://matplotlib.org/
Business Insights

What this means in plain English

Contract type is the strongest churn predictor
Month-to-month customers churn at significantly higher rates than those on annual or two-year contracts. Flexible contracts reduce commitment and dramatically increase exit risk.
The first 12 months are the highest-risk window
Short-tenure customers show the highest churn rates. Customers who reach 12 months are substantially more likely to stay — early experience and onboarding are critical.
Higher charges increase risk among newer customers
Customers paying above-median monthly charges — particularly those with shorter tenure — show elevated churn. They may not yet perceive enough value to justify the cost.
Absence of support services increases churn risk
Customers without tech support or online security are more likely to leave, especially fibre subscribers. These customers encounter friction with no safety net.
Electronic check payers churn more
This payment method is associated with higher churn than automatic payment options, potentially reflecting lower service engagement or commitment.
Senior citizens show elevated churn risk
Senior citizens churn at a higher rate than non-senior customers, suggesting potential accessibility gaps or unmet needs in this segment.
Retention Strategy

Turning predictions into action

SegmentRisk SignalRecommended Action
New customers (0–6 months) Low tenure → high exit risk Structured onboarding, setup support, 30/60/90-day check-ins
Month-to-month contract holders Flexible contract → low commitment Offer annual contract incentive or loyalty discount at renewal
High monthly charge, short tenure Price sensitivity before perceived value Proactive loyalty offer before the 6-month mark
No tech support / online security Lower service stickiness Bundle tech support or security into existing plan
Electronic check payers Low payment automation Incentivise switch to auto-pay with a small monthly discount
Senior citizens Accessibility / engagement gaps Dedicated support channel and simplified service experience
Customers targeted
600
High-risk customers identified by the model
Campaign effectiveness
15%
Assumed retention rate from outreach
Customers retained
90
At $65/month × 12 months
Revenue retained
$70,200
Illustrative estimate — actual results depend on campaign cost and targeting precision