This repository contains an end-to-end deep learning pipeline designed to predict customer churn using PyTorch. By analyzing customer demographics, account information, and service usage, the model identifies high-risk customers, allowing businesses to act precisely with retention strategies through a live interactive dashboard.
- Objective: Predict binary churn (Yes/No) with high precision.
- Model: Multilayer Perceptron (MLP) with Batch Normalization and Dropout.
- Performance: Achieved ~79.74% accuracy on the validation set.
- Interactive Interface: Live Streamlit web application for real-time risk scoring.
-
Step 1: We defined churn as the target variable (Binary Classification) to help the sales team prioritize retention efforts.
-
Step 2 (Data): Cleaned the Kaggle Telco dataset, handling "empty string" errors in
TotalCharges.
-
Encoded categorical features using
LabelEncoder. -
Applied
StandardScalerto ensure numerical features (Tenure, Charges) had a mean of 0 and variance of 1.
- Step 3 (Architecture): Designed a 3-layer MLP architecture (
64 -> 32 -> 1).
- Implemented
BatchNorm1dfor training stability andDropoutto prevent overfitting.
- Step 4 (Training):
-
Used
BCEWithLogitsLossfor numerical stability. -
Trained for 30 epochs, reaching a final training loss of 0.4060.
-
Step 5: Monitored accuracy across epochs, starting at 78.54% and peaking at 79.74%.
-
Step 6: Developed an
inference.pyscript for individual customer scoring.
- Deployed a Streamlit dashboard for non-technical stakeholders to perform "What-If" analysis.
The following visualizations provide context for why the model makes specific predictions:
-
Observation: Customers on Month-to-Month contracts exhibit drastically higher churn rates compared to One-year or Two-year contracts.
-
Insight: Contract flexibility is the primary driver of attrition; gender plays a minimal role.
-
Observation: Churn is highest in the 0-1 Year group and decreases significantly as tenure increases.
-
Insight: The first 12 months are the "critical period" requiring focused onboarding.
-
Observation: Fiber Optic users without Online Security are high-risk.
-
Insight: Bundling security services with high-speed internet is a strategic retention necessity.
-
Observation: Electronic Check is the most frequent payment method but is historically linked to higher churn.
-
Insight: Incentivizing a move toward Automatic Credit Card or Bank Transfer could reduce involuntary churn.
data/: Contains theWA_Fn-UseC_-Telco-Customer-Churn.csv.dataset.py: Preprocessing and PyTorchDatasetclass.model.py: Neural Network architecture.train.py: Logic for training and validation loops.main.py: Entry point for the full training pipeline.inference.py: Script to generate a risk score (0-100) for a single customer.app.py: Streamlit web interface for interactive predictions.requirements.txt: Environment dependencies.
- Clone the repo and navigate to the directory.
git clone https://github.com/devopspower/telco-customer-churn-predictor.git
- Install dependencies:
pip install -r requirements.txt
- Train the model:
python main.py
- Launch the Interactive Dashboard:
streamlit run app.py
