Beryl Analytics Blog

How to Build a Churn Prediction Model That Sales and Success Actually Use

By Beryl Analytics • 21 January 2026 • 9 min read

Plenty of companies build a churn model. Far fewer build one that anyone uses. The model gets trained, hits a respectable accuracy number, lands in a slide deck, and then sits there while customers keep leaving. The difference between a churn model that gathers dust and one that saves real revenue is rarely the algorithm. It is the decisions made around it: how you define churn, which features you trust, where you set the threshold, and how the score reaches the people who can act. This guide walks the full build, end to end, with the operational details that make or break adoption.

Step one: define and label churn precisely

Before any modeling, you have to answer a deceptively hard question: what counts as churn? For a subscription business it might be a cancellation. But what about a customer who downgrades to a free tier, or one who stops logging in but keeps paying, or one whose annual contract lapses without renewal? Each is a different outcome, and a model that mixes them learns mush.

Pin down a single, measurable definition. For example, churn is a paying account that cancels or fails to renew within a given billing cycle. Then label your history accordingly. Every past customer gets a clear yes or no, with a date attached. This labeling step is tedious and unglamorous, and it is also where most churn projects quietly go wrong.

Set the prediction window

Decide how far ahead you want to predict. A model that says someone will churn the day they cancel is useless. A model that flags risk 30 to 60 days out gives your team time to intervene. The window shapes everything downstream, so choose it based on how long a meaningful retention play actually takes.

Step two: choose features that carry signal

Features are the inputs the model learns from. Good churn features usually fall into a few families.

Engagement. Login frequency, active users per account, feature adoption, and the trend in all of these. A declining usage curve is one of the strongest churn signals there is.
Support and friction. Ticket volume, unresolved issues, time to resolution, and angry sentiment. A spike in support pain often precedes a cancellation.
Commercial signals. Plan type, contract value, time since last upgrade, payment failures, and discount status.
Relationship signals. Time since the last meaningful contact, whether a champion left, and account tenure.

A crucial trap to avoid is leakage, using a feature that only exists because the customer already churned. If a cancellation reason field is populated only after someone cancels, including it teaches the model nothing useful and inflates your accuracy in testing. Every feature must be something you would genuinely know at prediction time.

Step three: pick a model that fits your reality

The instinct to reach for the most complex model is usually wrong. For churn, a gradient boosting model such as XGBoost or LightGBM is the workhorse. It handles mixed data types, captures nonlinear patterns, and produces strong results without enormous datasets. Logistic regression is a fine starting point too, and its coefficients are easy to explain, which helps with trust.

Deep learning is rarely necessary for tabular churn data and often performs no better than boosting while being far harder to deploy and explain. The right model is the simplest one that hits your accuracy bar and that your stakeholders will believe.

Measure the right thing

Raw accuracy is misleading for churn because churn is usually rare. If only 5 percent of customers churn, a model that predicts nobody churns is 95 percent accurate and completely worthless. Use precision and recall instead. Precision asks: of the accounts we flagged, how many actually churned? Recall asks: of the accounts that churned, how many did we catch? You almost always have to trade one against the other.

Step four: choose a threshold that matches your capacity

A model outputs a probability between 0 and 1. You decide the cutoff above which an account is flagged. This is a business decision, not a math decision. If your success team can only work 50 at-risk accounts a week, set the threshold so it surfaces roughly the 50 highest-risk accounts, not 500. Tie the threshold to the capacity of the team that will act on the scores. A model that floods a team with alerts gets ignored within a month.

Step five: operationalize the score into retention plays

This is the step that separates models that work from models that get abandoned. A probability sitting in a database changes nothing. It has to become an action in the tools your teams already live in.

Route by segment. High-value at-risk accounts go to a human success call. Lower-value ones might get an automated email sequence or an in-app offer.
Push scores into the CRM. When a rep opens an account and sees a clear risk flag with the top reasons, the model becomes part of their daily workflow.
Give reasons, not just scores. A flag that says high risk: usage down 40 percent, two open tickets is actionable. A bare 0.81 is not.
Close the loop. Track what happened to flagged accounts so you can prove the program saved revenue and keep improving it.

Getting scores into live workflows is its own discipline. We cover the deployment side in our guidance on putting models into production, because a model that never ships saves nobody.

Frequently asked questions

How much data do I need to build a churn model?

You want at least several hundred churn events to learn from, ideally over a year or more so seasonal patterns are visible. If churn is rare and your customer base is small, you may need to collect more history before modeling pays off.

How often should the model be retrained?

Customer behavior drifts as your product and market change. Retraining monthly or quarterly is common, and you should monitor for performance decay so you retrain before the model goes stale.

Why does our model have high accuracy but no business impact?

Almost always because the scores are not reaching anyone who acts, or because the threshold surfaces too many or too few accounts, or because there is leakage inflating test accuracy. The fix is operational, not algorithmic.

The takeaway

A churn model earns its keep when a real person changes a real decision because of it. Define churn precisely, choose features you would genuinely know in advance, keep the model simple enough to trust, set the threshold to your team's capacity, and deliver scores with reasons into the tools people already use. Do that, and you will reduce churn. Skip the operational half, and you will have a very accurate report of customers who are already gone. To talk through a churn engagement, reach out through our contact page.

churn predictioncustomer churn modelchurn prediction modelreduce churn

Want analytics that actually moves the number?

Beryl Analytics builds predictive models, data pipelines, and dashboards that drive decisions for businesses across New Zealand and Australia. We ship to production and prove the return.

Talk to Beryl Analytics