Appendix: Embedding Domain Knowledge for Estimating Customer Lifetime Value

How we designed an interpretable neural network to predict Customer Lifetime Value (Appendix)

This is an appendix to the blog post Embedding Domain Knowledge for Estimating Customer Lifetime Value. We will describe some alternatives we considered for solving the proposed problem, but did not end up being implemented.

First, let’s assume we have a pre-trained model for estimating the probability of the target $yAlive_N$ and $yTaker$.

Estimating Lifetime Value using an optimization function

With a model containing client propensity of accepting the offer (yTaker), we can make a simple calculation for estimating CLTV:

Business Rules only approach

     \begin{eqnarray*} argmax & ( & \\ \text{X in Offer} & & (Propensity(User, X) \times PriceDest(X) \times 24 + \\ & & (1-Propensity(User, X)) \times PriceOrigin(User, X) * FP)\\ & ) & \\ \end{eqnarray*}

The first term of the equation is the expected revenue at the end of the fidelization period (FP), which is being renewed to 24 months. A second term is summed, comprised of the expected revenue in case the client does not accept the offer (and assuming no new offer is made in the remaining months – as such, he remains for “FP” months).

Business Rules + Propensity + Churn Model approach

Let’s now assume we have two models:

  • Propensity Model: we can calculate the probability of y_taker_N (i.e., of client accepting the offer)
  • Churn Model: we can predict the number of remaining months until the client churns

And that we also have some business rules embedded:

  • Survival Buyers: we can calculate global survival curves, for the complete customer base (Buyers), for clients which accept any new offer. These give us the average number of months until the client leaves the company, if he accepts an offer.

We can then create a slightly more complex optimization function.

     \begin{eqnarray*} argmax & ( & ( PriceDest(X) \times Buyers(FP) \times Propensity(User,X) + \\ \text{X in Offer} & & (1-Propensity(User, X)) \times PriceOrigin(User, X) \times \\ & & Churn(User) \\ & ) & \\ \end{eqnarray*}

Single-Task Machine Learning 

Although this is a solution that can be quickly calculated in case pre-trained models are available for churn and taker tasks (which is good for quick proofs of concept and baseline performance), we are not using much of the knowledge which can be extracted from customer interaction.

A possible approach for using this is including the probabilities of accepting the offer and churning as features, as follows:

CLTV :: Propensity x OriginOffer x DestinationOffer x ChurnProbability

However, this would require maintaining three models in production, and assessing their quality constantly: a regression model for estimating customer lifetime value, propensity model and churn model. Also, if we wanted to do a multiple output approach, this would require having as many pre-trained models as the number of outputs.

Like this story?

Subscribe to Our Newsletter

Special offers, latest news and quality content in your inbox once per month.

Signup single post

This field is for validation purposes and should be left unchanged.

Recommended Articles

Business-centric AI: A New Perspective for Your Company

Coping with the challenge of integrating AI into your business? You’re not alone. Many companies struggle to find the right approach to AI, often getting lost in technical details or data management issues. However, there’s a solution that transcends these common pitfalls: Business-centric AI. This transformative strategy is the perfect way to align your core […]

Read More
Long-term vs. Short-term Predictions in Machine Learning

When building a machine learning model, one of the most common questions is whether to opt for long-term or short-term predictions. In other words, should you build a model that forecasts an event tomorrow or a month from now? Our article will demystify this critical decision-making process. We’ll walk you through a strategic approach that […]

Read More
Ditch the Crystal Ball: Reverse-Engineering with Machine Learning

  Machine Learning models are estimators – which means they can be used not only to predict unknowns in your business but also to reverse-engineer complex business processes. As part of this blog post, you will learn how to identify these potential points of improvement, prioritize them, and create models to estimate them. Identification How […]

Read More