Introducing BG/NBD Models

In my last post we examined the nuts and bolts of the Pareto/NBD model. If you haven’t had a chance to read it, you can find a link to the article here. The Beta Geometric Negative Binomial Distribution, or BG/NBD for short, is another probability model that is frequently used by practitioners predict customer lifetime value in continuous, non-contractual settings. The underpinnings of the model is very similar to that of the Pareto/NBD. This post will cover the differences between the two models. Afterwards a quick demonstration of the model in action will be shown.

BG/NBD Modeling Assumptions

The modeling assumptions of BG/NBD is similar to that of the Pareto/NBD. The table below provides a side-by-side comparison of the assumptions behind both models:

 Pareto/NBDBG/NBD
1A customer can be "alive" for an unobserved period of time, and then die.A customer can be "alive" for an unobserved period of time, and then die.
2While "alive", the number of transactions made by a customer can be described using a Poisson distribution.While "alive", the number of transactions made by a customer can be described using a Poisson distribution.
3Heterogeneity in the transaction rate across customers follows a gamma distribution.After each transaction, a customer can drop out with probability p.
4Each customer's unobserved lifetime is distributed exponentially.Each customer's dropout probability is distributed across transactions according to a geometric distribution.
5Heterogeneity in dropout rates across customers follows a gamma distribution.Heterogeneity in dropout probabilities follow a beta distribution.
6Both the transaction rates and the dropout rates vary independently across customers.Both the transaction rates and the dropout probabilities vary independently across customers.
Modeling assumptions

As you can see the first two assumptions of BG/NBD is the same as that of the Pareto/NBD. The third assumption and onwards are where the key differences lie. Let's take a closer look at each assumption and break down what they mean.

Assumption #3: After each transaction, a customer can drop out with probability p.

BG/NBD takes a different approach to modeling the lifetime of each customer than Pareto/NBD. After each transaction, the model assumes that a customer may choose to discontinue their patronage afterwards. If we were looking at a grocery store for example, this would be akin to a customer shopping at the grocery store one week, and then choosing not to return the following week; perhaps because the customer found another store that sells groceries at a cheaper price.  The likihood that a customer will dropout is represented with a probability p.

Assumption #4: Each customer's dropout probability is distributed across transactions according to a geometric distribution.

Geometric distributions are used to model the number of independent trials until the first success. In our case, we are interested in modeling the number of transactions a customer will make before dropping out. We can figure out the likihood a customer will drop out after making some number of transactions using this formula:

Geometric distribution

The x in the equation is the number of transactions; p is the probability of dropping out after each transaction, as discussed in the previous assumption.

If we were to plot the geometric distribution using some value for p and a range of values for x, we would see that the probability would decrease as x increases, which reflects what we would expect from a customer who makes many repeat purchases. These customers are likely to stick around much longer, than those who make only a few purchases.

Assumption #5:  Heterogeneity in dropout probabilities follow a beta distribution.

In other words, not every customer has an equal likihood of dropping out after each transaction. Some customers may stick around much longer than others. Therefore we need a way to represent the variation in dropout probabilities for each customer. That way is the beta distribution.

Beta distribution

Beta distributions are distributions on probabilities. It can be used to model conversion rates, how likely someone will click on a Facebook ad,  the survival rate of patients with a disease, and other similar probabilities. The parameters a and b represent the number of successes and failures we expect.

The Model

The assumptions described above allow us to predict the probability that a customer will be alive after a certain time period and the number of purchases we can expect a customer to make in the future. This resource provides an excellent derivation of the equations used to calculate both.

BG/NBD In Action

An implementation of BG/NBD is provided by the lifetimes python package. The use and setup of the model is the same as that of the Pareto/NBD. The only difference of course is the instantiation of the model:

mdl = lifetimes.BetaGeoFitter()

My previous post provides a run through of the Pareto/NBD model on a sample dataset. Just replace the instatiation of the model with the code snippet above and you're all set!

That's all folks!

Next time I will show you another model that can be used in to predict monetary value of the customer transactions, giving us a complete picture of the customer lifetime value. Talk to you then!