Gamma Gamma Bills Y’all!
If you've been following my series of posts on customer lifetime value prediction, you should be well aware of Pareto/NBD and BG/NBD. Both models can be used to estimate the lifetime of any given customer and predict the number of transactions the customer will make while alive. It would be nice if we could also get an estimate of the average spend per transaction of each of our customers. The next model I will be going over will do just that. It is called Gamma Gamma.
Gamma Gamma Modeling Assumptions
Much like Pareto/NBD and BG/NBD, Gamma Gamma models come with their own set of assumptions.
- The first assumption is that the monetary value of any given customer transaction varies randomly around the mean transaction value for the customer. We'll go into more detail about this point later.
- The second assumption is that the average spend varies across customers, but remains consistent over time for any individual customer.
- The third assumption is that the distribution of average spend across customers is independent of the transaction process.
Going back to the first assumption, the value of each transaction is assumed to the distributed according to a gamma distribution. The gamma distribution formula again is:

Where α and β describe the shape and rate of the distribution. It just so happens that the rate parameter is also randomly distributed by the gamma distribution as well. The Gamma Gamma model got its name because of its use of two gamma distributions.
Given the assumptions we can calculate the average monetary value for any customer using the following equation:

You can read the original paper on the Gamma Gamma model for details on how the above equation was derived. Now that we've discussed what the model is, let's see it in action.
Gamma Gamma Model In Action
To demonstrate the Gamma Gamma model, I will return to the Online Retail dataset used in previous posts.
I'll start by once again reading in the transaction dataset:
transactions_df = pd.read_csv('data.csv',encoding='ISO-8859-1')
The Gamma Gamma model that I will be using comes from the lifetimes package. The model requires two pieces of information in order to provide predictions:
- The frequency of the customer transactions, and
- The total amount spend for each transaction
The required information will need to be calculated from the dataset. For the sake of brevity I will not be showing the calculation of the frequency and monetary values. You can refer to my RFM blog post for the details.
At this point, if you've read my previous post, we will have two pandas dataframes containing the frequency and monetary values. We will now combine them into one dataframe to make management of the data a little bit easier. We also want want to make sure that our dataframe does not contain any monetary values less than or equal to 0; the method that we will be using to calculate expected average profit will not work if such data is present.
combined_df = pd.merge(frequency_df, monetary_df, on='CustomerID')
combined_df = combined_df[combined_df['monetary'] > 0]
Now before proceeding with the model fitting we want to check for a correlation between the frequency and monetary values. This is to ensure that the third assumption of the Gamma Gamma model is valid. We won't be able to use the model if there's a strong correlation between the two fields.
combined_df[['frequency','monetary']].corr()

There appears to be a moderate correlation between frequency and monetary, but definitely not strong enough to stop us from using the model. We will go ahead and proceed with the fitting.
gg_mdl = lifetimes.GammaGammaFitter()
gg_mdl.fit(combined_df['frequency'], combined_df['monetary'])
The expected average spend per transaction for each customer can be computed using the conditional_expected_average_profit method.
combined_df['average_profit'] = gg_mdl.conditional_expected_average_profit(combined_df['frequency'],combined_df['monetary'])

Estimating Customer Lifetime Value
Gamma Gamma can be combined with either the Pareto/NBD or BG/NBD to calculate a lifetime value figure for each customer. This is all possible with the customer_lifetime_value method. To use it, we'll first need to build a transaction model. I will be demonstrating the use of the method using a Pareto/NBD model. Read my Pareto/NBD post for details on the construction of the model.
Some of the other parameters used by the method that you'd want to know about are:
- Frequency: The frequency of the customer transactions
- Recency: The amount of time passed since the customer's most recent transaction
- Age: The amount of time since the customer's initial purchase
- Monetary: The monetary value of each transaction
- Time: The amount of months into the future to estimate the lifetime value
- Discount Rate: The monthly adjusted discount rate
- Freq: The time unit of the age is measured in
Here, we will be estimating the 3 month customer lifetime value for each customer.
combined_df['cltv'] = gg_mdl.customer_lifetime_value(pareto_mbd_mdl,
combined_df['frequency'],
combined_df['recency'],
combined_df['age'],
combined_df['monetary'],
time=3,
freq='D')

That's all Folks!
You can find the code for this post here.