How to calculate the sample size for an A/B testing, Including Calculator & Derivation
Introduction
Suppose that we have two treatments A and B representing two different marketing strategies. Treatment A is called the “control” that represents the marketing strategy that is already in place and Treatment B is called the “variation” that represents a newly-developed marketing strategy. We are going to compare the two different marketing strategies A and B by comparing the conversion rate associated with each strategy. The conversion rate is equal to the number of purchases over the number of visiting customers. For example, when considering an e-commerce website, the conversion rate for treatment A, is found by calculating how much of the number of website visitors who view the website designed according to marketing strategy A, purchase the product. A higher conversion rate is an indication of a good marketing strategy whereas a low conversion rate is an indication of a poor marketing strategy. In A/B testing, we decide whether or not there is a significant difference between the two strategies, and if there exists a significant difference, we find which strategy is superior and quantify such a significant difference. The main statistical tool for A/B testing is Hypothesis Testing, and in particular when conversion rates are involved, we are using the test of proportions.
In this article we will first provide a calculator that calculates the required sample size needed before the A/B test is conducted that ensures a proper comparison between the two treatments. The formula for the required sample size is then shown together with a full mathematical description and derivation of the A/B test and the associated sample size equation.
Sample Size Calculator for A/B Testing
Minimum Detectible Effect (in %):
Confidence Level (1-):
Power (1-):
Sample Size per Variation:
- The control conversion rate is the conversion rate associated with the marketing strategy already in place.
- The confidence level is the probability of correctly stating that the variation is not a better alternative to the marketing strategy already in place.
- The power is the probability of correctly stating that the variation is indeed a better alternative to the marketing strategy already in place.
- The minimum detectible effect is the value such that the test can identify a relative increase of or more from the conversion rate of A to the conversion rate of B, with a confidence level of at least and a power of at least .
For example, suppose that the conversion rate associated with the marketing strategy already in place is 8%, that is is 8%. Suppose that the minimum detectible effect is 25%, the confidence level is 95% and the power is 80%. The required sample size for the variant B is 2513. Thus if we have a sample of 2513 customers for variant B, we can perform the A/B test and be able to correctly state that variant B offers no better alternative to the control A with probability 95%, or else correctly state that variant B offers a better alternative to A with probability of at least 80%, when the percentage change from the conversion rate of A to the conversion rate of B is at least 25%.
Required Sample Size Formula for A/B Testing
The minimum sample size for the variation B given a control conversion rate , a minimum detectible effect , that ensures a confidence level and power of at least is:
Thus, if the percentage change from to is or more, and we have such a sample size for variation B, we can perform the A/B Testing with a probability of a Type 1 error of at most and a probability of a Type 2 error of at most . The following section gives the mathematical derivation of the sample size equation.
Mathematical Derivation of the Sample Size
Let be population proportion (i.e. the conversion rate) for the control A and let be population proportion (i.e. the conversion rate) for treatment B. We will use the one-tailed set of hypothesis. Thus the hypotheses for the test of proportions are:
The control A is the marketing strategy that is currently in place. If the marketing strategy B is better, that is , then the alternative hypothesis () will be true. If the marketing strategy B is worse or equally as effective as marketing strategy A, then B is disregarded.
In practice the true values of and are unknown and in statistics we use the sample proportions and to derive results on the two population parameters and and thus we are able to choose one of the two hypotheses. In hypotheses testing we can make two types of errors. The first one is when we accept when in reality is not true. This is known as a Type 1 error. The second one is when we reject when in reality is true. The probability of making a Type 1 error is and the probability of making a Type 2 error is . We are after an appropriate sample size that ensures that both and are fixed to certain pre-defined values. A common value for is 0.05 (5%), whereas a common value for is 0.20 (20%). Thus, in such a case, we will be finding a sample size that ensures that the probability of a Type 1 error is 5% whereas the probability of a Type 2 error is no more than 20%.
Let and be the sample sizes for control A and variation B respectively. Recall, that from probability theory, the sampling distribution of the difference is given by:
Therefore,
We are going to fix the probability of a Type 1 error to be equal to . Hence:
If we assume that , the pooled sample proportion reduces to and the confidence interval reduces to:
Now let us consder the probability of a Type 2 error associated with the alternate hypothesis , in particular with the value . We want the probability of a Type 2 error to be at most . Therefore:
For such an inequality to hold, we have:
Now since the control A is the market strategy already in place, we assume that we would have a value for in hand. In fact is the control conversion rate which is one of the inputs of the sample size calculator. We will express in terms of as follows:
,
and in terms of as follows:
,
where stands for the minimum detectable effect, and is the minimum percentage increase from and , that results in a test having confidence level and power at least .
Therefore,
Thus the minimum sample size for the variation B given a control conversion rate , a minimum detectible effect , confidence level and power at least is:
Worked Example
Suppose that the conversion rate associated with the marketing strategy already in place is 8%, that is is 8%. Suppose that if the percentage change from the conversion rate of strategy A to the conversion rate of strategy B, is 25% or more, we can perform an A/B testing with confidence level of at least 95% and power 80%. Thus , and is 25%. The required minimum sample size is: