A/B Testing, or “Split Testing” as it is also known, can be one of the most useful and powerful tools available for CRO, when used correctly. Without careful planning and analysis, however, the potential benefits of an A/B test may be outweighed by the combined impact of errors, noise and false assumptions. For these reasons, we created The Crazy Egg A/B Test Planning Guide. Our user-friendly guide provides a roadmap through the A/B test planning process. In addition, it serves as a convenient way to record and store your testing history for future review. What is an A/B Test? If…

A type of hypothesis testing where multiple variables are tested simultaneously to determine how the variables and combinations of variables influence the output. If several different variables (factors) are believed to influence the results or output of a test, multivariate testing can be used to test all of these factors at once. Using an organized matrix which includes each potential combination of factors, the effects of each factor on the result can be determined. Also known as Design of Experiments (DOE), the first multivariate testing was performed in 1754 by Scottish physician James Lind as a means of identifying (or…

A term used to describe test methods or algorithms that continuously shift traffic in reaction to the real-time performance of the test. Also known as “multi-armed bandit testing”, the name is derived from the behavior of casino slot machine players who often play several machines at once in order to optimize their payout. Rather than stay with a single machine, the gambler will often play some percentage of the time on several other nearby machines. In this way, the new “hot” machine can be identified without leaving the original machine behind. When used in website testing, bandit testing represents a…

Confidence Interval: A range of values calculated such that there is a known probability that the true mean of a parameter lies within it. The science of statistics is all about predicting results by sampling a portion of a population. Since you can never be 100% certain of that prediction, the result is often expressed as a possible range of values. This range is also known as the confidence interval. For example, you might estimate average body weight based on a random sample of 500 men and 500 women. Your sample results will vary, so you need to add a…

Confidence Level: The percentage of time that a statistical result would be correct if you took numerous random samples. Confidence is often associated with assuredness, and the statistical meaning is closely related to this common usage of the term. To state a percentage value for confidence in something is essentially stating a level of how “sure” you are that it will happen. In statistical terms, it is the expected percentage of time that your range of values will be correct if you were to repeat the same experiment over and over again. Unfortunately, there is no such thing as a…

Margin of Error: An expression for the maximum expected difference between the true population parameter and a sample estimate of that parameter. When you are analyzing a statistical experiment or study and progress from discussing the test sample results to discussing the whole population that the sample represents, there will always be a margin of error attached to any estimated values. The margin of error will be stated with a “plus or minus” (+/-) in front of it, meaning you are just as likely to be above or below your estimated value by the same amount. Despite the word “error”…

Sample Size: The number (n) of observations taken from a population through which statistical inferences for the whole population are made. The concept of sampling from a larger population to determine how that population behaves, or is likely to behave, is one of the basic premises behind the science of applied statistics. For example, if you have a population of 10 million adults, and you sample 10 of them to find out their favorite television show, intuitively you will realize the sample size you have chosen is not large enough to draw valid conclusions. Knowing just how large “n”…

The debating, campaigning and speculating will soon be over. In just a few short weeks, our new President will be elected. Somehow, through this long and arduous process, the parallels between website A/B testing and the election became apparent to me as the campaign continued to grind on. This shouldn’t come as much of a surprise. After all, an election, particularly a Presidential election, has essentially become the ultimate marketing campaign. Of course, one of the objectives of this particular marketing campaign is to make it appear as if it is NOT a marketing campaign…anyone remember “New Coke”? With a…

The concept of perfection is an interesting one. Like infinity, it represents something that can never truly exist, yet most of us talk about it as if it does. How often have you heard a friend or coworker say they don’t want to put their name on anything that is not “perfect”? 40 years ago, Nadia Comaneci became the first gymnast to score a perfect “10” in an Olympic gymnastics event, breaking the paradigm that perfection was unattainable. Is it possible to create perfection in our website testing? For the athlete, achieving perfection is all about maximizing the output of…

They say you can never have too much of a good thing. When it comes to time, money and friendship, I completely agree. For many of us, website testing for CRO has become another “good thing” we have harnessed to put data to work for maximizing profitability. But could there be a time when we have tested too much? Under certain conditions: Yes. While A/B and other types of testing for CRO can utilize data in valuable and meaningful ways, we are ultimately responsible for the decisions we make about our testing strategy. Unfortunately, this can leave us open to…