How Not To Do An A/B Test

A Homemade Mess

There are a large number of ways to make a hot mess out of an A/B test. Here are five:

1. Don’t Measure Conversions
“We don’t have time to set up conversion tracking. Let’s just decide based on click rate.”
This is a terrible idea. Clicking and converting are two very different things, and click rates are often not correlated with conversion rates. For example, I click on pictures of Ferraris, because I like to look at Ferraris, but you can ask all my friends – I have never bought a Ferrari. I have bought a Mazda, a Toyota, a Datsun and a Ford Maverick. You can show me a Ferrari if all you want is clicks, but show me something I might actually buy if you want conversions.

2. Don’t Do Any Test Size Calculations
Ten minutes of work could tell you that you won’t have enough data to read your test even if you ran it for two years. Are you sure you can’t afford some time to do a Google search for “A/B Test Calculator” and plug some numbers into a form?

I’ll save you even more time, use this one: ABBA

3. Stick With Your Test Size Calculations No Matter What Happens
The test size calculations you did were based on some assumptions: confidence level, the magnitude of the difference you wanted to be able to detect, and the expected performance of the baseline or control. After you’ve run the test for a while you can begin to see where reality and your assumptions have parted ways. What should you do? Most people do repeated significance calculations and quit when they are satisfied with the significance. If you do this, you’ve spent too much time and opportunity cost on your test. You could have quit sooner, had you known about Anscombe’s Stopping Rule, which uses an approach called regret minimization, and you would actually end up with more conversions.

Check it out: A Bayesian Approach to A/B Testing

4. Don’t Think About Gating
What is gating?
Let’s say you have two different versions of a page: Version A and Version B. Let’s say your plan is to rotate them randomly. Let’s say your site and your content are such that most people come to the site repeatedly, say two to six times per week. If you are rotating Version A and Version B completely at random, then most of your users are going to see a blended treatment. This will reduce the effects of your test. To fix this, you want to make the version a person sees “sticky’ so one group sees only Version A during the test and the other group sees only Version B. That way each group sees a consistent treatment and you will see more of an effect (assuming the differences between A and B are substantial enough).
This is called “gating” and is done by randomly assigning new visitors (people with no gating in their cookie) to Version A or B, and then storing that in their cookie so that the next time they will see the same version.

5. Conclude That Your A/B Testing Result is Actually Optimal
An A/B test picks one “best” version for everyone. But isn’t it possible that there are some people in the audience who’d respond best to Version A and others who’d respond best to Version B? For that, you’d need to be able to collect lots of data about what kinds of users respond to the different options, and then you’d need a way to target the two versions at the audiences they work best with. Fortunately such tools exist.

Check out the toolset I work with every day at [X+1]: [X+1] Home Page

One Response to “How Not To Do An A/B Test”

  1. Zara Tretyakova Says:

    A very impressive article. Well prepared. Very motivating!! Go off on to facilitate way


Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>