A/B Testing
A little while ago I was talking to our interaction designer Pekka about a link we had in the masthead that was under-performing. We suspected it was the wording that wasn't working, and after thinking up a few possible options Pekka said it would be great to be able to "A-B test" some of them. This led to some blank gawping from me, and so in turn to an explanation from him.
It turns out '' is a way to show different options for part of the UI to two (or more) groups of users, and then monitor what they do as a result. This can be done in normal of course, but more interestingly it can also be done on a live site with large numbers of users - getting real world results on a statistically valid scale. In our very simple case we could show the different wording options to different groups and see which one generated the most clickthroughs.
This sounded like a great idea, but not something I could fit onto the project backlog at the time. Luckily, some teams here set aside a little time within their project planning specifically to let developers 'scratch that itch'. No, not like that. I mean work on something that's bothering them rather than whatever's needed next in the project. Our team uses a system within our normal project planning process, , to give us up to a day a fortnight for this.
Thus I decided to spend a day writing this mechanism for doing A-B testing on bbc.co.uk.
The code
The JavaScript is split into two parts. The first is used for new visitors, and, aside from checking the function's configuration is valid, simply uses the probability value to decide whether the user is going to be part of the test, and cookie them accordingly.
The second is used for those who already have a cookie, and have been randomly selected for testing. It cycles through all the links on the page, checking for any that have a class starting 'abtest_'. When it finds one it adds an onclick which, when activated, makes a call to a tracked image so that the click is counted.
This tracking is done using the ³ÉÈË¿ìÊÖ's stats system, known internally as 'go-tracking', which allows us to easily get reports from the server logs showing how many times particular URLs are visited.
To make these calls I used img.src="..."
to make an asynchronous call from JavaScript. Note however, if the link in question goes to a new page about 1/3 of browsers won't complete the request before the page unloads. If this was an issue for a particular test the links' hrefs could be changed, so they go via a tracked URL instead.
Finally, in the page itself we place some small snippets of SSI (though we could easily use more Javascript or PHP) that check for the cookie, and change what the user gets shown according to its value.
Example use
For 5 in 100 people to get a two-option test running for 24 hours the function is initialised like this:
abtest.init(
'mydemo', {
probability:5,
numOfOptions:2,
path:'/includes',
duration:24
}
);
Within the markup, content is changed according to the cookie value:
<!--#if expr="$HTTP_COOKIE=/abtest_mydemo=a/" -->
Text a
<!--#elif expr="$HTTP_COOKIE=/abtest_mydemo=b/" -->
Text b
<!--#else -->
Default text
<!--#endif -->
And classes like this are added to links that are to be counted:
class="abtest_mydemo"
Any clicks on that link will result in an asynchronous call to an image with a path that includes "mydemo" from the class and the option value (e.g. "b") from the cookie:
/go/abtest/int/abtest_mydemo_b/
Thus after the test we can count the totals for each option, and use those figures to help decide which solution to go with.
And the winner is...
Well, whaddya know: the design changed, removing the button in question, so I'm still waiting for my first excuse to use it. However, my friend Manjit over in World Service has just run a test of the wording for the main link to their syndication widget, and immediately proven a small tweak would be three times as effective at getting users' attention. Not a bad first outing.
As a method it's not without its dangers of course. Your presented options must be realistic, and not just chasing clickthroughs for example - but with an experienced information architect at the helm I think this little snippet may come in quite handy.
What do you think? Have any of you used this kind of testing?
Comment number 1.
At 12th Jan 2010, Dominykas Blyze wrote:There is some sort of a bug somewhere, but my Google Reader showed this article at incorret URL: /blogs/webdeveloper/2009/10/ab-testing.shtml
Side note (as I don't know where to post a bug report): why am I not allowed to use my real name, which contains non-Latin characters?
Complain about this comment (Comment number 1)
Comment number 2.
At 12th Jan 2010, Mat wrote:Hi, yes, sorry about that: I originally started the article last year, then hit publish without realising that doing that wouldn't automatically update the publish date. Doh. :-(
(& I'll let Identity get back to you about your username.)
Complain about this comment (Comment number 2)
Comment number 3.
At 12th Jan 2010, Simon Cross wrote:Hey @Dominykas Blyze,
We're working on internationalising ³ÉÈË¿ìÊÖ iD which means you'll soon be able to have UTF-8 charachters in your Display Name. We'll have a blog post here or on the Internet Blog when that happens.
Si, Product Manager, ³ÉÈË¿ìÊÖ iD.
Complain about this comment (Comment number 3)
Comment number 4.
At 12th Jan 2010, Dominykas Blyze wrote:Thanks a lot for the responses!
And back on topic - it's really nice to have the numbers help make a decision. Care to share more details about the insights from the testing? :)
Complain about this comment (Comment number 4)
Comment number 5.
At 14th Jan 2010, mrmsr wrote:We found that just by tweaking the link text to our widget page on the Worldservice homepage from "Widget" to "Install Widget" we counted three times more clicks.
This was just a trial and proof of concept - with some tweaking help from Mat. We are now looking to roll this out in key areas across the ³ÉÈË¿ìÊÖWorldservice.com site and some of the language sites too including ³ÉÈË¿ìÊÖBrasil.com.
(BTW, I took the JS route for reading the cookie and manipulating the page with Glow rather than any ssi logic within the HTML.)
Complain about this comment (Comment number 5)
Comment number 6.
At 19th Feb 2010, steven lau wrote:A/B or multivariant testing is very common. I used it lots in my last company and the results warranted full time resources dedicated to this kind of testing. Options included: changing text, changing colours of buttons, changing page layout and removing items from the page. All variants were tracked. When implementing such a system, please include a method to force a particular variant to be displayed when required (cookie value, cgi value or similar) otherwise testing variants can be a nightmare.
Complain about this comment (Comment number 6)