Small numbers in user-centered design

The misconception – or the mistaken intuition – that a small sample accurately reflects underlying probabilities is so widespread that Kahneman and Tversky gave it a name: the law of small numbers. The law of small numbers is not really a law. It is a sarcastic name describing the misguided attempt to apply the law of large numbers when the numbers aren’t large. – The Drunkard’s Walk

User-centered designers, who have chosen to accept the qualitative method of needing only a handful of participants to get useful usability data, need to be very careful not to fall into the trap of the law of small numbers.

As a basic example, if 4 out of 5 participants fail to recognise the purpose of a particular button on a page, a designer should not attribute that failure with greater significance than the 1 participant who had no problem with it. Rather, he or she must continue to look at each participant’s feedback in isolation of the others, and address that based on qualitative data at his or her disposal.

Such information could include contextual or cultural knowledge about each participant and why the issue arose specifically for each. From this one could make changes that address the issues raised while still not affecting the 1 participant who didn’t originally have a problem. How one does this is ultimately what makes design hard, and why qualitative user testing isn’t ever a silver bullet. It’s a helpful, but ultimately thin, validation point before proceeding further with a design. The real proof will come from larger numbers – numbers you can only realistically get once a design has been deployed.

It is perhaps this difficulty of interpreting isolated and qualitative feedback into a homogenised and all-inclusive design solution that results in statements like, “based on testing, most people would have an issue with that button”, even though this has not been proven statistically true. It is seemingly an attempt to beef up the reliability of a non-conclusive starting point. This doesn’t mean it isn’t helpful, it’s just not conclusive. The very small sample size of 5 participants falls extremely short in what is actually required to eliminate random deviations from the norm. Yet this kind of framing of findings happens all the time, and those who are audience to it regularly eat it up.

The law of small numbers is a very easy trap to fall into, especially when designers are being pressed by those around them to give them harder and more actionable findings.

Nonetheless, it is important for real longterm design success to resist the pressure and innate urge to produce erroneous statistical conclusions from qualitative methods. This can only be done if we accept that a good amount of user testing will not provide us with actionable information. We need to be able to accept this, throw away our testing scripts, recordings, and notes, then defer back to our heuristic experience and be ready to change things quickly when more conclusive data becomes available.