On the NYC-CHI mailing list today, someone from a big web shop wrote this:

My company is dismantling its focus group and usability testing lab, due to a chronic lack of space and clients’ growing reluctance to allow our agency to do testing on our own work.

Behavior‘s thinking has always been that it takes an awful lot of chutzpah to try to sell in-house usability testing services as part of a web design and development process. We build test prototypes, write test plans, advise on the creation of test screeners and suggest types of participants for recruiting, and of course we observe the testing and take notes — but the actual recruitment of subjects, the proctoring/facilitating of the sessions, the recording of the sessions, and the synthesis and reporting of the test results is done by a third party, always.

How do other consulting firms “get away” with testing their own designs? If your company does offer design and in-house usability testing services, have you heard clients express distrust of the model? If so, how do you get over it?

UPDATE: I should distinguish between the different scales of user testing here. When a design team conducts quick and informal usability testing (i.e., non-lab-based, such as with colleagues and friends), well, somehow to me that’s a lot easier to swallow than a when a large-scale formal lab study is done by that design team. It’s a healthy part of a design process to build in informal testing, and the benefits of stepping back and reviewing a site in this way far outweigh the risks of bias or glossing over problems.

It’s funny how wildly different the two ends of this spectrum seem, at least to me: Low-fi informal testing done by the design consultant seems, to me, healthy and honest and worth paying extra for… while major formal lab testing done by the same design consultant seems highly vulnerable to bias. Maybe it’s because the stakes seem so much higher in the formal testing.


4 responses to “Usability Foxes Guarding the Henhouse”

  1. Most of our clients express distrust at the whole concept of usability testing. I’m not sure if it’s the immaturity of the Australian web market, or the size of the businesses we’re working for, but it’s a rare thing for the idea of testing to go unquestioned at proposal stage, and even rarer for the client to be prepared to pay for it. It’s incredibly frustrating, and the end result is that we do most of it in-house, with friends and beer. Sad.

  2. Ideally, there shouldn’t be a problem with an in-house formal testing, as the methodology and data should be presented merely as transparant data, not as a conclusion. You are correct that it is all too easy to cherry pick, and bais the testing, but if you’re fully transparant, any biases are immediately found.

    This brings up two points: First, most clients don’t want to pay for formal user testing. Not that they don’t believe it will yield valuable data, but rather, that they’re cheap bastards. Even afterwards when their new shiny application is collecting dust from disuse, they rarely understand that some simple testing could have made the application viable.

    Secondly, what often happens is the stake-holders, who aren’t the end users of the product, dictate what they think the application should do. The application often then becomes a failure. Look at the navigational structure of most corporate websites. The vast majority of them are not user-centric at all, but rather, reflect the corporate structure of the client. Navigational structures aren’t org charts!!

    Look at Dell.com, for example. I have no idea if I am a consumer, home office, small office, large office, or enterprise, if all I want to look for is a 23″ wide aspect LCD monitor, and yet, I must choose one before moving forward.

  3. CMH: The key figure in this equation is the test moderator — if the moderator is the actual person, or even a close colleague, of the information architect who devised a given design, that moderator may be inclined to give the test subjects subtle hints about what to click, or to avoid questions that might reveal deep fundamental flaws in the design’s basic concepts. A truly impartial moderator might ask “Did you find this confusing”, whereas someone with even a tiny bit of bias might find even a simple but probing question like that hard to muster up.

  4. The key figure in this equation is the test moderator

    I totally agree, although in a way you may not expect. Why does the moderator have to be in the picture during the study at all? I’ve done studies that prep a subject by asking them to complete a task based on the UI presented, and to speak everything they thought. No interruption is allowed, unless they’re totally stuck, in which the task ends. Even a question like “Do you find this confusing?” is leading. If the subject can’t figure out where to click after 3 minutes, there is clearly something “up”. This is where I am coming from with regard to the Tufte mantra of “Present data”. Everything else is interpretation.