Benchmark Testing

Running longitudinal usability tests

Benchmark Testing

At Zapier, we utilized A/B testing as we shipped and iterated on features. While helpful, this focuses attention on single features and a few backstop metrics.

Benchmark usability testing helps you monitor macro-level changes to the user experience of a product.


Benchmark testing demands repeatabilty. You want to be able to compare results from one interval to another, in order to see how things are changing. This involved:

I. Tasks

Creating a set of tasks that any user could complete.

II. Rubric

To limit researcher bias, we developed a standard tagging taxonomy with strict definitions.

III. Tooling

We built tools to catalog observations.

Airtable was used to tally and code the responses.


Based on Airtable summaries, it was easy to create a one-pager scorecard.

I followed this up with a report, describing the qualitative changes we saw. This provided a great opportunity to highlight recent changes that alleviated prior issues.