What counts, and what gets counted

Books | The quest to quantify the performance of our most important institutions can backfire, but what other choice do we have?

Carmela Chivers 4 April 2018 1003 words

Teaching to the test? Keith Morris/Alamy

The Tyranny of Metrics
By Jerry Z. Muller | Princeton University Press | $54.99 | 240 pages

We are obsessed with metrics. From measuring our children’s learning with standardised tests to holding employees to their key performance indicators, we are awash with data on human attainment. But while metrics undoubtedly can be useful, might we be putting too much faith in the numbers?

Jerry Z. Muller thinks so. In his new book, he argues that the desire to hold organisations and their employees to account has morphed into the mistaken belief that standardised measurements are the key to improving our most important institutions. He calls this “metric-fixation,” and says its consequences are profound for both government and business.

Muller’s book is full of examples of how an over-reliance on metrics can backfire. Using case studies from universities, schools, hospitals, business and government, he shows that misused metrics can impede progress towards the very goals they are supposed to promote.

Problems arise especially when the people using metrics forget that they are a means to an end – a tool. Used carelessly, they can encourage employees to game the stats, ignore unmeasured (but still important) goals, or turn down difficult work.

Muller points to the TV series The Wire as a perfect illustration of how this plays out. In the show, police commanders in Baltimore are so focused on “hitting the numbers” – drug arrests, cases solved, reductions in crime rates – that they have little patience for detectives who try to solve complex cases. It’s better to bring in easy catches like low-level drug offenders, even though doing so does nothing to solve the underlying problem. And worse, because the success of the police department is measured by the number and seriousness of crimes reported, officers have an incentive to misrepresent how bad things are on the ground.

Reality is more complex than a TV crime drama, of course. But Muller argues that The Wire hit on a problem that modern policing really does face: too much emphasis on performance indicators creates warped incentives for officers and police departments. It’s tempting to downgrade reported offences, for instance, if crime rates are taken as the measure of a police department’s success. And when things go very wrong, police may create the illusion of falling crime rates by putting pressure on victims to withdraw complaints, as happened in Queensland last year.

Metric-fixation is not only a problem in policing. In universities, a “publish or perish” culture undervalues complex work, downgrades teaching and discourages the publication of certain kinds of research findings (such as negative but potentially useful results). In schools, judging teachers by their students’ results encourages “teaching to the test.” In business, quotas and performance targets can lead to unethical practices, especially when the goals are set too high. In British hospitals, performance statistics have resulted in some surgeons turning down difficult or risky operations.

Muller does a great job of showing how these problems are linked to dodgy metrics. The case studies are detailed and well-reasoned, though they become a bit repetitive as Muller’s rollcall of metric-fixations proceeds (which may be his point). For anyone who finds comfort in numbers, this is a timely warning not to be complacent. Metrics are only useful when they are well designed, a good proxy for what they are supposed to measure, and understood in a broader context. Muller argues, persuasively, that standardised results on human performance should complement experience and judgement, rather than replace them.

But an important point is missing. He never properly explains how we should judge human performance without using admittedly imperfect numbers. The book left me wondering: what other choice do we have?

Muller’s suggestion is to put more emphasis on the experience and judgement of experts working in a field — but it’s an idea that he fails to flesh out. He simply doesn’t acknowledge that the drawbacks of using metrics may be outweighed by the benefits.

There’s a reason why metrics are so popular. They enable policy-makers and practitioners to compare performance across different domains. By putting a number on what’s going on inside our institutions, they enable us to measure progress and test interventions. They showcase best practice and highlight problems.

An astounding example of metrics working well, cited only briefly by Muller, is the use of performance measures to reduce the incidence of central-line infections in American hospitals. Before the intervention, 32,000 people died each year from these preventable hospital-acquired infections. Doctors developed a simple checklist to follow when inserting a central line, and monitored whether the procedure was being followed correctly. The results were dramatic: the rate of infection dropped 66 per cent in hospitals that adopted the new procedure.

Muller says the intervention worked because metrics were used to support medical staff to work towards common professional goals. But what he doesn’t mention is that there was initial opposition to the checklist. Many clinicians felt it undermined their professional autonomy, and it required significant cultural change among staff. Had the experience and judgement of individual medical professionals been relied on, high rates of central-line infections might have been seen as “inevitable,” as was the common view before the intervention.

Metrics, even imperfectly used, can also increase accountability. They compel employees and their superiors to take stock of progress and provide evidence that they are working towards certain goals. When shared with the public, metrics give citizens information about how well institutions are working; they inform decisions made at the ballot box.

It is difficult to see how non-empirical measures could do a better job. Perhaps the key is to find a way to better combine professional experience and judgement with relevant, targeted data collection. By identifying the problem, Muller’s The Tyranny of Metrics gets us part of the way there. From here, the trick will be to use qualitative and quantitative data to create a more meaningful picture of how institutions and their employees are performing. Perhaps, having alerted us to their risks, the title of Muller’s next book should be The Value of Metrics. ●