Metrics, Part 2: Are we measuring the right things?

(If you haven’t already read it, you might like to take a look at Part 1: The Webstats Legacy.)

Anand Giridharadas asks in the New York Times, Are metrics blinding our perception?. Giridharadas begins by talking about the Trixie Telemetry company which takes data about a baby’s naps, nappy changes and feed times and turns it into charts, graphs and analyses to “help parents make data-based decisions”. He then goes on to say:

Self-quantification of the Trixie Telemetry kind is everywhere now. quantifies your sexual encounters. quantifies your progress toward goals like losing weight. Withings, a French firm, makes a Wi-Fi-enabled weighing scale that sends readings to your computer to be graphed. There are tools to measure and analyze the steps you take in a day; the abundance and ideological orientation of your friends; the influence of your Twitter utterances; what you eat; the words you most use; your happiness; your success in spurning cigarettes.

Welcome to the Age of Metrics — or to the End of Instinct. Metrics are everywhere. It is increasingly with them that we decide what to read, what stocks to buy, which poor people to feed, which athletes to recruit, which films and restaurants to try. World Metrics Day was declared for the first time this year.

But measure the wrong thing and you end up doing the wrong thing:

Will metrics encourage charities to work toward the metric (acres reforested), not the underlying goal (sustainability)? […] Trees are killed because the sales from paper are countable, while a forest’s worth is not.

The same is true in social media. Count the wrong thing and you’ll do the wrong thing. As Stephanie Booth says, in the second video in this post:

As soon as you start converting behaviours into numbers then people adapt their behaviour to have good numbers.

She goes on to say that some of her clients believe that the number of comments they have on a blog post is a measure of success, but because of this they become obsessed with getting people to comment:

So you’re going to write posts which make people react or you’re going to encourage people to have chatty conversations in your comments. That’s really great, you get lots of comments, but does it mean that what you’re providing is really more valuable? […] I don’t believe that more is always better, that more conversation is always better. It’s “Is it relevant?” And that’s something that we do not know how to measure in numbers.

If the key metric for assessing success is a simplistic one like ‘page views’ or ‘unique users’ or ‘comments’, the emphasis in your web 2.0 strategy will be on creating something populist instead of something that meets a business need.

Let’s say you’re in eCommerce and you sell pet supplies. Your business goal is not ‘get more people onto our website’, it is ‘get more people buying pet supplies from our website’. The two are very different indeed. A company that believes that they need to just lots and lots of people through the virtual door will focus on anything that might get them more attention and traffic. A company that understands they need to attract the right people will focus on communicating with passionate pet lovers who arrive at the site primed to buy.

This is why niche blogs can command higher advertising rates than general news sites. Advertisers can see that more of the people who click their ads will actually buy their products and are willing to pay more for these higher quality visitors.

Equally, let’s say you want to ‘improve collaboration’ internally and to that end you start a wiki. You start measuring activity on the wiki and focus on ‘edits per user’ as a key metric. You encourage people to edit more, but the quality and amount of collaboration doesn’t increase as you expected. Why? Because people learnt that changing a single typo boosts their ‘edits per user’ count and took a lot less effort than creating a new page, engaging with a co-worker or making a substantive change. Focusing on the wrong numbers changes the wrong behaviour.

In order to think about metrics, you need to know exactly what you’re using social media for. Figure that out and you’re halfway there.

Metrics, Part 1: The webstats legacy

Probably the hardest part of any social media project, whether it’s internal or external, is figuring out whether or not the project has been a success. In the early days of social media, I worked with a lot of clients who were more interested in experimenting than in quantifying the results of their projects. That’s incredibly freeing in one sense, but we are (or should be) moving beyond the ‘flinging mud at the walls to see what sticks’ stage into the ‘knowing how much sticks’ stage.

Social media metrics, though, are a bit of a disaster zone. Anyone can come up with a set of statistics, create impressive-sounding jargon for them and pull a meaningless analysis out of their arse to ‘explain’ the numbers. Particularly in marketing, there’s a lot of hogwash spoken about ‘social media metrics’.

This is the legacy of the era in a couple of ways. Firstly, the boom days of the era attracted a lot of snakeoil salesmen. After the crash, businesses, now sceptical about the internet, demanded proof that a site really was doing well. They wanted cold, hard numbers.

Sysadmins were able to pull together statistics direct from the webserver and the age of ‘hits’ was born. For a time, back there in the bubble, people talked about getting millions of hits on their website as if it was something impressive. Those of us who paid attention to how these stats were gathered knew that ‘hits’ meant ‘files downloaded by the browser’, and that stuffing your website full of transparent gifs would artificially bump up your hits. Any fool could get a million hits – you just needed a web page with a million transparent gifs on it and one page load.

This led to the second legacy: an obsession with really big numbers. You see it everywhere, from news sites talking about how many ‘unique users’ they get in comparison to their competitors to internal projects measuring success by how many people visit their wiki or blogs. It’s understandable, this cultural obsession with telephone-number-length stats, but it’s often pointless. You may have tens of thousands of people coming to your product blog, but if they all think it’s crap you haven’t actually made any progress. You may have 60% of your staff visiting your internal wiki, but if they’re not participating they aren’t going to benefit from it.

Web stats have become more sophisticated since the 90s, but not by much. Google Analytics now provides bounce rates and absolute unique visitors and all sorts of stats for the numerically obsessed. Deep down, we all know these are the same sorts of stats that we were looking at ten years ago but with prettier graphs.

And just like then, different statistics packages give you different numbers. Server logs, for example, have always provided numbers that were orders of magnitude higher than a service like StatCounter which relies on you pasting some Javascript code into your web pages or blog. Even amongst external analytics services there can be wild variation. A comparison of Statcounter and Google Analytics shows that numbers for the same site can be radically different.

Who, exactly, is right? Is Google undercounting? StatCounter overcounting? Your web server overcounting by a factor of 10? Do you even know what they are counting? Most people do not know how their statistics are gathered. Javascript counters, for example, can undercount because they rely on the visitor enabling Javascript in their browser. Many mobile browsers, for example, will not show up because they are not able to run Javascript. (I note that the iPhone, iTouch and Android do show up, but I doubt that they represent the majority of mobile browsers.)

Equally, server logs tend to overcount not just because they’ll count every damn thing, whether it’s a bot, a spider or a hit from a browser, but also they’ll count everything on the server, not just the pages with Javascript code on. To some extent, different sorts of traffic will be distinguished by the analytics software that is processing the logs, but there’s no way round the fact that you’re getting stats for every page, not just the ones you’re interested in. Comparing my server stats to my StatCounter shows the former is 7 times the latter. (In the past, I’ve had sites where it’s been more than a factor of ten.)

So, you have lots of big numbers and pretty graphs but no idea what is being counted and no real clue what the numbers mean. How on earth, then, can you judge a project a success if all you have to go on are numbers? Just because you could dial a phone with your total visitor count for the month and reach an obscure island in the Pacific doesn’t mean that you have hit the jackpot. It could equally mean that lots of people swung past to point and laugh at your awful site.

And that’s just web stats. Socal media stats are even worse, riddled with the very snakeoil that web stats were trying to mitigate against. But more on that another day.