When I was at the BBC, a very smart producer, Gill Parker, approached me about pulling together a massive amount of data and information she was collecting with Frank Gardner trying to unravel the events that lead to the 11 September 2001 attacks in the US. Not only had Gill worked on the BBC’s flagship current affairs programme Newsnight and on ABC’s Nightline in the US, she also had worked in the technology industry. They were interviewing law enforcement and security sources all around the world and collecting masses of information which they all had in Microsoft Word files. She knew that they needed something else to help them connect the dots, and speaking with me in Washington where I was working as BBCNews.com’s Washington correspondent at the time, she asked if help her get some database help.
I thought it was a great idea. My view was that by helping her organise all of the information that they were collecting, the News website could use the resulting database to develop info-graphics and other interactives that would help our audience better understand the complex story. We could help show relationships between all of the main actors in al Qaeda as well as walk people through an interactive timeline of events. I had a vision of displaying the information on a globe. People could move through time and see various events with key actors in the story. This was a bit beyond the technology of the time. Google Earth was still a few years away, and it would have required significant development for some of the visualisations. However, on a story like this, I thought we could justify the effort, and frankly, we didn’t need to go that far. Bottom line: Organising the data would have huge benefits for BBC journalists and also for our audiences.
?Unfortunately, it was the beginning of several years of cuts at the BBC, and the News website was coming under pressure. It was beyond the scope of what I had time to do or could do in my position, and we didn’t have database developers at the website who could be spared, I was told.
A few years later as Google Earth developed, Declan Butler at Nature used data of the spread of the H5N1virus globally to achieve something like the vision I had in terms of showing events over time and distance.
It is great to see my friend and former Guardian colleague Simon Rogers move forward with this thinking of data as a resource both internally to help journalists and also externally to help explain a complex story in his work on the Wikileaks War Logs story. Simon wrote about it on the Guardian Datablog:
we needed to make the data easier to use for our team of investigative reporters: David Leigh, Nick Davies, Declan Walsh, Simon Tisdall, Richard Norton-Taylor. We also wanted to make it simpler to access key information for you, out there in the real world – as clear and open as we could make it.
As the digital research editor at The Guardian, data was key to many of my ideas (before I left this March to pursue my own projects). I even thought that data could become a source of revenue for The Guardian. Data and analysis is something that people are willing to pay for. Ben Ayers, the Head of social media and community at ITV.com, (speaking for himself not ITV) said to me on Twitter:
Brilliant. I’d pay for that stuff. Surely the kind of value that could be, er, charged for. Just sayin’ … just an example of where, if people expect great interpretation of data as part of the package, the Guardian could charge subs
As I replied to Ben, I wouldn’t advocate charging for data for the War Logs, but I would suggest that charging for data about media, business and sports. That could become an important source of income to help subsidise the cost of investigations like the War Logs. Data wrangling can be time intensive. I know from my experience in developing the media job cuts series that I wrote at the end of 2009 for The Guardian. However, the data can be a great resource for journalists writing stories as well as developing interactive graphics like the media job cuts map or the IED attack map for the War Logs story. Data drives traffic, as the Texas Tribune in the US has found, and I believe that certain datasets could be developed into new commercial products for news organisations.