Reuters’ Connected China: How to win the argument for big data projects at news organisations

Connected China project from Thomson-Reuters

Data has long been a part of journalism, but I think we’re moving into a new and exciting phase in which data is helping to drive innovations in storytelling. Aron Pilhofer and the interactive team at the New York Times, Simon Rogers and the data team at the Guardian and Reg Chua, the data and innovation editor at Reuters, are all exploring new ways to bring together data and novel storytelling techniques together in new ways that help reveal context and connections while also engaging audiences with rich narratives. I am very fortunate to call all three friends and, in the case of Simon, a former colleague.

At the recent NICAR conference in the US, Reg unveiled the latest example of this new wave of data-driven projects, Connected China. What is Connected China? Reg explains on his blog:

It’s a little hard to sum up simply; at one level, it’s a microsite that focuses on looking at power in China, explaining how it flows, the key players and institutions, and their relationships, featuring stories and rich multimedia (including fantastic archival footage.) But it’s much more than that: It’s also a series of innovative data visualizations that pull from a rich, underlying database of people, institutions and relationships to illustrate the connections, careers and positions of key officials in China. And more than that: It’s a great example of how the combination of data. visualizations, stories and multimedia can be much more than the sum of their parts.

To me the micro-site is only part of the story. It is one application built from a database.

It’s an amazing database: Tens of thousands of entities, 30,000 relationships, and a million and a half words (not to mention the array of news stories, photos and videos also featured in the app.) The team structured tons of information – connections, the importance of job roles, etc – with an editorial sensibility. In other words, they applied news judgment – but rather than use it just in stories, they used it to structure data.

One of the really powerful features of this database is the relationships. This is an incredibly rich store of context. In the visualisations created on the site, suddenly, a web of power invisible to all but the most knowledgeable experts on China becomes visible.

The power of connections and context

For years, I’ve dreamed of creating projects like this that unveil relationships between events, people and companies. Years ago, I was inspired by a project called a project called the Shakespeare Explorer developed for the Kennedy Center in Washington DC. It is a wonderful multimedia project that brings together pictures, places, plays and historical events. The timeline highlights connections between the plays and events in history, putting Shakespeare’s plays in a broader context.

You can see something similar with this PBS Frontline interactive showing the connections between Al Qaeda’s network, and how much the US intelligence services knew about the network at various points in time between 1993 and 2001. There were a few features like this around the time of the 9/11 attacks.

These are powerful features, and Connected China shows how a decade of development has moved these concepts forward. The question becomes why we haven’t seen more of them.

How to justify the effort

Projects like Connected China take a lot of time and resources. When I was at the BBC, my colleague Gill Parker worked with a database startup in addition to her work in journalism. Gill was working on the team with BBC security correspondent Frank Gardner looking into global investigation into the 9/11 attacks. They had huge amounts of material, mainly Microsoft Word documents, but they didn’t have an efficient way to organise them. Gill knew there was a better way, so she reached out to me to see if I could connect her with someone at BBC News Online who could provide database expertise. Unfortunately, we didn’t have the spare resources. Well, we probably did, somewhere, but at the time the BBC wasn’t very good at pooling resources across the organisation.

I’ve thought a lot about strategies for making the case, both then and over the years. These are my recommendations:

Plan for reuse, not artisanal single use apps – If there was one lesson I learned early on in my digital journalism career is that it is almost impossible to justify artisanal web interactives. In the mid and late 1990s, we built a lot of things by hand online. To be honest, the massive effort was almost never justified by the response from audience. Very quickly, we learned that we needed re-usable elements that we could build easily over time.

For data, think of a database as raw material for other projects and apps. For instance (and knowing Reg, I’m sure that he has thought about this) with Connected China, you’ve got a huge database of structured information. Using something like Thomson-Reuters’ Calais, it would be relatively easy to link people in China stories to elements of Connected China.

Building databases takes effort, and knowing the kind of databases that will generate the most applications might help you decide which ones to develop and which ones aren’t worth the effort.

Think of potential revenue streams – The days when journalism could afford to be completely divorced from business realities is over, and if you’ve got to make the case why your data project should get scarce resource over another project, you’ll need to think of possible revenues sources that might make it more attractive to the powers that be. I’ve worked with a number of news organisations on data journalism, including Reed Business Information, CNN and Czech TV. Some of the techniques that I advocate are about bringing down the cost of charts and graphs, but I also speak to teams about how they can develop revenue streams through data projects.

Always ask:

• Does the data have commercial value?
• Are there obvious sponsors for the data?
• Could you build an app with the data that might be a premium product?

Think of internal and external applications – One of the strategic justifications that I tried to use for the BBC 9/11 project was that it would lead to an important internal resource as well as an external resource. Ultimately, for that project, the argument didn’t win the day, but it can help you get important buy-in if you can make the case that the data resource can help your staff as well as being a compelling feature for your audience.

Think small before taking on big data – Connected China is a massive project beyond the scope of most organisations. However, there are still important concepts about context and data that can be used on much smaller projects. Think of how structured data can be used to add context to local stories and how you can build up databases over time rather than thinking that you have to build big right away.

This really is an exciting time to be a journalist, and it’s great to see news organisations invest in projects like this. No matter the size of your organisation, there are important ways to use data to add value, in all senses of the word, to your journalism.

Al Jazeera Unplugged: Kaiser Kuo on China

This is a live blog. It may contain grammatical errors, but I tried to be as true to the essence of the comments as possible?

Google’s announcement in January that it would shut down rather than continue to submit to censorship in China. It created a lot of column inches about foreign businesses operating in China and also about cybersecurity.

Kaiser believes that focusing on censorship and The Great Firewall in China is actually crippling our ability to deal with China. It’s a too convenient narrative. He used the image of Sergey Brin standing in front of the tanks in Tiananmen? Square. The Chinese internet is very robust and interesting and deserves attention in its own right. Quoting a Chinese scholar, he says that The Great Firewall is being seen as the Iron Curtain 2.0. The US government is sending very clear messages by referring to this The Great Firewall as another Iron Curtain.

We have this image of Chinese netizens as a group of skinny patriotic hackers or cosmopolitan aspiring democrats. Often, he says that the reality is somewhere in between. Chinese rarely go outside of China to see content. They very rarely bump into The Great Firewall, although Twitter, YouTube and other western sites are blocked. He finds that regrettable. They often bump into self-discipline censorship. Any site whatsoever will receive from any number of ministries what the provisos on content. They will redact words or ask you to close accounts. If companies don’t comply, they can face penalties all the way up to being shut down.

However, the focus on censorship obscures the development of technology and the internet in China. There are 404m internet users in China, more users than people in the US. There are 800m mobile handset subscribers in China. There are companies such as the instant messaging service QQ, which has 80% of all internet users. The number of accounts, because of multiple accounts by individuals, dwarfs the number of internet users in China.

The internet in China can be described more as an entertainment super-highway rather than an information super-highway. In the last two or three years, internet censorship has become more draconian in China. More sites have been blocked, and the restrictions on domestic sites has become more onerous. At the same time, in recent years, the internet has emerged as a full fledged public sphere in Chinese life, something that has never existed in China.

There is discussion about issues that are assumed to be off limits, but there is a great level of creativity to conduct these discussions. Officials at all levels of government are constantly taking the temperature of online opinion. You see policy decisions changing in response to online public opinion. A picture was taken and posted online of an official wearing a watch and smoking a cigarette “clearly out of his pay grade”. The official was jailed.

A woman was accosted by a couple of men and one was a party official. She stabbed the men and killed them, but there was such an outcry online that she wasn’t prosecuted. We are seeing a real development of a public sphere in China. When we focus solely on censorship, then we miss this phenomenon.

Everyone here wants to advance internet freedom in China, and Kaiser is quick to say that he supports it. But when the US government that it is dedicating millions of dollars to support internet censorship circumvention technologies, many people changed their minds about the official party line. Some liberal Chinese users came to accept the view that the internet was being used for imperialism. Planting the American flag on this operation might have backfired.

The development of the Chinese internet will eventually overwhelm censors. These freedoms should be taken from within. They cannot be granted from without.

He applauds private organisations and companies working to help create that change, but to paraphrase Kaiser, government involvement brings baggage.