News organisations need to focus on customer data as mobile payments take off

With a number of new and updated products announced, Tim Cook looked to make Apple his own just shy of three years since Steve Jobs death, and while much of the focus has been on the Apple watch, to me the most interesting part of the event was mobile payments. I instantly started thinking about how mobile payments would affect the business of journalism. 

Alan Mutter updated a post he wrote on how mobile payments could revolutionise commerce, including the commercial world of journalism. For me, these four paragraphs are key:

Although the outlook (for mobile payments) is unclear, there can be no question that mobile payments will revolutionise marketing by creating an ocean of real-time, granular and precise consumer data.

This matters to publishers and broadcasters, because it means that marketers in the future probably will vector ever more of their advertising dollars into direct connections with consumers, instead of mass media. …

Because rich data – not mass audiences – will be the name of the game in the future, every local media company should be gathering as much data as possible about every household and individual in the community it serves.

The most immediate opportunities to do this are through newsletter programs, contests, site registration and smart mobile apps. Obviously, all of these tactics require close attention to government and corporate privacy policies.

We live in a world of data. Data really is the new oil, and while the challenges for news organisations are myriad, data – and not just in terms of storytelling – is increasingly important. The organisations that master data will be the master of their own destiny, and for news organisations, this might be one of the best last opportunities to retake the initiative. 

Prioritisation: How do news organisations decide what they must do?

Since leaving The Guardian last April and striking out on my own, I joke that I’ve become an occupational therapist. I’ve had the chance to speak to journalists, editors and media executives around the world and hear the issues and challenges facing them. Most of them are instantly familiar, but one issue that I first heard about in Norway in 2009 and heard with increasing frequency in 2010 was prioritisation. In digital, there are a myriad of things we could do, but in this era of transition and scarce resources, the real question is what we must do.

In the comments on my last post covering the opportunities for news organisation in location services and technologies, Reg Chua, the Editor-in-Chief of the South China Morning Post, had this insight:

Kevin’s point on prioritization is a critical one. We can’t do everything well; in fact, we can’t do everything, period. I’d argue that we should think about the product we want to come out with first – and then figure out what data is needed to make it work. I realize that leaves some value on the table, but I suspect we all need to specialize more if we’re to really create products that have real value.

For the news organisations that I’m working with now in developing their digital strategies, one of the things that I look at is where they have the most opportunity. I agree with Reg that there are a number of opportunities in terms of creating products using data, and I also think that data is important in determining which products to develop. News organisations need to get serious about looking at what their audiences find valuable, digging into their own metrics. Right now, we’ve got a lot of faith-based decision making in media. It’s critical that we begin to look at the data to help determine what new products we should deliver and how we can improve our existing offering.

For a good start on this kind of thinking, Jonathan Stray wrote an excellent post last year, Designing journalism to be used. He wrote:

Digital news product design has so far mostly been about emulation of previous media. Newspaper web sites and apps look like newspapers. “Multimedia” journalism has mostly been about clicking somewhere to get slideshows and videos. This is a little like the dawn of TV news, when anchors read wire copy on air. Digital media gives us an explosion of product design possibilities, but the envisioned interaction modes have so far stayed mostly the same.

This is not to say that the stories themselves don’t need to change. In fact, I think they do. But the question can’t be “how can we make better stories?” It must be “who are our users, what would we like to help them to do, and how can we build a system that helps them with that?”

Josh Benton of the Nieman Lab says that we have an opportunity to rethink the grammar of journalism during this period of transition in the business. Jonathan and Reg are definitely thinking about rethinking the grammar and rethinking the products that we create. Digital journalism is different, and the real opportunity is in thinking about how its different and how that creates new opportunities both to present journalism and support it financially.

Opportunities from the data deluge

There are huge opportunities for journalism and data. However, to take advantage of these opportunities, it will take ?not only a major rethinking in the editorial and commercial strategies that underpin current journalism organisations, but it will take a major retooling. Apart from a few business news organisations such as Dow Jones, The Economist and Thomson-Reuters, there really aren’t that many general interest news organisations that have this competency. Most smaller organisations won’t be able to afford it on an individual level, but it leaves room for a number of companies to provide services for this space.

Neil Perkin outlines the challenge and the opportunity in a wonderful column that he’s cross-posted from Marketing Week. (Tip of the blogging hat to Adam Tinworth, who flagged this up on Twitter and on his blog.) In our advanced information economies, we’re generating exabytes of data. While we’re just getting used to terabyte disk drives, this is an exabyte:

1 EB = 1,000,000,000,000,000,000 B = 1018 bytes = 1 billion gigabytes = 1 million terabytes

To put this in perspective, I’ll use an oft-quoted practical example from Caltech researcher Roy Williams. All the words ever spoken by human beings could be stored in about 5 exabytes. Neil quotes Google CEO Eric Schmidt to show the challenge (and opportunity) that the data deluge is creating:

Between the dawn of civilisation and 2003, five exabytes of information were created. In the last two days, five exabytes of information have been created, and that rate is accelerating.

All the words spoken since the dawn of language in 5 exabytes or the amount of information created in the last two days helps illustrate the acceleration of information creation. Those mind-melting numbers wash over most people, especially in our arithmophobic societies. However, there is a huge opportunity here, which Neil states as this:

The upside of the data explosion is that the more of it there is, the better digital based services can get at delivering personal value.

And journalists can and definitely should play a role in helping make sense of this. However, we’re going to have to overcome not only the tyranny of chronology but also the tyranny of narrative, especially narratives that prejudice anecdote over data. Too often to sell stories, we focus on outliers because they shock, not because outliers are in any way representative of reality.

From a process point of view, journalists are going to need to start getting smarter about data. I think data crunching services will be one way that journalism organisations can subsidise the public service mission that they fulfil, but as I have said, it’s a capacity that will need to be built up.

Helping journalists ‘scale up what they do’

It’s not just raw data-crunching that needs to improve, but we’re starting to see a lot of early semantic tools that will help more traditional narrative-driven journalists do their jobs. In talking about how he wanted to help journalists at AOL overcome their technophobia, CEO Tim Armstrong talked about why these tools were necessary. Journalists have not been included in corporate technology upgrades (and often not included in creation of tools for their work). Armstrong said at a conference in June:

Journalists I met were often the only people in the room who never had access to a lot of info, except what they already knew.

It’s not technology for technology’s sake but tools to open up more information and help them make sense of it. Other industries have often implemented data tools to help them do their jobs, but it’s rare in journalism (outside of computer-assisted reporting or database journalism circles). Armstrong said:

You can pretty much go to any professional industry, and there’s some piece of data system that helps people scale what they do.

Journalists are being asked to do more with less as cuts go deep in newsrooms, and we’re going to have to work smarter because I know that there are some journalists now working to the breaking point.

There have been times in the last few years when I testing the limits of my endurance. Last summer, filling in behind my colleague Jemima Kiss, I was working from 7 am until 11 pm five days a week and then usually five or six hours on the weekends. I could do it for a while because it was a limited 10-week assignment. Even for 10 weeks, it was limiting the amount of time I had with my wife and was negatively affecting my health.

I’m doing a lot of thinking about services that can help journalists deal with masses of information and also help audiences more easily put stories into context. We’re going to need new tools and techniques for this new period in the age of information. The opportunities are there. Linked data and tools to analyse, sort and contextualise will lead to a new revolution in news and information services. Several companies are already in this space, but we’re just at the beginning of this revolution. We live in exciting times.

Journalists! Go check out the projects from Rewired State

I had Rewired State in my calendar for months because it was happening in the Guardian’s new offices, but a rather full schedule in 2009 and over-subscription of the event itself prevented me from making it. What was Rewired State?

Government isn’t very good at computers.
They spend millions to produce mediocre websites, hide away really useful public information and generally get it wrong. Which is a shame.

Calling all people who make things. We’re going to show them how it’s done.

My good friend and former colleague at the BBC, Chris Vallance, came to the tail end of the event, and he was said that the projects sparked a lot of ideas, many of the ideas that would make great journalism.

Voxpomp was one that caught my eye immediately. The idea is simple: “Statements made by MPs during Parliamentary debate cross-referenced with news stories of the time.” You can search by subject and member of parliament in a very simple interface. There is another project that allows people to log when and where they have been stopped under Section 44 of the Terrorism Act 2000. This is code in progress, but it’s definitely an interesting idea. Foafcorp is an SVG visualisation that shows links between companies and their directors using UK Companies House Data. Here is an explanation from the developer.

The full list of projects are now available online.

That’s the good. However, how many of you had heard about the event? I wish that the organisers had done better outreach or publicity before the event. It was an obvious success because organisers told me that they had 300 applications and only space enough for 100 people so they had to ration the invites. However, the media and technology journalists at the Guardian didn’t even know about this event, even though it was happening in our building. Charles Arthur, or editor of Technology Guardian and driving force behind the Guardian’s Free Our Data campaign, hadn’t heard about it. The only reason that I knew about it is because I work closely with our development teams who were involved with it. I only received a very brief press release (frankly a one page email) from organisers on the Friday before the event. If Guardian journalists didn’t know about it, how many other journalists had heard about it until after the fact? 

I popped my head right near the end because I was meeting Chris. Suw and I saw a number of familiar faces from the Open Rights Group, MySociety and government and technology circles we know.

I know that this is a hackday and the purpose was to create new applications with public data and wasn’t necessarily concerned with making a big splash in traditional media, and I’m definitely not trying to imply that you needed journalists there to validate the project. But I think this was an important event, and I’m concerned that apart from a the participants and their followers on Twitter and a few folks who happened to find out about it,that very few people outside of those circles knew about it. I’m not even finding many blog posts about it.

Guys, you did something really good. It’s OK to let a few more people know about it. I know that organising an event takes a lot of work, and publicity might be the last thing on your to-do list. But there were some great projects that a much wider audience could easily understand. Underselling your work will make it difficult to convince the government that open data with better formats is an imporant agenda item with so many other pressing issues at the moment.

Leveraging a print poster on the web

FlowingData highlighted this data project from WallStats showing how US tax money was spent. The US government being the sprawling beast that it is has an incredibly complex budget, and this visualisation not only makes it accessible but pulls the reader into exploring it.

It has to be good. It even had the American queen of home decorating and entertaining, Martha Stewart, talking about it. I also love is that by using Zoomorama, they have leveraged a printed poster online, simply but quite effectively.

BeebCamp: Eric Ulken: Building the data desk at the LATimes

A fun example of structured data from the LATimes, which showed the popularity of dog names in LA County by postcode.

A fun example of structured data from the LATimes, which showed the popularity of dog names in LA County by postcode.

This is from one of the sessions at BeebCamp2, a BarCamp like event for BBC staff with some external folks like Suw, me, Charlie Beckett and others. Charlie has a great post on a discussion he led about user-generated content and what it adds to news, video games and also Twitter and Radio 4.

Eric Ulken, was the editor of interactive technology at the LATimes. He was one of the bridges between technology and the editorial

News organisations:

  • We collect a lot of data but don’t use it (We always thought that was a shame. We had a computer-assisted reporting team at the LATimes, wouldn’t it be nice if we used that.)
  • What online readers want from us is bigger than ‘news’ in the traditional sense
  • We need to be an information soure.

They did a homicide map, which mapped all of the murders in LA in a year on a map and which illustrated a blog that reported all of the murders in LA County in a year.

The project was well received, and they decided to develop a data desk. It brought together the computer-assisted reporting unit, investigative reporters, the interactive technology team and the graphics team to bring together the data desk. They all sat together in the newsroom. A lot of synergies were created. The Times had 10 to 15 investigative reporters on different desks from different disciplines.

Ten bits of advice:

  1. Find the believers.
  2. Get buy-in from above
  3. Set some priorities
  4. Go off the reservation (We had a real problem with our IT department. They had their priorities and we had ours. We invested in a server system using Django.)
  5. Templatize. Never do anything once. Do things you can reuse.
  6. Do breaking news. There is data in breaking news. They did a database of the victims. They added information to the database as it became available. The database was up in 24 hours after the crash. They had built most of the pieces for previous applications. (There was a question about accuracy. Eric said the information was being gathered, but it wasn’t structured. The information was edited by a line manager.)
  7. Develop new skills. They sent people out to workshops. They had hired a Django develop who was also a journalist. He taught Django to others in the office.
  8. Cohabitate (marriage is optional). The investigative reporters and computer-assisted reporters still reported to the pre-existing managers, but by being together, they saw possibilities for collaboration without reworking the organisation.
  9. Integrate.
  10. Give back. They worked to give back to the newspaper.

They used Javascript to add this to other parts of the site. They created these two datasets from the train crash and the homicides, but they also have used publicly available data in their projects. He showed their California schools guide. Apart from the standard data analysis available from state and national educational agencies, they also created a diversity rank that showed the relative diversity of the schools. They did do some reporting on the data. In analysing the schools data, they found discrepancies in reporting about the performance of the schools.

In a slightly more humourous example, he showed dog names and breeds by postcodes.

UPDATE: Eric has added some more details in comments below, and you can follow Eric’s work and follow his thoughts on his site.

DEN: Eric Ulken: Beyond the story-centric model of the universe

After appearing virtually at a few Digital Editors Network events at the University of Central Lancashire in Preston, I finally made the trip to appear in person. I really enjoyed Alison Gow talking about live blogging the credit crunch for several Trinity-Mirror sites using CoverItLive.

Eric Ulken, formerly the LATimes.com editor of interactive technology, spoke about an issue dear to my heart: Moving beyond the story as the centre of the journalism universe. It’s one of the reasons that I chose to be a digital journalist is that I think it brings together the strengths of print, audio and video while also adding some new story-telling methods such as data and visualistions. Eric talked about the projects he worked on at the Times to explore new ways of telling stories.

Eric started off by talking about the history of news articles.

The story article so far

  • born 17th Century
  • served us well for about 400 years
  • lots of words (800-1000 words on average)
  • unstructured, grey and often boring.

“What else is there in the toolbox?” he asked.

Some examples: (Eric suffered the dreaded no internet, links in presentation problem so am a little link light on this. You can see examples that Eric has worked on from his portfolio.)

  • text trick – lists, tables, timelines, (Eric mentioned Dipity as one way to easily create a timeline, but said it was “not quite there”. He also mentioned MIT’s Simile project (which has ‘graduated’ and is now hosted on Google Code). Licenced for use under BSD licence, it’s is easily something for more news organisations to use.) Other text formats include the q&a and what he called the q&no, eg the New York Time tech blog. They put up questions for Steve Jobs before MacWorld. His Steve-ness never answers them, but it lays out the agenda.
  • blogs are the new articles
  • photo galleries as lists, timelines
  • stand-alone UGC
  • video: short-form, packages
  • mapping, charts, data visualisation
  • database applications visualisation.

I think this is really important for journalists to understand now. They have to be thinking about telling stories in other formats than just the story. Journalist-programmer ninja Adrian Holovaty has a number of ways that stories can be re-imagined and enhanced with structured data. News has to move on from the point where the smallest divisible element of news is the article. News organisations are adding semantic information such as tags, as we have at the Guardian.

But beyond that, we have to think of other ways to present information and tell stories. As more journalists shift from being focused solely on the print platform to multi-platform journalism, one of the most pressing needs is to raise awareness of these alternate story-telling elements. Journalists, outside of the development departments and computer-assisted reporting units, need to gather the data around a story. It needs to become an integral part of newsgathering. If a department inside of your organisation is responsible with gathering this data, your data library needs to be made accessible and easily searchable by journalists. If it sounds daunting, especially for small shops, then use Google Docs as an interim solution. This is also an area ripe with opportunities for cooperation between universities and news organisations.

Eric gave one example of this non-story-centric model for news. “We did a three-way mashup”, he said. They brought together the computer-assisted reporting team, the graphics team and Eric’s team.

They worked with a reporter on the City desk. She wanted to chronicle every homicide in LA County. In 2007, there were 800 murders. She did the reporting in a blog format. It might not have been the best format, but it was easy to set up. She started building up a repository of information. I was begging people to get the tech resources to build a database. We built a database on top of the blog. We took data from the County Coroner. We took gender, race and age and put it in a database which was crossed linked to the blog. We added a map. You could filter based on age or race on the map. The result was two things. It was a way to look at the data in aggregate, and it was a way to drill down through the interface to the individual record. They took public data, original reporting and contributions from users.

“One of the things that is challenging is getting the IT side to understand what it is actually that you do,” he said.There are more tech people who are interested in journalism probably than there are journalist who are able and willing to learn the intricacies of programming.

When the floor was opened to questions, I wasn’t surprised that this one came up.

Question: Could the LATimes get rid of the print and remain profitable?
Answer: No. Revenue from online roughly covers the cost of newsroom salaries, not the benefits, not for ad staff. I don’t think he was saying that the LATimes had figured it out. He had been saying that for some time before he said it publicly. It was for morale. He was saying that it is not inconceivable for the website to pay in the future.
“There is a point where this cycle ends of cutting staff and cutting newshole,” he said.

UPDATE: And you can see the presentation on SlideShare: