Sacrificing web history on the altar of instant

As I said in my last post about Twitter’s lack of a business model, I’ve been doing some research lately for a think tank. My research has basically consisted of three things:

  • Looking back on the media coverage of an event that happened in early 2010
  • Looking back at the way bloggers reacted to said event
  • And having a quick look at Twitter for reactions there too

Pretty simple stuff, I think you’ll agree. My assumption was that I would be able to tap into Google News; Google Blog, Icerocket and maybe Technorati; and Twitter’s archives. Then I’d be able to scrape the data using something like Outwit Hub, chuck it in Excel and Bob’s your uncle.

Oh, how sadly misguided, how spectacularly wrong.

Now before you talk about how ephemeral the web is and how no one should rely on it for anything, that’s only partly true. A lot of stuff on the web stays on the web, and given how much of our digital selves we are putting on the web, we do need to think about archiving and how we preserve our stuff for the future. But this post is not about archiving, but about accessing what’s already out there.

The first thing I did when I started my research was try to go back in time in Google News to early 2010 and search for news articles about the particular story – the eruption of Eyjafjallajökull – I was interested in.

But Google’s News search results are fuzzy. I wanted to search for the news on particular days, e.g. all the news about the Eyjafjallajökull on 16 April 2010. Do that search, and you’ll be presented with lots of results, many of them not from 16 April 2010 at all, but 17 April or even 18 or 15 April.

I  wanted to refine the search by location, so restricted it to Pages from the UK. Fascinatingly, this included Der Spiegel, Business Daily Africa, Manila Bulletin, FOX News, Le Post and a whole bunch of media sources that, when I last looked, weren’t based in the UK.

So I now have search results which are not limited to either the date or the place that I want. But even worse, results are clustered by story, which might seem like a good idea, but which in reality is lacking. Firstly, these clusters of similar stories are often not clusters of similar stories at all, but clusters of stories that appear to have some keywords in common but which are often actually about slightly different things. I can see the sense in attempting to cluster stories together for the sake of cutting down on duplication for the reader but equally, sometimes I just want a damn list.

Whilst doing my research, I also found that Google News is not, as I had thought, a subset of Google Web results. If you do the same searches on Google Web you get a slightly different set of data, obviously including non-news sites, but actually also including some news sites that aren’t in Google News, and not including many that are.

So far, so annoying, but Google isn’t the only search engine in the world… Except, Google pwned search years ago and innovation in search appears to be almost entirely absent. Bing does news, but a search on Eyjafjallajökull tosses up just three pages of results, and you can’t sort by date. Yahoo News finds nothing. A friend suggested that my local library might have a searchable news archive, but the one I looked at was unworkable for what I wanted.

I’m sure there are paid archives of digital news, but that wasn’t within my budget and, to be honest, given how much news is out there in the wild, there should be a good way to search it. I even tried the Google News API, but that has exactly the same unwanted behaviours as the website.

But hey, things will be better in the blog search, right?

Years ago, in the golden era of blogging, Technorati worked. Their site used to be really great, and I loved it so much I did some work with them. These days, I’m not quite sure what it’s for. It’s certainly not for search, given it finds nothing for Eyjafjallajökull. Icerocket is a better search engine, and you can refine by date, but it finds nothing on our target date, which is surprising as it’s a day or so after Eyjaf popped her top and the flight ban was well underway and, well, you’d think someone on the internet might have had something to say about it.

So, we’re back to Google Blogs. It lets me restrict by date! And specify UK-only! And it coughs up one page of results. Really? We have 4.57m bloggers in the UK and only 35 of them wrote something? I’ve always had my suspicions that the Google Blog index was poorly formed, but Google Blogs is the only choice I have, so I just have to put up with it. At least the results are in a neat list and all on the target date, even if some of them are clearly not from the UK, or even actually blogs, for that matter.

Now then, Twitter. We all know that Twitter’s archives have been on the endangered list for some time, but although they aren’t deleting old Tweets, accessing them is very difficult. Despite providing you with dates going back to 2008 in their advanced search page, you get an error if you try to search for April 2010: “since date or since_id is too old”.

SocialMention is a new search site that I’ve started to find really useful. They search across the majority of the social web and allow you to split that down by type. So I can search for ‘Eyjafjallajökull’ in ‘microblogs’ and get realtime results, but I can’t go back in time further than ‘last month’.

So, we’re back to Google again, this time Google Realtime. It only goes back to early 2010, so lucky for me that my target date is within that period. But the only way I can access that date is by a really clunky timeline interface – I can’t specify a date as I can in Google’s other searches.

Furthermore, there’s no pagination. I can’t hit ‘next page’ at the bottom and fish through a bunch of search pages to find something interesting – my navigation through the results is entirely dependent on the timeline interface. Such an interface will and does entirely outwit Outwit, which can normally follow ‘next’ links to scrap date from an entire search. I doubt it knows how to deal with the stupid timeline interface.

After all this searching and frustration, I’m left with this question:

What has happened to our web history?
The web is mutable, yes, but there’s an awful lot of fire-and-forget content that, generally speaking, hangs around for years. Individual blogs may come and go, but overall there’s a huge pool of blog content out there. Same for news. Twitter is a slightly weird case because it’s a single service with a huge archive of historically interesting data which it isn’t letting just anyone get at. Not even scholars. Twitter may have given its archive to the Library of Congress, but even that’s going to be limited access if their blog post is anything to go by:

In addition to looking at preservation issues, the Library will be working with academic research communities to explore issues related to researcher access.  The Twitter collection will serve as a helpful case study as we develop policies for research use of our digital archives. Tools and processes for researcher access will be developed from interaction with researchers as well as from the Library’s ongoing experience with serving collections and protecting privacy and rights.

The Library is not Twitter and will not try to reproduce its functionality.  We are interested in offering collections of tweets that are complementary to some of the Library’s digital collections: for example, the National Elections Web Archive or the Supreme Court Nominations Web Archive. We will make an announcement when the collection is available for research use.

I’m not an academic researcher, so whether I’d even get access to the archive for research is up in the air. (I can’t find any updates as to the availability of Twitter’s archive via the Library of Congress, so if anyone has info, please leave a comment.)

I think we have two problems here, one already briefly mentioned above.

1. Google has pwned search
For years, Google has been the dominant search engine, and in some ways they’ve paid a price for this as publishers of all stripes have climbed on the Google haterz bandwagon. My suspicions are that Google’s fuzzy search results are a sop to the news industry, because Google should be capable of producing a rich and valuable search tool that allows the user to see whatever data they want to see, in whatever layout they want. Maybe, after all the stupid shit the news publishers have thrown their way, Google thinks that building in failure to their news search product will insulate them from criticism from the industry.

But I don’t think that this absolves Google of responsibility for the lack of finesse in historical search. After all, which bloggers are gathering together to demand Google not index them? And Twitter users who don’t want to be indexed by Google can go private with their account.

But Google dominance does seem to have caused other search engines to wither on the vine. It’s almost like no one is bothering to innovate in search anymore. Bloglines used to be a pretty good blog search engine, but it has gone the way of the dodo. Technorati is now useless as a search engine. Bing is a starting point, but needs an awful lot of work if it’s going to compete. News search is completely underserved, and Twitter… really, Twitter archival search is non-existent.

Are Google really so far ahead that they can’t be touched? Are they really so great that no one is going to bother challenging them? The answer to the first question is clearly no, they aren’t that brilliant that their work can’t be improved upon, not just in terms of the search algorithm which has come in for a lot of criticism lately, but also in terms of their interface and the granularity of advanced searches. And I’d be deeply disturbed if people thought that the answer to the second question was yes. Google are the incumbent but that makes them vulnerable to a smaller, more nimble, more innovative competitors.

2. Historic search has been sidelined to serve instant
What’s going on right now? That’s the question that most search engines seem to be asking these days. Most have limited or zero capacity to look back on our web history, focusing instead of instant search. The immediacy of tools like Twitter and Facebook is alluring, especially for brands and companies who want to know what’s being said about them so that they can respond in a timely fashion.

But focusing on now and abandoning deeper, more nuanced historic searches is a disturbing trend. Searching the web’s past for research purposes might be a minority sport, but can we as a society really afford to disenfranchise our own past? Can we afford to alienate the researchers and ethnographers and anthropologists who want to learn about how our digital world has changed? About our reactions to events, as they happened rather than remembered years later? There is value in archives, but not if they are locked up, and the key thrown away by the search engines.

We cannot afford to sacrifice our history on the altar of instant. We can’t just say goodbye to the idea of being able to find out about our past, because it’s ok, we can see just how pretty the present looks. The obsession with instant risks not just our past, it also risks our future.

Twitter: Building a business-critical tool, then breaking it

I remember four years ago, when Twitter was still a blossoming new service, the outages that they used to suffer. Within just a few weeks of joining, I realised what a great tool it was and how important it was to me. Like many others who endured ongoing disruptions to Twitter’s service, I publicly stated I was willing to pay. Indeed, people were begging Twitter to let us give them money, to have some sort of way of paying for a service we had so quickly learnt to love. Twitter, inexplicably, pooh-poohed the idea, much to our frustration.

Over the last four years I have watched Twitter grow and fail to find any sort of business model that you could hang a hat on. Unlike Flickr, or Viddler, or Dipity, or WordPress, or LinkedIn, or countless other services, I can’t choose to give Twitter my money through subscribing to a premium account. For four years, Twitter have failed to earn any money from me, despite my willingness to pay.

i still have no idea what Twitter’s real business model is. Promoted Tweets and trends seem like a weak and risky ad-based model. I had thought they were going to let the ecosystem grow some awesome apps, add-ons and clients, buy the best, and then use sales of that as one income stream, but instead they bought Tweetie and then killed it dead.

For a while I thought they’d spot the popularity of services like TwitPic or TwitVid and build a premium offering around media sharing. Or maybe they were going to archive the world’s Tweets and use them in some clever way to research market demographics or something. But no. Not only have they trashed their archive, their search is useless.

And now they are slowly killing off the very developer community that made them great – indeed, made them at all usable – in the first place. Many have written about Twitter’s Ryan Sarver telling developers not to bother working on new clients already, so I shan’t dig further here. But it’s another nail in the coffin for all the goodwill and love that had built up around Twitter.

Whether deliberately or by accident, Twitter has created a service that people, businesses, NGOs, governments and grassroots groups now need in order to communicate with their constituencies. And many organisations, and many different types of organisation, rely on Twitter in one way or another. They also rely on a number of 3rd party services to help them understand how well they are doing, and what they are doing, on Twitter.

But Twitter’s rate limits – which stop 3rd party apps from abusing their API – are now starting to affect the use of Twitter as a serious communications tool. I am in the process of writing a research report, and yesterday I wanted to do a bit of research into the usage of Twitter by some well-known organisations. One of the tools I was using was Twitalyzer, and I came across this error:

Twitter is refusing to process any more of our requests right now! Twitter imposes certain rate limits on all partner applications and we have hit one of those limits.

Now, when I check, I get this message too:

The Twitter Search API appears to be largely OFFLINE today. On Friday, March 18th we started seeing a dramatically elevated rate of search-related errors during our processing. We are talking with Twitter but have not found resolution yet. All accounts and account processing appears to be affected. We are doing everything we can to resolve this issue but for now it appears out of our hands.

Now, it’s one thing to get errors like that if you’re doing an ego search, but if you’re using Twitter for professional purposes you need it to be up and running all the time, and that means that up and running for 3rd party tools as well as for Twitter.com. That means having an API that is reliable, that people can depend on not to die or get rate-limited into oblivion.

It’s not just Twitalyzer that’s having problems. Tweetstats had problem with a massive queue (which I guess is their way of getting round the rate limiting), and Kevin has reported problems with Nambu, a desktop client. Slowly, it seems, Twitter is killing off its ecosystem with API changes, rate limiting, poorly developed clients/apps, annoying new features (the ‘dickbar’) and restrictive API policies. Whether this is deliberate or not, I wouldn’t like to say. But it is shortsighted.

There is still an opportunity for a premium Twitter account, one that I and, I am sure, many businesses would happily fork out decent cash for. A Twitter account that guaranteed me reliable service, some way of ensuring that I don’t get rate-limited when using 3rd party apps, and perhaps even some additional premium services that I haven’t dreamt up yet. That, yes, I would still pay for.

Now, you might say that Twitter is caught between a rock and a hard place: They don’t have the scale of Facebook, and they don’t have the income streams either. With no credible business model, further scaling could be difficult, and investors may be militating for some sort of return by any means possible. This could be pushing them into shortsighted strategies to scrape some dosh in any way they can. Now that’s speculation, of course, but I’m still struggling to make any sense out of what they actually are doing, or see how it translates into a decent business model.

Certainly, I have to think that Twitter are playing a dangerous game with their users’ goodwill. People keep saying that they have lock-in because there’s no credible competition, but where is Friendster now? Or MySpace? Yeah, they still exist, but as shadows of their former selves.

Twitter can’t take their users for granted, because they can and will go elsewhere. Twitter have made a phenomenal tool, one that both businesses and individuals find useful enough to want to pay for, yet bit by bit they are breaking it. Not big. Not clever. And not a strategy for long term business sustainability.

Digital journalists: You’ve got a choice

As I said in my previous post, my good friend Adam Tinworth has highlighted a comment on the Fleet Street Blues blog. I’ll highlight a slightly different part of the comment:

It turns out, however, that the new skills are a piece of piss (particularly with current web technology), and promoting a yarn via Google, Facebook, Twitter etc is, in reality, an administrative task rather than a journalistic one. If you want to employ a proper journalist rather than a cheap web monkey, the SEO stuff really is secondary.

To which all I can say, bitter much? The commenter goes on to say that bosses hire web monkeys because they’re cheaper than real journalists. I know that this commenter is in the UK, but I’d point out that AOL is paying journalists for its local news sites, Patch.com, a very good starting salary, much more than they would probably be making at a local newspaper.

As I’ve said quite a few times before, too many journalists waste their time dividing the world into us and them, into people they consider are journalists and people they don’t. I can understand the anger and upset about journalists tired of cuts and anxious about their future. Most of my career, I’ve worked for organisations engaged in sometimes dramatic cuts. I was at the BBC for eight years, half the time there were cuts. I was at The Guardian for three and a half years, and half the time there were cuts, quite deep cuts. However, comments like this are counterproductive and self-defeating on so many levels.

It’s just another misguided attempt to trivialise the work of digital journalists. Sigh. It’s sad that in 2011, we’re still having to justify our journalism to those who would call us cheap web monkeys.

Digital Journalists: Go where you’re appreciated

To fellow digital journalists, I’d say this. If you hear this repeatedly in your organisation, you have a choice. Frankly, a year ago, I could feel some bitterness overtaking me: Cuts, integration battles, the pressure of having to make up for a greatly reduced staff, and the odd pot shot thrown in my general direction for being a “cheap web monkey”. I realised that I was never going to win over folks like the commenter on Fleet Street, and I finally stopped trying. I finally stopped quoting my CV to justify my journalistic credentials. I stopped correcting people who assumed that my degree was in computer science instead of print journalism.

Now, the news organisations that Suw and I are working with, international and national news organisations around the world, don’t question whether I’m a journalist. I don’t have to quote my journalistic achievements, and you shouldn’t either just because you work in digital.

If you’re in a poisonous work environment like this, constantly having to defend your work, just leave. It’s a judgement call, and every place has its politics, but if you’re sidelined, marginalised and disrespected, you owe it yourself and to journalism to take your skills where they’ll be put to good use. I spend quite a bit of my time now training people to do what you already can. You’re in demand and fortunately, now, there are places where you can just get on with being a journalist. There are organisations getting past these cultural issues or aware of them and working hard to overcome them.

Worried about the economy? So was I last year. I held off taking a buyout at The Guardian until the last minute. I was scared, and I’d be lying if I said there haven’t been a few ‘oh shit’ moments in the last year. However, I can honestly say, I haven’t been this happy professionally since I worked for the BBC as the Washington correspondent for BBCNews.com. The work is fascinating and rewarding. The news organisations I’m working with, like Al Jazeera, are pushing the boundaries, like we did when I joined the BBC News website in 1998 less than a year after its launch. Financially, Suw and I are more secure. Oh, and I did I mention already that we’re happier? If you’re beat down and don’t want to take it anymore, just remember, you don’t have to.

Social media is part of journalism

Adam Tinworth has highlighted a comment on Fleet Street Blues that sees social media as “an administrative task” rather than a journalistic one and says that editors want to hire “web monkeys” because they are cheaper than real journalists.

This commenter wouldn’t be the first person to mistake social media journalism for nothing more than a promotional function best left to “cheap web monkey”. I’m sure if the commenter works for a large enough organisation to have its own press office that they would love to be called cheap web monkeys for . However, smart journalists long ago realised how valuable interacting, not merely promoting one’s work or broadcasting on Twitter, was to their journalism.

Megan Garber of Harvard’s Nieman Lab wrote:

Carvin’s work cultivating sources and sharing their updates has turned curation into an art form, and it’s provided a hint of what news can look like in an increasingly networked media environment.

I’m back in Doha working again with Al Jazeera. I spent five weeks in November and December conducting social media training with more than 100 Al Jazeera English staff. Social media has been key in how they covered this story, and it has been a part of the story, especially in Tunisia and Egypt. As Egypt cracked down on their television staff, Al Jazeera sent its web journalists to Egypt to help tell this historic story.

We’re not “web monkeys”. You can just call us journalists from now on.

Does journalism need another open-source CMS?

I have to say that I’m a bit baffled by a $975,000 grant from the Knight Foundation to the Bay Citizen and the Texas Tribune, two very well funded non-profit news organisations in the US. The goal is to create a nimble open-source content management system. I guess WordPress or Drupal, just to name two open-source content management systems, didn’t fit the bill. PaidContent is reporting that news start-ups expressed this need during meetings last year at SXSW Interactive. PaidContent said:

  • Manage an integrated library of text, video and audio files;
  • Maximize search engine optimization by improving the way articles are linked, aggregated and tagged;
  • Integrate sites with social networks like Facebook and Twitter as well as bloggers;
  • Provide membership tools and integration with ad networks to help with new revenue streams.

I wonder if those news start-ups have heard of OpenPublish. The platform is a distribution of Drupal with Thomson-Reuters’ Calais semantic technology added to help deliver better related content to users. It’s got some nice monetisation features. The Nation and PBS Newshour use it. That’s just one open-source option. How does this not tick the boxes above?

I also know from my own work with news organisations, it’s highly likely that these non-profits will create a platform that is optimised for their own needs but not generally applicable. This is a larger problem with news organisations. All but the largest news orgs could use open-source CMSes and get 90% of what they need with little modification. However, a lot of news editors are obsessed with differentiating on aspects of the CMS that deliver little efficiency to their journalists and little or no benefit to their audiences. IT managers are more than happy to deliver these vanity features because it can justify a bit of empire building.

I do worry that this money will go into reinventing the wheel and deliver little marginal benefit to these start-ups and to the larger news eco-system. Wouldn’t this money be better spent supporting existing open-source projects and adapting them to journalism rather than creating another platform?

Journalism: Winning the battle for attention

Last week, I had the honour to return to Sydney Australia for Digital Directions 11, a digital media conference sponsored by Fairfax Media and organised by the ever-wonderful XMediaLab team. I focused on the theme of the attention economy. It’s not a new idea. Umair Haque was talking about it in 2005, but if anything, the issue is more acute now than 6 years ago. Most media business models are based on scarcity. Across the English-speaking world, all but the largest cities are served by only one newspaper. Until cable and satellite, we had the choice of only a few television channels, and in those businesses, high capital costs usually led to monopolies. Digital media of all kinds has ended scarcity, and as Clay Shirky says:

Abundance breaks more things than scarcity does

One of the troubling things has been is that news organisations have responded by creating ever more content. The thinking has been in digital media to create more content to hopefully attract a larger audience and have more content to put ads against. It hasn’t led to increased revenue. If anything, the excess inventory actually depressed digital returns during the recession.

The Associated Press also found in a study (A New Model for News PDF) that young audiences were turning off to news because they were overwhelmed with incremental updates:

The subjects were overloaded with facts and updates and were having trouble moving more deeply into the background and resolution of news stories.

Yet the response by news organisations has been to produce more content even as they have had to reduce staffing due to their economic problems. It’s like trying to save a drowning man by giving him a glass of water.

I argued that relationship and relevance are key to news organisations winning the battle for attention. Engaging audiences directly through social media journalism is one way that news organisations can increase loyalty. I also think that helping audiences discover content that is relevant and interesting to them is key to the future success of news organisations, and I think that they can do this both with semantic and location-based technologies. Success will come with smart, sharp content and real engagement by journalists.

Ada Lovelace Day: 7 October 2011

Cross posted from FindingAda.

As announced on the front page of the Ada Lovelace Day site a few weeks ago, the date of this year’s Ada Lovelace Day has moved to Friday 7 October 2011. Please put it in your diary!

I didn’t take the decision to change the date lightly. We’ve had two years of ALD being in March, and it was starting to become a bit of a tradition, so the idea of moving it to later in the year has worried me a bit, as I don’t want to lose momentum. But by early January it had become clear that things just weren’t going to be ready in time.

Although I have had some fabulous help from some wonderful people, the responsibility for getting things moving still lies with me, and the last six months has seen me incredibly busy with work. We’re in the middle of a recession, so I feel grateful for having such a full diary, but the knock-on effect has been that I’ve not been able to give this year’s Ada Lovelace Day the love it deserves.

It turns out that March is a supremely bad time of year to have a recurring event. Despite trying to get things moving towards the end of last summer, I didn’t make much progress and before you know it, it’s Christmas and everyone’s really busy, and then New Year has come round and suddenly things aren’t ready and it’s all getting a bit tight. Add a trip to India in February to the mix and deadlines throughout March and it became clear to me that something had to change.

The March date was always arbitrary, picked because I was too impatient to wait any longer! The October date has been picked because it’s far enough away that it gives us a chance to get our ducks in a row, but also because (hopefully!) it doesn’t clash with school and university calendars. I’d very much like to do a bit more outreach this year, and would like to have more resources for teachers, pupils, university lecturers and students. A date that’s in term-time, but not too near Easter or in exam season is a more important consideration now than it was two years ago.

There are other changes afoot too: I’ve also shifted the mailing mailing list from Yahoo to Mailchimp, so provide us with more flexibility. Please do join up – there’s a form in the sidebar of FindingAda.com. I’ll be sending out monthly updates once we have a few people subscribed, with more updates closer to the big day. You’ll be able to manage your subscription and unsub at any time you like, so take the plunge and subscribe today!

Finally, I do need your help to spread the word about the new date, so please do blog, tell your friends, Twitter, and Facebook followers! Ada Lovelace Day: 7 October 2011.

Digital Directions 11: Josh Hatch of Sunlight Foundation

Josh Hatch, until recently an interactive director at USAToday.com and now with the Sunlight Foundation, talked about how the organisation loves data. The transparency organisation uses data to show context and relationships. He highlighted how much money Google gave to candidates. Sunlight’s Influence Explorer showed that contributions from the organisation’s employees, their family members, and its political action committee went overwhelmingly to Barack Obama.

Sunlight Foundation Influence Explorer Google

The Influence Explorer is also part of another tool that Sunlight has, Poligraft. It is an astoundingly interesting tool in terms of surfacing information about political contributions in the US. You can enter the URL of any political story or just the text, and Poligraft will analyse the story and show you the donors for every member of Congress mentioned in the story. They will highlight details about the donors, donations from organisations and US government agencies. It’s an amazingly powerful application, and I think that it points the way to easily add context to stories. It does rely on the gigabytes of data that the US government publishes, but it’s a great argument for government data publishing and a demonstration for how to use that data. Poligraft is powerful and it scales well.

Josh showed a huge range of data visualisations, and he’ll post the presentation online. I’ll link to it once he’s done.

Digital Directions 11: Fairfax’s digital business

I’m in Sydney to speak at Digital Directions 11. I’ll post my talk to Slide Share in a bit. The conference is hosted by Fairfax, and yesterday, we got a look at their digital business. There are a lot of news and media organisations that have built credible digital offerings over the last decade without building sustainable digital businesses. Fairfax is one of the exceptions. Fairfax CEO Greg Hywood said that digital is its third largest division by revenue and soon to take over the number two spot. Yesterday, we were told that transactions were 60% of digital revenue. Transactions? They have businesses such as a dating service called RSVP and a holiday home rental service, Stayz.com. They are seeing phenomenal growth in that business. Many of these businesses have been acquisitions, not businesses built in house.

Moreover, the revenue being generated by digital is now driving the ascendancy of digital in the organisation. Recently, Jack Matthews, who had been the CEO of digital, was made the CEO of their metro division overseeing both print and digital. He will drive integration. They are going to integrate their print and digital editorial operations, but the current thinking (and this might change) is that while journalism resources will become a common pool, digital will retain its own editorial independence, Matthews told me. Fairfax has found that print and digital offerings don’t share the same audience. Market research has found that their digital audience is slightly more upmarket than their print audience, and they have decided that they need to maintain digital and print independence to best serve those audiences.

They were ready to admit that integration has often meant print divisions taking over digital. If digital didn’t have such a strong revenue position, I doubt, actually, I’m almost certain, that digital wouldn’t be driving integration at Fairfax. That is not to say that it hasn’t been a battle at Fairfax. I know there has been a battle, and Matthews admitted as much. It’s not to say that the battle is over. However, when digital brings revenue, instead of being pushed aside during print and digital integration, they can actually be in the driver’s seat.

Live blogging evolved: Context and curation not just collection

When I was the blogs editor at The Guardian, I was a big booster of live blogging. We now think of Twitter and Facebook as “real-time”, but in terms of news, we’ve been living in a real-time environment now for decades. With the advent of radio, we stopped waiting for the news to arrive when we got the newspaper. Sometimes rolling news can descend into self-parordy, but after working for the BBC for eight years, I know that live and continuous news can be done well. For me live blogging and real-time curation allow newspapers via their websites and via mobile to compete against broadcasters in rolling news.

Seeing this post via Martin Belam that the Guardian Newsblog might somehow be “the Death of Journalism” by John at The Louse & the Flea* gives me a chuckle but it also raises some valid points. Martin responds to some of those points, and I know that he and others at The Guardian are actively working to address some of the deficiencies in the format. Martin says:

Nevertheless, John does identify some of the issues that concern me from an information structure point of view of the way we do live blogging now – notably it is very difficult within our templates to display a summary prominently enough, and the strict reverse chronology of entries whilst a live blog is “active” can lead to the more important chunks of the content getting buried. We could also probably do an improved job of permanently sign-posting packages of more conventionally formatted stories from within the live blog itself.

As anyone who works in online news knows, some of this is just down to the limits of technology, as Martin admits, but they are limits that can be addressed both technically and editorially. Live blogging began with sports coverage at The Guardian and moved on to media and tech conference coverage and also live blogging TV shows. The length of a post was limited by the event – 90 minutes of football or an hour episode of Big Brother, but I’m not sure that this format is really well suited for events that carry on all day for several days.

Drowning in a ‘River of News’

However, John does raise some issues that I think are worth addressing. John says that The Guardian’s live blog is:

a mish-mash of baffling tweets, irrelevant musings from the Guardian’s comments, contact details for those who want to find out about loved ones or make donations (including one from the New Zealand Red Cross, who actually says it doesn’t want donations just yet, and another from the Auckland University Students’ Union, the relevance of which escapes me), musings from a boffin at that world renowned centre of earthquake research, Bristol University, and speculation on how the tragedy might affect the Rugby World Cup, due to kick-off in NZ in seven months’ time. Scattered meagrely throughout, like sixpences in a Christmas pudding, are bits of what you and I might call “hard news”.

I really do worry that some of the aggregation that we’re doing is really difficult to navigate unless you’re a news junkie. We have to make sure that a stream of news aggregation doesn’t feel like a maddening stream of consciousness. I have the utmost respect for Guardian live blog masters (and friends) Matthew Weaver and Andrew Sparrow, but I can’t help but think there has to be a better way to package their prodigious and highly professional output. Andy said that some days during the election last year he was sometimes producing CORRECTION: 14,000 40,000 words a day. (Andy corrected me on Twitter. I thought 40k words in a day was ambitious, but he is prolific!) How does the average reader easily navigate this? The Guardian did a lot of work during the election to improve the format. They added better formatting for different elements such as blockquotes and contributions from other members of Guardian staff, but the reader still has to rely too much on searching within a page.

I know that Martin and Co will come evolve the format, but I still can’t help but wondering if simply breaking up the posts at major inflection points might be a good interim solution. I agree with Martin that there is a lot that can be done with better packaging.

Martin flags up the prodigious output of The Guardian yesterday, much of it in more traditional formats. However, looking at the headlines, I have to admit that I’m overwhelmed. With some of the headlines, I’m not entirely clear how the stories are different. In saying that, I don’t want to pick on The Guardian. Frankly, I’m really think that over-commissioning is part of a problem that newspapers are suffering from right now. They can publish continuously, but I know there is a better way to mix slow and flow news coverage.

Curation and context not just collection

I also think that John has hit on an issue that has become a real problem in real-time news coverage in the last couple of years. I’m a journalist. I’m a news junkie. I keep tabs on a wide range of stories in some considerable depth, but even with the background knowledge that I have, I’m sometimes lost. If I’m lost and overwhelmed in stories that I know really well I know that our audiences don’t even know where to start.

Whether it’s live blogging or new tools like Storify, I worry that sometimes we’re training a fire hose of news on our audiences. We’re not curating. We’re not making editorial choices and adding context. Instead I do fear that we’re causing information overload rather than helping people make sense of the world. Storify and live blogging are great tools and techniques, but they work when a journalist makes editorial choices and adds value through context. Who is this person on Twitter? What is their role in the story?

On Twitter, I occasionally hear the claim that to edit out information is some form of censorship. If people want the fire hose, they can use Google Realtime. People have a choice to swim in the waterfall, and our editorial choices don’t preclude people from digging deeper and in different ways than we have. Journalists report and choose what they think are the most important bits of information. That’s one of the services that we provide, and in the deluge of real-time news, that service is actually more important than before.

I guess to do that, to be trusted guides, we have to win rather than assume trust. That’s another change in terms of people’s relationship to journalism, but we can do that because we don’t have the walls that separate us from audiences.

Real-time in motion

Some of these super-long live blogs are also is a terrible experience on mobile even on light-weight mobile templates. The downloads are huge, and they don’t work well on small screens. As we increasingly move to mobile consumption, we’re going to have to rethink this format or more likely than not think of a new format entirely.

I hope that my friends and former colleagues at The Guardian don’t think I’m picking on them. These are more general observations than The Guardian’s live blogging, and I know that Martin and the great live bloggers on staff there don’t rest on their much deserved laurels. It’s a big challenge. We can relay so much more information than in the past when we had a few sources and the wires, but that means that we have to find new ways to help audiences make sense of that information.

*Note to John
This isn’t a snarky comment but honest advice. If you’re an unemployed journalist, I’d really suggest adding some links on your blog to your past work and an up-to-date CV. Suw and I use LinkedIn and have a link to our work histories there. Hopefully your blogging won’t just keep you sane while you look for another job but actually help you land it. It’s working really well for me, and I hope it helps you find a job soon.