Whuffie and the snowball

Doc Searls blogs about how these days he prefers to roll snowballs downhill instead of pushing rocks uphill:

Tell ya what. I’m fifty-seven years old, and I’ve been pushing large rocks for short distances up a lot of hills, for a long time. Now, with blogging, I get to roll snowballs down hills. Some don’t go very far. But some get pretty big once they start rolling.

See, each snowball grows as others link to the original idea, and add their own thoughts and ideas. By the time the snowball gets big enough to have some impact, it really isn’t my idea any more.

Anyway, at this point in my life I’d rather roll snowballs than push rocks.

He then quotes Steve Gillmor and Jay Rosen about getting ideas moving, and concludes:

I think Big Challenges start with conclusion, with finished opinions. That’s what makes them sysiphean. They are bodies at rest that are hard to put into motion, especially in an uphill direction.

But if you start with an idea, whether partly formed or whole, whether yours or somebody else’s, and push it in the downhill direction that all blogging (thanks to links and RSS) essentially goes, it’s bound to have some impact once it grows large enough. And as long as it keeps going.

The problem I have with Doc’s post is this – in order to get ideas rolling downhill, you need to already be uphill, you need whuffie. Firstly you need people to be reading your ideas, secondly they need to want to do something with your ideas (there’s always extra kudos and therefore motivation in doing something with the ideas of someone who’s got a bit of whuffie), and thirdly they need to tell people that they got their idea off you (so that your whuffie builds).

People like Searls, Gillmor and Rosen have whuffie in spades, and this is why they can start snowballs rolling downhill and why those snowballs grow as they go. If you have no whuffie, your snowball will just melt – no whuffie means few readers, no one gaining kudos off developing your idea, no whuffie coming back to you for having had it. The idea goes nowhere.

It’d be nice to think that it’s the quality of the idea that gets the snowball moving, but more often than not, that has nothing to do with it. Hugh Macleod, for example, has so much whuffie that all he has to do is fart and the trackbacks start rolling in.

I saw exactly the same thing when I worked as a music hack – it’s not the bands with the best music or the journalists with the best writing skills that make it, it’s the ones with the whuffie. Same in the film industry – doesn’t matter how good your script is, if you have no whuffie you aren’t going anywhere.

It’s no surprise that it’s the same in the blogosphere, after all, we all know that we have a small minority of bloggers who have all the whuffie. They stand at the top of the mountain, from where it’s easy to start an avalanche. Those of us in the foothills can throw snowballs all we like, but it’s not going to have the same effect.

The trouble is that there are a couple of whuffie Catch-22s going on: firstly, those who have whuffie get more whuffie and those with none find it hard to build up. Secondly, my own personal whuffie Catch-22 and one common to all those in the whuffie-dependant industries, is that I need to blog to gain whuffie so that I can get more work, but the things that pay the bills take me away from blogging thus preventing me from gaining more whuffie in order to obtain more work. That’s basis of the feast-or-famine self-employed life.

The answer? Find a ski-lift.

Observer launches new blog

The Observer launched their first blog today (or rather, yesterday, by the time MT deigned to post this), and I have to admit that I really like it. It’s a mix of links to stories from the Observer, glimpses into the production process and thoughts about the news:

Ok, that’s the first edition gone and now there’s the big haul through the next few hours to improve the paper as we go through the night. Last edition is at 2am and we’ll keep tweaking until then.

I’ve been here since 8 this morning, probably ballsing a few things up, changing my mind and basically shouting at people who are doing a very good job. Good staff always save the news editor. At one point the production editor, Bob Poulton, patted me on the shoulder and said: ‘We kept you out of the loop on that one’ as I started arguing about another headline. Quite right.

Unlike most blogs out of the mainstream media, the Observer blog has a nice personal tone to it. Although it obviously does link to its own articles, it’s not just hawking them, it’s talking about how they decide what to cover, how they put the paper together, what their thoughts are. Some of the commenters accuse the blog of being banal, but I like the personal observations. There are so many blogs that act as content filters, giving me links and sending me off into the great wide web, and that’s nice, but for me an insight into what it’s like to put a major UK broadsheet together is far more interesting.

I’ve said before that I think the Guardian ‘gets’ blogs in a way that most of the rest of the media don’t and Neil McIntosh, along with the Observer team and Ben Hammersley, has done a great job on this new blog.

In his email announcing the blog, Neil said:

The blog experiment has been fascinating thus far, especially the unprecedented quality and quantity of discourse between readers and journalists we’ve been enjoying. We’ll be continuing the experimentation as we continue to launch and develop our weblogs in the months ahead, aided by blog guru and journalist Ben Hammersley, who’s come on board to help with the technical aspects of our blog setup.

The Observer is also possibly the first mainstream paper to start podcasting too, as Ben explains:

Meanwhile, the first Podcast (another newspaper first, I think) went up today with this post by John Naughton. Apart from being a fine chap, and one of the webloggers, John wrote the definitive history of the internet, and is the Professor of the Public Understanding of Technology for the Open University. So, you know, this isn’t some old guff recorded in the car by a fat bloke with a mullet.

It’s a coincidence that the Observer’s blog launch comes the day before my discussion panel at the LSE on blogging and journalism, but it’s a great example of the good stuff that can be done with a little thought and a solid understanding of how blogs work and what their strengths are.

Fired blogger gets hired

Joe Gordon, who was fired by Waterstone’s for blogging a few frustrated comments about his employers, has now been hired by Forbidden Planet, thus making at least one of my predictions come true – a better job with better pay.

Joe is still waiting for news of his appeal against Waterstones, and to see whether an industrial tribunal will be required. Whilst it would be tedious for Joe to have to go to these lengths to gain recompense for Waterstone’s idiotic behaviour, and I wouldn’t wish it on anyone, I am sure I am not be the only person to wonder curiously how the lawyers and trade unions view blog in relationship to such concepts as freedom of speech, public domain and bringing a company into disrepute.

I’m also curious to know what internal changes will or have occurred at Waterstone’s now that the person who fired Joe for allegedly bringing the company into disrepute has managed to actually and measurably bring the company into disrepute. But I guess that’s something I’ll never find out.

Reading through Joe’s archive, though, it seems that he was not actually the first person in the UK to get fired for blogging – JGram and Dykenee Crossroads (whom I can’t find online) have also been fired for their blogs. Back in November, JGram blogged the official letter and reason for his firing, however, the original posts have been removed, so it’s impossible to say whether the company he worked for were overreacting or not.

My belief remains that frequently blogs are an excuse – whether used by the company to get rid of someone that they just don’t like, or the blogger to cover up some other misdemeanour. Blogs remain a minority occupation misunderstood (or not understood at all) by many. People frequently fear that which they do not understand and fear can breed illogical over-reactions at worst and a pretence of non-existence at best.

Illustrative of this was the conversation I had with my Lloyds TSB business bank manager on Thursday as I attempted to update him as to what I am doing with myself. The subtext – and it was not a particularly well hidden subtext – from him was ‘I do not understand what you are doing, I do not believe that what you are doing is important, and I do not believe that you will be successful’. Not only did he actually tell me that he didn’t understand what I was doing or how I could make a living from it, he actually implied that no amount of effort on my part would ever result in my being successful and that no amount of explaining would ever make him understand.

Now, this doesn’t have a particularly big impact on me, but it will do on Lloyds TSB when I move two business accounts and at least one personal account away from them to another bank because I’m fed up of having to deal with an ignorant, rude and ineffectual idiot of a bank manager. Ah, the power of the customer.

But all this is is just another illustration that it doesn’t matter which side of the fence you are on, employee/employer or business/customer, you’ve got to keep your eye on that blog-shaped ball.

How metafeeds will lead the way to RSS nirvana. Maybe.

I have blogged before about RSS overload, the problem of simply having too many feeds in your aggregator to be able to read them all. Now Bill Burnham gives it a name, Feed Overload Syndrome, and discusses how “RSS threatens to sow the seeds of its own failure by creating such a wealth of data sources that it becomes increasingly difficult for users to sift through all the “noise” to find the information that they actually need.”

He then describes the problem in detail and discusses possible solutions. Syndicating the results of keyword searches instead of actual blogs, he says, is not an ideal approach for three reasons: many RSS feeds are excerpt not full post, thus preventing comprehensive indexing; keyword searches become less effective the more data you index; keywords can have multiple meanings which produce noise in the results.

The new Technorati tag system is also ‘fundamentally flawed’ in his view:

The problem at the core of tagging is the same problem that has bedeviled almost all efforts at collective categorization: semantics. In order to assign a tag to a post, one must make some inherently subjective determinations including: 1) what’s the subject matter of the post and 2) what topics or keywords best represent that subject matter. In the information retrieval world, this process is known as categorization. The problem with tagging is that there is no assurance that two people will assign the same tag to the same content. This is especially true in the diverse “blogsphere” where one person’s “futbol” is undoubtedly another’s “football” or another’s “soccer”.

I agree that this is a big problem with tagging, if what you are aiming to achieve is a flawless, cross-referenced database of blog posts. In an ideal world, that would be nice, but this is not an ideal world and people are used to the internet not working quite right. Users learn how to rephrase their search terms to improve results and once Technorati allow for more complex tag searches or starts to produce clustered search results then semantic issue becomes less important. (Although I doubt they will ever become irrelevant regardless of what is done.)

Instead, Bill Burnham believes that the way to RSS nirvana is through the use of metafeeds – “RSS feeds comprised solely of metadata about other feeds”.

Combining meta-feeds with the original source feeds enables RSS readers to display consistently categorized posts within rich and logically consistent taxonomies. The process of creating a meta-data feed looks a lot like that needed to create a search index. First, crawlers must scour RSS feeds for new posts. Once they have located new posts, the posts are categorized and placed into a taxonomy using advanced statistical processes such as Bayesian analysis and natural language processing. This metadata is then appended to the URL of the original post and put into its own RSS meta-feed. In addition to the categorization data, the meta-feed can also contain taxonomy information, as well as information about such things as exact/near duplicates and related posts.

RSS readers can then request both the original raw feeds and the meta-feeds. They then use the meta-feed to appropriately and consistently categorize and relate each raw post.

The benefits of using metafeeds as outlined by Bill look great. You would be able to find related documents, eliminate duplicates, create custom taxonomies, combine metafeeds and have your information “consistently sorted and grouped into meaningful categories”.

I have to admit, that sounds great. It would be wonderful to be able to create complex search strings and to get a feed back from the web that would contain only relevant posts and no duplicates. It would indeed be a form of RSS bliss.

It won’t, however, solve the problem of RSS overload – it is likely that it will just make it worse. Bill’s fix is a technical solution to a non-technical problem, and as such it is only half a fix.

We have always lived in a world where there was more information available than any one person can comprehend, but before email, the internet, blogs and RSS feeds, the limiting factor was not the existence of the information but gaining access to it. The form of the information limited the speed with which it could be accessed: having to go to a library, find the right book or journal, turn the pages, reading them one by one; gaining an introduction to an expert, persuading them to sit down with you and discuss the matter at hand; or doing empirical studies in order to reveal the information sought. It all took time.

Now the data we seek is easily accessible and the problem has shifted – it’s not finding information that’s the issue, it’s finding the right amount of the right information. The limiting factor is no longer access but discrimination. There is so much information available that it’s hard to know which bits to trust.

Anyone who paid attention at university learnt that the way you do library research is to cross reference your sources – you can’t trust one single source to be telling the truth so you learn to triangulate. The more sources that tell you that zebras are black and white, the more you believe it. Then you learn to weight your sources by credibility and reputation. If Learnéd Academic Journal tells you that zebras are black and white, then you feel confident that all other sources are going to agree with that, and it’s easier then to discount the Tabloid Freakshow Magazine article that claims to have discovered a purple zebra.

That’s basic research methodology. Cross reference. Consider the source. Keep a bibliography. And it’s a hard, hard habit to break, even for people who didn’t know that they were doing it.

RSS overload is partly to do with trying to triangulate the ‘truth’ from too many sources. There are many blogs devoted to Macs, for example, and the urge is to read them all to see what each one is saying, to compare the information in order to draw some conclusion as to what is most likely to be true. In blogging, there really aren’t any Learnéd Academic Journal-type sources with the sort of standing that allows you to immediately trust them. There are many reliable blogs written by many well-informed people, but it is difficult to tell which they are until you have completed your triangulation, reached your own conclusion and found that it syncs with what your now trusted blog tells you.

Of course, this is not necessarily a bad thing, as many previously trusted data sources are being shown to be less than trustworthy, but we do have to recognise that this whole process of building up a list of trusted blogs takes time and effort. Although to some degree trust can be passed on to other readers through word of mouth recommendations, we are still doing more work to locate trusted sources than we used to.

Another problem not solved by Bill’s metafeeds is that of completism. If you’ve ever met a rabid collector of stuff then you have probably met a completist, someone who just can’t bear not to have every last Star Wars toy, or every last scrap of Elliott Smith memorabilia. That’s what makes collectors collectors.

Many bloggers are completists too – information completists. To go back to the Mac example, you may rapidly decide which feeds are most reliable and which are mainly talking rubbish, but that doesn’t mean you are going to delete the rubbish feeds from your aggregator because there is the possibility, however slim, that they might just break the rumour of the G5 PowerBook that you’ve been desperately waiting for all these months.

Then there are the long link trails left for us to follow when we are researching our next post. You come across an interesting post, it contains links, which you follow, and then that contains more links which seem relevant so you follow those too… and then you check Technorati and read the posts you find there, and they lead to more and more posts and before you know it you’ve spent a day researching a blog post that is only two paragraphs long.

Information completism is dangerous – it leads to chronic information overload and can turn into a form of ‘legitimate procrastination’. Because link trails are convoluted and potentially exceedingly long, it’s easy to over-research instead of actually get on with the post.

The only cure is to accept that we are human and flawed and we cannot possibly know everything about everything. We can’t even know everything about one thing, because there is too much to know, too many perspectives to take on board, too many angles to look at it from. We cannot and should not attempt to read every post and comprehend everyone’s point of view on a subject.

Instead we should refine our lists of sources down to a few trusted writers, and let the rest go. Is the Mac idiot whose blog makes you fume really going to break news about a new G5 PowerBook? No. Ditch it. Is reading every post about RSS really going to make your post about RSS overload any better? No. Read what you need then get on with the writing.

If anything, Bill’s metafeeds could well add to the problem of RSS overload by adding more sources to the mix. Instead of cutting down the number of feeds people try to read, it will add to them by providing alternative concretions of data which supplement existing sources rather than supplant them. This is because of the third flaw in his plan – blogs are social, and his fix is technological.

Most of the blog feeds I read on a daily basis I read for social reasons rather than informational reasons. I have 56 feeds in my ‘friends/dailies’ group in NetNewsWire, another ten under ‘acquaintances’. None of these feeds have anything to do with information per se. They could not be replaced by any sort of keyword search and metafeeds would be simply irrelevant in this context. I read them because I want to know what these people are up to – they are friends or people I wish were friends.

But even here, where you would think that the territory is fairly well defined, there is a problem of bloat. Social networking is great, it allows you to meet a whole bunch of interesting people you would never otherwise have met, but widening your social circle also means you have more friends and acquaintances to keep up to date with. Whilst individuals may not expect you to read their blog, (indeed, I remain in a state of permanent surprise that anyone reads any of my blogs at all), there remains a nebulous feeling that one really ought to. I’m now connected to a ludicrous number of people, and in all honesty there is no way I can read everyone’s blog.

The problem of RSS overload is not completely technological and a technological fix will not work. Instead it is partly technological, partly cultural, partly social, and partly down to our own personality quirks and habits. Metafeeds may help us find more relevant information more easily, but they won’t cure the information overload problem. Only we can do that, by cutting down on the number of feeds we read, the number of tabs we leave open in Firefox, and the number of people whose blogs we follow.

, , , , , , , , ,

More on Technorati tags

Over on Burningbird, Shelley has written a great summary/analysis of the current thinking on Technorati’s tags. It is beautifully written, sports some wonderful photographs, and is well worth reading. I’m not even going to attempt to summarise it here, because to do so would be to be like reinventing the wheel in triangular shape – pointless and nowhere near as good as the original.

The thoughts that follow are an elaboration of the comment I left on Shelley’s post, so if you read that then some of this may seem eerily familiar.

As I said on my previous post about Technorati tags, I can’t help feeling that we’re really only at the very beginning of the creation of meaningful tagsonomies and tagsonomical tools. Technorati’s implementation of tags is one step on a long road, but until we can sort by what Technorati calls ‘authority’ (but which is really a sort of popularity), pull the search results in to our aggregators by RSS, search using Boolean operands on multiple tags and do all sorts of complicated bespoke filtering, tags will remain a bit of a kludge.

Tags are, at the moment, at the ‘sledgehammer to crack a walnut’ stage, and there’s a lot of work to be done before we get it refined down to the toffee hammer stage.

A big issue is obviously implementation. People are lazy – I certainly am and I am sure I am not alone. Until we have a way to automatically tag or create tag suggestions that can be approved or disapproved by the user, we are going to have to rely on people bothering to tag their posts, and we’re going to have to put up with the way that the variable quality of their metadata affects this metadata-reliant system.

Of course, we have movement in that direction in terms of the various tagging tools which have sprung up with impressive rapidity. Ecto supports tags using the Custom Tag facility – just create a custom HTML tag with the code below and it will automatically create a tag from the selected text.

<a href="http://technorati.com/tag/%*" rel="tag"></a>

Stephanie Booth has created a plug-in for WordPress, and there is of course the Oddiophile bookmarklet I have mentioned previously. All good starts, but they still require the blogger to bother using them and think clearly about which tags are relevant. As Shelley and others have noted, people are not necessarily very good at creating accurate tags – even people knowledgeable in the area of taxonomy and metadata don’t always create good tags for their own work.

That said, I think there are a few uses for which tags, even as they stand, beat every other system hands down, and one of those is classifying posts by language. At the moment, there really isn’t a consistent way to mark blogs or blog posts by language and that makes it very difficult if one is interested in finding blogs in a given tongue.

If I wanted to find blogs written in Welsh, then I have a bit of a challenge ahead of me. I can search in Google for ‘blog cymraeg’ but all that gives me are blog posts which use the word ‘cymraeg’, so if the post is in Welsh but doesn’t mention the word ‘cymraeg’ it’s not going to show up. For more popular languages, I can choose which language Google should search in, but that still means I need to pick some keywords to search on.

There is a similar problem even with specialised blog search engines, including the keyword search on Technorati – they all search content. I’m no metadata expert, but I see a clear difference between metadata that describes the contents of a post, i.e. what it is about, and metadata that describes the format of the post, such as what language it is in.

By allowing people to add format metadata, tags give bloggers the power to describe aspects of their posts that would not be accurately reflected by keywords selected from the content. Tagging all Welsh posts with ‘Cymraeg‘, for example, allows anyone interested in Welsh blogging to locate the most recent posts in that language, regardless of what those posts might be about.

Using tags to make up for this shortfall in existing blog metadata, we can then use Technorati as an engine for discovery (as opposed to search) within a set of given criteria. At the moment there is just no other way to do this.

Tags may be a bit kludgy at the moment, but because they are capable of filling a gap in the way we locate blog posts that may be of interest, I think they are going to be with us for the long haul.

, , , , , , ,