The Twelve (or so) Step Program for Conference Speakers and Organisers

There’s been a lot of talk the last few days about Office 2.0, a conference that brought gender inequality in technology to a new low. Fifty three speakers and one woman was the original unpleasant statistic, and a few people got very cross about it. Rightly so.

But who is at fault? The organisers? The women? No one? Everyone? Someone else?

If the women think that it’s the organiser’s fault for not looking for more women, then we risk becoming passive, quietly waiting our turn. If the organisers think it’s women’s fault for not putting themselves forward, then they risk being lazy, and waiting for women to turn up on their doorstep. It becomes a tragedy of the commons, everyone blaming everyone else and no one doing a damn thing about it.

So, what do we do? I personally believe that the answer lies with all of us. We are ultimately responsible for our own lives, and our own experiences. As a woman, I am responsible for my own attendance at conferences, for submitting papers, and for assessing the invitations that I get. No one put me on some secret Speakers List – indeed if you look at all the lists of women speakers that have been drawn up these last few days, you’ll see I’m not on any of them. Instead, I went through a process of figuring out how to get to speak at conferences, and although I’m still learning, I think it might be helpful to share some of that knowledge here.

I also want to give organisers a heads up, but I’ll do that below. You people are also responsible for your own experience but you also, at the conference at least, help shape ours. You have a responsibility to pull your fingers out of your collective ass and start trying harder.

So… on with the program.

How To Become A Conference Speaker
1. Identify your interests. What subjects are you interested in? What are you passionate about? What do you do at work? What do you want to do at work? Where does your experience lie? If you don’t know the answer to these questions, you are going to find it hard to crack the conference problem. You need to be focused – there are a lot of conferences out there, and you need to pick the ones that you will benefit from most.

2. Identify the conferences. This is easier said than done. I’ve yet to see a comprehensive conference calendar, and I’ve missed plenty of good conference action because I missed an announcement. So search Upcoming and the blogs and any other event-based site you can find. List out your conferences, look at when they are, how much they cost, who’s speaking, what the topics are, and then make a shortlist of ones that interest you most. Note: This is an ongoing process, because new conferences get announced all the time.

3. Pick which conferences you really want to go to. If any are work related, ask your boss if she or he will send you. You may think ‘They’ll never go for this’ but you won’t know unless you ask, and you might get a nice surprise.

4. If you can’t persuade your boss to send you, book off some holiday and go yourself. This is career development, and the investment will be worth it. You will learn new stuff, meet new people and have new ideas. How can that not be worth it?

5. Identify and become a part of the conferences communities. Conferences don’t happen in a vacuum, and you will do well to join the mailing lists associated with the subjects you are interested. You should also engage with bloggers writing about your subjects, and any wikis, forums, etc. that are relevant. Some conferences will even have social networking tools associated with the event, so use those too. Also join wikis like The Speaker’s Wiki. Get yourself out there.

6. If you blog, then write about the conferences you are going to, tell the world you are open to meeting up with people, and then follow through on any invitations you get. If you don’t have a blog, get one and do the above.

7. At the conference, participate. Talk to other attendees, the organiser, the speakers, everyone you can. You’re not there to observe, you’re there to take part and you can guarantee that no organiser is going to notice you if you just sit in the corner and watch. Make sure you mingle with everyone – don’t just hang out with your friends or other women. Go talk to strangers!

8. Ask questions. Speaking is a skill you may have to work hard to acquire. For some it comes naturally, for most it does not. But almost everyone is terrified by the thought of potential public humiliation and I know more people whose stomachs get churned up before speaking than who don’t. One way to ease yourself through this pain barrier is to make yourself ask questions from the floor – personally I find it harder to ask questions than to be a speaker, but maybe that’s just me.

9. When you’ve been to a few conferences and are familiar with the way that they work, start looking for ones that you want to speak at. If they have a call for papers then submit one. If not, then contact the organiser and say you’d like to take part as a speaker. Make sure you are clear about what you bring to the conference: What experience do you have? What projects have you been working on? What are you unique successes? Where does your wisdom lie? Don’t give them a huge long biography or CV, just a succinct summary of your experience and some ideas of how you could fit into the conference schedule. The idea is not to drown them in information but to show them how you make their conference better, and make them want to get in touch with you to find out more.

10. Be prepared to be turned down. It happens to everyone all the time. It may bruise your ego but it’s going to happen and you may as well get used to it. Don’t let it stop you from continuing to push yourself forward as a speaker, and don’t get a chip on your shoulder about it.

11. Improve your public speaking skills. If you’re not a natural (and you may not find that out until it’s too late if you do no prep), then you are going to have to work hard to becomes a good speaker. Most people in the tech industry – male and female – do not do this. They make no effort to learn how to present, and consequently they bore the pants off their audience. Yes, some of them keep getting invited back because they did something that everyone’s interested in, but if you didn’t just float your start-up or invent AI, then you’re going to have to make sure you are damn compelling when you get up on that stage. So be prepared!

12a. Knock their socks off, and keep knocking them off. Be interesting. If there’s one thing that will keep getting you invited back, it’s being interesting.

12b. GOTO 1.

Note for Conference Organisers
This doesn’t let you off the hook. If you aren’t more inclusive, you can expect to get the kind of flack that Ismael is getting over Office 2.0. If you don’t want to get hassled, then I suggest that you too follow a few tips.

1. Organise your conferences in advance. Don’t try and throw something together at the last minute, because people have lives and the best speakers aren’t necessarily going to drop everything just for you.

2. Look at the other conferences in your field. Who’s speaking? How many males? How many females? How many people from out of town? Or abroad?

3. Look at who’s blogging about your subject. Use Technorati or Icerocket, and spend significant time finding you who’s saying what to whom.

4. Look at your list of potential speakers. Are they all friends? If so, then you might want to hold a private party instead. Are they all men? If so, then you might want to put a bit more effort into finding some women, unless you want your balls handed to you on a plate. Does the gender balance reflect that of the industry? If so, well done.

5. Ask around. Dig a little. Find people who are new to you. Start to compile a list of subjects and possible speakers, and see how well you can balance new, familiar, male, female.

6. Talk to the community. They know people, y’know. Announce a call for papers, but be specific about what you want. I can promise you that ’email me if you want to speak’ is going to result in a whole world of pain for you – far better to have a formalised submission process asking for things like abstracts to make sure that you collect the necessary data.

7. If you have some names of speakers that you just don’t know, try having a conference call with them to try and get a feel for how they’ll be onstage. It’s very easy to see cross people off your list just because you’ve never heard of them, but try to actually investigate first. After all, you don’t know everybody.

8. For panels, consider mixing up some established speakers and some first-time speakers. Panel discussions are really good places for first-time speakers to cut their teeth, but make sure you have an experienced moderator to make sure everyone gets a say.

9. If you have a newbie who has some really good business experience to share, but no speaking experience, try setting up an onstage interview instead of giving them a keynote. But make sure you find a presenter who is good at interviewing (maybe a journalist?), as the only thing worse than one bad speaker on stage is two bad speakers on stage.

10. Stand up to your sponsors. Yes, we all know big names draw crowds. But not everyone on your schedule has to be famous and if your sponsors are pushing for more big names, you should push back. Some of the people on the conference circuit give new talks every time… some just trot out the same old same old every time. Ditch them, no matter how famous they are.

11. Have an expenses fund. Not all good speakers work for big companies willing to cover their costs. Be prepared to help out those who are self-funded, even if you only pay travel and a cheap hotel.

12a. Never stop putting the effort in. Your job is to put on a good conference with varied voices, and if you stop trying to find new speakers, and stop trying to ensure a healthy gender balance, then you’re failing. There is such a thing as ‘bad publicity’ after all – it’s when people say ‘sod you, I’m not coming to your crappy sexist conference’.

12b. GOTO 1.

Right… those are my thoughts off the top of my head. Any more tips for speakers and organisers?

UPDATE: For the record, I did get an invitation to Office 2.0 from Ross Mayfield (after this, but probably unconnected as I’ve worked with Ross in the past). I can’t go, because I have a prior engagement.

FooCamp: What I Did On My Holidays

Wow.

So, FooCamp. It’s a bit like being at a conference where only the speakers have turned up, with no formal schedule and more foyer space than seating for sessions. In other words, it’s just exactly what you want it to be: a chance for a damn good conversation. Or several.

And I did have several damn good conversations. Michael Sparks ran a session, which was far too sparsely attended in my opinion, on how to use scifi to do brainstorming. He explained the basic principle, which is that you name a bunch of authors, ask what thing they have invented in their fiction, and then assume that it’s actually real (so long as it doesn’t require breaking the laws of physics). You then ask how things would be different if you had this thing, and what aspects of it you could actually make within a year.

We ended up taking Terry Pratchett‘s Luggage (which has legs and follows its owner round), and working out how to make one… basically you take a Roomba, add a suitcase to it and include additional sensors to follow a beacon implanted in your shoes. You could also add GPS, and Google Maps so that it can find out where you are (you must also transmit your geolocation to it), and then figure out an optimal route to get to you. You could also add a webcam so you could see where your luggage was, and with a bit of AI you pretty much have the Soomba. Or is that a Luggoomba?

How does this change things? Well, in the context of TV programmes, (Michael works at the Beeb), you could have Holiday, shown from the luggage’s point of view. Or LOST, where the luggage lands on the other side of the island and has to fight its way through the jungle until it’s reunited with its owner. Or Airport, all about how the luggage coped with being routed via Minsk. The opportunities are endless!

And also very funny. In fact, we thought it was so funny that we got told off by the participants of the session next door who said we were being too loud. Oops.

Another set of great conversations were with the guys from Second Life, Cory Ondrejka and Philip Rosedale whom I spoke to a few times about what they were up to in their virtual world. I went to a couple of Second Life-related sessions, including the one Philip ran, and was really fascinated by the way SL is developing.

It seems to me that it’s going to be increasingly important for me as a social software consultant to be in SL, and to come to fully understand its ecosystem and the economics. Organisations and businesses are already using SL for mixed reality events, and other commercial purposes, and I have already have conversations with various clients regarding how they could interact with the people within SL. Of course, it being a community-owned world, any business wanted to enter into it has to do so carefully, and has to understand the community before it tries anything, lest it screw up.

Additionally, I can see SL becoming a really useful tool for running virtual meetings an a way richer and more real than IM or voice chat. Or, in fact, even videochat in some ways. It’s hard to stand up and pace about in videochat right now, and sometimes avatars are a more real representation of ourselves than a photo or video is. (But that’s a whole nother post.)

So I foresee a lot more work with SL in my immediate future, not to mention hopefully a lot of ongoing discussions with Philip and Cory.

I had a lot more fascinating conversations with fascinating people, but it’s impossible to record every one of them here. I also had a great time playing Werewolf (link doesn’t include the role Healer, which we were using, but it’s close enough that you’ll get the gist). I only got to play one round, and I did fairly well at not getting lynched (I was a villager), or eaten by the werewolves, but the Seer didn’t figure out I was a villager so that immediately threw suspicion onto me. However, it was a real laugh and I can see why everyone is addicted to it. Really am looking forward to another game some time.

Overall, I have to say that Foo was a fantastic experience. I know it’s not cool in some quarters to rave about how bloody great Foo is, because it’s invitation only and therefore there’s a risk of cliques, but as someone who doesn’t really feel that she’s all that well known or doing the sort of groundbreaking cool shit that a lot of people there were, I must say I felt very much accepted by everyone. There was a great gender balance too. In fact, Quinn tells me it was 17% female, which is far higher than your average tech conference, so props to the O’Reilly crew for that. And there was a lot of diversity in the type of people there: it wasn’t just cool dudes with robots (although there were some cool dudes with robots in attendance).
I did, of course, try to pitch a book idea, and fluffed it really badly. I’m no good at this pitching thing. Witness my attempt to pitch my talk via a single sentence on the schedule boards. Just one person, the very nice Nic Werner, turned up and I have to say that we had a great conversation about social software. One day I will actually write up the talk I was going to give, but maybe next year, if they invite me back, I’ll have something more Fooey to talk about.

Fooooooo!

FooCamp, for those of you who don’t know, is a small invitation-only camping ubergeek event at O’Reilly Media‘s campus at Sebastopol in California. The whole thing was set up purely so that the O’Reilly lot could then set up a free bar, called the FooBar. It’s a pun, you see, and one worth gathering a few hundred people together to realise.

I was really surprised to get an invitation this year, and really very chuffed, so I’m really excited to be here. There are a lot of people here – most of whom I don’t know, but lots of whom I do. It’s a nice mix of catching up with old friends such as David Weinberger and making new ones like Philip Rosedale from Second Life.

Only got here yesterday afternoon, which was spent putting up the tent, helping David put his up, and generally chatting with people. During the evening, there was a general introduction session where we all (all 326** of us) had to stand up and say who we are, our affiliation and three words to describe ourselves. Mine was ‘Suw Charman, social software consultant. Scaring businesses. Kittens.’ You have to nominate yourself for a talk, so I’ve done that. ‘Social Software: Happy Stories from the Real World’, which will be about how people are really using social software in business… or it will be if anyone actually turns up.

Meantime, Google Earth are sending a plane over to photograph the campus to a resolution of two inches, so up the road people are building a crashed Cylon raider, and Tom Coates, Cal Henderson, Simon Willison, Michael Sparks and I have built a giant space invaders game on some land just behind O’Reilly. Should be fun to see how it comes out in a few weeks when they’ve uploaded the images.

The only hiccup is that I really didn’t do enough research as to the weather here. I was expecting it to be warm, and during the day yesterday, in the sunshine, it was. But by evening, it was freezing cold, and I froze my ass off in my tent overnight despite a nice down sleeping bag. Today, I shall be demonstrating how to wear all the clothes at once.

Having built the space invaders, it’s now time for the first of the sessions that I’m here to see. No notes, because this is more about listening and taking part than it is about taking notes, although I may summarise later if I don’t get caught up in a long, late-night game of Werewolf.

Later…

So the plane came over at just about the same time that the sun came out and the wind picked up. There was much running about and picking up of white cardboard pixels that had blown about, putting them back in the right place and weighing them down with windfall apples and stones. We can only hope that the Google plane managed to get a shot of the space invaders when they were looking their best.

Also went to my first session, which was about a cool brainstorming technique using scifi as it’s starting point that Michael Sparks gave. Was hysterically funny, to the point where we got told off by one of the other sessions for being too noisy. Oops. Sorry. Currently in a session about Second Life which I came late to because I was having a great conversation with Rael Dornfest, whom I last met via a web cam carried about by Kevin Marks at Etech (I was on the webcam, Rael and Kevin were having lunch).

I have to say, though, that FooCamp is possibly one of a very small number of places where one could find oneself staring up at a flock of birds saying ‘Are they birds… or something Philip Torrone built?’.

* For a very flexible definition of ‘ubergeek’, I think.
** Roughly

IPPR/Reuters – The Long Tail: Opportunities in a New Marketplace

The IPPR and Reuters held a seminar on Tuesday 4 July about the ‘long tail’ and niche marketing, and how it relates to IP. Speakers were Shaun Woodward MP; Chris Anderson, Wired; Azeem Azhar, Reuters. As usual, I took copious notes, a habit which will become redundant if all organisers provide the level of recording that the IPPR has for this seminar. You can read the official summary, and you can listen to Part 1: Shaun Woodward MP, Chris Anderson, Azeem Azhar (21.1MB), and Part 2: Questions from the floor and responses (16.8MB).

Meantime… my notes. EAEO.

Shaun Woodward MP

One of my most cherished positions is a cutting i have from the Sunday Times best seller list, from 1984, because for a few weeks with a co-author Esther Rantzen was top of the best seller list for book on Ben Hardwick, a small child who needed a liver transplant. The story changed pediatric liver transplants in the UK, and the profits of the book went to supporting parents.

That was 20 years ago, because now he has a chance to look back on the media and consider how it has changed. The idea of a programme pulling 20 million viewers on a Sunday night, like That’s Life did. Even when it was axed it had 5 or 6 million viewers. In the 90s, it seemed like that was a good idea. But in the future, how will the media create a programme with that kind of market share?

Media couldn’t see the future then, couldn’t see how it would develop. In the 80s there were just four channels and no one predicted that there would be things like Sky. If you wanted mass entertainment, there were only two places to go, BBC or ITV. It was supply lead, and you had two choices of channel, or nothing.

You could put together a schedule that would grab a third of the population. A winning evening’s schedule would clean up, and the challenge for the BBC was just to keep That’s Life ahead of Coronation STreet.

Now it’s multichannel, 24 hour broadcasting, and more choice than ever. Revolution in content and form, because of digital. Prospect of convergence between content and context. Trying to see the future is like trying to see round corners. You can only speculate.

The Long Tail is part of the informed speculation you can do, as opposed to the wild speculation. Need to find a grammar and a lexicon to describe what is happening across the creative industries.

Chris Anderson’s book puts on to the table some very important issues, that everyone in the creative industries need to take on board. He says that the emergent market is going to be radically different, which is right. He says that the market was about hits, but is now about misses. But what is the nature of those misses.

Advertising on TV and in newspapers is down, and they need to find somewhere else to go. This isn’t just about lower audience share and declining sales, it’s about the consequences of choice; but it’s also about new and emergent markets and services. In the old linear economy, it was controlled by the supplier and retailer. You have to sell a certain number of books/cds/cinema seats to be economic, and the key thing is space. You need enough space to break even.

In the 80s, the BBC didn’t even broadcast for 24 hours. Look at music, digital downloads are now 80% of sales; cost of digital film print is much less (1/5th?) traditional prints.

Need to be prepared for niche markets. 31 million hours of original video programming are already being produced every year and as it’s cheaper to make there will be more. Internet is incomparably cheaper than satellite or terrestrial, so will be central to this explosion. Even major film studios are considering the net when thinking about how they release film products. How you watch, where you watch, will change, but that mass market product will be up against huge competition.

This new market is about putting the power in the hands of the consumer instead of the producer. and the producer has to adapt to this. All we can be certain of is that the demand will continue to change. New markets not driven by scarcity.

This presents a new set of problems. How do you navigate through 31 million hours of video programming? Search engines are going to be very important. Whether it’s collaborative, recommendation or affinity filters. We need to recognise the enormity of the challenge that’s upon us.

There’ll be local sports clubs, theatres, schools, streaming their content. These opportunities we need to be taken in the UK. We’ll see more and more people creating their own product. Put together with prices falling for technology, wider access to more material, unlimited storage, unlimited bandwidth. This is all happening now. But we don’t know how it will unfold. The net has changed everything. And it’s going to change our broadcast services.

Mass markets of a million have changed to markets of one. But there is still a mass market, there’s still a place for the BBC, but there will be very important discussions about content, censorship, regulation and legislation. The EU want to introduce a directive called TV without Fronteirs, which wants to police content put on the web. There’s a huge growth in opportunities, in jobs, and there is a risk that a directive born of dire to protect an on-demand video market in Luxembourg, has all the hallmarks of the French wanting to protect their farmers by introducing the CAP. This directive is well intentioned, but the consequences are vast.

Have to look at the UK’s position in this revolution. 75% penetration of digital TV and by 2012 we will switch of analogue all together. Broadband three years ago was 27 pcm, now it’s 24 but 48x faster.

Have a problem with the digital divide, people with access and people without. But the truth is bigger than that. And we will look back and think that switching off analogue will prove huge foresight.

Big question is, are we ready? Challenges to content, copyright, intellectual property. Work of Andrew Gowers is important, but may be out of date in a few years. But whatever we produce now or early next year will itself need revision in a few years.

831 billion in the world creative economies. That’s 1.3 trillion now, 50% increase in five years. by 2100, 2 trillion dollars. UK employs millions in these industries.

Chris Anderson, The Long Tail: Who needs megahits?

We grew up in the era of the blockbuster. We see the world through hits, but it wasn’t always this way. in the 19th and 18 th C, culture was fragmented by geography. It moved at the pace of people.

High speed printing press, then photography, changed that. But radio and TV then changed the whole natures of culture. We were suddenly watching the same thing at the same time. The idea that you could come in to the office on MOnday, and talk about what was on TV Sunday night, we were linked by a common culture.

this defined the era I and most of use grew up in. Peaked in March 21, 2000. First day of spring of the new century, shortly after the dotcom crash. Launch of the Nsync album, No Strings Attached – sold 1 million copies on the first day, 2.4 million in the first week. This record will never be broken.

Chart of hit albums shows a peak around 2001. Number of hit albums has fallen by 50%, despite music sales being steady if you include digital. More music than ever, more artists, but fewer hits.

For TV, number of broadcast channels increasing, but network share for top four networks fell.

Ratings of top TV shows, shows a peak in the 50s for I Love Lucy, but decreases steadily as choice increases. Number 1 show wouldn’t have made top 15, 50 years ago.

Shape of 21st C is a power law – big peak where a few things are very popular and a lot of things are not. There’s a bottleneck – bookshop shelf space; spectrum for broadcast etc. When you have limits you have to be selective, and when you are selective you pick the most popular things.

Net has no limit. Infinite shelf space. So can provide for everyone.

All markets show a straight line when you show sales vs products on a log scale; same for earth quakes; same for city sizes.

Should be a straight line… but not for American box office. Around the 900th film the cinema’s run out of films. Megaplexes can show 250 films per year, so as soon as you run out of screens, films that aren’t shown start grossing much, much less. Cinema’s distribution channel can only show a limited numbers of shows, and ergo they are the popular ones. Thus there is latent demand for product suppressed by the distribution model.

Long tale is not a concept I invented, but it’s been around for a long time. All I’ve done is give it a name.

Music data for Rhapsody, 2005. Walmart is the largest music retailer in the US. If you remember real music shop, Walmart is a soul destroying experience. last year 65k new albums were released, but only 700 made it to Walmart. It only sells a tiny sliver of what’s out there, because it’s inefficient distribution and limited shelf space.

The long tale is, therefore, huge. Compares what’s available in

Rhapsody/Walmart,

1.2 million tracks sold in Rhapsody; 55,000 sold by Walmart

40% of total sales are in offline retail stores

Netflix/Blockbuster, and

55,000 DVDs on Netflix; 3,000 held by Blockbuster

21% of total sales are in offline retail stores

Amazon/Barnes & Noble.

3.7 million books on Amazon; 100,000 books in Barnes & Noble

25% of total sales are in offline retail stores.

These numbers for online are growing dramatically.

What’s driving it?

getting more stuff, democratise the tools of production; blogs democratise publishing result: more stuff.

Democratise distribution; the net gives everyone access, result more sales.

connect supply and demand; results;d drive business from hits to niches.

Google provides long tail advertising. Hyperfocused ads on hyperfocused blogs. Google scaled their model down to the small people who were neglected by the old advertising models.

Ebay is the same. Allows small vendors to have global reach.

CapitalOne: long take of credit cards; people who either had great credit ratings or really bad, no middle ground. Now we have the ability to fine slice the market and offer the right credit card by offering different rates depending on individual credit rating. Lead to a huge pile of debt, but…

Open source: long tail talent. Some programmers come from odd places, like madagasgar.

Long Tail libation: “Tale Ale”. No choice in beers – have had just four national beers, but can now provide more variety on the same shelf due to stock management software.

Small is the new big.

Many is the new few.

10 Fallacies of ‘hitism’:

1. Assume everyone wants to be a star.

2. Everyone’s in it for the money. Average books sells 500 copies. Average expectation is not that they will make money – lots of reasons to write books.

3. The only success is mass success. Can be an artist with small number of fans, but be true to yourself.

4. “Direct to video” = bad. But allows you to get the right audience the right way.

5. “Self published” = bad. but allows you to reach your audience more easily.

6. Amateur = amateurish. It really means people do it for love. Knowledge, experience, wisdom, is much more widely distributed than our professional ranks would say.

7. Low-selling = low-quality. Sometimes the most refined, most perfect items are the ones that are aimed at a niche audience. The best researched book will not be a best seller. There are gems, there are diamonds in the rough. The thing is not to give up because the signal to noise ration is bad, but to look for the diamonds.

8. If it were good, it would be popular. Instead, it’s just not for everyone.

9. The economics of the head apply to the whole market. They don’t.

10. You can focus on strong signals and ignore the weak signals. Rise of the bottom-up hits, where bands that didn’t go the traditional route are burbling up from below; now the top down marketed hits aren’t doing so well. Have to listen to the weak signals because that’s where the innovation is coming from.

Lessons:

1. Don’t confuse limited distribution with shared taste

2. Everyone deviates from the mainstream somewhere.

3. One size no longer fits all

4. The best stuff isn’t necessarily at the top

5. The mass market is becoming a mass niche market.

He then shows the ‘Day of the Long Tail’ video.

Q&A

Azeem: Can government help or hinder giving access to the long tail content?

Chris: Biggest barrier is rights. That’s the elephant in the room. Those rights were cleared for one form of distribution – broadcast, but need to clear the rights for redistribution. Given up on congress in the US, because disney controls that issue. Is there a way to do batch processing to clear the rights?

Shaun: This is at the heart of the whole deal, because the consumer has a right to access stuff, but the producer has a right too. It’s interesting to see how the BBC is changing it contracts. When you were a reporter, you signed over everything but that’s changed. I think that one of the things that … I think that the BBC is doing some interesting things, I say this because I actually think that the BBC is really looking at the 21st C, in a way that is responsible and innovative and about opening up markets to enable competition. In the management rights and repeats etc., I think they’re going down the right way.

What Chris has referred to is coming up in our program, and he’s on to something when he says copyright law in America is written on behalf of Disney. How do people with ideas for games, for e.g., develop those ideas when MS or Sony own the means of distribution. It should be ownership for everybody, because it’s access for everybody.

Larry Kay, Solicitor: Would ask Chris to turn his idea about copyright on its head. His idea is about minority interest things. IN the era of the long tail the blogger is as powerful or as powerful as bit industry and there’s a great danger that we risk throwing out the whole frame work, because it’s built on rights and if we have a problem it’s how do we efficiently identify, say, orphaned works. Need solutions at a practical level.

Chris: For books, there are 32 million title in the US libraries. 20% are explicitly out of copyright. Vast majority, 75 – 80% are in a grey area. They are probably in copyright, but the copyright holders may not be interested in upholding those rights. So if we don’t know, we default to not taking the chance. And as the result the vast majority are unavailable. This is the Google Books problem.

The thing is different rights holders, and different people have different views. At the head, you want to control the rights and exploit. The majority want readers, so they are happy to neglect the copyright if it means more readers. And then there are people who renounce copyright and use a CC license. There’s a spectrum of people’s views, but there’s no mechanism for clearing rights.

Paul Sanders, PlayLouder: We do have these two uses of copyright, two traditions – one protects the creator and the other protects the distributor. I wonder whether that can be sustained. Whether we should be using copyright rather than other rights to protect the distributor.

Chris: Right to point out that. The distributor was the necessary path to market, to rights of the creator got subsumed by those of the distributor. Creator can express their own rights now.

Shaun: It’s a fascinating question. It’s equally true in the film industry. We’re going to need to have some debates about what we value. There are things that are going to be lost. Interesting to put this in context of broadcasting and film. Good tradition of making public service broadcast, some of which is very expensive. Do we value that? Do we want it? If you value ‘stars’ in films, you’re going to have to pay for it. Now we can lose stars, or we can have them, so then you’re in to the business of who’s going to fund that? So we need to find the right questions and get busy asking them.

Simon Walker, EMI: Chris, you seem to be setting an ‘either/or’, either you are big or you are long tail. We have a long tail model, we put out lots of stuff, but we know not all will be hits; plus we have a back catalogue. Can see who business models emerging. Do we need to adapt the model.

Chris: It’s not that you have hits retailers and niches retailers, but online you have retailers who have hits and niches. In the 90s you had niche-only retailers but there was no way to start, just a big undifferentiated mess. Rather than an either/or, it’s an and business.

How do you deal with this? You scaled down. The old model involved a large advance and an asymmetrical royalty model, but the new model is people go straight to consumers on their own. So whilst it might be advantageous for you to be on our label, but we’ll do something smaller, just sell digitally, not use our sales force to get you into WalMart, and give you a bigger cut.

Azeem: Is there a consolidation of the market?

Chris: Are seeing more and more independent labels.

Shaun: Creative industries are also about risk, and some of these products take a lot of money to develop, and one of the things we have to balance here is what are we trying to protect? Have to get it wrong because, if you get it wrong it’s hard to recover.

Anon: Yes, there are many more small labels, but consolidation in things like eBay, PayPal, etc.

Chris: Many have noted that there is a short head of aggregators, like Amazon or iTunes. We’re at the first day of this. One size doesn’t fit all, but iTunes is a terrible way to get music – it’s all oriented around pop. In search, Google has a dominant share, but you have lots of different ways to search, and they realise you want different forms of aggregation for different markets. But we’re so early in this market we haven’t seen the diversity.

[My notes end here, although the debate did go on.]

Xtech 2006: Wrapup

Well, Xtech ended on Friday afternoon, and I’ve had the weekend to recover and to think about it all. Actually, I’ll need a lot longer than a weekend to process all the stuff that I took in, but it’ll be fun cogitating on everything I heard. I think Edd Dumbill and Matt Biddulph lined up some fantastic speakers, and having produced a conference myself in the past, I know just how much work it is.

The talks that stood out for me were:

– Matt Biddulph, talking about putting the BBC’s programme catalogue online.

Paul Hammond, on open data and why there isn’t more of it about.

Tom Coates, doing his web of data talk, which is always good.

Had some fun conversations too, with a whole host of people, including but not limited to: Teh Ryan King, Brian Suda, Thomas Vander Wal, Jeremy Keith, Simon Willison, Matt Patterson, Jeffrey McManus, and Jay Gooby. I’m sure I’ve missed someone off: sorry if that’s you!

The next Xtech will be in Paris in 2007. Can’t wait.

Xtech 2006: Jeff Barr – Building Software With Human Intelligence

Amazon have released a number of APIs, but going to focus on the Mechanical Turk.

[He has ‘ground rules’ and says ‘please feel free to write about my talk’. Er… command and control anyone? Give me a break.]

[Lots of guff about Amazon.]

Artificial artificial intelligence.

The original Mechanical Turk was a chess playing robot. No one could figure out how it worked. Turns out there was a ‘chessmaster contortionist’ inside.

So the MTurk is powers by people. Machines can’t tell the difference between a table and a chair, whereas a person can do it immediately.

HIT – Human Intelligence Task.

Can check people have the appropriate skills, e.g. be able to speak Chinese or tell a chair from a table.

So you make a request, say what skills etc. are required, figure out the fee you’re willing to pay. Workers go to MTurk.com. 45 types of work, can filter by price or skills required etc. Transcriptions are very popular. E.g. CastingWords.com do transcriptions of podcasts. Question and answer kinds of things.

You decide if your workers get paid, but there are ratings on both sides, so you can rate the employers as well as the workers.

Software Developers

– can use APIs with this to include humans in their applications

Businesses

– can get stuff done that humans need to do

Anyone

– can make money

– new businesses feasible

Public use – massive scale image clean-up, i.e. which picture best represents the thing it’s of? Got Slash-Dotted to death. People did greasemonkey scripts to help them doing it. Had so many people doing this that they ran out of images. Had more workers than work for a while.

HIT Builder, helps you create your HIT (task thing).

Could use it for market research or surveys. E.g. wanted a survey for developers, so added some qualifications to weed out the non-developers by making people answer questions like ‘which of these four aren’t programming languages’.

Translations services.

Translate written transcripts to audio.

Image Den, photo editing and retouching, e.g. removing red-eye, cutting things up etc.

CastingWords, podcast transcription service.

Need an Amazon account to work, which requires a credit card, and that’s their way of trying to ensure no child labour.

[I think this could possibly have been an interesting talk, but it wasn’t. I like the idea of having APIs for something like MTurk, but this guy was really dull. I guess I could cut and paste from the back channel to spice things up a bit, but that might be mean.]

Xtech 2006: David Beckett – Semantics Through the Tag

Common way to think of tags is as a list of resources. You tag something, then you get a list of stuff you’ve tagged, etc.

Another way is tag clouds. Size of the tag represents how popular it is.

Suggested tags. Discovery process for other tags that people are using that are similar to your tags.

Flickr – clustering of photos with tags that are related, also interestingness which is partly tagged-based but also takes interactions with the site from people into account.

Mash-ups – assume tag is a primary key. Works well on events such as Xtech: Technorati, Planet Xtech. Photo sites are more place/time centric; del.icio.us ones are more pic related. So further away you get from place/time centric and the more generic the tag gets, the weaker their usefulness for mash-ups is. Generic tags don’t work so well – dont work as a connection across tag space, won’t tell you anything.

Emergent tag structures

There’s no documentation because it’s so lightweight so people use it however they like. Pave the paths that people follow.

– Geo tagging, lat/long, places

– Cell tagging, from mobile phone cell towers, associated with cameraphone pictures

– Blue tagging, bluetooth devices that were in context at the time

Hierarchy

Not about creating an taxonomy, but looking at emergent hierarchy, e.g.

– programming

– programming:html or programming/html

These appeared on their own, no on is thinking of consistency. Tag system may not understand the hierarchical system, but it helps people find them.

Grouping

Bundles in delicious. Similar to Flickr sets.

So who’s stuff is tagged?

– Yours on the tagging site

– Other people’s on the tagging site

– Anybody’s

Flickr is a closed system, and you can only tag certain things like your photos and your contacts; whereas Del.icio.us is more open.

So how’s tag is it anyway?

If it was a domain name it is clear, but really, who cares. This is lightweight, don’t need to think about who owns it, just use it.

Tags are vocabularies per service, tagonomies. Each services uses different words as tabs, and may or may not be the same across services. So they can use terms differently.

What’s the point of tagging semantics?

1. for people to understand what some use of a tag means: there’s no way of finding out what a tag means without looking at it in context and figuring it out.

2. for computers to gather information about a tag, supporting #1.

What does a tag mean to someone?

– ask them? not scalable

– look it up in a canon. dictionary or encyclopaedia, but it isn’t distributed and it’s too much like hard work. need a mechanism that you can just use. don’t want anything heavyweight.

Good things

– low barrier to use, just start typing

– few restrictions on syntax

– unrestricted word space, if you were looking at it from a librarian’s point of view, Dewey Decimal is restricted to what’s defined by the system

– social description, folksonomy, can see what friends are doing, looking groups and sets, and make up your own tags

– if you have lots of tags, over time as the no. of tags increases, the descriptions merge towards an average, because any one individuals version of a description becomes less important, so over time meanings converge.

– easy to experiment, because there is no authority that says it’s not allowed.

Problems

– formalism problems: mixing types of things, names of things, genres, made up things, ambiguity, synonyms

– meaning is implicit

– power curves, nobody explains the long tail tags, so individuals meanings get lost and subdued by the mass of people tagging (this is a plus – see above – and a minus)

– naive tag mashups mix up meanings

– syntax problems – stemming, plurals. some services try to join things by ignoring spaces, plurals, caps, lower case, etc. by using natural language.

– tricky to make a short, unique tag. computer wants something unique, humans want something short and easy.

All these are the usual human-entered metadata problems.

Possible solutions

– microformats: no good hoock for software, and are read only

– web APIs: read/write but are for programmers only, not much use to 99% of tag users

– RSS: but it’s read-only, so more about me giving you stuff than getting stuff back.

Separate from service

Need to then understand the words out of context, with no service behind it.

Want

– a description

– a community

– a historical record

Answer: a wiki

– a description page

– a community of people to discuss and/or edit

– a historical record

Example, raptor tag

Raptor is a bird of prey, a hard drive, a plane, dinosaurs. So what does it mean if you tag something raptor?

So there is ambiguity. Wikipedia uses disambiguation pages to help clean up meanings from works. People can read this, but so can machines. This stuff is recording semantically, so can tell this term is ambiguous. Can also look up in Wiktionary, and can then leap across languages too.

Wikitag

– Easy to create

– record the ambiguity, and synonyms/prefered names

– microformat compatible: metadata is wiki markup, so is visible; reuse of existing format.

http://en.wikitag.org/wiki/Raptor

Defined the meaning of a tag.

– can discuss the term

– and can add disambiguation if need be

This isn’t perfect.

– discussion, needs easy-to-use threaded discussion

– wikipedia rules, e.g. NPOV and encylopedic style are not appropriate for something as lightweight as this. Needs fewer rules, maybe just ‘keep it legal’.

– Centralised. 🙁 don’t want a ‘one true way’ of doing this.

Can also add in semantic wiki mark-up.

Tagging is s social process with a gap: the place for a community to build the meaning. wiki can fill the gap.

Xtech 2006: Mikel Maron – GeoRSS

90% of information has a spatial component. Needed to agree a format.

Definitive history of RSS

– Syndicus Geographum – ancient Greek treaty for sharing of maps between city states

– blogmapper/rdfmapper – 2002, specifying locations in weblog posts, little map with red dots

– w3c rdfig geo vocabulary – 2003, came up with simple vocab on irc and published a doc, and this is the basis of geocoding RDF and RSS

– geowanking – May 2003, on this discussion list GeoRSS first uttered

– World as a Blog/WorldKit – realtime weblog geo info nabbing tools, World as a Blog looks at geotags in real time then plots them on a map so you can see who’s up too late.

– USGS 2004, started their Earthquake Alerts Feed.

– Yahoo! maps supports georss 2005

Lots has happened. Google released GoogleMaps, and shook everyone up with an amazing resource of map data, and released an API. Lots of map-based mash-ups.

OSGeo Foundation, Where 2.0, OpenStreetMap.

Format then only specified points, not lines or polygons. GeoRSS.org. Alternatives were KML used by Google Earth, rich, similar to GML, but too complicated and tied to GoogleEarth so some stuff is more for 3d; GPX, XML interchange format for GPS data, but tied to its applicaton, extensible but not so useful; and GML, Open Geospacial Consortium, and useful for defining spatial geometries, so an XML version of a shape file, but quite complicated spec at over 500 pages, and a bit of confusion on how you use it because it’s not a schema its similar to RDF, so provides geometric objects for your own schema.

OGC got involved in GeoRSS because they wanted to help promote GML. So some of GeoRSS is drawn from GML. Two types of GeoRSS: Simple and GML. Simple is a compressed version of GML. Neutral regading feed type, e.g. RSS1.0/RDF, RSS2.0, Atom.

Looking for potential to create a Microformat.

[Now goes into some detail re the spec which I’m not going to try to reproduce].

EC JRC Tsunami Simulator. Subscribed to USGS earthquake feed, ran tsunami model, and dependent on outcome, they would sent out an alter. Also had RSS feed. Produce maps of possible tsunami.

Supported by, or about to be supported by:

– Platial

– Tagzania

– Ning

– Wayfairing, Plazes

Commercial support

– MSFT announced intention

– Yahoo! (Upcoming, Weather, Traffic, Flickr may potentially use it, and Maps API

– Ning

– CadCorp

Google

– OGC member

– MGeoRSS

– Acme GeoRSS

– GeoRSS2KML

And other stuff

– Feed validator

– WordPress Plugin in the works

– Weblog

– A press release

– Feed icon

Aggregation

http://placedb.org

http://www.jeffpalm.com/geo/

http://fofredux.sourceforge.net/

http://mapufacture.com/georss/

Mapufacture, create and position a map, select georss feeds and put them together in a map. Then can do keyword searches and location searches. Being able to aggregate them together is very useful. Rails app. E.g. several weather feeds, added to a map and then when you click on the pointer on the map, the content shows up.

Social maps, e.g. places tagged as restaurants in Platial and Tagzania on one simple maps.

Can search, and navigate the map to show the area you’re interested in, then it searches the feeds and grabs everything in that location. All search results produce a GeoRSS feed which you can then reuse.

Odds and ends

– mobile device potential, sharing info about where you are

– sensors, could be used for publishing sensor data

– GIS Time Navigation, where you navigate through space and see things happening in time, e.g. a feed of events in Amsterdam which provided you with a calendar and location.

– RSS to GeoRSS converter, taking RSS, geocode place names and produce GeoRSS

Xtech 2006: Ben Lund – Social Bookmarking For Scientists

Connotea, social bookmarking for scientists.

Why for scientists? Obviously, scientists and clinicians are a core market. doesn’t exclude others, but concentrating on users with a common interest they could increase discover benefits. Hooks into academic publishing technologies.

Connotea is an open tool, is social so connects to other users, and has tags. But what it does is identify articles solely from the bookmark URLs. So it can pull up the citation from the URL – title, author, journal, issue no. page, publication date. This is important for scientists.

Way it does it is by ‘URL scanning’. So user is on a page, e.g. PubMed which is a huge database of abstracts from biomed publications. When the user clicks ‘Add to Connotea’, this opens a window, it recognises that this is a scholarly article, and imports the data.

Uses ‘citation source plug-ins’ – perl modules for each API. It asks each plug-in to see if it recognises the URL and when it does it goes and gets the information which then associates it with the bookmark in the database.

[Now runs through some programming stuff.]

Bookmarks on a lot of these scientific resources are far from clean or permanent and have a lot of session data in. So this needs cleaning off.

So what’s important? Retrieval and discovery. Already has tagging for navigation. Also has search in case there are some articles that haven’t been accurately tagged.

Provides extra link options for bookmarks. Main title links to the article, say in PubMed; but there are links to other sources for this article, e.g. to the original Nature article; plus other databases, and cross-referencing services.

System also produces a long open URL with all the bibliographic information in it.

Now … the hate.

First hate:

– poorly documented and poorly implemented data formats. Variety of different XML schema. Liberal interpretations of standards.

Second hate:

– have to do lots of unnecessary hoop-jumping to get this data. Lots of pinging different urls to get coookies, POSTs, etc.

Third hate:

– have to do everything on a case-by-case basis. have to reverse engineer each publisher’s site . have to write ad hoc rules and custom procedures for each case.

A wish

Nature release a proposal called OTMI, open text mining interface – wants to make Nature’s text open for data mining, but not the articles themselves. So researchers looking for raw XML for doing data mining research, but ever time someone asks they have to make ad hoc arrangements for each case. So OTMI does some pre-processing to make the data more usable.

Publishers could choose to be supported by Connotea and remove the need for them to reverse engineer. Publisher just puts a link through to an ATOM doc with the relevant data in so that the citation can be easily retrieved.

Blogs already do autodiscovery of ATOM feeds, so can test idea using a citation source plug-in for a blog. It works, so can treat any source as a citation, but only whilst the post is still in the RSS feed.

Another wish

Citation microformat. Connotea would work really well with a citation microformat, so is going to look into that.

Summary

How to do URL to metadata

– manual entry

– scraping the page

– recognise and extract some ID, Connotea does that, but it doesn’t scale to the whole web.

– follow a metadata link from page, this is the blog plug-in

– parse the page directly, not possible yet.

Useful not just for Nature as publishers of data, but also anyone else who wants to be discoverable and bookmarkable.

Nature blog about this, Nascent.

Xtech 2006: Tom Coates – Native to a Web of Data: Designing a part of the Aggregate Web

This is a developed version of the talk Tom gave first at the Caron Summit on the Future of Web Apps. So now you can compare and contrast, and maybe draw the conclusion that either I am typing more slowly these days, or he just talked faster.

Was working at the BBC, now at Yahoo! Only been there a few months so what he’s saying is not corporate policy. [Does everyone who leaves the BBC go to work for Yahoo?]

Paul’s presentation was a little bit ‘sad puppy’ but mine is going to be more chichi. Go to bingo.scrumjax.com for buzzword bingo.

I’m going to be talking about design of W2.0. When people think about design they think rounded corner, gradient fills, like Rollyo, Chatsum. Now you have rounded corners and aqua effects FeedRinse. All started with Blogger and the Adaptive Path group.

Could talk for hours about this, the new tools at our disposal, about how Mac or OmniGraffe change the way people design. But going to talk about products and how they fit into the web. Web is gestalt.

What is the web changing into?

What can you or should you build on top of it?

Architectural stuff

Web of data, W2.0 buzzwords, lots of different things going on, at design, interface, server levels, social dynamics. Too much going on underneath it to stand as a term, but w.20 is condensing as a term. These buzzwords are an ettempt to make sense of things, there are a lot of changes and innovations, and I’m going to concentrate on one element. On the move into a web of data, reuse, etc.

Web is becoming aggregate web of connected data sources and service. “A web of data sources, services for exploring and manipulating data, and ways that user can connect them together.”

Mashups are pilot fish for the web. By themselves, not that interesting. But they are a step on the way to what’s coming.

Eg., Astronewsology. Take Yahoo! news, and star signs, so can see what news happens to Capricorns. Then compare to predictions. Fact check the news with the deep, importance spiritual nature of the universe.

Makes two sets of data explorable by each other, put together by an axis of time.

Network effect of services.

– every new service can build on top of every other existing service. the web becomes a true platform.

– every service and piece of data that’s added to the web makes every other service potentially more powerful.

These things hook together and work together so powerfully that it all just accelerates.

Consequences

– massive creative possibilities

– accelerating innovation

– increasingly competitive services

– increasing specialisation

API-ish thing is a hippy dream… but there is money to be made. Why would a company do this?

– Use APIs to drive people to your stuff. Amazon, eBay. Make it easier for people to find and discover your stuff.

– Save yourself money, make service more attractive and useful with less central dev’t

– Use syndicated content as a platform, e.g. stick ads on maps, or target banner adds more precisely

– turn your API into a pay-for service

Allows the hippies and the money men to work together, and the presence of the money is good. The fact that they are part of this ecosystem is good.

If you are part of this ecosystem, you benefit from this acceleration. If you’re not, you’re part of a backwater.

What can I build that will make the whole web better? (A web of data, not of pages.) How can I add value to the aggregate web?

Data sources should be pretty much self-explanatory. Should be able to commercialise it, open it out, make money, benefit from the ecosystem around you. How can you help people use it?

If you’re in social software, how can you help people create, collect or annotate data?

There is a land grab going on for certain types of data sources. People want to be the definitive source. In some areas, there is opportunity to be the single source. In others, it’s about user aggregation, reaching critical mass, and turn that aggregated data into a service.

Services for exploring/manipulating data. You don’t need to own the data source to add values, you can provide people tools to manipulate it.

Users, whether developers or whomever. Feedburner good at this. Slicing information together.

Now will look at the ways to build these things. Architectural principles.

Much of this stuff from Matt Biddulph’s Application of Weblike Design to Data: Designing Data for Reuse, which Tom worked on with Matt.

The web of data comprises these components.

– Data sources

– Standard ways of representing data

– Identifiers and URLs

– Mechanisms for distributing data

– Ways to interact with/enhance data

– Rights frameworks and financial

These are the core components that we have to get write for this web of data to emerge properly.

Want people to interrogate this a bit more, and think about what’s missing.

Ten principles.

1. Look to add value to the aggregate web of data.

2. Build or normal users, developers and machines. Users need something beautiful. Developers need something useful, that they can build upon, show them the hooks like consistent urls. Machines need predictability. How can you automate stuff? E.g. tagspaces on Flickr can be automated getting those photos thus tagged.

3. Start by explorable data, not page. How are you going to represent that data. Designers think yuou need to start with user needs, but most user needs stuff is based on knowing what the data is for to start with. Need to work out best way to explore data.

4. Identify your first order objects and make them addressable. What are the core concepts you are dealing with? First order objects are things like people, addresses, events, TV shows, whatever.

5. Correlate with external identifier schemes (or coin a new standard).

6. Use readable, reliable and hackable URLs.

– Should have a 1-1 correlation with the concept.

– Be a permanent references to resources, use directories to represent hierarchy

– not reflect the underlying tech.

– reflect the structure of the data – e.g. tv schedules don’t reflect the tv show but the broadcast, so if you use the time/date when a show is broadcast, that doesn’t correlate to the show itself, it’s too breakable.

– be predictable, guessable, hackable.

– be as human readable as possible, but no more.

– be – or expose – identifiers. e.g. if you have an identifiers for every item, e.g. IMDb film identifiers could be used by other service to relate to that film.

Good urls are beautiful and a mark of design quality.

7. Build list views and batch manipulation interfaces

Three core types of page

– destination

– list-view

– manipulation interface, data handled in pages that are still addressable and linkable.

8. Create parallel data service using understood standards

9. Make your data as explorable as possible

10. Give everything an appropriate licence

– so people know how they can and can’t use it.

Are you talking about the semantic web? Yes and no. But it’s a web of dirty semantics – getting data marked up, describable by any means possible. The nice semantic stuff is cool, but use any way you can to get it done.