What’s the difference between user generated content and user generated rubbish? Comments please…

Some user generated content (UGC) is genuine, honest, credible, reputable, trustworthy, valuable, quality information. But some is rubbish (let’s call that UGR), including deliberately misleading propaganda, biased blog comments, bogus product reviews, spam, veiled advertising, and bad poetry (or is it just my blog that attracts poetry bots?)

Google’s PageRank algorithm does a good job of measuring the quality of a simple web page, based on the number of incoming links to that page, and recursively weighted on the quality of those linking pages. However, web2.0 has given us blogs, wikis, forums, media sharing, customer product reviews and ratings, social bookmarking, and more recently aggregation of all of the above; resulting in web pages that contain an increasingly complex array of UGC and UGR, making it increasingly difficult for algorithms, and site visitors and site owners to filter the signal from the noise, the UGC from the UGR.

So I wanted to write a post about some of the emerging technology innovations attempting to solve this problem. Readers are kindly asked to add a comment at the bottom of the post. All comments will be shown, even bad poetry, for purposes of research and experimentation.

Measuring quality is relatively easy for eBay. Its Feedback Ratings provide an excellent indicator of trustworthiness, because online auctions involve measurable user actions such as ‘Was the product description accurate?’ and ‘Did the buyer pay up?’ Such actions speak louder than the mere words of a blog comment or product review.

Amazon now owns a valuable database of customer product reviews to help people through their purchasing decisions. Innovation by Amazon in this area has included the ability to provide feedback on the usefulness of other users’ comments, and a Reviewer Rank algorithm which provides a measure of reviewer quality (interestingly, this algorithm was recently improved to include some PageRank-like recursiveness).

In a past life I had the pleasure of working for Lonely Planet, a travel publisher whose credibility and quality has been built upon the independence of its authors and their unbiased travel reviews. Lonely Planet and its peers have long struggled with the opportunity to harvest UGC from loyal and passionate travelers, because it is just so difficult to measure the independence and quality of contributing users.

TripAdvisor was allowed to emerge as a disruptive force in the market for travel advice, allowing anybody to review any hotel or restaurant. That created a lot of quality content for a while, but ever since hotel owners found out about TripAdvisor and began to review their own hotels, it’s been difficult to tell the UGC and UGR apart. TripAdvisor still desperately needs a reliable measure of user generated quality to restore its credibility.

Perhaps social networking can help TripAdvisor; being able to filter your travel advice to that written only by your friends would eliminate biased reviews (unless you are friends with a bunch of hotel owners, in which case you’re probably going to stay in their hotel anyway). But until the internet settles on a standard for social data portability, not many of us will have enough online friends who have traveled enough and generated enough online travel content for such a social filter to work reliably, even allowing for recursive algorithms.

If it’s just travel advice and inspiration you’re looking for, you could wait for Lonely Planet’s upcoming blog syndication feature, which promises a novel solution to the problem.

But more generally, I think we all need a universal reputation system, one which aggregates lots of measures of quality from lots of different sites. Imagine if you could easily see a summary of my quality metrics from eBay and Amazon and Yahoo Answers and LinkedIn Answers and GetSatisfaction, perhaps even my Bugzilla and Basecamp metrics too; would that be enough for you to trust my travel advice and any other content that I generate?

Site visitors would benefit from increased visibility of users who generate content. Genuine contributors would be encouraged by being able to build a universal reputation for quality UGC, and discouraged from the risk of creating UGR. And site owners would benefit from data to filter out the UGC from the UGR.

A universal reputation system could also help to eliminate online vote rigging, astro-turfing (all those reviews of iPhone apps posted by the developers themselves), and space-faking (setting up false identities on social networking sites).

Who are the players?

SezWho SezWho provides a plugin for blog commentary which presents a useful summary of UGC history for each contributor, and allows customizable 5-point rating scales for site owners.
Intense Debate Intense Debate has a great interface design. It’s recently been acquired by Automattic, the owners of the WordPress blogging platform, which will provide some valuable distribution, perhaps critical mass. But will the other blogging platforms want to adopt or integrate with a standard controlled by a competitor?
Google Friend Connect Google Friend Connect allows any site to embed a comments or ratings gadget onto any page. The universal view of previous UGC is not there yet, however this will become powerful when integrated fully with Google’s other stuff; Blogger and SearchWiki and the Social Graph API and YouTube (arguably the site most in need of a UGR filter!)
Disqus Disqus is getting lots of press for its prompt Facebook Connect integration which takes the hassle out of commenting. Video comments can by posted, powered by Seesmic. Readers can nudge comments up and down the list by voting on them. Try it out below.

If you have a view on who will win the race to become the universal reputation system, please comment below. Are there any other players that I have missed out? (Yes I know that is exposing me to some comments on the quality of this post!)

Also here’s some further questions to inspire some commentary:

  • Should we settle on a word for what is being measured here? Quality, importance, value, trust, reputation, credibility, honesty, transparency? Or will the winner of the race provide a web2.0 brand name to describe this concept of a universal measure of user generated content?
  • Is it even possible to determine an objective universal score? The success of PageRank would suggest yes. Or is quality in the eye of the beholder? Is one person’s signal another person’s noise?
  • Would a universal metric destroy the democratic level playing field that is UGC / UGR?
  • What are the consequences of such a universal reputation system being gamed?
  • How likely are eBay and Amazon to open up their reputation data? What are the privacy implications?

Thoughts please. Don’t be shy!

This entry was posted in mobile geo social and tagged , , , , , , , , , , , , , , . Bookmark the permalink.