Friday, 21 August 2009

URL Shortening

I’ve been watching the success of Twitter for a number of years now, it’s amazing viral success in communication and great fun if you follow good citizens of the Twittersphere.

However we’ve all watch the problem emerge when sharing a URL. Twitter’s SMS support designed into the conversation means the message can be transmitted over the SMS network but it limited to the 160 characters we come to know and ‘love’. Twitter keeps 20 characters back for the Twitter username.

And, as all Twitters know this limit means sharing a link to a page can result in the link being parsed through a shortening service.

Many people have blogged about the longer term effect this has on the intrinsic hypertext linkage between pages. I posted a few comments on FriendFeed on this, I read a post from Dave Winer, “Enough with shortened URL’s”, Jeff Atwood also discussed this on “URL Shorteners: Destroying the web since 2002”. Dave and Jeff’s posts are good.

Each of these concludes with these broad problem summary:

  • What happens when the URL shortening service dies?
  • How do we know we trust the URL shortening service? What if it’s hacked and we are taken to lovely virus ridden pages (wait for this to reach the news, it’s coming someday soon)?
  • How does short linking affect Google Pagerank longer term?

And come to a similar conclusion; Twitter should have brought this feature into Twitter to solve the problem the platform introduced.

I’d agree here, Twitter introduced this problem, Twitter should solve it. But the problem here is the trend is actually expanding, Facebook is available via the SMS gateway too.

In Jeff’s post, he suggests that the big search engines provide this service:

Personally, I'd prefer to see the big, objective search engines who naturally sit at the centre of the web offer their own URL shortening services. Who better to generate short hashes of every possible URL than the companies who already have cached copies of every URL on the internet, anyway?

I think this is an entirely plausible approach and one which would seem to sit well within their domain.

We should likely all agree there is a trend toward more mobile users with more powerful mobile devices. Typically mobile phones, they have the SMS gateway service built in right there to receive the shortened URL. And with the browser experience improving on mobile phones through the iPhone Safari experience and the Opera mobile browser, we are likely to see this trend continue, more and more usage of SMS.

I see several proposals to solve some of the problem:

Each of the approaches needs a change both on the host pages and a service to resolve the address. I don’t think we can get away from this.

Personally I think the problem can be solve in another slightly more universal way without bias toward company profit or profile. And I think it can be done while maintaining a way to pull in all the legacy shortened URL’s and preserve the historical page linkage over time.

I use OpenDNS’s service to resolve DNS in place of my own ISP’s. It is because OpenDNS is faster, more customisable and slightly more robust in protecting me and my family from bad URL’s.

In OpenDNS exists a feature where a user can enter their own shortened URL and tell OpenDNS what to resolve this to.

image

Now this strikes me as a great little feature. Using this service, I can setup my own little shortcuts, point my desktops and servers at home to resolve using the OpenDNS service (in fact just via my single router at home), and bingo all the users at home can type ‘mail’ into the browser address bar and have OpenDNS resolve this to http://mail.google.com.

Brilliant.

Now my proposal is simply an extension on this.

Why can’t we have a similar service to OpenDNS where the shortened URL comes from OpenDNS storing the resolution and providing the shortcut. OpenDNS can replicate this shortened URL if you decided to ‘share’ the URL, public voting on the safety of the URL can help to ensure the public URL’s are valid and safe. Enough bad reports and the link can be blacklisted on the service.

DNS supports replication of such resolution data, so why can’t the service be extended to support this feature?

The problems I see with this, which I am fairly sure are solvable are:

  • Can DNS cope with the increase in traffic? (hint, I think yes, because DNS is cacheable and distributed)
  • Can blacklisting work? (again, yes. Look how successful public voting is on StackOverflow.com against articles, or PhishTank.com. Can we have public voting on URL’s safety. Enough upvotes and the URL is set in stone)
  • Will the URL’s decay over time? I think we should be able to look at DNS removing these short URL’s when the domain disappears, so yes they should decay over time
  • How do we solve incorrectly entered URL’s? I don’t think we have to bother, just generate the correct one again.

I don’t have the network of contacts to make this happen, or the technical background in the guts of DNS, but I certainly think this requires less work and is more Internet friendly than relying on a search provider to maintain this type of service.

If Google takes up this task, we end up in a place where Google holds all the keys to the web. And that’s not good for the long term Internet survival.

The entirely Internet is essentially one click away from a Google search result. The effects of Google failing at this stage are very, very bad.

Imagine how much worse it would be if Google failed and took with it all of the links on all of the pages on the Internet…

At least with this approach we are left with a distributed open network of linkage which we can work with and should outlast any single commercial company.

So, over to the smart folk who can make this stuff work.

What do we need to do to make this happen?

(you should follow me here on Twitter)