Wednesday, April 26, 2006

Contextual Carrier-Waves in Communication - where AdSense fails

In GMail just now, I was responding to a mail inviting me to a college reunion, for the tenth anniversary of our graduation. It was a chain of 6 fairly chunky emails, and the word "Anniversary" was mentioned just twice in the whole conversation.

However, all the adsense-driven adverts down the side were completely inappropriate - it seemed like adsense had pounced on the word "anniversary" and decided that that was the most important theme of the content.

To give a couple of examples, the contextual ads included such gems as:
  • "Inscribed Stone Sundials - An inscribed stone sundial makes a perfect gift for a special occasion"
  • "Give him a framed share of Playboy stock + others for your Anniversary"
  • "Original and Genuine Newspapers Ideal Birthday or Anniversary Gift!"

This is a shining example of when all the complicated semantic processing algorithms in the world are no substitute for actual context.

AdSense is usually pretty good at serving relevant ads on blog posts and webpages in general, but it seems a good deal less accurate in GMail. I would have expected far more ads related to college and graduation, or even parties or Durham - as these terms were much more prevalent in the actual text.

Thinking about it a bit more, I realised that there's a deeper issue, which is probably inherent in the nature of the communication mechanism.

Blogposts and web content in general are usually designed to be "broadcast" communications - one-to-many. Each page, or "message", is - or at least, should be - designed to stand on its own, and provide all the context required to interpret the communication in and of itself. When you are broadcasting to a potentially huge number of people, you must be clear, concise, and comprehensive.

Email, on the other hand, is generally one-to-one communication, or at most, one-to-a-small-group - certainly under the famous Dunbar Number. In this case, there is a lot of implied historical / social context in the communication:

  • you usually have some sort of existing relationship with the person you are emailing.
  • If you do not have a historical context (e.g. as with a job application) then the communication is usually much more formal, as it must provide all relevant context in itself.
  • If you are, or have been, part of the same social circle as the recipient, then that social context itself implies both a mode of communication and a certain terminology, plus a whole frame of reference in terms of shared experiences, etc.


In other words, the "signal" (message) rides on this contextual carrier-wave, and this enables the message itself to be much shorter than would otherwise be the case.

By contrast, a stand-alone message such as a blog post or whitepaper must carry itself - the context must be set by the message itself, to enable the random sampling of information that is the power of the internet as a medium.

So when this large amount of contextual information inherent in one-to-one conversations such as email is missing, a contextual semantic detection system such as AdSense is always going to be limited in its effectiveness.

...Unless, of course, it gets to the next step of deriving the context from the history of messages to and from that particular contact - but it would be interesting to see how well such a system would fare, given the Orwellian fears it would be bound to stir up... and looking ahead to Web 2.1 or even Web 3.0, I feel that one of the key battlegrounds is going to be that balancing of functionality with privacy.

After all, Google are already starting to skirt the borders of what's considered acceptable, and whoever manages to get it "right" stands to do very very nicely indeed...

3 comments:

Phil said...

Don't blame adsense just because you don't know it's called a "reunion!" :-P

Alistair Davidson said...

err....sorry Phil, I don't follow?

Anonymous said...

I find that the google ads can get pretty funky when you first install the code on your pages but after an hour they level out and show relevant ads according to your keyword density.