I have been catching up on Daniel Tunkelang’s weblog and came across a write up about the SIGIR 2009 presentation by Vanja Josifovski.

The crux of the presentation is that treating ads as a document in IR works well if you evaluate the search over the ad corpus.

What is interesting is that I spent a weekend testing just that when I was at Feedster. I pulled the ads from our ad provider, did a little bit of cleanup and indexed them into an index. I think there were about 80,000 ads so this step was very quick, on the order of seconds.

I then tried would run sample searches on the index to see what ads were retrieved and it seemed to work pretty well.

I also tested a scenario which took the text from posts in a weblog and ran that text as feedback over the ads index. By feedback I mean that there was no search but just a ‘bag of terms’ used to rank the ads. Again this worked pretty well. I pulled a popular feed (which shall remain nameless) and good ads that were very relevant to the individual posts in the weblog. Even using a large number of terms for feedback was not a problem because the ads index was so small, searches would run on the order of milliseconds.

Unfortunately we did not use this which was unfortunate, but it did show me that it could be done.


