Relevance

This is a very interesting article on relevance, by Elizabeth Van Couvering (Department of Media and Communications, London School of Economics.)

The abstract reads:

In the face of rising controversy about search engine results—that they are too restrictive, too comprehensive, lacking in certain areas, over-represented in others—this article presents the results of in-depth interviews with search engine producers, examining their conceptions of search engine quality and the implications of those conceptions. Structuration theory suggests that the cultural schemas that frame these discourses of quality will be central in mobilizing resources for technological development. The evidence presented here suggests that resources in search engine development are overwhelmingly allocated on the basis of market factors or scientific/technological concerns. Fairness and representativeness, core elements of the journalists’ definition of quality media content, are not key determiners of search engine quality in the minds of search engine producers. Rather, alternative standards of quality, such as customer satisfaction and relevance, mean that tactics to silence or promote certain websites or site owners (such as blacklisting, whitelisting, and index “cleaning”) are seen as unproblematic.

The article does a very good job of illustrating the issues with defining relevance and concludes that it is not as cut and dry as one would believe, but we all knew that:

The research questions that began this article were first, how do search engine producers conceive of quality? and second, what are the implications of these conceptions of quality for the future development of search engines? The evidence from the interviews examined in this article suggests that search engine producers conceive of quality in two separate but interrelated ways. First, a quality search engine, from the producer’s perspective, has high customer satisfaction. This definition of search quality is embedded in a larger cultural schema that I have called the “market” schema, in which search engines are primarily conceived of as businesses. Second, a quality search engine produces very relevant responses to queries. Again, this definition of quality is related to the cultural schema that I have characterized as “science/technology.” Search engines from the science-technology point of view are primarily pieces of engineering.

The implications of these conceptions of quality are far reaching precisely because they are embedded in larger cultural schemas. Structuration theory emphasizes how cultural schemas and their associated norms guide the allocation of resources. This article has shown that in the case of search engines, several schemas are at work simultaneously. The schemas clearly in the ascendant—the dominant market schema and the science-technology schema—provide little scope to raise issues of public welfare, fairness, or bias. Instead, they emphasize profit, in the case of the market schema, or progress and efficiency, in the case of the science-technology schema, or defense, in the case of the war schema.

Van Couvering conducted extensive interviews, providing a few excerpts in the article, I wish there were more because they are very interesting to read.

Mac Plus beats Athlon Dual Core

Admittedly this is a little outlandish, but this performance face off between a Mac Plus and an Athlon Dual Core shows the former beating the latter on more than half the tests.

There is no symetry at all between the tests, but it is fun to read. My first Mac was a Mac Plus, the first ‘platinum’ variety, tricked out with 4MB of RAM and a 20MB hard disc, so I got a chuckle from this article.

The article summarizes the test pretty well:

Is this to say that the Mac Plus is a better computer than the AMD? Of course not. The technological advancements of 21 years have placed modern PCs in a completely different league of varied capacities. But the “User Experience” has not changed much in two decades. Due to bloated code that has to incorporate hundreds of functions that average users don’t even know exist, let alone ever utilize, the software companies have weighed down our PCs to effectively neutralize their vast speed advantages. When we compare strictly common, everyday, basic user tasks between the Mac Plus and the AMD we find remarkable similarities in overall speed, thus it can be stated that for the majority of simple office uses, the massive advances in technology in the past two decades have brought zero advance in productivity.

YouTube on AppleTV

So we are now going to get YouTube on AppleTV.

This move parallels Apple’s previous move to bring podcasts to iTunes, making life easier for all of us (including me) when it came to managing podcasts which, incidentally, is when technology moves from early adopters to mainstream. It also allows Apple to leverage outside sources of content to make its products more appealing.

It is hard for me to get excited about this, I don’t watch YouTube at all, LonelyGirl15 got about 5 seconds of my attention before I clicked the close button on the window (I must be gettin’ old.)

But I am not oblivious to the fact that lots and lots of people watch YouTube, and I am sure that this will draw more people to AppleTV which, if you believe the rumors, has not been selling all that well.

Personally I would like to see more content on iTunes, where is HBO? where is the BBC? Now that would be something I would get excited about.

Thriving on chaos

I came across this very interesting article by Dan Pritchett on chaos in systems.

Well worth reading if you build large systems.

Sowing the seed of one’s own demise

Bob Cringely penned “The Final Days of Google”, which was picked up by the Search Engine Journal.

Cringely’s basic argument is this. There are lots of smart people working at Google right now and part of the Google policy is to allow people to allocate 20% of their time on their own projects. Since there are so many people, there will be many projects, and because of the nature of the company most of these projects will never make it to the Google web site. So these people will eventually take these ideas elsewhere, potentially building a Google killer. Ok, so this is somewhat abbreviated so go read the article.

The Search Engine Journal disagrees, but won’t reveal the reasons why not, which rather dents the credibility of the article, relagating it to an opinion piece with no backing, in a rather childish way.

I have a different take on the whole thing. I agree that people will be leaving Google to start their own thing, it makes sense that once your stock vests that it is time to leave and do your thing, whether it is angel investing, becoming a VC, or starting up your own company. This has happened many times before, with AOL, with Microsoft, will Ebay, Yahoo, the lost goes on.

And one of these new companies may be the one that takes on Google, but Cringely draws a cause and effect relationship which is not as definite as he makes out.

I think the real danger to Google is from within. Once the company gets settled and cosy, it will get slow and will have a harder time identifying new threats when they appear, or have a harder time reacting to them because of its own inertia. The key is to keep Google hungry and foolish, something which is very hard to do as a company gets larger.

iPhone stuff

A couple of interesting tidbits about the iPhone now that we are getting closer to release.

There is already a rumor that there are going to be different versions, a different case in fact. That would not be surprising. If we look at the iPod product line, there are three form factors, so I would expect that there would be different form factors for the iPhone.

The other rumor is that Google is already working on other applications for the iPhone. While I think this is a good idea, I would like to see the iPhone opened up to all developers, not just Google. I feel the iPhone’s adoption rate would be hampered if it was restricted to Apple and Google applications only, locking out other developers.

Google performance tools – part deux

Thinking about my previous post about the Google performance tools, specifically about the alternative memory allocator, I was struck that it was odd that it would be a separate product.

When you think about it, being Google you have a boatload of machines (on the order of 500,000, and then some.) So it would make perfect sense to build your own linux distribution, putting in the stuff you wanted, and ripping the cruft out. Remember a couple of years ago when Google was hiring operating systems developers, the rumor was that they were going to be releasing a Google linux (lets call it GoogleOS). Well they did, for internal consumption only.

Google performance tools

Earlier today I came across the Google performance tools, which include an alternative memory allocator which is supposed to be fast (still testing it).

There also a heap leak checker, a heap profiler and a CPU profiler.

At first glance it does not look like the CPU profiler offers much more than gprof except for a graphical output (I prefer text, old school I know,) and the heap leak checker looks similar to Valgrind.

But I will keep playing with these tools, it is nice to see that Google put these tools out into the public domain, and having a range of tools to choose from can only be a good thing.

Sharing the Memories

Shared memory is one of those things that most applications don’t use but can be very useful. I am only going to cover its implementation on Linux, how to size it, what its properties are and how do abuse it. What I am not going to cover are the “shm_*” functions which allows applications to share memory between them. While these are implemented on top of shared memory, that is an entirely different subject and well covered online or in various books (I can recommend “Unix Network Programming – Volume 2″ by Stevens.)

Shared memory on Linux is implememted as a file system mounted on “/dev/shm”. By default this partition is set to half of available RAM when the operating system is installed. So a system with 4GB of RAM will have a shared memory partition of 2GB.

This can be easily changed in /etc/fstab as follows:


tmpfs /dev/shm tmpfs size=4G 0 0

All that needs to be done is to change the size variable to whatever size you want your shared memory partition to be, and unmount and remount the file system. Two things to note: this memory is not actually allocated until you use it so won’t have any impact on your system, and the size of the shared memory partition cannot exceed the swap space available. I generally set my swap space to equal the amount of RAM available, though for special configurations I will set it to twice that.

Given that the shared memory is presented as a file system, shouldn’t we be able to use it as such. Well we can, and the best way I like to think about it is as a RAM disk. I can write stuff there and the files will be stored in RAM (not always, more on that later,) and I can access that stuff at a later date.

There are two caveats though. The first is that this file system gets cleaned upon reboot, so don’t put anything there which you will want to access after the system is rebooted, use the regular file system for that. The second caveat is that this data has to co-exist with everything else the operating system is running, so you probably would not want use all the space available if you want to maintain good performance. If the operating system needs space, it will not hesitate to write pages from shared memory out to swap (which is probably why the default file system size is set to half of available RAM.)

All that being said, the shared memory partition can be very useful as a storage space for temporary files, or for static files which you need to access a lot. The operating system will do its best to keep that data in memory, giving you fast access to it.

Herculean effort

Quite the Herculean effort, Two A Day did a review of 300 search engines.

If that wasn’t enough, they also did a review of web apps.

Follow

Get every new post delivered to your Inbox.