Google performance tools – part quatre

I have written three posts (part 1, part 2 & part 3) about the Google performance tools, at which time I claimed that I did not see much improvement from using the alternative memory allocator.

It turns out that my test was flawed. The test I ran was to create an in-memory index, a process which requires the allocation of lots of little bits of memory. Unfortunately the process runs as a single thread which is not the use case the alternative memory allocator is supposed to address. It is supposed to address the case where there are lots of threads running, all allocation memory.


I set up and ran a different test. A search engine I have been playing around with here allows me to set it up either as a number of processes and/or a number of threads. So I took a chunk of data (2GB of text), created an index (2GB of index) and processed to run a large number of test, setting up the search engine with 20 instances with a single thread each, and one instance with 20 threads.

I am not going to publish the raw data here, it is not all that interesting, but I will describe what I noticed.

When running the search engine with 20 instances with a single thread each, the google alternative memory allocator actually delivers about 10% worse performance than the memory allocator in the glibc library with the search cache turned off, but about 10% better performance with the search cache turned on.

When running the search engine as one instance with 20 threads, the results are quite different. The google alternative memory allocator delivers about 10-15% better performance than the memory allocator in the glibc library with the search cache turned off, and 25-30% better performance with the search cache turned on.

What is also interesting is that the performance curve flattens out after peaking at 4 concurrent clients (the machine sports dual xeon processors with hyperthreading) when running 20 instances with a single thread each (peaking at 520 searches/sec, flattening out at 500 seaches/sec), while dropping and flattening out when running one instance with 20 threads (peaking out at 860 searches/second, flattening out at 570 searches/second).

So my results bear out that the google alternative memory allocator delivers when used in a threaded application.


“Human Touch”

The NY Times has an article wondering if the ‘Human Touch’ will loosen Google’s grip on the search industry.

First the obvious comments. The press thrives on controversy, it makes for interesting news which sells. There also seems to be this obsession with looking for the next Google, I see it in the press and the VCs, everyone want to find, and back, the next Google, thereby securing fame and/or fortune (most likely both.)

Now for the less obvious comments.

Matt Cutts (by way of John Battelle) verbalizes this one much more eloquently than I ever could, so go and read his article. The crux of his argument is that even though the article draws a contrast between algorithm based search engines (cold machines) and social based search engines (warm fuzzy humans), the former are built by humans and rely on data created and compiled by humans. I do know that many engineers (like me) pour their heart and soul into the systems they develop, so all systems have very human roots.

Google is very deeply ensconced in the market in ways which make it very difficult to dislodge in the short term. Adwords has deep roots all over the internet, their search works well enough, their applications are good enough, smart enough and da’gone’it people like them. Their share price strongly suggests that Wall Street is very confident that there is plenty of market share and revenue left for them to grow into.

Even if a Google ‘killer’ appeared on the scene, I don’t feel there is enough oxygen in the market for a new-comer to take them on. Even Microsoft, Yahoo and Ask, all of whom have deep pockets and roughly equal technology are slowly getting asphyxiated.

From a strategic point of view, an eye need to be kept on the following:

  • There will be a Google ‘killer’ at some point, but not in the short-term, and probably not in the medium-term. The chink in their armor has yet to reveal itself.
  • Any Google ‘killer’ has to be much, much better than Google just because people feel that their search engine is just ‘better’.
  • Google is not standing still, they will keep improving.
  • Google needs to pay attention to their market-share, making sure that healthy competition exists in the market otherwise the trust-busters will come calling. Google needs Microsoft, Yahoo, Ask, Amazon and EBay to keep competition healthy and keep everyone honest. This only helps the customer.

While I don’t feel there is currently much oxygen in the market for a Google ‘killer’, there is plenty of room for vertical search engines which combine web based & user generated content, provide tools to engage users such as blogs and forums, and some sort of commercial component.

Don’t get me wrong, I like Google, I use their products every day because they are good. It has been a very interesting ten years since they came on the scene, and the next ten years will prove an interesting ride, and I would not miss it for the world.

Update: SearchEngineLand also has a post on this with additional links.