François Schiettecatte’s Blog

August 17, 2008

Pre-generated static files

Filed under: Scaling, Software Development — François Schiettecatte @ 3:26 pm

From “Strategy: Serve Pre-generated Static Files Instead Of Dynamic Pages” on “High Scalability“:

Pre-generating static files is an oldy but a goody, and as Thomas Brox Røst says, it’s probably an underused strategy today. At one time this was the dominate technique for structuring a web site. Then the age of dynamic web sites arrived and we spent all our time worrying how to make the database faster and add more caching to recover the speed we had lost in the transition from static to dynamic.

Indeed, serving plain files is really easy and very quick. I reminded me a question that was asked in the Google Maps session of the Google Scalability conference in 2007. The question was along the lines of “How much work does the web server do when serving browser requests?”, the answer to which was along the lines of “It all files and we can’t hand them over to the client fast enough, here they are, take ‘em, we’re done!”

August 9, 2008

Caching and system optimization

Filed under: Feedster, Scaling, Search — François Schiettecatte @ 1:41 pm

Greg Linden mentions an interesting paper out of Yahoo Research and presented at SIGIR 2008 “ResIn: A Combination of Results Caching and Index Pruning for High-performance Web Search Engines”.

The excerpt that Greg republishes in his post talks about how using a pruned index worked against having a cache on the main index since only singleton searches would reach the main index.

Which got me thinking about the index we had at Feedster. I actually implemented the search engine there, and managed its day to day running, so I had quite a bit to say on how it was organized.

The index was broken down into segments (shard if you like) of 1 million entries. The entries were added to the index segments in the order they were crawled, a new index segments being generated every 10 minutes, in other words we would be adding new entries to an index segments until we reached 1 million entries and then we would start a new index segments. We had only a limited number of machines on which to run the search engines, so we organized the overall index into a pipeline, new segments were added to the top and older segments were removed from the bottom and placed into an archive. While the overall index was quite big (about 750GB if you must know), we usually only had about 45 days’ worth of posts seachable, which was fine because our default sort order was reverse chronological and we could satisfy most search terms with that. The 45 days of indices (about 66 index segments) were spread across five machines and this whole setup was replicated for redundancy and load balancing. In front of those there were five search gateways which would handle searches coming from the front end. The search gateway would autoconfigure themselves, searching the local net for search engines, figuring out what was there and putting together a map of the index segments, presenting that as a single index to the front end. So search machines could go down and come back up and stuff would just work, and I did not need to tell the search gateways where the indices were located.

The search engines themselves would create local caches of search results for each index segment, any cache file older than seven days would be deleted to make space for new cache files. One feature I implemented in the cache was a switch which would allow me to cache either the whole search or components of the search. For example, searches could be restricted to a set of blogs, and that restriction was implemented as a search, so it made sense to cache both the search and the restriction separately so that they could be reused for other searches which contained either components.

In addition I implements automatic index pruning, where the search engine (and the search gateway) would know the order of the index, ie reverse chronological, and would search each index segment in turn until the search was satisfied. Users could also control this index pruning from within the search through search modifiers. In fact I implemented a long list of search modifiers most of which we did not document on the site for one reason or another.

At peak times the search engine was handling about 1.5 million searches a day, since the searches were spread across pairs of machines, each machine was handling 750,000 searches/day over an index of 14 million entries.

August 7, 2008

Memcached, again…

Filed under: Scaling, Software Development — François Schiettecatte @ 1:33 pm

The High Scalability blog has a good summary of a presentation given by Farhan Mashraqi of Fotolog.

As I have written before, I am really, really, really ambivalent about using memcached to cache data coming out of MySQL.

Fotolog has 51 instances of memcached on 21 servers with 175G in use and 254G available.

I can’t help but wonder how MySQL would perform if given 21 extra servers with all that memory.

They also mention MySQL’s cache, my advice on that is don’t even bother with it, it is worse than useless.

One place where I was interested in their use of memcached is in caching filesystems accessed via NFS. Again I have to ask whether they are solving the wrong issue. It is a bit like saying “my car is slow therefore I will buy a faster car to tow it with, and my slow car will now go faster.” The real solution is to see why your car is slow in the first place, and then trade it in for a faster one if need be.

Installers ahoy

Filed under: Scaling, Software Development — François Schiettecatte @ 1:16 pm

I am wrapping up work on the installer I started working on over the weekend.

The goal of the installer is to have a single script I can run on a pristine machine (or image), tell it what I wanted installed (crawler, api, indexer, etc…) and boom! five minutes later I have a fully installed, fully configured instance. Or course the best thing to do is to create an instance image which I can run in the cloud, but I need the installer to create the image. The exercise of building the installer is very good at getting a handle on which files, libraries and what-have-yous go with each sub-system. As a side-bar we did not have that at Feedster for the longest time, which was a bad mistake.

By the end of the weekend I decided that the best way to approach this is to assume that I was installing on a pristine machine (say a CentOS installation) and assume nothing, well assume that there is at least a network and a few tools like svn, gcc and make. So everything needs to checked for and, if needed, checked out and installed, right down to things like java and ant. This caused me to look again at the structure of my svn repository and effectuate a reorganization which helped a lot both from a code organization point of view and an install process.

August 4, 2008

Drizzling away

Filed under: Scaling, Software Development — François Schiettecatte @ 5:18 pm

I have been reading more and more about the Drizzle project these past few weeks and it looks like a very interesting project. The project itself started off from MySQL, lots of bits got ripped out and new bits are being added in. You can track progress on Planet MySQL and on the project home page.

Talking about the project home page, here is what it says:

A Lightweight SQL Database for Cloud and Web

The Drizzle project is building a database optimized for Cloud and Net applications. It is being designed for massive concurrency on modern multi-cpu/core architecture. The code is originally derived from MySQL.

While Jay Pipes does not think that this will ever make it out of the lab, I think there is a gap in the market for a lightweight, SQL-based, networked DBMS. At one end of the spectrum you have MySQL, a fairly complete and heavyweight RDBMS, further along you have SQLite and eventually BerkeleyDB from (Oracle, originally SleepyCat).

As an application grows and there is more and more data to manage, a switch has to be made from a monolithic database to a sharded database, which means that a lot of the work that was being done in the monolithic database server (referential integrity, joins, etc…) has to move to a middleware layer (this is documented ad nauseam so I am not going to expand on that.)

So if you are using MySQL in this scenario, you wind up not using 80% of the features that MySQL offers which just makes them overhead. Trouble is that at the other end of the spectrum (touched on above) there isn’t anything which does the 20% you need.

So I think (hope!) that this is where Drizzle is heading, because it really just makes sense.

Updated August 8th, 2008 - Drizzle is the subject of the current FLOSS Weekly podcast over on the TWIT network.

July 17, 2008

To normalize or not to normalize, that is the question

Filed under: Scaling, Software Development — François Schiettecatte @ 7:56 am

Jeff Atwood published a very interesting post “Maybe Normalizing Isn’t Normal” where he delves into whether you should normalize or denormalize. Be sure to check the comments, or this summary on the High Scalability blog if the number of comments (and tone in some cases) gives you a headache.

The post is very interesting but I took issue with this:

Both solutions have their pros and cons. So let me put the question to you: which is better — a normalized database, or a denormalized database?

Trick question! The answer is that it doesn’t matter! Until you have millions and millions of rows of data, that is. Everything is fast for small n. Even a modest PC by today’s standards — let’s say a dual-core box with 4 gigabytes of memory — will give you near-identical performance in either case for anything but the very largest of databases. Assuming your team can write reasonably well-tuned queries, of course.

While it is true that for small data sets there is no difference in performance whether you normalize you schema or not, it will make a huge difference once your data set grows. Adding to the fun is that changing your schema becomes more and more difficult as the data set grows.

Then things settle down:

First, a reality check. It’s partially an act of hubris to imagine your app as the next Flickr, YouTube, or Twitter. As Ted Dziuba so aptly said, scalability is not your problem, getting people to give a shit is. So when it comes to database design, do measure performance, but try to err heavily on the side of sane, simple design. Pick whatever database schema you feel is easiest to understand and work with on a daily basis. It doesn’t have to be all or nothing as I’ve pictured above; you can partially denormalize where it makes sense to do so, and stay fully normalized in other areas where it doesn’t.

A sane, simple design is a “good thing”, but you also need to plan for the future, you want a sane simple design which can evolve and scale.

Finally sanity is restored:

Pat Helland notes that people normalize because their professors told them to. I’m a bit more pragmatic; I think you should normalize when the data tells you to:

  1. Normalization makes sense to your team.
  2. Normalization provides better performance. (You’re automatically measuring all the queries that flow through your software, right?)
  3. Normalization prevents an onerous amount of duplication or avoids risk of synchronization problems that your problem domain or users are particularly sensitive to.
  4. Normalization allows you to write simpler queries and code.

In my experience (with Feedster amongst others), a heavily denomalized schema is easy to work with but simply does not scale well.

With my current project I took a different tack:

  • Normalize where it makes sense and group logical chunks of data together, even if it means having 1 to 1 relationships. From a performance point of view this means that you get and update the chunks you need rather than accessing tables with 50+ fields were 90% of the fields are null (don’t laugh, I have seen it happen).
  • Never ever ever join to get data, better to issue two simple queries rather than one join. With the caveat that this is born of experience with MySQL and large amounts of data (1/2 TB), even with indices performance can be unpredictable.
  • Sharding your data is pretty much the only way to scale, so design that in from the start.
  • Build a data access layer which hides the schema from the application.

I am sure there is more, but this is a start.

July 16, 2008

How many cores for innodb?

Filed under: Scaling — François Schiettecatte @ 1:00 pm

I came across two articles on Planet MySQL with some benchmarks on Innodb scalability on machines with multiple cores. The first one being “MySQL, Innodb, DBT2 Core Scalability Graphs” and the second one being “Innodb Multi-core Performance“.

What is interesting is that both seem to suggest that Innodb peaks around 8 cores, though I would be curious how many CPUs there actually were and what the memory bandwidth was. I guess what I am really saying is that I am curious where the bottleneck actually is.

July 8, 2008

Moving along

Filed under: General, Scaling, Software Development — François Schiettecatte @ 8:04 am

I have resisted posting anything about the current issues at Twitter, though I have complained about their service in the past, but this time it seems that people are starting to walk.

Two weeks ago TechCrunch carried a story about this with the following comment:

So why aren’t people screaming about the feature being gone? Because this time, they’re just heading over to Friendfeed to have those very same conversations. Friendfeed for most users was just a place to bookmarks all their activities on other social networks. Now, more and more, it’s a place that people start conversations. The early adopters got that a while ago. Now, the not so early adopters are using it as a Twitter replacement, too.

My feeling is that Twitter still has “critical mass”, but my gut tells me that there is a shift going on. I am hearing more and more stories about people tiring of the service outages and while there is a lot of understanding and patience over growing pains and the fact that it is a free service, the lack of reliability and constancy seems to be wearing people down. Migrations always start slowly and happen very quickly when “critical mass” has been achieved, some also call that an “inflection point.”

July 3, 2008

CAP, BASE & ACID

Filed under: Scaling, Software Development — François Schiettecatte @ 7:53 am

This post by Dan Pritchett on ACM Queue about “An ACID Alternative” is worth reading if you are into scaling databases.

Nothing really new if you read his “Adding Simplicity” weblog, but a good recap nonetheless.

June 22, 2008

Make it go faster!!

Filed under: Scaling, Software Development — François Schiettecatte @ 12:52 pm

I have been enjoying listening to the StackOverflow podcasts by Jeff Atwood and Joel Spolsky. As a developer it is always good to get different (and sometimes controversial perspectives on things).

The last podcast, their 10th episode, covered a wide range of topics, the last one of which was about Ruby and its suitability for enterprise usage. The show notes on this subject are:

On Ruby performance, scaling, “enterpriseyness” and whether or not this is even the right question to ask. Shouldn’t we be thinking of this in terms of the solution first, and the language as a side-effect of that?

Which I think is right on, you need to look at the task you want to accomplish and choose the right tool for the job. The problem is that some people get ‘wedded’ to a single language and use that for every problem they encounter, winding up with some very good implementation as well as some very bad ones as well.

I tend to approach the language choice using a variety of parameters which really boil down to the suitability of the language for the task, looking at how well the language deal with the problem being solved (such as Perl for text processing), scalability & performance, maintainability.

One comment Jeff Atwood made about scalability was about looking at the number of machines you need to scale up your operation. If you can only run one process per machine, scaling by an order of magnitude will be a lot more painful than if you can run 10 processes per machine, which I think is right on the money as it were. It maybe much more cost effective to spend more time upfront developing your app in a language such as C (or derivative) than putting something together quickly in Perl or Ruby and having to buy much more hardware later on.

« Newer PostsOlder Posts »

Blog at WordPress.com.