June 24, 2007 Leave a comment
Good article by Brian Moon about caching and the different ways you can cache stuff.
In particular I liked the distinction between pulled caching and pushed caching:
We have been using this method for a while in our ad serving software. We are now using it more and more. IMO, its the most sure fire way to handle increased load. Basically, you don’t have the pages of your web site make SQL requests to the live SQL data in the event no cache is found. That is what I call a pulled cache. Instead, you push the data from your primary database into some caching (or even another, optimized SQL server) for your web site to use. We are actually using MySQL Cluster for this purpose on our web site. The forward facing web site hits only the MySQL Cluster. If the data is not there, its just not there. We have processes on our backend that gather data from our primary database, assemble it for presentation and populate the cluster. The queries that the web site uses to access the cluster are highly optimized. You could do the same with memcached, but memcached is volatile. With cluster, we have high availability and get about the same performance as we did with a fully cached paged.
This is very important because the less stuff you need to pull together and build to render a page, the more efficiently you site will scale.