To Shard or not to Shard

This post on (not) sharding over at 37 Signals is getting some airtime.

So rather than sharding, they figured that they would just throw more RAM at the database server and scale that way:

Our read performance is in some aspect being taken care of by the fact that you can get machines with 256GB RAM now. We upgraded the Basecamp database server from 32GB to 128GB RAM a while back and we thought that would be the end of it.

The box was maxed out and going beyond 128GB at the time was stupid expensive. But now there’s 256GB to be had at a reasonable price and I’m starting to think that by the time we reach that, there’ll be reasonably priced 512GB machines.

I have to admit that I did a bit of a double take on this, 128GB of RAM is a lot of memory. I can see why they would want to avoid sharding. Sharding introduces all sort of complications, for example getting all rows matching a criteria gets more complicated (you have to address all the shards), doing joins gets more complicated, referential integrity gets more complicated, etc… On the other hand by leaving all the data on a single machine, even if that machine has a lot of RAM, does not remove other bottlenecks such as memory bandwidth, CPU power (to an extent), so there are still going to be limits to how far you can go.

Admittedly sharding is hard and has limited tool support, and those tools which support it are primitive at best, but I think there is much more upside to sharding than downside.

That being said it is a choice that should be based on the requirements at hand. Some of the databases in the current system I am building are sharded and others are not.


Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: