August 31, 2007 1 Comment
Last night I had a conversation with a colleague who is setting up a set of MySQL servers, he was specifically asking me about replication, so I came up with a bullet list of items. No great revelations and there is still a lot of work to be done to get some of these ideas going, but it is starting point:
- Masters and slaves should be the same machine types. There is a temptation to take your best machines and make them the master. You don’t want to do that because your slaves will become masters at some point.
- The slaves should only be used for replication, not for reading. This is important because putting any load on a slave might very well slow down replication which will cause problems if you need to turn the slave into a master and it is behind the (now defunct) master.
- MySQL permissions on the slave should be set to read-only until you need to use it as a master. This will prevent ‘bozo’ events where an errant application updates data on a slave.
- Create a system to promote a slave to a master automatically should the master die. Rebuilding a master can be very painful (been there, done that,) and it is much easier to create a new slave.
- Create a system to create a new slave from a master, MySQL has some support for that, but you may need to create your own solution.
- If your system is composed of shards/segments/slices, create a system to move them around as need be. It is not uncommon to put multiple shards/segments/slices on a single server to start with, but you need to be able to move them around easily as the quantity of data grows.
- Use the Google MySQL tools, this will ensure that data is copied to the slave before it is committed on the master.
- I would recommend using MySQL 5.0.x or 5.1.x. The latter has support for row based replication which is supposedly faster than statement based replication, though I don’t know how stable 5.1.x is, it was not stable when I last tested it in december last year, but I am sure that has changed since then.
- Make sure that you do not put too much data on the servers, this makes recovery hell (trust me I have been there).
- The most common error with slaves is duplicate keys, where you try to insert a new row with a key which already exists, less common is data disparity were the data is different for a specific key. Both things are usually due to ‘bozo’ programming but you need to be able to handle that.
- Finally make sure that you have a sensible numbering systems for the servers, the best idea is to derive the number from the IP address so you don’t even need to think about it.