MySQL replication

Last night I had a conversation with a colleague who is setting up a set of MySQL servers, he was specifically asking me about replication, so I came up with a bullet list of items. No great revelations and there is still a lot of work to be done to get some of these ideas going, but it is starting point:

  • Masters and slaves should be the same machine types. There is a temptation to take your best machines and make them the master. You don’t want to do that because your slaves will become masters at some point.
  • The slaves should only be used for replication, not for reading. This is important because putting any load on a slave might very well slow down replication which will cause problems if you need to turn the slave into a master and it is behind the (now defunct) master.
  • MySQL permissions on the slave should be set to read-only until you need to use it as a master. This will prevent ‘bozo’ events where an errant application updates data on a slave.
  • Create a system to promote a slave to a master automatically should the master die. Rebuilding a master can be very painful (been there, done that,) and it is much easier to create a new slave.
  • Create a system to create a new slave from a master, MySQL has some support for that, but you may need to create your own solution.
  • If your system is composed of shards/segments/slices, create a system to move them around as need be. It is not uncommon to put multiple shards/segments/slices on a single server to start with, but you need to be able to move them around easily as the quantity of data grows.
  • Use the Google MySQL tools, this will ensure that data is copied to the slave before it is committed on the master.
  • I would recommend using MySQL 5.0.x or 5.1.x. The latter has support for row based replication which is supposedly faster than statement based replication, though I don’t know how stable 5.1.x is, it was not stable when I last tested it in december last year, but I am sure that has changed since then.
  • Make sure that you do not put too much data on the servers, this makes recovery hell (trust me I have been there).
  • The most common error with slaves is duplicate keys, where you try to insert a new row with a key which already exists, less common is data disparity were the data is different for a specific key. Both things are usually due to ‘bozo’ programming but you need to be able to handle that.
  • Finally make sure that you have a sensible numbering systems for the servers, the best idea is to derive the number from the IP address so you don’t even need to think about it.

High Performance MySQL, Second Edition

I see that there is a new edition of the book “High Performance MySQL” in the works.

That is good to see and I hope that they add a lot to the book. Frankly that would not be difficult because the first edition sucked. I had been using MySQL for all of 6 months when I got it and I did not learn anything new.

On the other hand the book is useful if you don’t know anything about MySQL, but there are a number of better books out there than that one.

Google is a capitalist tool?

Two (1, 2) good articles in the Economist about Google.

Google will face more and more scrutiny as time goes by and will need to deal with it:

One obvious strategy is to allay concerns over Google’s trustworthiness by becoming more transparent and opening up more of its processes and plans to scrutiny. But it also needs a deeper change of heart. Pretending that, just because your founders are nice young men and you give away lots of services, society has no right to question your motives no longer seems sensible. Google is a capitalist tool—and a useful one. Better, surely, to face the coming storm on that foundation, than on a trite slogan that could be your undoing.

Tumblr on the UK Telegraph

The UK Telegraph has an interesting article on Tumblr:

Tumblelogs are the punk rock of blogging. They strip away all that prog-rock space jazz and focus on the content: short thoughts, quotes, photos, music, video clips and links. Unlike the verbose ramblings of most weblogs, where anything posted tends to be accompanied by several paragraphs of quotes, opinion and additional links, a tumblelogger just posts one thing at a time. An interesting photo or a single link. There might be one line of comment, but rarely more than that.

I have a tumblelog too.

OpenCoffee Boston

I went to the OpenCoffee Boston meeting this morning, met lots of interesting people, among them was Mark Doerschlag of Marks who runs the eponymous MarksGuide in the Boston area.

The guide is a calendar of professional networking events in the Boston area.

OpenCoffee Boston

I went to the OpenCoffee Boston meeting this morning, met lots of interesting people, among them was Mark Doerschlag of Marks who runs the eponymous MarksGuide in the Boston area.

The guide is a calendar of professional networking events in the Boston area.

iTunes and iPlayer

It looks like iTunes UK is finally getting some TV shows.

I am sure this will be seized by the various UK broadband providers as an excuse to complain about the capacity (or lack thereof) of their pipes.

But I digress…

I will be curious if this move by Apple will spur the BBC into rethinking its plans for the BBC iPlayer. This is the BBC’s solution to providing access to its (great!) content to consumers on the internet. Personally I think it is flawed for a number of reasons.

Think about it. The BBC is sitting on a wealth of great content with two options for distributing it. You can either build your own player with your own DRM, with all the distribution and support issues that entails, or you could enter a deal with a company that has a media delivery system installed on 300 million desktops, along with built-in DRM and a distribution infrastructure.

You have to wonder what they are thinking and what internal politics are being played out there.

But then again, I should not expect too much from a company whose source of income is a compulsory license that has to be renewed yearly by every TV owner in the country.