September 26, 2007 Leave a comment
Elias Torres has a very interesting post about an investigation into compressing content stored in MySQL.
At Feedster, feed post content is compressed and stored in MySQL. When I was there, we were using MySQL 4.1.x which did not support compression natively, so we had to roll our own.
What we did was to use zlib to compress and store the compressed content if it was smaller than the original content. This is significant because some content compressed to a size larger than the original content. So when we extracted the content, we has to check the first two bytes and decompress the content if we found “\a120\a156” at the start of the file. We stored all our content in utf-8, and “\a120\a156” is not valid utf-8, so we were knew that we would not decompress content by mistake.
The decompression was done by whatever client accessed the data (an API in our case), and we generally found that this was not onerous to do.