François Schiettecatte’s Blog

May 9, 2008

MySQL Forge

Filed under: Scaling, Software Development — François Schiettecatte @ 12:29 pm

I came across MySQL Forge a few days ago when Brian Moon was asking about sample my.cnf configuration files for MySQL. The samples provided in the distribution are very old, very, very old, and more up-to-dates ones are needed.

So I uploaded a copy of the my.cnf configuration file which I have used for database servers sporting 16GB of RAM. You can check out all the same my.cnf which have been uploaded here.

I have to admit mine is sparsely commented, I usually just point to the relevant place in the MySQL documentation, but it does the job.

Boston scalability user group meeting

Filed under: Scaling, Software Development — François Schiettecatte @ 12:16 pm

There is another Boston scalability user group meeting coming up on May 28th. I was not able to attend the last two but I think I will go to this one because it looks interesting:

Orion Letizi from Terracotta will conduct an interactive session exploring the open-source Terracotta project. Terracotta is useful in many different cases including use as Network Attached Memory for working with large datasets, clustering HTTP sessions, reducing database load, acting as a second-level cache for Hibernate objects and more. You can find out more about some of these use cases by visiting the Start Learning Terracotta page.

May 1, 2008

Interview with Donald Knuth

Filed under: Software Development — François Schiettecatte @ 7:52 am

InformIT has a very good interview with Donald Knuth (by way of Artima).

A couple of things resonated with me:

As to your real question, the idea of immediate compilation and “unit tests” appeals to me only rarely, when I’m feeling my way in a totally unknown environment and need feedback about what works and what doesn’t. Otherwise, lots of time is wasted on activities that I simply never need to perform or even think about. Nothing needs to be “mocked up.”

I understand the need for unit tests, I use them all the time, around half the code I have written so far on my current project is test code. But I spend most of my time working on the actual code, thinking though the structure, the flow and the data, and only a small amount of time on test code. I usually work on the big picture first and then focus on the details. I also spend a lot of time ‘refactoring’ (my definition of ‘refactoring’ is the restructure of code due to learning things that were not apparent when you started). I rarely jump into coding because more often than not it leads to dead-ends. Not that dead-ends are a bad thing, but you want to avoid them.

Another thing I do is not code. I spend about six hours a day coding on average, the rest of the time is spent thinking and reading other people’s code.

I also must confess to a strong bias against the fashion for reusable code. To me, “re-editable code” is much, much better than an untouchable black box or toolkit. I could go on and on about this. If you’re totally convinced that reusable code is wonderful, I probably won’t be able to sway you anyway, but you’ll never convince me that reusable code isn’t mostly a menace.

I am not completely sure what to make of this. I think the tension here is between the generic and the specific. You can build generic code which will work for lots of applications but that code will never be optimized for any specific application, or you can build very specific code which will be highly optimized for a single application. I like the idea of reusable code mainly because I am lazy and want to get the best return on my investment, on the other hand I have been known to spend days on very small chunks of code to make it perform as fast as possible.

Perhaps this falls in with the tenet that there is no such thing as portable code, only code that is ported.

I don’t want to duck your question entirely. I might as well flame a bit about my personal unhappiness with the current trend toward multicore architecture. To me, it looks more or less like the hardware designers have run out of ideas, and that they’re trying to pass the blame for the future demise of Moore’s Law to the software writers by giving us machines that work faster only on a few key benchmarks! I won’t be surprised at all if the whole multithreading idea turns out to be a flop, worse than the “Titanium” approach that was supposed to be so terrific—until it turned out that the wished-for compilers were basically impossible to write.

And this I took issue with. The reality of chip design is that we are hitting some very real limits with speed, size, power usage and heat dissipation. I think that multicore chips are a very good way to speed up processing within those limits. While it is true that most applications don’t parallelize well, and that writing parallelized code is hard, multicore chips work very well for multiprocessing which is what most operating systems do these days.

April 25, 2008

MySQL engines, MyISAM vs. Innodb

Filed under: Feedster, Software Development — François Schiettecatte @ 1:20 pm

I think Narayan Newton does a very good job of summarizing the pros and cons of MyISAM and Innodb in this post “MySQL engines, MyISAM vs. Innodb”.

I have seen a lot written about this before but I think his post neatly summarizes the arguments on both sides and as worth reading if you are having to make a decision about this.

My reflex is to always use Innodb unless there is a compelling reason for using MyISAM, and it has to be really, really compelling.

I did take issue with one point he makes which he illustrates with an experience:

On the other hand, InnoDB is a largely ACID (Atomicity, Consistency, Isolation, Durability) engine, built to guarantee consistency and durability. It does this through a transaction log (with the option of a two-phase commit if you have the binary log enabled), a double-write buffer and automatic checksumming and checksum validation of database pages. These safety measures not only prevent corruption on “hard” shutdowns, but can even detect hardware failure (such as memory failure/corruption) and prevent damage to your data.

Drupal.org has made use of this feature of InnoDB as well. The database in question contains a large amount of user contributed content, cvs messages, cvs history, forum messages, comments and, more critically, the issue queues for the entire Drupal project. This is not data where corruption is an option. In 2007, the master database server for the project went down. After examining the logs, it became clear that it hadn’t crashed as such, but InnoDB had read a checksum from disk that didn’t match the checksum it had in memory. In this case, the checksum miss-match was a clear sign of memory corruption. Not only did it detect this, but it killed the MySQL daemon to prevent data corruption. In fact, it wouldn’t let the MySQL daemon run for more than a half hour on that server without killing it after finding a checksum miss-match. When your data is of the utmost importance, this is very comforting behavior.

I have certainly had this happen to me, once or twice, and it is very satisfying to see Innodb recover and carry on on its merry way. However I did experience very nasty hardware failure where a RAID controller went nuts and sprayed bad data out to storage, Innodb won’t prevent this, the database had turned into a small pile of bit goo and Innodb was not able to recover it regardless of how high the ACP(*) innodb_force_recovery was set. We had to switch to backup and wipe the original system clean. It is probable that MyISAM would have been able to recover the database because of its simpler structure.

(*) ass-covering parameter.

April 18, 2008

MySQL Proxy for sharding

Filed under: Scaling, Software Development — François Schiettecatte @ 7:41 am

I have been reading about various experiments using MySQL Proxy to handle sharding (and by extension scaling) for application by rewriting SQL queries as they come through and directing them to the appropriate shards.

The most visible project seems to be HScale, which is well worth looking at and reading about.

The premise is very compelling, which is to remove the issue of sharding from the application layer, moving it into the database layer. This makes the application less complex because it no longer needs to deal with sharding (though it could be argued that sharding, if correctly done, has very little ‘imprint’ on the application.)

I think this project has promise but there are some questions that needs to be addressed before it is really ready to be used in a production setting:

  • First is that the MySQL Proxy introduces a single point of failure. If it fails, the application stops. At the very least, there needs to be a number of proxies and the application needs to be able to detect when one has failed and switch over to another one. I suspect you could get around that issue with a load balancer.
  • Second sharding does not mean that your application automatically becomes fault tolerant. If you have more machines, the odds of one failing go up, so the proxy needs to be able to handle failing over from a failing server to a backup server.

Both of those are difficult problems to deal with, and like a lot of software projects it is the 20% that is going to take 80% of the time.

When in Rome…

Filed under: Java, Software Development — François Schiettecatte @ 7:12 am

I have been doing a lot of work parsing feeds (both RSS and ATOM) lately and have been using a tool called “Project ROME” for that. I know there is another tool called Abdera but that only handles ATOM feeds.

The ROME project page describes it as follows:

ROME is an set of open source Java tools for parsing, generating and publishing RSS and Atom feeds. The core ROME library depends only on the JDOM XML parser and supports parsing, generating and converting all of the popular RSS and Atom formats including RSS 0.90, RSS 0.91 Netscape, RSS 0.91 Userland, RSS 0.92, RSS 0.93, RSS 0.94, RSS 1.0, RSS 2.0, Atom 0.3, and Atom 1.0. You can parse to an RSS object model, an Atom object model or an abstract SyndFeed model that can model either family of formats.

Which is what it does and it does it very well. I have thrown any number of feeds at it and it has performed very well. What I particularly like is the fact that foreign markup is accessible so any special tags like iTunes and Media RSS.

No tool is perfect and there are a few ‘lackings’ in it.

  • For some reason it does not support comment urls in items, I am not sure why this is the case since I would have expected it.
  • Some feeds contain some XSL/CSS directives located just before the feed itself, those are used to direct a browser to “pretty print” the feed when it displays it rather than raw XML. ROME does not like that at all and this stuff needs to be stripped from the feed before it is handed over for parsing.
  • Some feeds (like the NY Times, ahem…), have lots of null characters past the end of the feed, but which are part of the document. I suspect what is happening somewhere is that the feed is deemed to be longer than it actually is and the empty space is filled with null characters (let us pass on the existential issue of filling empty space with nulls). Those also need to be stripped out.

Unfortunately the last release was made in December 2006 and the project does not seem to have any work done on it since. Hopefully someone will step up to the plate and take it on, I might when work lets up. The one obvious thing I would do is add Generics to it.

April 14, 2008

Java 5.0

Filed under: Software Development — François Schiettecatte @ 2:27 pm

Recently I asked a colleague if he could recommend some good books on Java, specifically covering the new features in Java 5.0. I have used a number of the 5.0 features by gleaning them from code and documentation I gathered from the internet, but I was looking for something which brought everything together. He did not have any ideas off the top of his head so I did a little digging and found two books which fit the bill:

The first one is “Java In A Nutshell, 5th Edition” (by O’Reilly Media). It has a chapter dedicated to the bigger additions to Java, namely generics, enumerations and annotations. Additionally it covers additional features like ‘for/in’ loops other control flow features in the chapter on Java syntax, making it clear what features were added in 5.0.

The second one is “Learning Java, 3rd Edition” (by O’Reilly Media). It has two chapters dedicated to Java 5.0 features.

While all the information in those books can be found on the internet, I find it easier to have it all collated in books.

April 10, 2008

Read replication with MySQL - part deux

Filed under: Feedster, Scaling, Search, Software Development — François Schiettecatte @ 1:32 pm

Following up on my last post on read replication with MySQL, I read this post by Greg Linden on the subject of caching which mirrors my thinking on the matter (except that his is better written):

My opinion on this differs somewhat. I agree that read-only replication is at best a temporary scaling solution, but I disagree that object caches are the solution.

I think caching is way overdone, to the point that, in some designs, the caching layers sometimes contains more machines than the database layer. Caching layers add complexity to the design, latency on a cache miss, and inefficiency to use of cluster resources.

My experience at Feedster confirms this, once we got powerful enough servers for the DBMS, we found that we did not need to use memcached at all, in fact it was a hinderance more than anything because it added to the number of machines that needed to be administered.

As a small postscriptum, this post by Ronald Bradford does a very good job of listing out the reasons for replication along with the advantages and disadvantages of each.

April 9, 2008

Being lazy sometimes pays off

Filed under: Scaling, Software Development — François Schiettecatte @ 5:04 pm

Interesting post on Gojko.net on how to make web sites go faster.

The crux of the article is summed up in these three points:

  • Delegate all long operations to a background process
  • Never ever talk to an external system synchronously, no matter how fast it is
  • Be lazy – if something does not have to be processed now, leave it for later

The one that resonated with me is the third one. The last search engine I built depended on two external libraries to support very specific foreign languages. One of these libraries took a while to initialize so I would only do it only when I knew it was going to be needed rather than doing the initialization with every new search that came in.

April 8, 2008

Read replication with MySQL

Filed under: Feedster, Scaling, Software Development — François Schiettecatte @ 4:31 pm

I have been following the thread about the death of read replication over on the Planet MySQL weblog with interest. In with this issue the notion of caching is thrown in to illustrate that it can be used as a substitute to read replication. (See this, this and this.)

Personally I think the two issues are separate and should be treated as such, and I will be basing this on my experiences at Feedster scaling a MySQL database from about 1GB to around 1.5TB.

Initially we relied on read-replication to shift the read load from the master server to alternative read servers. For a while this worked, but as our hardware got better (post-funding) we found that the read servers were not keeping up with replication. After some amount of digging and consultation, what became very clear to me was that the read servers were never going to catch up for a very simple reason.

While the master server and the read servers were roughly the same in terms of capacity, the issues were that the read server was having to support the same write load as the master server and, in addition, a much higher read load. Combine that with the fact that replication takes places in a single thread (whereas the master uses multiple threads to write data), and you have a situation where the read servers cannot catch up with the master server.

There are a couple of tricks you can employ to make the slave servers faster, one is to do the replication across multiple threads using a script (which I have done) but you lose referential integrity, the other is to write a utility which pre-reads the replication log and accesses relevant rows before they are accessed to make sure that replication is not slowed down waiting for data to be read off storage (this was the solution implemented by YouTube for a while).

Looping back to read replication. I agree that read replication is dead, and it should be. Replication should be used for backup purposes only, which is what we eventually did at Feedster. And your replication server should be ready to take over if the master server fails.

Onto the second issue of caching. The caching that memcached does is actually pretty simplistic. You can cache a ‘chunk of data’ somewhere and access it later if it has not been flushed to make room for other ‘chunks of data’. I say ‘chunk of data’ because that is how memcached sees it, you are responsible for serializing the data (flattening it in a contiguous area of memory) and decoding it when you get it back. Caching makes sense if it takes you more time to get data out of your database than it does getting it from cache. Ideally you want to be in a situation where you don’t need to use caching because you get get to your data fast enough. Getting to that point means having an optimized schema and a sharded database so you can take advantage of the bandwidth that multiple machine afford you. The point is to take the memory you would use for caching and give it to your database servers.

Older Posts »

Blog at WordPress.com.