François Schiettecatte’s Blog

August 7, 2008

Memcached, again…

Filed under: Scaling, Software Development — François Schiettecatte @ 1:33 pm

The High Scalability blog has a good summary of a presentation given by Farhan Mashraqi of Fotolog.

As I have written before, I am really, really, really ambivalent about using memcached to cache data coming out of MySQL.

Fotolog has 51 instances of memcached on 21 servers with 175G in use and 254G available.

I can’t help but wonder how MySQL would perform if given 21 extra servers with all that memory.

They also mention MySQL’s cache, my advice on that is don’t even bother with it, it is worse than useless.

One place where I was interested in their use of memcached is in caching filesystems accessed via NFS. Again I have to ask whether they are solving the wrong issue. It is a bit like saying “my car is slow therefore I will buy a faster car to tow it with, and my slow car will now go faster.” The real solution is to see why your car is slow in the first place, and then trade it in for a faster one if need be.

Installers ahoy

Filed under: Scaling, Software Development — François Schiettecatte @ 1:16 pm

I am wrapping up work on the installer I started working on over the weekend.

The goal of the installer is to have a single script I can run on a pristine machine (or image), tell it what I wanted installed (crawler, api, indexer, etc…) and boom! five minutes later I have a fully installed, fully configured instance. Or course the best thing to do is to create an instance image which I can run in the cloud, but I need the installer to create the image. The exercise of building the installer is very good at getting a handle on which files, libraries and what-have-yous go with each sub-system. As a side-bar we did not have that at Feedster for the longest time, which was a bad mistake.

By the end of the weekend I decided that the best way to approach this is to assume that I was installing on a pristine machine (say a CentOS installation) and assume nothing, well assume that there is at least a network and a few tools like svn, gcc and make. So everything needs to checked for and, if needed, checked out and installed, right down to things like java and ant. This caused me to look again at the structure of my svn repository and effectuate a reorganization which helped a lot both from a code organization point of view and an install process.

August 4, 2008

Yellowline Arrow Crab

Filed under: Scuba — François Schiettecatte @ 6:04 pm


This strange little guy is a Yellowline Arrow Crab straddling the top of a tube sponge. I got a second shot too.

It was odd to see it there, but cool nonetheless. Usually these little guys are tucked into the folds of a barrel sponge or in some cracks in the coral wall.

They have a really long nose (not sure why) and really long delicate legs. Personally I think they are quite beautiful.

Drizzling away

Filed under: Scaling, Software Development — François Schiettecatte @ 5:18 pm

I have been reading more and more about the Drizzle project these past few weeks and it looks like a very interesting project. The project itself started off from MySQL, lots of bits got ripped out and new bits are being added in. You can track progress on Planet MySQL and on the project home page.

Talking about the project home page, here is what it says:

A Lightweight SQL Database for Cloud and Web

The Drizzle project is building a database optimized for Cloud and Net applications. It is being designed for massive concurrency on modern multi-cpu/core architecture. The code is originally derived from MySQL.

While Jay Pipes does not think that this will ever make it out of the lab, I think there is a gap in the market for a lightweight, SQL-based, networked DBMS. At one end of the spectrum you have MySQL, a fairly complete and heavyweight RDBMS, further along you have SQLite and eventually BerkeleyDB from (Oracle, originally SleepyCat).

As an application grows and there is more and more data to manage, a switch has to be made from a monolithic database to a sharded database, which means that a lot of the work that was being done in the monolithic database server (referential integrity, joins, etc…) has to move to a middleware layer (this is documented ad nauseam so I am not going to expand on that.)

So if you are using MySQL in this scenario, you wind up not using 80% of the features that MySQL offers which just makes them overhead. Trouble is that at the other end of the spectrum (touched on above) there isn’t anything which does the 20% you need.

So I think (hope!) that this is where Drizzle is heading, because it really just makes sense.

Updated August 8th, 2008 - Drizzle is the subject of the current FLOSS Weekly podcast over on the TWIT network.

August 3, 2008

Installers over the weekend

Filed under: Software Development — François Schiettecatte @ 6:21 pm

I have been spending the weekend on an installer script for a project I am currently working on. For me this is a very boring task, right up there with writing documentation. The incentive to get it over and done with is that I can get back to more interesting stuff.

The installer script is designed to be a standalone script which can be used to install complete sub-systems on a machine, so it could install or upgrade the crawler on a machine, or a search engine. The machine could be an actual machine, or a virtual machine or eventually a image which could them be distributed across virtual machines. The devil is in the details, tracking down all the files needed by each sub-system, etc… The reward is a very simple, completely automated installer.

I have been spending so much time on this, my mind is beginning to ‘think’ only in installer terms, so I should “./Installer.pl –make-dinner” now.

August 2, 2008

Obscure Apple Raid bug

Filed under: Apple — François Schiettecatte @ 8:14 am

I ran into an obscure bug with Apple Raid. I had 4 drives in my MacPro, two of which were mirror raided together. My main drive failed recently and I decided to stop the RAID and used the two drives which made it up as primary and backup drive. The Apple Disk Utility allows you to stop RAID which I did and the two drive mounted on the desktop as expected.

After copying lots of data around, I was left with just those two drives in my computer, each with a different name. Oddly the Apple Disk Utility could show the backup drive as having the same name as the primary drive and also being mounted on ‘/’, again same as the primary drive, but it would show the drive as being in the correct slot. SuperDuper (which I use for backups) was very confused as well but the ‘df’ utility would show both drives correctly.

I had to erase and repartition the backup drive for this oddity to go away.

July 28, 2008

Time to clean house

Filed under: General — François Schiettecatte @ 9:39 pm

Periodically I have to clean house, meaning that I close down all those online accounts which I opened and never used, unsubscribe from lists that are no longer useful, cut out RSS feeds I don’t read, ‘unfriend’ people I never talk to, and go through all the computer gear I have amassed and get rid of unwanted and unused stuff.

Now is that time.

UPDATED - 7/30/08 - Cleaned up a lot of stuff, got one taker on craigslist for all of it and then nothing more is heard from them - unbelievable - I guess free means the person does not value the stuff.

When not to send a drive in even if it is under warranty

Filed under: General — François Schiettecatte @ 3:53 pm

One of my drives failed over the weekend, it was the main drive on the the main computer (I had a current backup so lost no data,) but I ran into the issue of whether to send the drive in under warranty or not. The drive is an expensive 150GB WD Raptor which spin at 10,000RPM (it does make a difference,) so I would like to get it fixed under the warranty. The problem is that the drive won’t even spin up so I can’t wipe it.

So it won’t get sent it because of the data that is on it, way too much personal stuff there. I will use it as a paperweight for a while, take it apart to satisfy my inner geek and then take a hammer to it.

July 26, 2008

Grouper getting a make-over

Filed under: Scuba — François Schiettecatte @ 1:11 pm


Last week I posted something small, so this week it is time for something big (I try to alternate.)

This is a Nassau Grouper getting a make-over from some Peterson Cleaner Shrimp at a cleaning station. If you look carefully you can see the shrimp in its mouth.

A cleaning station is a place where fish (and other creatures) can come and get a cleaning from resident shrimps and other cleaning fishes. The cleaning consists of removing dead skin and flesh, as well as any parasites that may be on the fish getting cleaned.

A reef will have cleaning stations all over, you only need to look where fish are hovering, usually at a slight angle, with their mouths and/or gill open. This is usually the signal that they want a cleaning rather than hunting for dinner!

On this particular shot I was able to loiter for about 5 minutes, taking lots of shots as the shrimp clambered all over the grouper to clean it. Cleaning stations are a great place to get good shots because the fish are usually very relaxed and don’t move, so if you are patient you will be usually rewarded with great shots.

The oddity here is that I could not see the Corkscrew Anemone where Peterson Cleaner Shrimp usually hide in, but maybe it was tucked away somewhere out of sight.

Peterson Cleaner Shrimp will also clean diver’s hands if you let them which I have done before and which will be the subject of a future post.

“stuck with a multilanguage future”

Filed under: Software Development — François Schiettecatte @ 11:55 am

At the end of a fairly predictable article where SOAP and REST supporters take cheap shots at each other, Tim Bray being one of them, Bray comes out with some pretty eye-rolling stuff:

During a keynote presentation at OSCON on Friday, Bray will talk about the “language inflection point,” in which various languages such as Perl, Python, and Ruby have been gathering momentum at the expense of the established Java and .Net platforms.

“Up until two years ago, if you were a serious programmer you wrote code in either Java or .Net,” Bray said. “[Now], there are all these options that people are looking at and it’s really an inflection point.”

I fail to see what “serious programmer” and specific languages have to do with each other, I would have thought that a “serious programmer” would pick the language best suited to the task at hand.

The Java platform is accommodating scripting languages such as Ruby and Python on the JVM, Bray noted. Sun has been enabling these to work on the Java Virtual Machine. “The Java language is not what the cool kids are choosing to use these days,” said Bray.

IMHO the “cool kids” who are really smart learn a variety of languages and keep learning new ones. They do this to increase the breadth of their knowledge and toolbox, so they don’t approach every programming problem with the same hammer.

Still, Java will stay around, he said. “The Java language isn’t going away. It’s the world’s most popular programming language,” Bray said.

I have not seen any specific figures as to how popular a specific language is, in fact how would you measure that. Lines written? Programmers using it? Users using application written in it?

“I think that like it or not, we’re stuck with a multilanguage future,” he stressed.

What’s not to like about a “multilanguage future”, we have a multilanguage present and we have had a multilanguage past, multilanguage has served us well and will continue to do so. As for being “stuck”, I am glad we were not “stuck” 30 years ago otherwise we would all be writing stuff in COBOL, or worse assembler.

« Newer PostsOlder Posts »

Blog at WordPress.com.