François Schiettecatte’s Blog

May 10, 2008

Smooth Goose-Neck Barnacles

Filed under: Scuba — François Schiettecatte @ 10:31 am

This is a buoy floating at around 20 feet at Alcyone in the Cocos Island. I was surprised at the number of barnacles growing on it. This shot was taken at the end fo the dive when we were doing our safety stop. I got two shots of this, not easy as there was a bit of a current (everyone was drifting off into the blue about to surface and be picked up by the panga and I needed to stay with them).

These barnacles are free-swimming as larvae and as adults form a hard shell and live attached to submerged surfaces. Obviously these are adults but I saw lots of larvae swimming around at all depths. I tried to take some pictures of larvae but none were any good.

I really like the colors in this shot, the water was also very clear which made for a lot of detail.

Scaling MySQL at Facebook

Filed under: Feedster, Scaling, Search — François Schiettecatte @ 10:16 am

By way of Greg Linden, some interesting notes and figures from various high traffic web sites on scaling MySQL.

As Greg points out, Facebook’s strategy is to partition the data and spread it across a lot of servers, which is pretty much the only way to go if you want to scale MySQL, or any site for that matter.

Crawling is indeed harder than it looks

Filed under: Feedster, Scaling, Search — François Schiettecatte @ 10:10 am

Greg Linden (a must-read blog because he picks up new publications very quickly) has a good post aggregating a number of papers from WWW 2008 on crawling and why crawling is hard.

I wrote the version one crawler for Feedster (version zero was not very good and got ditched very quickly) and it is very difficult to write a good crawler. It is basically a balancing act, currency versus bandwidth usage, etc…

I finished writing a crawler a month or so ago for the current project I am working on and it took me a while to adjust the crawl interval based on how frequently a feed changed. I am not sure I have it quite right yet and the algorithm still needs more adjustment.

May 9, 2008

MySQL Forge

Filed under: Scaling, Software Development — François Schiettecatte @ 12:29 pm

I came across MySQL Forge a few days ago when Brian Moon was asking about sample my.cnf configuration files for MySQL. The samples provided in the distribution are very old, very, very old, and more up-to-dates ones are needed.

So I uploaded a copy of the my.cnf configuration file which I have used for database servers sporting 16GB of RAM. You can check out all the same my.cnf which have been uploaded here.

I have to admit mine is sparsely commented, I usually just point to the relevant place in the MySQL documentation, but it does the job.

Boston scalability user group meeting

Filed under: Scaling, Software Development — François Schiettecatte @ 12:16 pm

There is another Boston scalability user group meeting coming up on May 28th. I was not able to attend the last two but I think I will go to this one because it looks interesting:

Orion Letizi from Terracotta will conduct an interactive session exploring the open-source Terracotta project. Terracotta is useful in many different cases including use as Network Attached Memory for working with large datasets, clustering HTTP sessions, reducing database load, acting as a second-level cache for Hibernate objects and more. You can find out more about some of these use cases by visiting the Start Learning Terracotta page.

May 3, 2008

Furry sea cucumber & Bumblebee shrimp

Filed under: Scuba — François Schiettecatte @ 6:46 am


This is a furry sea cucumber, shot in the Turks and Caicos. These live on sandy sea floors and feed on whatever they find there. We were diving on a large sandy area with lots of these little guys crawling around. They are about 12 inches long and quite stocky. You can see a side-on photo here.

What is interesting about these guys is that they serve as a host to bumblebee shrimps who spend their lives on it. You can see three in this picture, the most obvious one is right in the middle of the picture. These shrimps are very small, between 1/8 and 1/4 of an inch long. They are called ‘bumblebee’ because of the distinctive black and white stripes on their body. There are also some yellow marks on their claws.

You can see another picture here.

May 2, 2008

Barcamp Boston

Filed under: General — François Schiettecatte @ 7:46 am

A Barcamp event is being held in Boston on May 17th-18th, it looks interesting, I may well attend on one of the days:

What: BarCamp is an unConference, organized on the fly by attendees,
for attendees.

There is no registration fee, but you don’t just attend a BarCamp –
you can participate in discussions, demo your projects, or join into
other cooperative events.

Topics may include, but are not limited to: open source software,
startups, UI design, entrepreneurship, AJAX, hardware hacking,
robotics,mobile computing, bioinformatics, RSS, Social Software,
programming languages, and the future of technology.

Who: You, if you’re a geek or somewhat geeky. Pre-registration is
highly recommended.

When: May 17/18, 2008 starting each day at 9 AM or whenever you
want to arrive.

Where: Matignon High School, 1 Matignon Rd., Cambridge, MA. The
school is a short 10 minute walk from either the Davis Sq. or Alewife
stations on the MBTA Red Line. Parking is available on site.

Details: For more information go to http://barcampboston.org.

May 1, 2008

More on the NVIDIA GeForce 8800 GT

Filed under: General — François Schiettecatte @ 9:02 am

Previously on this blog I described some issues I had with the NVIDIA GeForce 8800 GT video card upgrade from Apple.

In short I ordered two cards thinking that I could use both in my MacPro, but one of the cards died on me and Apple support told me that they do not support putting two of those cards on a MacPro.

The dead card is now on its way back to Apple for a refund, and I put the two NVIDIA GeForce 7300 GT video cards I pulled from the machine back in the machine (now running with three cards, one for each display.)

I did a little more research on Apple’s web site and a dual NVIDIA GeForce 8800 GT video card configuration should be supported. The NVIDIA GeForce 8800 GT video card draws 110W and the MacPro supports up to 300W of power for video cards (presumably from the PCIe bus itself and via the motherboard jumper.)

Profiling Java

Filed under: Java — François Schiettecatte @ 8:55 am

Recently I asked a colleague about Java profilers. Four years ago I worked on a Java project (my first) and am currently working on my second one. Profiling was not really an issue on the first project but I am hitting some performance issues with this current one.

He pointed me to to YourKit which looks very full featured, it is also not cheap at $500/license.

I did a little bit of looking around and found JRat, a command line profiling tool which seems to work well for command line applications. You use JRat Desktop (a Swing application) to look at the profile and coverage information, and it does a good job of laying out where time is being spent in the application.

What it is telling me now is that the Java MySQL connector is a real performance hog.

Interview with Donald Knuth

Filed under: Software Development — François Schiettecatte @ 7:52 am

InformIT has a very good interview with Donald Knuth (by way of Artima).

A couple of things resonated with me:

As to your real question, the idea of immediate compilation and “unit tests” appeals to me only rarely, when I’m feeling my way in a totally unknown environment and need feedback about what works and what doesn’t. Otherwise, lots of time is wasted on activities that I simply never need to perform or even think about. Nothing needs to be “mocked up.”

I understand the need for unit tests, I use them all the time, around half the code I have written so far on my current project is test code. But I spend most of my time working on the actual code, thinking though the structure, the flow and the data, and only a small amount of time on test code. I usually work on the big picture first and then focus on the details. I also spend a lot of time ‘refactoring’ (my definition of ‘refactoring’ is the restructure of code due to learning things that were not apparent when you started). I rarely jump into coding because more often than not it leads to dead-ends. Not that dead-ends are a bad thing, but you want to avoid them.

Another thing I do is not code. I spend about six hours a day coding on average, the rest of the time is spent thinking and reading other people’s code.

I also must confess to a strong bias against the fashion for reusable code. To me, “re-editable code” is much, much better than an untouchable black box or toolkit. I could go on and on about this. If you’re totally convinced that reusable code is wonderful, I probably won’t be able to sway you anyway, but you’ll never convince me that reusable code isn’t mostly a menace.

I am not completely sure what to make of this. I think the tension here is between the generic and the specific. You can build generic code which will work for lots of applications but that code will never be optimized for any specific application, or you can build very specific code which will be highly optimized for a single application. I like the idea of reusable code mainly because I am lazy and want to get the best return on my investment, on the other hand I have been known to spend days on very small chunks of code to make it perform as fast as possible.

Perhaps this falls in with the tenet that there is no such thing as portable code, only code that is ported.

I don’t want to duck your question entirely. I might as well flame a bit about my personal unhappiness with the current trend toward multicore architecture. To me, it looks more or less like the hardware designers have run out of ideas, and that they’re trying to pass the blame for the future demise of Moore’s Law to the software writers by giving us machines that work faster only on a few key benchmarks! I won’t be surprised at all if the whole multithreading idea turns out to be a flop, worse than the “Titanium” approach that was supposed to be so terrific—until it turned out that the wished-for compilers were basically impossible to write.

And this I took issue with. The reality of chip design is that we are hitting some very real limits with speed, size, power usage and heat dissipation. I think that multicore chips are a very good way to speed up processing within those limits. While it is true that most applications don’t parallelize well, and that writing parallelized code is hard, multicore chips work very well for multiprocessing which is what most operating systems do these days.

Older Posts »

Blog at WordPress.com.