François Schiettecatte’s Blog

Apple Subscription Plan

Posted in Apple, Personal, Podcasts by François Schiettecatte on November 2, 2009

Apparently Apple is shopping a $30 month subscription plan to the media networks for content (1, 2).

I think this is a good idea.

Most of the content on the iTunes Movie store is buy-only which is not an attractive proposition since I suspect that most of it will only get watched once (or twice, maybe). The high cost of buying programs is what puts me off buying anything on the store. For example if I want to catch up on the Family Guy on the iTunes Movie store, I would need to buy all the seasons which would cost me between $10 and $30 a season, a show-stopper for me. For that kind of money I am better off going to Netflix but then I have to wait for 2 days. I know you can rent movies and I do once in a while but the selection is pretty thin.

So if I was able to have access to the content on the iTunes Movie store for $30/month and pay per movie rented, that would be an attractive proposition for me.

Attractive in the way cable wasn’t because I would get 70 channels for a basic cable subscription (67 of which I did not watch), and I would need to get premium subscriptions for better quality content, which is why I killed off my cable 4 years ago.

The interesting thing about this plan though is that it moves close to the a-la-carte programming that cable companies fought tooth and nail against time and time again.

Custom GelaSkins

Posted in General, Personal by François Schiettecatte on October 31, 2009

I recently found out that you can make custom GelaSkins, the process is a little tedious but I got my three customs GelaSkins last week and they look great. The colors are well reproduced, crisp and clean. Be sure to upload high resolution pictures because a laptop lid is quite large, and bear in mind that you will probably have to crop the top and bottom of the image because of differences in the the aspect ratios between the it and the laptop lid.

Sharks

Posted in Scuba by François Schiettecatte on October 26, 2009

This is a small Nurse Shark resting on some sand. Nurse Shark feed at night and rest during the day, and have a set of teeth which are used for grinding because they typically feed off the bottom looking for shellfish and crustaceans. They are not aggressive towards divers and are in fact rather curious. Some of them came very close to me.

Last week I said that I would give an update on the sorry state of the shark population, so here are some stats.

Officially 38 million sharks are fished from the ocean, mostly for their fins (the finless shark is dumped back in the water to drown) and some as by-catch (I don’t have any numbers on the ratios.) The real number is most likely between 70 and 100 million because most of the shark fishing is not reported.

In the past 20 years the global shark population has been reduced by 50%, at this rate sharks will be gone by 2030.

Some shark populations have crashed completely, 95% of the Mako sharks off the east coast of the USA are gone.

Each year about 10 people are killed by sharks.

What is really obvious is that sharks are not cuddly or cute, like polar bears or whales for example, which is a real problem for any conservation campaign, and I am still at a loss as to how this can be remedied. A lot of people think sharks are dangerous and in fact the statistics show quite the opposite. As one of the top predators in the oceans, sharks have a vital place in the ecosystem, take them away and there will be major disruptions to that ecosystem. We are already seeing that with an increase in the number of stingrays off the east coast of the US.

I hope that we are more enlightened now than we were when we hunted whales close to extinction, but I am not sure given the lack of response I have seen on this issue.

VC Website Organization

Posted in General, User Interface Design by François Schiettecatte on October 25, 2009

I have been spending a lot of time this weekend looking at VC portfolios on their website for a client. My sense is that VCs would want these portfolios to be well structured and easy to navigate, after all the idea is to have a favorable exit. Turns out that there is a lot of variability in the way these portfolios are presented making navigation sometimes very difficult.

So here are some suggestions on how to improve this:

  • Show the company logo, web site and a short description together in a simple list. Do not force the user to click a link to see the company description.
  • If there are lots of companies be sure to give a way to narrow the list down quickly either through a faceted search or via tagging, so Networking, Social, Music, etc… And there is no harm in having multiple tags for each company.
  • Separate prior investments from current investments, or provide a way to do so easily.
  • Make sure the company descriptions are accurate and skip the marketing/hype adjectives, if I don’t see descriptions such as “Delivers revolutionary software solutions” or “Discover breakthrough improvement opportunities”, it will be too soon. This tells me nothing about what the company does and my time is limited.

The idea behind listing a portfolio is to communicate, the less I have to futz with the website the better the communication.

Anti-Virus for Mac

Posted in Apple by François Schiettecatte on October 25, 2009

Interesting, PC Tools now has anti-virus software for Macs.

I tried it out and the scan came up negative, and I uninstalled the software.

I have not run anti-virus software since the days of Mac OS 9 and may even have dropped it since Mac OS 8.

Still you have to assume that Mac OS X will become a target once the ecosystem is large enough.

New Apple Products and the Inevitable

Posted in Apple by François Schiettecatte on October 20, 2009

Apple just announced a new raft of products today. I’ll spare you the me-too and just direct you to the Apple site for details.

Of interest to me is the new mouse which looks very nice, I currently use the wired Mighty Mouse from Apple because it fits my hand nicely and I don’t like the tracking (floating) of the Bluetooth variant. But I will certainly try this new one and see how it goes.

The new remotes are very cute, I will have to try one of those too.

And there is the inevitable. My Airport Express died two weeks ago, on sunday morning at 6:45am. It was out of warranty, so I went and bought myself a new one. And it just figures that Apple would update it two weeks later (today.)

Lionfish

Posted in Scuba by François Schiettecatte on October 19, 2009

When I was in the Turks and Caicos a couple of months ago, we saw Lionfish on pretty much every dive we did, if not all the dives. This fish is a native of the Pacific Ocean so had no business being in the Atlantic. It would like seeing a polar bear in the middle of the jungle.

In fact it is a voracious predator consuming very large quantities of small fish, threatening reef ecosystems.

The Economist has a very interesting article about the issue and steps which are being taken to control their population:

Mr Dimin’s company works with fishermen who practise sustainable fisheries management, and helps them get their catches into the sort of high-class restaurants frequented by wealthy conservationists. Mr Dimin got his idea from the appearance in some resorts of “lionfish rodeos”, in which holidaymaking divers round the fish up, and which are usually followed by lionfish cook-ups on the beach. He learned from these that the fish, suitably de-spined, are delicious (they taste like snapper). That got him wondering if consumer demand might be a force powerful enough to halt even an invasive species as successful as the lionfish.

Next week I will post an update about the sorry state of shark populations.

MacResearch Weblog

Posted in Apple by François Schiettecatte on October 18, 2009

Can’t think how I missed it, but the MacResearch weblog tracks the use of Macs in research.

Particularly interesting to me is the tutorial on OpenCL (currently there are six parts, 1, 2, 3, 4, 5, 6).

Number Encoding II

Posted in Scaling, Search, Software Development by François Schiettecatte on October 16, 2009

To conclude my little foray into number encoding (see the presentation by Jeffrey Dean from Google titled “Challenges in Building Large-Scale Information Retrieval Systems” (video, slides)), here are a few conclusions:

  • In terms of raw performance the “Varint Encoding” is much faster than “Byte-Aligned Variable-length Encodings” and I was able to get better numbers than Google got, most likely because I am using a different machine. It would be interesting to know what kind of machine/OS they used for their timings so I could do a direct comparison. My lookup array structure is different (and more compact) than Google’s, assuming I understood Google’s lookup array structure in the presentation.
  • The “Byte-Aligned Variable-length Encodings” is faster if you are storing three numbers per posting, namely a document ID, a term position and a field ID. The “Group Varint Encoding” is faster if you are storing four number per posting, namely a document ID, a term position, a field ID and a weight.
  • As I described in the last comment in the original post, two bits are used in the header for each varint to indicate its size in bytes, so 0, 1, 2 or 3 indicate whether your varint is 1, 2, 3 or 4 bytes long respectively. However if you store deltas a lot of the numbers you store will be 0, using a byte to store 0 seems wasteful to me. So I changed this so that the two bits indicate the actual number of bytes in the varint, and 0 bytes means 0. This way I don’t actually allocate space unless there is a value other than 0 to store. This saves about 10% in my overall index size, and a lot more if you only take the term postings into account because I store some amount of document metadata in my index. Of course this means that you can’t store a number greater than 16,777,216 which won’t happen unless you are creating huge indices with more than 16,777,216 documents in them or have documents longer that 16,777,216 terms.

Basically it comes down to trade-offs, index compactness vs. decode speed, and looking at speed both in test code (usually a contrived example) and performance on a real data set. I used the Wikipedia data for that along with 200 relatively complex searches designed to read lots of postings lists.

Danger Danger

Posted in General, Scaling, Software Development by François Schiettecatte on October 13, 2009

Plenty has been written about the Danger data loss over the weekend (TechCrunch). For me the most interesting commentary came from John C. Dvorak, he got some things right but he also got some things wrong:

Over the past week, users of the T-Mobile Sidekick platform found that all their contacts and other important information was permanently lost, because of server mishaps. If Microsoft had wanted to throw a monkey wrench into cloud computing, it could not have done a better job.

Huh, don’t think so, this was a data loss screw-up, nothing to do with the cloud. If what we are reading is to be believed, a SAN upgrade went wrong and the data was lost, with no backup.

Insert Ellen Feiss ad here…

Seriously though backups are essential because things will go wrong. Note that I use ‘will’ and not ‘may’, and these may or may not be under our control.

The other things I used to tell clients is to do a fire drill on a regular basis, by that I mean taking the backups and making sure they can be restored properly and completely. I had one client who discovered that all their backups were useless when they checked.