Google Please Don’t Crawl This Server

For some reason the Googlebot has found it necessary to crawl a development server of mine, I suspect that one of the users uses Google Chrome which probably snarfs urls browsed.

Google tells us that one way to do this is to return the 410 HTTP  status code , and the way to enforce this in httpd.conf is:


# Tell crawlers to go away   
RewriteEngine On    
RewriteCond %{HTTP_USER_AGENT} (Googlebot|bingbot|Validator|MJ12bot|Baiduspider)    
RewriteRule ^.* - [G]

 

I have included other crawlers in the list just to make sure.

 

Advertisements

Chromium OS X

Came across the Chromium OS X browser for Mac, guess it is an open source version of Google Chrome.

Can’t really much difference between the two, but I will try it out for a while and see.

 

French Court Fines Google $660,000 Because Google Maps Is Free

I am so glad I escaped France and that I am no longer French. Only in France could this happen, protecting a business whose model has been disrupted by a competitor. The sad thing is that this only benefits the incumbent, not its customers who could cut costs by shifting to a cheaper (free!) product, or the consumers at large for the same reason. So whatever friction was removed from the system has now been artificially reintroduced. Sucks to be a French consumer.

 

French Court Fines Google $660,000 Because Google Maps Is Free:

Google faces a $660,000 fine after a French court ruling that the company is abusing its dominant position in mapping by making Google Maps free.

According to The Economic Times, the French commercial court “upheld an unfair competition complaint lodged by Bottin Cartographes against Google France and its parent company Google Inc. for providing free web mapping services to some businesses.”

Bottin Cartographes provides mapping services for a cost, and its website boasts several business clients such as Louis Vuitton, Airbus and several automobile manufacturers.

Interesting Bug in Django

I am working on a project that involves Django and ran into an interesting issue. Django creates a small database to keep track of various bits of data one of which is user session information in a table called django_session as follows:

CREATE TABLE django_session (
  session_key varchar(40) NOT NULL,
  session_data longtext NOT NULL,
  expire_date datetime NOT NULL,
  PRIMARY KEY (session_key)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

This is all well and good but there is an issue. InnoDB orders the rows in primary key order (primary keys are SHA1 hex digest). The problem is that these keys are effectively random so a new session row be be inserted anywhere in the table causing data to move around with every insert. While this might work when the table is small, it does not work so well when you have 500,000+ rows in it (which is another issue that I will get to).

A better schema for the table is as follows:

CREATE TABLE django_session (
  id int(11) NOT NULL AUTO_INCREMENT,
  session_key varchar(40) NOT NULL,
  session_data longtext NOT NULL,
  expire_date datetime NOT NULL,
  PRIMARY KEY (id),  
  UNIQUE KEY session_key (session_key)

) ENGINE=InnoDB DEFAULT CHARSET=utf8;

This will ensure that rows are inserted consecutively which will ensure better performance as the table grows.

Two things to note:

I am not sure whether Django specifies the ENGINE to use when creating these tables, but MySQL 5.5 uses InnoDB rather than MyISAM, and I don’t think this will be an issue with the latter.

The other thing is that Django does not seem to clear out sessions past their expiry date, so one needs to do that regularly with the following statement:

DELETE FROM django_session WHERE expire_date <= NOW()

One more thing, I think that is the case too with database backed caches too.

 

How to speed up an aging MacBook with a solid state drive

About 3 months ago I upgraded the disk drive in my MacBook Pro with an SSD, the interesting thing was how easy it was to do the upgrade with the help of iFixit. Ars does the same to a MacBook:

How to speed up an aging MacBook with a solid state drive: “When we recently detailed how to boost the storage space in a MacBook Air with a replacement solid state drive module, some readers asked what it would be like to swap the hard drive in an older MacBook with a similarly speedy SSD. We decided to investigate, and as it turns out, thanks to a common 2.5″ drive size and widely available external enclosures, the swap is quicker, easier, and cheaper than the one for a MacBook Air.”

The upgrade was a real shot in the arm of my MacBook Pro in terms of performance, even though the previous disk drive was a 7,200rpm model.

 

BBEdit 10.0

BBEdit 10.0 came out last week, I have been a long time user of BBEdit (pretty much since it came out). This release comes with a lot of changes, a lot, and has required me to change some of my work processes. There are a number of rough edges too so it might be an idea to wait for the .1 release. And you will want to make sure not to upgrade unless you have some downtime to work through all the changes.

 

Mac OS X Lion

I upgraded to Mac OS X Lion over the weekend, things went mostly well, backup, install, repeat on the other machine.

There are a number of good reviews out there, notably John Siracusa’s at ArsTechnica, and Robert Mohns’ at MacInTouch.

Some things that are driving me crazy:

  • The way the new scroll bars appear and disappear, I just make them permanent. I tried to make them automatic but they would not always appear and you lose the ability to see what is and is not scrollable.
  • The Address Book and iCal are a train wreck.
  • For some reason my folder/document locations keep getting reset.

That being said, I really like the new Mail, Mission Control is nice (though they have made it more difficult to see minimized windows).

Overall this version is optimized to run on laptops, not desktop machines, but you can turn off all the ‘laptopy’ things.

Lastly I found that John Siracura has a podcast called Hypercritical. Worth listening to.