So you want to go 64 bit

Warning, this post is going to get a little technical and would mainly interest those who program in C or C++, though others may be interested in some of the issues raised. I will also post some observations that 32/64 bit has on scripting languages.

The main (and pretty much only) reason to migrate your application from 32 bit to 64 bit is to take advantage of more addressable memory. In general 32 bit machines max out at 4GB, Linux on those machines can effectively only address 3.5GB and a process is limited to 2GB. With a 64 bit machine, you can address oodles and boodles of memory.

What 64 bit won’t do is make your code go faster, or allow you to access larger files (32 bit machines can already do that), or make your smile brighter.

So where am I coming from that I can write about this?

I wrote the search engine for Feedster, consisting of about 260,000 lines of C code with some C++ and Perl scripts, give or take. While it was originally written for a 32 bit platform, very early on it became clear that it needed to be migrated to a 64 bit platform if it was going to scale to deal with the quantities of data that were were being indexed and searched.

There are a number of programming models defined to handle 32 bit and 64 bit development, but only two are applicable to us, LP32 and LP64. The others are interesting, but two is enough to deal with, more would become too complicated and we want to keep this simple right?

The LP32 model defines sizes in bits as follows:

char		8
short		16
int		32
long		32
long long	64
pointer		32

The LP64 model defines sizes in bits as follows:

char		8
short		16
int		32
long		64
long long	64
pointer		64

What interests us are the sizes for the int, long and long long data types. Note that the sizes of int and long long are the same across the two models, and that long is the only numeric type which changes. If you used a long for 32 bit integers (which I did, duh!), you will most likely need to change it to an int to preserve it as a 32 bit integer, and use a long long type for 64 bit integers. One thing to watch out for is that your application will have a tendency to gulp much larger quantities of data because you have much more memory available, so some of these 32 bit integers you were using will need to be increased to 64 bit integers.

Now you could just leave your longs alone and just let them automagically become 64 bit integers. That would be fine, but you will save yourself space by making the change and it would kept the size of the integers consistent across the LP32 and LP64 platforms. This will preserve your sanity when you need to hunt down bugs that come out on LP32 and not LP64, specifically with casting (more on that later.)

Constants are pretty simple, 1 is an int, 1L is a long and 1LL is a long long, and use UL and ULL for unsigned integers. It is a good idea to make them explicit again to maintain consistency across the LP32 and LP64 platforms.

One option would to create types to specify the size of integers such as int32_t, int64_t, uint32_t, etc… I would recommend staying away from those. Way back in the day when there were many more flavors of unix than there are now (remember Ultrix and OSF/1), the size of integers was not written in stone and vendors would create their own implementation, so an int could be 32 bit on one platform and 16 bit on another. Very messy but those days are behind us, thank goodness.

Castings can get interesting, consider the (contrived) statement below, it will work on a LP32 but will behave unpredictably on LP64. I know you will tell me that this is poor coding, but I have seen this before and it caused the application to crash:

	int fee;
	long fi;
	int fo;

	fee = fi - fo;

The fee variable should be a long and fo should be cast to a long for form, but that is not really required in this case.

Generally scripting languages along with Java and C# (amongst others) will shield you from these issues, but I did notice that integer numbers on Perl were presented as longs for bit shifting purposes, so bit shifting 48 bits to the left would work on LP64 but fail silently on LP32, so there is some leakage there.

Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: