Interesting list of distributed key-value stores,

Perhaps you’re considering using a dedicated key-value or document store instead of a traditional relational database. Reasons for this might include:

  1. You’re suffering from Cloud-computing Mania.
  2. You need an excuse to ‘get your Erlang on’
  3. You heard CouchDB was cool.
  4. You hate MySQL, and although PostgreSQL is much better, it still doesn’t have decent replication. There’s no chance you’re buying Oracle licenses.
  5. Your data is stored and retrieved mainly by primary key, without complex joins.
  6. You have a non-trivial amount of data, and the thought of managing lots of RDBMS shards and replication failure scenarios gives you the fear.

Whatever your reasons, there are a lot of options to chose from. At we do a lot of batch computation in Hadoop, then dump it out to other machines where it’s indexed and served up over HTTP and Thrift as an internal service (stuff like ‘most popular songs in London, UK this week’ etc). Presently we’re using a home-grown index format which points into large files containing lots of data spanning many keys, similar to the Haystack approach mentioned in this article about Facebook photo storage. It works, but rather than build our own replication and partitioning system on top of this, we are looking to potentially replace it with a distributed, resilient key-value store for reasons 4, 5 and 6 above.

At Feedster we used MySQL initially as an RDBMS and quickly switched to using it pretty much as a key-value store. This is a little simplistic in fact, we did run selects to gather multiple tuples, but we eschewed anything more complex than a simple select … from table where … (basically single table selects.)

I am not ready to jump ship to an Anti-RDBMS just yet, I still need to be able to get multiple tuples containing a specific indexed value, so select … from table where … is still something that is required.


2 Responses to Anti-RDBMS

  1. Mihaï says:

    About Vertica what do you think?

  2. I only know what is published on their website and it looks interesting, though it seems to be geared towards analytics rather than storage which is what the products listed on the post I refer to deal with.

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: