Online Mendelian Inheritance in Man (OMIM)

Been spending time working on the OMIM website. Basically a tiered system with an API (developed with Java, MySQL, myBatis and Lucene/Solr), and a front end (developed with Django and JQuery).

Lots of moving parts to the site, every night we download data from about 20 sources (about 3GB of data in total), parse it all and assemble the database and all the links to external resources. Basically a big ETL machine.

What is interesting to me is the breath of quality in the data and the lack of standardization. Actually the only standard that exists is the comma delimiter. The other interesting thing is that some sites really strive to keep their data up to date while others are much more, shall we say, relaxed about it.

OMIM also now has a Twitter account.

Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: