# Tuesday, 04 July 2006

I'm here again!

Sorry, I have little excuses for not posting for a long time (more than one month!). The only justification is that I have been abroad for one month. Yes, I have been to Salt Lake City, UT, for the last month, working on the project I am currently employed on at IASMA.

In the last year, from July 2005 when I started working here on the Grape Genome Project, we prepared all the programs, databases and informatics structure to house and analyze data of the grape genome. This included databases for the handling of huge data volumes, applications for mass-analysis of data (gene predictions, automatic annotation of as many genes as possible using different sources, etc) and a first infrastructure to query and retrieve data on the genome.

This part involved a lot of scripting, being more a matter of handling data and gluing existing programs together, and this was the reason I had to learn Perl. But there was also room for building interesting programs which make a good use of structured data, such as the Gene Ontology. Ontologies have a lot of limitations (I will probably dedicate a post to this subject), but they are a great leap forward from the actual position of biological data handling. The structure of the Gene Ontology allow us to do some very interesting things, like inferring functions in a more precise manner, taking into consideration more than one source, the “informativeness” of the assigned term, and the degree of confidence we have; it is possible, for example, to see if a term is supported consistently among different sources. We also used the Gene Ontology to create a new and flexible tool for the analysis of expression in different classes of biological processes from microarray experiments, and a new tool for data-mining that queries genes not based on textual searches (as most of the public databases do) or on sequence similarity searches, but based on “semantic” queries, i.e. queries based on the “type” of the gene.

Be aware that ontologies and their terms are not type systems and types; personally I think they are a starting point on which to build more formal and complete techniques. But this will be part of future work, maybe even my future work… :)

Please login with either your OpenID above, or your details below.
(will show your gravatar icon)
Home page

Comment (HTML not allowed)  

[Captcha]Enter the code shown (prevents robots):

Live Comment Preview