New gene prediction method capitalizes on multiple genomes

December 20, 2007

Researchers at Stanford University report in the online open access journal, Genome Biology, a new approach to computationally predicting the locations and structures of protein-coding genes in a genome. Gene finding remains an important problem in biology as scientists are still far from fully mapping the set of human genes.

Furthermore, gene maps for other vertebrates, including important model organisms such as mouse, are much more incomplete than the human annotation. The new technique, known as CONTRAST (CONditionally TRAined Search for Transcripts), works by comparing a genome of interest to the genomes of several related species.

CONTRAST exploits the fact that the functional role protein-coding genes play a specific part within a cell and are therefore subjected to characteristic evolutionary pressures. For example, mutations that alter an important part of a protein's structure are likely to be deleterious and thus selected against. On the other hand, mutations that preserve a protein's amino acid sequence are normally well tolerated. Thus, protein-coding genes can be identified by searching a genome for regions that show evidence such patterns of selection. However, learning to recognize such patterns when more than two species are compared has proved difficult.

Previous systems for gene prediction were able to effectively make use of one additional 'informant' genome. For example, when searching for human genes, taking into account information from the mouse genome led to a substantial increase in accuracy. But, no system was able to leverage additional informant genomes to improve upon state-of-the-art performance using mouse alone, although it was expected that adding informants would make patterns of selection clearer.

CONTRAST solves this problem by learning to recognize the signature of protein-coding gene selection in a fundamentally different way from previous approaches. Instead of constructing a model of sequence evolution, CONTRAST directly 'learns' which features of a genomic alignment are most useful for recognizing genes. This approach leads to overall higher levels of accuracy and is able to extract useful information from several informant sequences.

In a test on the human genome, CONTRAST exactly predicted the full structure of 59% of the genes in the test set, compared with the previous best result of 36%. Its exact exon sensitivity of 93%, compared with a previous best of 84%, translates into many thousands of exons correctly predicted by CONTRAST but missed by previous methods. Importantly, CONTRAST's accuracy using a combination of eleven informant genomes was significantly higher than its accuracy using any single informant. The substantial advance in predictive accuracy represented by CONTRAST will further efforts to complete protein-coding gene maps for human and other organisms.

Further information about existing gene-prediction methods and the advance CONTRAST brings to the field can be found in a minireview by Paul Flicek, which accompanies the article by Batzoglou and colleagues.

Source: BioMed Central


print this article email this article download pdf blog this article bookmark this article     Stumble it Digg this share on Facebook retweet share on Reddit add to delicious
Rate this story - not rated yet


December 20, 2007 all stories

Comments: 0

not rated yet
  • Stumble this up

  • Digg this

  • share this

  • hide
  • Related Stories

  • Why females live longer than males: is it due to the father's sperm?
    created Dec 01, 2009 | popularity not rated yet | comments 0
  • New research into the mechanisms of gene regulation
    created Nov 19, 2009 | popularity not rated yet | comments 0
  • The food-energy cellular connection revealed
    created Oct 15, 2009 | popularity not rated yet | comments 0
  • Outfoxing pox: Developing a new class of vaccine candidates
    created Oct 15, 2009 | popularity not rated yet | comments 0
  • Using RNAi-based technique, scientists find new tumor suppressor genes in lymphoma
    created Oct 13, 2009 | popularity not rated yet | comments 0


Other News

Chicken of the sea? Tuna farming getting a boost (AP)

Chicken of the sea? Tuna farming getting a boost

Biology / Ecology

created 8 hours ago | popularity not rated yet | comments 0

(AP) -- Thousands of tuna, their silver bellies bloated with fat, swim frantically around in netted areas of a small bay, stuffing themselves until they grow twice as heavy as in the wild. Is this sushi's ...


meat

Pork meat grown in the laboratory

Biology / Biotechnology

created Dec 01, 2009 | popularity 4.7 / 5 (34) | comments 31

(PhysOrg.com) -- Scientists from Eindhoven University in The Netherlands have for the first time grown pork meat in the laboratory by extracting cells from a live pig and growing them in a petri dish.


A farmer droving his sheep, northwest of Melbourne

Australian scientists aim to reduce sheep burps

Biology / Biotechnology

created Nov 29, 2009 | popularity 2.2 / 5 (6) | comments 9

Australian scientists are working to breed a sheep that belches less, as they look for ways to reduce harmful methane emissions from the country's woolly flocks, a researcher said Sunday.


Sylvia atricapilla (Blackcap)

By feeding the birds, you could change their evolutionary fate

Biology / Plants & Animals

created Dec 03, 2009 | popularity 4 / 5 (5) | comments 6

Feeding birds in winter is a most innocent human activity, but it can nonetheless have profound effects on the evolutionary future of a species, and those changes can be seen in the very near term. That's ...


Research shows some plants can remove indoor pollutants

Research shows some plants can remove indoor pollutants

Biology / Plants & Animals

created Dec 02, 2009 | popularity 4.7 / 5 (15) | comments 2

(PhysOrg.com) -- Some plants have the ability to drastically reduce levels of indoor pollutants, according to new research at the University of Georgia. Researchers showed that certain species can effectively ...