New gene prediction method capitalizes on multiple genomes
December 20, 2007Researchers at Stanford University report in the online open access journal, Genome Biology, a new approach to computationally predicting the locations and structures of protein-coding genes in a genome. Gene finding remains an important problem in biology as scientists are still far from fully mapping the set of human genes.
Furthermore, gene maps for other vertebrates, including important model organisms such as mouse, are much more incomplete than the human annotation. The new technique, known as CONTRAST (CONditionally TRAined Search for Transcripts), works by comparing a genome of interest to the genomes of several related species.
CONTRAST exploits the fact that the functional role protein-coding genes play a specific part within a cell and are therefore subjected to characteristic evolutionary pressures. For example, mutations that alter an important part of a protein's structure are likely to be deleterious and thus selected against. On the other hand, mutations that preserve a protein's amino acid sequence are normally well tolerated. Thus, protein-coding genes can be identified by searching a genome for regions that show evidence such patterns of selection. However, learning to recognize such patterns when more than two species are compared has proved difficult.
Previous systems for gene prediction were able to effectively make use of one additional 'informant' genome. For example, when searching for human genes, taking into account information from the mouse genome led to a substantial increase in accuracy. But, no system was able to leverage additional informant genomes to improve upon state-of-the-art performance using mouse alone, although it was expected that adding informants would make patterns of selection clearer.
CONTRAST solves this problem by learning to recognize the signature of protein-coding gene selection in a fundamentally different way from previous approaches. Instead of constructing a model of sequence evolution, CONTRAST directly 'learns' which features of a genomic alignment are most useful for recognizing genes. This approach leads to overall higher levels of accuracy and is able to extract useful information from several informant sequences.
In a test on the human genome, CONTRAST exactly predicted the full structure of 59% of the genes in the test set, compared with the previous best result of 36%. Its exact exon sensitivity of 93%, compared with a previous best of 84%, translates into many thousands of exons correctly predicted by CONTRAST but missed by previous methods. Importantly, CONTRAST's accuracy using a combination of eleven informant genomes was significantly higher than its accuracy using any single informant. The substantial advance in predictive accuracy represented by CONTRAST will further efforts to complete protein-coding gene maps for human and other organisms.
Further information about existing gene-prediction methods and the advance CONTRAST brings to the field can be found in a minireview by Paul Flicek, which accompanies the article by Batzoglou and colleagues.
Source: BioMed Central
-
Not the black sheep of domestic animals
Feb 07, 2012 |
3 / 5 (1) |
0
-
Extended synaptic development may explain our cognitive edge over other primates
Feb 01, 2012 |
5 / 5 (1) |
1
-
Tracking the birth of an evolutionary arms race between HIV-like viruses and primate genomes
Jan 26, 2012 |
not rated yet |
0
-
New understanding of chronic pain suggests new target for drug development
Jan 22, 2012 |
5 / 5 (1) |
0
-
'Pulverized' chromosomes linked to cancer?
Jan 19, 2012 |
4.7 / 5 (7) |
0
-
Engineers build first sub-10-nm carbon nanotube transistor
Feb 01, 2012 |
4.9 / 5 (33) |
30
-
Something old, something new: Evolution and the structural divergence of duplicate genes
Jan 31, 2012 |
4.6 / 5 (7) |
1
-
The hidden nanoworld of ice crystals: Revealing the dynamic behavior of quasi-liquid layers
Jan 30, 2012 |
5 / 5 (4) |
1
-
Stock market network reveals investor clustering
Jan 27, 2012 |
3.9 / 5 (23) |
8
-
Of microchemistry and molecules: Electronic microfluidic device synthesizes biocompatible probes
Jan 26, 2012 |
5 / 5 (2) |
0
More news stories
A mitosis mystery solved: How chromosomes align perfectly in a dividing cell
Although the process of mitotic cell division has been studied intensely for more than 50 years, Whitehead Institute researchers have only now solved the mystery of how cells correctly align their chromosomes during symmetric ...
8 hours ago |
4.3 / 5 (14) |
0
|
Researchers find extensive RNA editing in human transcriptome
In a new study published online in Nature Biotechnology, researchers from BGI, the world's largest genomics organization, reported the evidence of extensive RNA editing in a human cell line by analysis of RNA-seq data, demons ...
8 hours ago |
5 / 5 (4) |
0
|
The proteins ensuring genome protection
Researchers from the University of Geneva (UNIGE), Switzerland, have discovered the crucial role of two proteins in developing a cell 'anti-enzyme shield'. This protection system, which operates at the level of molecular ...
8 hours ago |
5 / 5 (3) |
0
|
Entire genome of extinct human decoded from fossil
(PhysOrg.com) -- In 2010, Svante Pääbo and his colleagues presented a draft version of the genome from a small fragment of a human finger bone discovered in Denisova Cave in southern Siberia. The ...
Feb 07, 2012 |
4.7 / 5 (60) |
51
|
Why are there so few fish in the Earth's oceans?
(PhysOrg.com) -- A Stony Brook University researcher has found that, contrary to popular belief, there are not plenty of fish in the sea.
Feb 08, 2012 |
4.2 / 5 (18) |
27
|
Scientists discover molecular secrets of 2,000-year-old Chinese herbal remedy
For roughly two thousand years, Chinese herbalists have treated Malaria using a root extract, commonly known as Chang Shan, from a type of hydrangea that grows in Tibet and Nepal. More recent studies suggest that halofuginone, ...
New method to examine batteries -- MRI from the inside
There is an ever-increasing need for advanced batteries for portable electronics, such as phones, cameras, and music players, but also to power electric vehicles and to facilitate the distribution and storage of energy derived ...
Google might launch Drive for cloud storage soon
(PhysOrg.com) -- Google's next big move, according to the Wall Street Journal, is a cloud storage service called Drive. Hardly first to the plate, Google is simply catching up to introducing its cloud reposi ...
Lab study raises questions over nano-particle impact
Tests involving chickens have raised questions about the impact on health from engineered nano-particles, the ultra-fine grains commonly used in drugs and processed foods, scientists said on Sunday.
Overeating may double risk of memory loss
New research suggests that consuming between 2,100 and 6,000 calories per day may double the risk of memory loss, or mild cognitive impairment (MCI), among people age 70 and older. The study was released today and will be ...
Starve a virus, feed a cure? Findings show how some cells protect themselves against HIV
A protein that protects some of our immune cells from the most common and virulent form of HIV works by starving the virus of the molecular building blocks that it needs to replicate, according to research published online ...