Scientists compare 12 fruit fly genomes

November 7, 2007
Genomic revelations from fly's family tree

Image by Broad Communications

In one of the first large-scale comparisons of multiple animal genomes, scientists at the Broad Institute of MIT and Harvard, the Computer Science and Artificial Intelligence Laboratory (CSAIL) at MIT, and many collaborating institutions, have analyzed the genomes of twelve species of the fruit fly Drosophila to reveal insights on the evolution of genes and genomes and to discern the functional elements encoded in animal DNA.

The work appears in the November 8 issue of Nature and in more than 40 accompanying papers in Genome Research and other journals.

The method of comparing the genomes of multiple related species, fly or otherwise, not only reveals new insights into species evolution and identifies thousands of novel genes and other functional elements, but also provides a powerful tool for unraveling genome function that may help researchers unlock the secrets of our own genome.

In these papers, the international consortium reported the genomes of ten newly sequenced Drosophila species, some very closely related and others less so, and their comparison to two previously sequenced flies including Drosophila melanogaster, one of the most powerful model organisms for the study of animal biology and evolution. The availability of the many Drosophila genomes has enabled a great deal of new insights about genome function and aided the study of how genomes have changed across evolutionary time.

“Having the sequences of many closely related species allows us to study the evolutionary forces that have shaped the fruit fly’s family tree, and to discover the working parts of the fly genome in a systematic way,” said Manolis Kellis, associate member of the Broad Institute, assistant professor in MIT’s CSAIL, and one of the consortium’s project leaders.

On one hand, the researchers studied the differences across species to help elucidate how evolution has shaped fly biology over millions of years. Their analysis revealed that while many attributes of Drosophila genomes are in fact conserved across multiple species, each species has novel features not seen in any other. In fact, only 77 percent of the approximately 13,700 protein-coding genes in D. melanogaster are shared with all of the other 11 species. For example, the genes involved in interactions with the environment and in reproduction showed signs of adaptive evolution, meaning that they likely provided some survival advantage to the organism.

On the other hand, the researchers studied the similarities of the different species to help define the functional parts of the fly genome. The parts of a genome that are unchanged (conserved) are those that have been kept by evolution, and are thus likely to play crucial roles. Thus, genome comparison can reveal which regions of the genome are functional, based on the degree to which evolution has conserved them.

“Focusing on the conserved part of the genome is a great way to discover what has been maintained by evolution,” said Kellis. “Moreover, by looking more closely at the subtle patterns of mutation within conserved regions, we can predict the functional roles they play.”

Indeed, at the level of DNA, several combinations of letters, or nucleotides, may encode the same function, in the way that a storyteller can use different combinations of words to tell the same tale. For example, four different nucleotide combinations – GTT, GTC, GTA, and GTG – all encode the same protein building block, or amino acid. Thus, a change in the third letter would leave the amino acid unchanged, one example of how DNA changes can be tolerated while still preserving the function of the corresponding protein.

Through these kinds of random mutations, evolution explores the space of possible nucleotide combinations that preserve function. This exploration produces unique patterns of genomic change, described by the researchers as “evolutionary signatures” that are specific to the function of that region of DNA. Protein-coding genes, for example, show frequent substitutions at every third nucleotide, due to the fact that one amino acid can be encoded by several nucleotide triplets. In contrast, some genes that don’t encode proteins — so-called RNA genes — show changes that preserve the overall structure of RNA while tolerating changes in the genes’ DNA sequence.

Like codebreakers turning their knowledge of biology into computational algorithms, Kellis and his colleagues identified evolutionary signatures associated with a variety of roles in the genome: protein-coding genes, non-coding RNAs, microRNAs, and regulatory motifs. In each case, the researchers identified distinct evolutionary signatures associated with each function, based on the tolerated changes that still preserve that function.

The researchers then used these evolutionary signatures to systematically identify the functional elements encoded in the fly genome, leading to hundreds of novel functional elements and many new insights on animal biology.

The work allowed the discovery of 1,193 new sequences that encode proteins, the flagging of 414 regions that were mistakenly labeled as protein-coding genes, and corrections to hundreds of previously annotated protein-coding genes. This allowed the researchers to revise the catalog of protein-coding genes for Drosophila melanogaster, with updates affecting 10% of all genes. The revision was confirmed through manual curation by scientists at the FlyBase consortium and through large-scale experimental validation led by the Berkeley Drosophila Genome Project.

In addition, the researchers identified hundreds of new RNA genes and structures, new microRNA genes, and new DNA sequences involved in the control of gene expression during embryo development and environmental changes. The twelve genomes also allowed the prediction of very small regulatory targets in the genome, which can help piece together the first regulatory network for an animal genome without having to perform intense and expensive experiments.

The work also led to many surprises. For example, the researchers found many protein-coding genes that defy the traditional rules of how the DNA code gets translated into protein. For example, 150 genes apparently bypass signals that would normally cause DNA to stop being translated, and other genes encode multiple proteins in a single RNA transcript. Other findings include surprising evidence that a single microRNA gene locus can produce up to four functional microRNAs, each with distinct functions.

The team’s analysis is the first time that such a diverse range of evolutionary signatures has been applied to identify the functional elements of a genome in a comprehensive way. “By comparing many closely related genomes, we were able to discover things we never thought were possible using one genome sequence alone,” said Kellis. One intriguing possibility is that evolutionary signatures may even identify novel, yet unknown classes of functions. For example, although the fruit fly has been intensely studied for over a century, microRNAs were only discovered in the last decade, and are now known to play a central role in development. Many other classes of yet unknown functional elements may be hidden in the fly genome, and recognition of their common evolutionary properties may help lead to their discovery.

The study of the 12 flies has immediate implications for the discovery of functional elements in the human genome. “We are now using similar methods to analyze 32 mammalian genomes, in order to help understand the human genome,” Kellis explained. “We should be able to apply the methodology of evolutionary signatures to any group of closely related species.” Peering into the past and interpreting clues carved in the genome by evolution is yet one more way to make revelations about human biology. As the genome sequences of more organisms become available, the power to make discoveries about functions encoded in the genome will likely continue to increase.

On the whole, genome sequencing projects have given us a glimpse of the incredible variety of life, recording the genetic plans of organisms as wide-ranging as bacteria, algae, insects, and mammals and exposing common genes and functions conserved by evolution. The approach of sequencing many close relatives on the family tree of life provides a rare view of the precise workings of evolution, giving scientists the tools to decipher the secrets hidden in our genome.

Papers cited:

Drosophila 12 Genomes Consortium. (2007) Evolution of genes and genomes in the Drosophila phylogeny. Nature DOI:10.1038/nature06341.

Stark et al. (2007) Discovery of functional elements in 12 Drosophila genomes using evolutionary signatures. Nature DOI:10.1038/nature06340.

Lin et al. (2007) Revisiting the protein-coding gene catalog of Drosophila melanogaster using twelve fly genomes. Genome Research DOI:10.1101/gr6679507.

Stark et al. (2007) Systematic discovery and characterization of fly microRNAs using 12 Drosophila genomes. Genome Research DOI:10.1101/gr6593807.

Stark et al. (2007) Reliable prediction of regulator targets using 12 Drosophila genomes. Genome Research DOI:10.1101/gr7090407.

Rasmussen, Kellis. (2007) Accurate gene-tree reconstruction by learning gene- and species-specific substitution rates across multiple complete genomes. Genome Research DOI:10.1101/gr7105007.

Ruby et al. (2007) Evolution, biogenesis, expression, and target predictions of a substantially expanded set of Drosophila microRNAs. Genome Research DOI:10.1101/gr6597907

Source: Broad Institute of MIT and Harvard, by Leah Eisenstadt

4.6 /5 (9 votes)  

Rank 4.6 /5 (9 votes)
Tags

Related Stories
Relevant PhysicsForums posts

More news stories

A mitosis mystery solved: How chromosomes align perfectly in a dividing cell

Although the process of mitotic cell division has been studied intensely for more than 50 years, Whitehead Institute researchers have only now solved the mystery of how cells correctly align their chromosomes during symmetric ...

Biology / Cell & Microbiology

created 7 hours ago | popularity 4.5 / 5 (12) | comments 0 | with audio podcast

Researchers find extensive RNA editing in human transcriptome

In a new study published online in Nature Biotechnology, researchers from BGI, the world's largest genomics organization, reported the evidence of extensive RNA editing in a human cell line by analysis of RNA-seq data, demons ...

Biology / Biotechnology

created 7 hours ago | popularity 5 / 5 (4) | comments 0 | with audio podcast

The proteins ensuring genome protection

Researchers from the University of Geneva (UNIGE), Switzerland, have discovered the crucial role of two proteins in developing a cell 'anti-enzyme shield'. This protection system, which operates at the level of molecular ...

Biology / Cell & Microbiology

created 7 hours ago | popularity 5 / 5 (3) | comments 0 | with audio podcast

Entire genome of extinct human decoded from fossil

(PhysOrg.com) -- In 2010, Svante Pääbo and his colleagues presented a draft version of the genome from a small fragment of a human finger bone discovered in Denisova Cave in southern Siberia. The ...

Biology / Biotechnology

created Feb 07, 2012 | popularity 4.7 / 5 (60) | comments 50 | with audio podcast

Why are there so few fish in the Earth's oceans?

(PhysOrg.com) -- A Stony Brook University researcher has found that, contrary to popular belief, there are not plenty of fish in the sea.

Biology / Plants & Animals

created Feb 08, 2012 | popularity 4.2 / 5 (18) | comments 27 | with audio podcast


Scientists discover molecular secrets of 2,000-year-old Chinese herbal remedy

For roughly two thousand years, Chinese herbalists have treated Malaria using a root extract, commonly known as Chang Shan, from a type of hydrangea that grows in Tibet and Nepal. More recent studies suggest that halofuginone, ...

New method to examine batteries -- MRI from the inside

There is an ever-increasing need for advanced batteries for portable electronics, such as phones, cameras, and music players, but also to power electric vehicles and to facilitate the distribution and storage of energy derived ...

Overeating may double risk of memory loss

New research suggests that consuming between 2,100 and 6,000 calories per day may double the risk of memory loss, or mild cognitive impairment (MCI), among people age 70 and older. The study was released today and will be ...

Lab study raises questions over nano-particle impact

Tests involving chickens have raised questions about the impact on health from engineered nano-particles, the ultra-fine grains commonly used in drugs and processed foods, scientists said on Sunday.

Google might launch Drive for cloud storage soon

(PhysOrg.com) -- Google's next big move, according to the Wall Street Journal, is a cloud storage service called Drive. Hardly first to the plate, Google is simply catching up to introducing its cloud reposi ...

Starve a virus, feed a cure? Findings show how some cells protect themselves against HIV

A protein that protects some of our immune cells from the most common and virulent form of HIV works by starving the virus of the molecular building blocks that it needs to replicate, according to research published online ...