Future of biology rests in harnessing data avalanche

September 4, 2008
Future of biology rests in harnessing data avalanche

(PhysOrg.com) -- Like most sciences, biology is inundated with data. However, a group of researchers warns in a Nature feature that the avalanche of biological information is at the point where the discipline may be unable to reach its full potential without improvements for curating data into on-line databases. The commentary appears in the September 4, issue of the journal and outlines specific remedies to harness the information overload.

By July 2008, data-extractors or curators had indexed over 18 million articles in PubMed and sequences of over 260,000 organisms into GenBank. Both are examples of databases where biological information is stored for public access. Data curation is very labor intensive.

“There is a lack of standardization or consistency in the way scientists report their findings in different journals,” remarked corresponding author Sue Rhee of the Carnegie Institution’s Department of Plant Biology and principal investigator of The Arabidopsis Information Resource (TAIR). “In some cases the researchers don’t even specify the species of a gene under study. That leaves biocurators, who have advance degrees in biology, and expertise with databases and scripting languages, to read the full text and transfer the essence of the information into specific fields in the database. They spend a lot of time just figuring out the basics. And that leaves a lot of room for error.”

Curation is not just a data organization tool. Such input has become essential to biological research. The authors note that eleven different databases had ľ of a million visitors who viewed 20 million pages in just one month. And with inference programs that feed on the curated data, researchers can now tap into other work that relates to theirs and use that data in their own experiments—a huge advancement that is accelerating the pace of biology. “With this vast universe of information, the whole nature of experimentation is changing,” continued Rhee. “But the field is being held back with the curation backlog.”

The group of authors outlined a series of solutions to the problem. The first is to have authors input their data directly into databases upon acceptance in refereed journals. This step has already begun with Plant Physiology and TAIR. When a manuscript in accepted, researchers now fill in a web form about Arabidopsis genes. Second, the commentators urge the biological community to adopt standard reporting formats that are universally agreed upon. And third, curation needs to be elevated by academic institutions and funding agencies. There should also be incentives for researchers to curate their own data, such as increases in academic recognition, career advancement, and funding. They additionally suggest that “community annotation” could be modeled after large-scale astronomy projects like the Sloan Digital Sky Survey, or the Galaxy Zoo, where 80,000 astronomers and interested amateurs classified one million galaxies in less than three weeks.

“The effort and cost required to curate the data is small compared with the cost of carrying out the research in the first place, yet this additional step adds tremendously to the value of the research results to society,” commented Eva Huala, director of TAIR.

Wolf Frommer, acting director of Carnegie’s Department of Plant Biology noted that “advances in our understanding of biology will affect our food supply, our health-care system, the development of remedies for climate change, and many other aspects of daily life. Basic and applied research have to go hand in hand with curation of databases so that humanity can adapt to the quickly changing world as fast as possible.”

Provided by Carnegie Institution

Filter


Move the slider to adjust rank threshold, so that you can hide some of the comments.


Display comments: newest first

voiceofuruguay
Sep 04, 2008

Rank: not rated yet
this is actually a huge step forward
Rank 5 /5 (3 votes)
Tags

Relevant PhysicsForums posts

More news stories

The power of estrogen -- male snakes attract other males

A new study has shown that boosting the estrogen levels of male garter snakes causes them to secrete the same pheromones that females use to attract suitors, and turned the males into just about the sexiest ...

Biology / Plants & Animals

created 12 hours ago | popularity 4.8 / 5 (5) | comments 1 | with audio podcast

Grass to gas: Researchers' genome map speeds biofuel development

Researchers at the University of Georgia have taken a major step in the ongoing effort to find sources of cleaner, renewable energy by mapping the genomes of two originator cells of Miscanthus x giganteus, a large perenn ...

Biology / Biotechnology

created 9 hours ago | popularity 3.8 / 5 (5) | comments 0 | with audio podcast

Experts reveal how plants don't get sunburn

(PhysOrg.com) -- Experts at the University of Glasgow have discovered how plants survive the harmful rays of the sun.

Biology / Cell & Microbiology

created 12 hours ago | popularity 4.7 / 5 (3) | comments 0 | with audio podcast

Miami battling invasion of giant African snails

No one knows how they got there. But an invasion of African giant snails has southern Florida in a panic over potential crop damage, disease and general yuckiness surrounding the slimy gastropods.

Biology / Ecology

created 16 hours ago | popularity 4 / 5 (1) | comments 4

Protein libraries in a snap

(PhysOrg.com) -- A Rice University undergraduate will depart with not only a degree but also a possible patent for his invention of an efficient way to create protein libraries, an important component of biomolecular ...

Biology / Cell & Microbiology

created 16 hours ago | popularity 4.8 / 5 (4) | comments 0 | with audio podcast


Google users warned of threat to smartphone wallets

Users of Google smartphone wallets were being warned on Friday that there is a way to crack pass codes intended to thwart thieves from going on illicit shopping sprees.

Anonymous knocks CIA website offline (Update)

The website of the Central Intelligence Agency was inaccessible on Friday after the hacker group Anonymous claimed to have knocked it offline.

Complex wiring of the nervous system may rely on a just a handful of genes and proteins

Researchers at the Salk Institute have discovered a startling feature of early brain development that helps to explain how complex neuron wiring patterns are programmed using just a handful of critical genes. ...

New error-correcting codes guarantee the fastest possible rate of data transmission

Error-correcting codes are one of the triumphs of the digital age. They’re a way of encoding information so that it can be transmitted across a communication channel — such as an optical fiber o ...

Humans may have helped the decline of African rainforests 3000 years ago

(PhysOrg.com) -- Large areas of rainforests in Central Africa mysteriously disappeared over three thousand years ago, to be replaced by savannas. The prevailing theory has been that the cause was a change ...

New power source discovered

(PhysOrg.com) -- Researchers at the Massachusetts Institute of Technology (MIT) and RMIT University have made a breakthrough in energy storage and power generation.