Establishing standard definitions for genome sequences
October 8, 2009
Organizers of the 2009 "Sequencing, Finishing, Analysis in the Future" meeting held in May at Santa Fe, N.M. Left to Right: Michael Fitzgerald, Broad Institute; Patrick Chain, DOE JGI; Bob Fulton, Washington University in St. Louis; Donna Muzny, Baylor College of Medicine; Johar Ali, Ontario Institute for Cancer Research; Chris Detter, DOE JGI; and Alla Lapidus, DOE JGI. Credit: DOE Joint Genome Institute
In 1996, researchers from major genome sequencing centers around the world convened on the island of Bermuda and defined a finished genome as a gapless sequence with a nucleotide error rate of one or less in 10,000 bases. This effectively set the quality target for the human genome effort and was quickly applied to other genome projects. If a genome sequence didn't meet this stringent criterion, it was simply considered a "draft."
More than a decade later, researchers are finding that with the advent of the latest sequencing technologies the terms "draft" and "finished" are no longer sufficient to describe the varying levels of genome sequence quality being produced. The quality issue is of particular concern for any researcher who wants to use the sequence, in order to know its integrity and reliability. This is of even greater concern for reference genome sequences, such as those genome projects conducted in support of the U.S. Department of Energy missions of bioenergy and environmental clean-up, because they provide the foundational knowledge of the gene content and how these organisms interact with the environment.
As the proverbial "fire hose of data" becomes a Niagara torrent, with conservative estimates of 12,000 draft genomes hitting the public databases by 2012, researchers may be surprised to find that these datasets describe genomes that are not complete. Recognizing the problem, a group of researchers from several sequencing centers, including the DOE Joint Genome Institute (JGI), the Sanger Institute and the Human Microbiome Project (HMP) Jumpstart Consortium sequencing institutes, has proposed a new set of standards that expand upon the so-called "Bermuda standard." In the October 9 issue of the journal Science, they propose four additional categories between "draft" and "finished" status that reflect varying levels of completeness.
This video is not supported by your browser at this time.
This audio file shares an interview with DOE JGI's Patrick Chain regarding his Oct. 9 paper in Science. Credit: DOE Joint Genome Institute
"In the past we've been limited to two options, requiring us and the other centers to come up with internal definitions," said DOE JGI metagenomics researcher Patrick Chain at Los Alamos National Laboratory (LANL), first author of the Science paper. "But these are not clear and they're not propagated to the databases to which we submit sequences. So when users try to download genomes they get data of unknown quality with no information, or a complete genome that they assume has been checked for missing-data errors."Chain said that when he and the other organizers of the Sequencing, Finishing, Analysis in the Future meeting hosted by LANL first gathered in 2005, they were concerned by the varying quality of the new genomes being submitted to public archives . As the meeting organizers all represented major sequencing centers (and smaller groups as well), the genome projects standards group was initiated at LANL, stimulated by these concerns.
The six categories defined by the group include:
- "Standard draft," which is the minimum amount of information needed for submission to a public database;
- "High quality draft," which is typically generated by large sequencing centers such as DOE JGI, and which has little or no manual review;
- "Improved high quality draft," which consists of data reviewed by either people or machines to some extent so most of the genetic data is assembled correctly, but some errors may still be present;
- "Annotation-directed improvement," which is a sequenced segment that presents all the information in various gene regions as accurately as possible;
- "Noncontiguous finished," which includes sequences that have been reviewed by both people and machines and would be considered complete except for "recalcitrant regions" that are proving problematic;
- "Finished," which defines complete sequences that have minimal errors, if any.
"My hope is all the major genome centers and advanced genomics groups use the gradations that fit their needs," he said. "Some centers may want all six, while some may only want three, but as long as they keep them intact we are in good shape. Then, my hope is that the smaller genomics groups adopt the classes as written to help the rest of the scientific community know what they are generating and submitting."
Chain added that the process of coming up with the proposed standards was not exactly an easy task since all major centers "have different pipelines, different sequencing techniques, different internal standards". They also recognized that the attempt to develop a "one size fits all" set of standards is still a work in progress. The definitions provided in the Science paper are fairly flexible, designed to apply regardless of the genome project or sequencing technologies employed and to meet each group's needs.
"We do expect that a number of people will comment on these standards, and possibly expand on the categories," he said, "but we feel we've covered all the bases with these six categories."
Chain said the group plans to team with the Genomic Standards Consortium, a grassroots movement begun by scientists who were concerned about the need for data collection standards in genome projects. The group has also talked to public archives such as GenBank to append these proposed standards to GenBank entries so that researchers can tell if the sequences will be useful to them. "Standards are a major issue to be tackled in genomics right now," Chain said. "These proposals are guideposts meant to inform users and generators."
More information: Chain PSG, Grafham DV et al. (2009) Genome project standards in a new era of sequencing. Science.
-
Horse genome sequence draft is issued
Feb 07, 2007 |
not rated yet |
0
-
Exploring standards to advance microbial genomics
Jul 10, 2009 |
not rated yet |
0
-
Genome Institute Reaches Milestone with a Mighty Microbe
May 08, 2007 |
not rated yet |
0
-
Human chromosome 3 is sequenced
Apr 27, 2006 |
not rated yet |
0
-
All eyes and ears on the corn genome
Mar 13, 2008 |
not rated yet |
0
-
Engineers build first sub-10-nm carbon nanotube transistor
Feb 01, 2012 |
4.9 / 5 (31) |
30
-
Something old, something new: Evolution and the structural divergence of duplicate genes
Jan 31, 2012 |
4.6 / 5 (7) |
1
-
The hidden nanoworld of ice crystals: Revealing the dynamic behavior of quasi-liquid layers
Jan 30, 2012 |
5 / 5 (3) |
1
-
Stock market network reveals investor clustering
Jan 27, 2012 |
3.9 / 5 (23) |
8
-
Of microchemistry and molecules: Electronic microfluidic device synthesizes biocompatible probes
Jan 26, 2012 |
5 / 5 (1) |
0
-
Factors affecting beet root cell membrane
3 hours ago
-
Stem cell question.
Feb 10, 2012
-
Protease cleavage
Feb 10, 2012
-
Pertubance in a model
Feb 10, 2012
-
Cancer drugs and Alzheimer's, Oh my!
Feb 09, 2012
-
Squishing cells
Feb 09, 2012
- More from Physics Forums - Biology
More news stories
Entire genome of extinct human decoded from fossil
(PhysOrg.com) -- In 2010, Svante Pääbo and his colleagues presented a draft version of the genome from a small fragment of a human finger bone discovered in Denisova Cave in southern Siberia. The ...
Feb 07, 2012 |
4.7 / 5 (58) |
47
|
Why are there so few fish in the Earth's oceans?
(PhysOrg.com) -- A Stony Brook University researcher has found that, contrary to popular belief, there are not plenty of fish in the sea.
Feb 08, 2012 |
4.3 / 5 (17) |
26
|
Miami battling invasion of giant African snails
No one knows how they got there. But an invasion of African giant snails has southern Florida in a panic over potential crop damage, disease and general yuckiness surrounding the slimy gastropods.
Feb 10, 2012 |
4.7 / 5 (3) |
5
Deciding to go left or right: Researchers use device to determine that lower animals can navigate too
For decades, scientists have associated binary decision making opting to go left or right with higher-ranking animals, including humans. A team of Harvard researchers, however, is rewriting that ...
Feb 09, 2012 |
4 / 5 (1) |
4
|
Study shows chimps able to understand needs of others
(PhysOrg.com) -- By setting up a unique experiment, a small team of researchers has found that chimpanzees are able to understand need in other chimps, despite their general disinclination to offer aid when ...
Injured boomers beware: Know when to see doctor
(AP) -- It happened to nurse Jane Byron years after an in-line skating fall, business owner Haralee Weintraub while doing "men's" push-ups, and avid cyclist Gene Wilberg while lifting a heavy box.
Google might launch Drive for cloud storage soon
(PhysOrg.com) -- Google's next big move, according to the Wall Street Journal, is a cloud storage service called Drive. Hardly first to the plate, Google is simply catching up to introducing its cloud reposi ...
Latin America mining boom clashes with conservation
Latin America is experiencing a mining boom as prices rise fuelled by a hike in global demand, but the region is also being hit by a wave of violent protests, strikes and rallies by environmentalists.
Love a click away in Indonesia's Twitter Republic
He was a geeky kid from Yogyakarta, she a glamorous city girl in Jakarta. In a country with one of the world's most vibrant social networking scenes they fell in love on Twitter.
Europeans protest controversial Internet pact
Tens of thousands of people marched in protests in more than a dozen European cities Saturday against a controversial anti-online piracy pact that critics say could curtail Internet freedom.
Walney offshore wind farm is world's biggest (for now)
(PhysOrg.com) -- The Walney wind farm on the Irish Sea--characterized by high tides, waves and windy weather--officially opened this week. The farm is treated in the press as a very big deal as the Walney ...