Contamination found in nearly a quarter of genome databases
February 18, 2011 By Christine Buckley
Mark Longo, a Graduate student in molecular and cell biology, and associate professor Rachel O'Neill. Photo by Dan Buttrey
(PhysOrg.com) -- UConn scientists say the results could complicate disease identification in humans.
A new genomics study by molecular biologists at the University of Connecticut has shown that at least 22 percent of non-human genome databases are contaminated with human DNA. Their results imply that this level of contamination could also exist in records of the human genome, which could produce major problems in identifying human diseases.
Associate professor Rachel ONeill, graduate student Mark Longo, and associate professor Michael ONeill of the molecular and cell biology department in the College of Liberal Arts and Sciences published their findings today in an online edition of the journal PLOS One.
Longo says that he had originally been scanning the genome of zebrafish and comparing it with the human genome to find what are called ultraconserved regions, or bits of DNA that are so ancient they are similar among species that are distantly related, like humans and fish.
But, to Longos surprise, he found a region of DNA that was identical to one in humans and couldnt be a part of the fish genome. Thats when he knew that the fish genome database he was using was contaminated.
Contamination in these databases could be from peoples skin or hair, or it could be DNA from other sequence libraries kept in the same facility, says Longo. We knew we needed to quantify this to see how many of the databases contained human contamination.
The researchers gathered sequences from all the major global DNA repositories, including the archives at the National Center for Biotechnology Information, the University of California Santa Cruz, the Joint Genome Databases, and the Ensembl genome browser. Any sequencing project funded by federal funds is required to be deposited in one of these archives.
Using a section of DNA that is specific to primates and abundant in the human genome, the researchers identified 454 non-primate genomes out of the 2,027 they sampled as contaminated with human DNA.
Rachel ONeill says this result led them to reason that if these non-human genome databases were contaminated with human DNA, then its just as likely that many human databases would be contaminated as well. But, she says, the catch is that its virtually impossible to identify a foreign bit of human DNA in a human genome database.
In sequencing, you have to put all the pieces of the genome together like a big jigsaw puzzle. The pieces that dont fit stand out, Longo says. But if youre working on a human puzzle, its like working on a three-billion piece puzzle, and its all black.
Its virtually impossible to find human contamination in human genome databases, she adds, because they simply dont stand out as anything unusual in a human genome. This, she says, could lead to some terrible mistakes.
A portion of the National Center for Biotechnology Information includes a Cancer Genome Atlas: a library documenting mutations that occur in cancer cells. ONeill says theres no room for error in these databases.
It would be very upsetting to be told you have a mutation for breast cancer, when in fact you dont, and it was just a contamination from another sample, she says.
ONeill emphasizes that scientists need to exercise extreme caution when performing their sequencing, and that they should validate results through tests in their own laboratories before submitting them to databases. Longo points out that the UConn researchers found contaminations in some sequences that they had produced in their own laboratories, which they then discarded. ONeill says these practices should be the norm.
Were compounding this problem in our rush to move forward with genomics, she says. Millions of dollars are invested each year in these sequence databases, but were plowing ahead with less caution than we should. The result is that we might have a harder time recognizing the etiology of something like cancer.
Longo notes that in his analysis, there was one type of DNA database that showed no contamination at all: that of influenza. Because viruses are so dangerous, great care is taken in their preparation, he says much more than is usually taken with a commonplace and harmless genome. This kind of caution should be extended to all sequencing, says ONeill.
The sequencing world has moved in leaps and bounds, she says. Its time for validation to catch up.
-
Study finds 'masculine' women get more promotions at work
Jan 27, 2011 |
not rated yet |
0
-
Researchers develop safer way to make induced pluripotent stem cells
Feb 01, 2011 |
not rated yet |
0
-
New government dietary guidelines may require altering habits
Feb 03, 2011 |
not rated yet |
0
-
Using mathematics to identify the good guys
Oct 28, 2010 |
not rated yet |
0
-
Modern society made up of all types
Nov 04, 2010 |
not rated yet |
0
-
Stars containing dark matter should look different from other stars
Feb 20, 2012 |
4.5 / 5 (17) |
11
-
Physicists discover evidence of rare hypernucleus, a component of strange matter
Feb 17, 2012 |
4.7 / 5 (38) |
22
-
Fast photon control brings quantum photonic technologies closer
Feb 13, 2012 |
5 / 5 (8) |
1
-
Engineers build first sub-10-nm carbon nanotube transistor
Feb 01, 2012 |
4.9 / 5 (36) |
32
-
Something old, something new: Evolution and the structural divergence of duplicate genes
Jan 31, 2012 |
4.6 / 5 (7) |
1
-
Eye biology videos
4 hours ago
-
Flowering Plant Revived After 30,000 Years in Permafrost
Feb 21, 2012
-
Toba volcano eruptions - 1.000 - 10,000 breeding pairsunb
Feb 20, 2012
-
How is a specific gene removed from DNA
Feb 20, 2012
-
Reproduction and Human evolution
Feb 19, 2012
-
Viruses: Living or Non-living organisms
Feb 19, 2012
- More from Physics Forums - Biology
More news stories
Surprising diversity at a synapse hints at complex diversity of neural circuitry
A new study reveals a dazzling degree of biological diversity in an unexpected place a single neural connection in the body wall of flies.
11 hours ago |
5 / 5 (4) |
0
|
Men might not 'become extinct' after all: Theory of the 'rotting' Y chromosome dealt a fatal blow
If you were to discover that a fundamental component of human biology has survived virtually intact for the past 25 million years, you'd be quite confident in saying that it is here to stay.
14 hours ago |
5 / 5 (6) |
1
|
New family of legless amphibians found in India
Since before the age of dinosaurs it has burrowed unbothered beneath the monsoon-soaked soils of remote northeast India - unknown to science and mistaken by villagers as a deadly, miniature snake.
21 hours ago |
5 / 5 (9) |
3
Climate change affects bird migration timing in North America
Bird migration timing across North America has been affected by climate change, according to a study published Feb. 22 in the open access journal PLoS ONE.
10 hours ago |
4.5 / 5 (2) |
2
New iridescent lizard species found in Cambodia
A new species of lizard with striking iridescent rainbow skin, a long tail and very short legs has been discovered in the rainforest in northeast Cambodia, conservationists announced Wednesday.
10 hours ago |
not rated yet |
0
Researchers build first physical 'metatronic' circuit
(PhysOrg.com) -- The technological world of the 21st century owes a tremendous amount to advances in electrical engineering, specifically, the ability to finely control the flow of electrical charges using ...
Spitzer finds solid buckyballs in space
(PhysOrg.com) -- Astronomers using data from NASA's Spitzer Space Telescope have, for the first time, discovered buckyballs in a solid form in space. Prior to this discovery, the microscopic carbon spheres ...
Faster than light neutrinos? More like faulty wiring
You can shelf your designs for a warp drive engine (for now) and put the DeLorean back in the garage; it turns out neutrinos may not have broken any cosmic speed limits after all.
Physicists surprised by disappearing and reappearing superconductivity in iron selenium chalcogenides
Superconductivity is a rare physical state in which matter is able to conduct electricity -- maintain a flow of electrons -- without any resistance. This phenomenon can only be found in certain materials at low temperatures, ...
Stanford research team cracks animated NuCaptcha
(PhysOrg.com) -- The research team from Stanford University, led by Elie Bursztein, that previously had cracked regular CAPTCHAs and then audio CAPTCHAs, now has also successfully cracked the animated version called NuCapt ...
Going up: Japan builder eyes space elevator
A Japanese construction firm claimed Wednesday it could execute an out-of-this-world plan to put tourists in space within 40 years by building an elevator that stretches a quarter of the way to the moon.
Feb 18, 2011
Rank: 5 / 5 (1)
Feb 18, 2011
Rank: not rated yet
Feb 18, 2011
Rank: not rated yet
I wonder that too. Lawyers have always presented DNA evidence to juries as if it were essentially infallible. It is not infallible and this information my provide defense attorneys with a counter argument.
Feb 18, 2011
Rank: not rated yet
Feb 19, 2011
Rank: not rated yet
Probably exactly nothing, since people have millions of copies of their DNA throughout their bodies, and I doubt the same contamination is present in every single one.
Feb 19, 2011
Rank: not rated yet
If nearly a quarter of DNA in databases, which were carefully gathered in a manner to minimize contamination, are contaminated, what level of contamination can be expected of DNA evidence gathered from dirty crime scenes where there have been no means to insure uncontaminated samples?
Suppose a match to a suspect has a 1 in a million chance of being wrong. Now recognize that the sample taken at the crime scene has a 50% chance of contamination. The 1 million to 1 probability of guilt has just dropped to 1 in 2 (50%), since there is only a 50% probability that the sample is not contaminated.
Such "evidence" becomes meaningless.
Feb 19, 2011
Rank: not rated yet
Though the contamination is probably a small percentage of the total DNA present. The article states that the researchers found a bit of DNA while looking at ancient strands. If at a crime scene, they find the DNA of a person and it has small portions of another person, they'll probably go with whichever DNA is predominant.
Feb 19, 2011
Rank: not rated yet
Feb 21, 2011
Rank: 5 / 5 (1)
Contamination shows up lighter than the predominant set of DNA. So it is usually possible to compare the sample with a suspects sample even with some contamination.
DNA sequencing is different because the cheap way to sequence was to chop it all up and then put it in the right order with a computer program. They don't see it as a whole just a set of A-G-C-T mostly for proteins. You would not have contamination looking like a ghost image as it does in the forensic testing.
IIRC Ventner's technique ignored a lot of the DNA that didn't code for proteins. Might have stopped that shortcut by now.
Ethelred
Feb 21, 2011
Rank: 3 / 5 (2)
First of all, you're calculating the probability of the evidence given innocence (i.e. P(DNA match | innocent)) and should instead be doing probability of guilt given the evidence (i.e. P(guilt|DNA match)), which, if you'd listen to most biologists, isn't all that high. It's fairly hard to find samples to get DNA analysis from in most crime scenes, since the chance of contamination is always very high. Unfortunately, a society that's led by expectations of CSI-esque evidence believes that it's only a matter of looking hard enough.
Secondly, as Ethelred pointed out, you don't sequence evidence, you gel it.
Feb 21, 2011
Rank: 3 / 5 (2)
Feb 22, 2011
Rank: 3 / 5 (2)
"Contamination is problematic, however, when comparing genes from closely related taxa"
" But, she says, the catch is that it’s virtually impossible to identify a foreign bit of human DNA in a human genome database."
m SHUCKS!!! That is and ha been my argument!!
“In sequencing, you have to put all the pieces of the genome together like a big jigsaw puzzle. The pieces that don’t fit stand out, “But if you’re working on a human puzzle, it’s like working on a three-billion piece puzzle, and it’s all black.
“It’s virtually impossible to find human contamination in human genome databases,” she adds, because they simply don’t stand out as anything unusual in a human genome. This..., could lead to some terrible mistakes. INDEED, or bad assumptions!
Feb 23, 2011
Rank: 5 / 5 (1)
Ethelred