Finding deep roots, new genome software infers ancestry with high accuracy
March 19, 2008Some people may know where their ancestors lived 10 or 20 generations ago, but the rest of us can learn our distant biological heritage only from our DNA. New genomics analysis software developed by computer scientists at Stanford appears far more adept than prior methods at unraveling the ancestry of individuals. A paper describing the HAPAA system, which takes its name from "hapa," the Hawaiian word for someone of mixed ancestry, appears online today and in the April printed issue of the journal Genome Research.
Going back 20 generations the software can identify what continent or broad global region an individual's ancestors were from. But going back about 10 generations the software can be much more precise, making distinctions as fine-grained as the traditional gene pools of nearby population groups—hypothetically differentiating Greek from Italian, or Russian from German.
Specifically what the software does is compare an individual to all those in the International HapMap database to see what distinct spans of genetic snippets, called haploblocks, they share in common.
"With very high accuracy, even for 20 generations, we can trace the populations of those individuals who are indeed represented in your genome," says Stanford computer science Assistant Professor Serafim Batzoglou, who led a team of graduate students to create HAPAA. They include co-lead authors Andreas Sundquist and Eugene Fratkin, as well as Chuong B. Do.
Batzoglou points out that because the HapMap database, a genetic record of 270 individuals of Western European, West African and East Asian ancestry, is very small, HAPAA now can only generate an ethnic profile in terms of these populations.
Fratkin himself was able to verify that he is of European ancestry, but not that he is 1/64th Polish. But more genomics data will become available, the researchers said, which will further expand the software's ability to help people discern their roots.
Low error, high precision
In the Genome Research paper the researchers tested the system's accuracy using real individuals in the database and by synthesizing virtual people, essentially simulating mating for 20 generations among individuals in the database.
The team also compared HAPAA to the current state-of-the-art system known as SABER. Using the standard statistical measure of "mean-square" error, Batzoglou and his students found that HAPAA's error rates were between a half and a third as big as SABER's. The difference widened as the generations probed went further back—meaning that HAPAA's error rate remains consistently low, even back 15 or 20 generations.
An important advance that improves HAPAA's accuracy is its more accurate modeling of individual variation. The Stanford computer scientists created an algorithm efficient enough to compare the genetic information of the test individual to that of every individual in the database. Other systems, including SABER, rely on comparisons to a composite that represents an averaging of the data from many individuals. That methodology is easier to program and run on a computer, but the problem with averaging is that a lot of information is lost.
Consider using comparison as the way to characterize a soccer player. One could look at her total goals scored and compare that figure to historical league average. Such a comparison would reveal whether she was generally a high scorer, but couldn't lend any insight as to whether her scoring patterns (e.g., game winners, late-game goals, penalty kicks) were more like those of Mia Hamm or Birgit Prinz.
For now the HAPAA software provides proof of this concept but limited utility given the small size of the HapMap database. In the future the software will benefit not only from having more individuals available for comparison, Batzoglou said, but also more detailed data about each individual. Today's genome samples track about 500,000 markers, or common genetic differences, but there are about 10 million candidates. Most individuals have about 3 million such specific differences. As genomics technology improves, he says, so will HAPAA's ability to infer ancestry from the data.
The research was supported by a grant from the National Institutes of Health and a Stanford graduate fellowship provided by the German software company SAP AG.
Source: by David Orenstein, Stanford University
-
When worlds collide: Researchers harness supercomputers to understand solar storm, magnetosphere
Feb 07, 2012 |
3 / 5 (1) |
6
-
UCSF leaders explore bioinformatics in research, patient care and education
Feb 01, 2012 |
not rated yet |
0
-
Robot reconnoiters uncharted terrain
Feb 01, 2012 |
5 / 5 (1) |
0
-
Israel sets sights on next-generation Internet
Jan 26, 2012 |
4 / 5 (1) |
0
-
Under the electron microscope -- A 3-D image of an individual protein
Jan 25, 2012 |
4.9 / 5 (12) |
0
-
Engineers build first sub-10-nm carbon nanotube transistor
Feb 01, 2012 |
4.9 / 5 (33) |
30
-
Something old, something new: Evolution and the structural divergence of duplicate genes
Jan 31, 2012 |
4.6 / 5 (7) |
1
-
The hidden nanoworld of ice crystals: Revealing the dynamic behavior of quasi-liquid layers
Jan 30, 2012 |
5 / 5 (4) |
1
-
Stock market network reveals investor clustering
Jan 27, 2012 |
3.9 / 5 (23) |
8
-
Of microchemistry and molecules: Electronic microfluidic device synthesizes biocompatible probes
Jan 26, 2012 |
5 / 5 (2) |
0
-
Classical and Quantum Mechanics via Lie algebras
Apr 15, 2011
- More from Physics Forums - Independent Research
More news stories
Overeating may double risk of memory loss
New research suggests that consuming between 2,100 and 6,000 calories per day may double the risk of memory loss, or mild cognitive impairment (MCI), among people age 70 and older. The study was released today and will be ...
Medicine & Health / Neuroscience
4 hours ago |
5 / 5 (2) |
0
|
Starve a virus, feed a cure? Findings show how some cells protect themselves against HIV
A protein that protects some of our immune cells from the most common and virulent form of HIV works by starving the virus of the molecular building blocks that it needs to replicate, according to research published online ...
8 hours ago |
5 / 5 (2) |
0
|
Injured boomers beware: Know when to see doctor
(AP) -- It happened to nurse Jane Byron years after an in-line skating fall, business owner Haralee Weintraub while doing "men's" push-ups, and avid cyclist Gene Wilberg while lifting a heavy box.
9 hours ago |
5 / 5 (1) |
0
Declining health-care productivity in England: Who says so?
Reports that the National Health Service in England has been declining in productivity in the last decade appear to have been accepted as fact. However, a Viewpoint published Online First by The Lancet disputes this. The Vi ...
2 hours ago |
not rated yet |
0
FDA-approved drug rapidly clears amyloid from the brain, reverses Alzheimer's symptoms in mice
Neuroscientists at Case Western Reserve University School of Medicine have made a dramatic breakthrough in their efforts to find a cure for Alzheimer's disease. The researchers' findings, published in the journal Science, show t ...
Medicine & Health / Neuroscience
Feb 09, 2012 |
4.9 / 5 (58) |
17
|
Scientists discover molecular secrets of 2,000-year-old Chinese herbal remedy
For roughly two thousand years, Chinese herbalists have treated Malaria using a root extract, commonly known as Chang Shan, from a type of hydrangea that grows in Tibet and Nepal. More recent studies suggest that halofuginone, ...
New method to examine batteries -- MRI from the inside
There is an ever-increasing need for advanced batteries for portable electronics, such as phones, cameras, and music players, but also to power electric vehicles and to facilitate the distribution and storage of energy derived ...
A mitosis mystery solved: How chromosomes align perfectly in a dividing cell
Although the process of mitotic cell division has been studied intensely for more than 50 years, Whitehead Institute researchers have only now solved the mystery of how cells correctly align their chromosomes during symmetric ...
Google might launch Drive for cloud storage soon
(PhysOrg.com) -- Google's next big move, according to the Wall Street Journal, is a cloud storage service called Drive. Hardly first to the plate, Google is simply catching up to introducing its cloud reposi ...
Lab study raises questions over nano-particle impact
Tests involving chickens have raised questions about the impact on health from engineered nano-particles, the ultra-fine grains commonly used in drugs and processed foods, scientists said on Sunday.
Researchers find extensive RNA editing in human transcriptome
In a new study published online in Nature Biotechnology, researchers from BGI, the world's largest genomics organization, reported the evidence of extensive RNA editing in a human cell line by analysis of RNA-seq data, demons ...