BaBar Collaboration Completes Data Reprocessing
December 18, 2008 By Kelen Tuttle
(PhysOrg.com) -- One might think that processing the records of 22 billion electron and positron collisions once would be enough. But not so for the BaBar collaboration, which this week announced the completion of reprocessing for 99.99 percent of its huge coffers of Upsilon(4S) raw data.
Processing is one of the very first steps in data analysis, and involves putting raw data into a more useful form. This requires taking the signal recorded by BaBar's many layers of detectors and reconstructing which types of particles left the signals, while traveling in what directions and at what speeds. These reconstructed data are then compared to simulated data to identify particularly interesting events, and divided into many different streams from which researchers can pluck event types of interest.
Over the years, the collaboration has again and again reworked the method and programs it uses to process data. By reprocessing the entire dataset with the newest software, the collaboration has now created a standardized dataset across the experiment's eight years of data collection.
"This was a huge effort undertaken by many people," said BaBar Computing Coordinator Homer Neal. "It takes a lot of work to do something like this, but it's worthwhile to create such a uniform and deep dataset."
The reprocessing project began in 2007, when the collaboration decided to invest the time and effort to produce the best software possible for the final phase of data-taking. "And from there, the argument was easy for reprocessing everything with that same software," said Emeritus BaBar Computing Coordinator Gregory Dubois-Felsmann.
The first step was to write the new reconstruction software—no easy task. Taking the signals from the detector and working backward to figure out what actually happened is an extremely complex process. When you make improvements in one area of the software, Dubois-Felsmann said, there is always the chance that you have worsened some other aspect accidentally. Nonetheless, through multiple iterations and by checking the software against large amounts of data, researchers validated the new software last spring.
"We were slowed down a bit by the bad budget news and the decision to take the last few months of data at a lower energy," said Dubois-Felsmann. "We needed to rewrite the software for this lower energy as well, so we didn't finish until about a month later than originally planned."
Even with this late start, the collaboration finished reconstructing BaBar's eight years of data with the new software ahead of schedule. The success, Neal and Dubois-Felsmann agreed, is a result of hard work and the ability to expand computing resources both at SLAC and at the Padova computing center in Italy, where much of the reprocessing took place. "Both SLAC and Padova were wonderfully supportive and made this happen," said Neal.
In addition to improving the event reconstruction, researchers also made improvements to two other areas of the production process: simulation and data skimming.
Although it seems slightly counterintuitive at first, simulated data are integral to the analysis of real data. That's because the only way to understand the output of BaBar's detectors is to simulate the many different types of collisions that could occur—and what those collisions would look like when recorded by the layers of detectors—and then compare the real data to the simulations. "Essentially, you see how theoretical, fundamental physics interacts with your detector," said Neal. "Doing all of these simulations was the biggest challenge to the reconstruction effort."
About 20 computing sites around the world contributed to this effort. Thanks to these sites (including SLAC, where simulation production nearly doubled in 2008), the collaboration simulated about 7.5 billion events. While almost all sites achieved record production levels, two of the sites, the computing center at IN2P3 in France and the Rutherford Appleton Laboratory in the United Kingdom, were at times producing as much as 20% of the total production each. "It was really a great effort from the collaboration's computing centers," said Dubois-Felsmann. "Not only did we expand infrastructure to make this happen, but we also optimized the process, accumulating a lot of one-percent improvements. In this way, we increased the speed by 20 to 30 percent."
The last step in processing is the separation of data into different streams based on the event types apparent in each collision. This process, called skimming, was performed at the computing centers—SLAC, GridKa in Germany, and RAL and the University of Manchester in the U.K. By upgrading the efficiency of the data skimming system, researchers hastened along this time-intensive process.
In all, the reprocessing was so successful that not only is the BaBar data set more accurate, but more of the data are being used than ever before. "Our data set has actually been growing since we turned the detector off," said Homer.
This is possible because in the past, any possibly distorted data were immediately discarded from analysis. "For example, for each hour worth of data we took, we would make a 'rough and ready' decision on the data to decide if it was good enough for analysis," said Dubois-Felsmann. "This time around, we went back and looked at the excluded data to decide if the original decision was too conservative." Researchers also loosened the filters that determined whether specific events were interesting and thus worth looking at in the future, and revived data that had been excluded because they seemed too anomalous at the time.
"Essentially, we were able to fix some problems that we were not able to fix in the past," said Dubois-Felsmann. "As a result, our dataset is about five percent larger and better than ever before."
Provided by SLAC National Accelerator Laboratory
-
Engineers build first sub-10-nm carbon nanotube transistor
Feb 01, 2012 |
4.9 / 5 (33) |
30
-
Something old, something new: Evolution and the structural divergence of duplicate genes
Jan 31, 2012 |
4.6 / 5 (7) |
1
-
The hidden nanoworld of ice crystals: Revealing the dynamic behavior of quasi-liquid layers
Jan 30, 2012 |
5 / 5 (4) |
1
-
Stock market network reveals investor clustering
Jan 27, 2012 |
3.9 / 5 (23) |
8
-
Of microchemistry and molecules: Electronic microfluidic device synthesizes biocompatible probes
Jan 26, 2012 |
5 / 5 (2) |
0
-
Thermodynamics q
2 hours ago
-
what is electricity???
5 hours ago
-
Can Plasma Be Solid
6 hours ago
-
What is delta Δ ?
7 hours ago
-
Need some help understanding Hertz–Knudsen formula
7 hours ago
-
Anatomy of Fat man: implosion-critical bomb
9 hours ago
- More from Physics Forums - General Physics
More news stories
Explained: Sigma
It's a question that arises with virtually every major new finding in science or medicine: What makes a result reliable enough to be taken seriously? The answer has to do with statistical significance -- but ...
Feb 09, 2012 |
5 / 5 (20) |
78
Quantum physicist explains $100K offer for proof scaled-up quantum computing is impossible
(PhysOrg.com) -- MIT researcher Scott Aaronson has certainly riled the physics community with his offer this past Friday, of $100,000 to anyone who can prove that scaled-up quantum computing is impossible. ...
Diamond light, brighter than the sun
Its the size of five football pitches and generates light 10 billion times brighter than the sun. As the Diamond Light Source celebrates its tenth anniversary this year, Penny Bailey visits one of the ...
Feb 07, 2012 |
4.1 / 5 (10) |
18
|
Physicists 'record' magnetic breakthrough
An international team of scientists has demonstrated a revolutionary new way of magnetic recording which will allow information to be processed hundreds of times faster than by current hard drive technology.
Feb 07, 2012 |
4.6 / 5 (43) |
15
|
Hints of the Higgs - papers are submitted
Back in December 2011, the ATLAS and CMS experiments at CERN presented some exciting results that provided tantalising hints of the Higgs boson.
Feb 08, 2012 |
4.3 / 5 (8) |
10
Scientists discover molecular secrets of 2,000-year-old Chinese herbal remedy
For roughly two thousand years, Chinese herbalists have treated Malaria using a root extract, commonly known as Chang Shan, from a type of hydrangea that grows in Tibet and Nepal. More recent studies suggest that halofuginone, ...
New method to examine batteries -- MRI from the inside
There is an ever-increasing need for advanced batteries for portable electronics, such as phones, cameras, and music players, but also to power electric vehicles and to facilitate the distribution and storage of energy derived ...
A mitosis mystery solved: How chromosomes align perfectly in a dividing cell
Although the process of mitotic cell division has been studied intensely for more than 50 years, Whitehead Institute researchers have only now solved the mystery of how cells correctly align their chromosomes during symmetric ...
Google might launch Drive for cloud storage soon
(PhysOrg.com) -- Google's next big move, according to the Wall Street Journal, is a cloud storage service called Drive. Hardly first to the plate, Google is simply catching up to introducing its cloud reposi ...
Lab study raises questions over nano-particle impact
Tests involving chickens have raised questions about the impact on health from engineered nano-particles, the ultra-fine grains commonly used in drugs and processed foods, scientists said on Sunday.
Starve a virus, feed a cure? Findings show how some cells protect themselves against HIV
A protein that protects some of our immune cells from the most common and virulent form of HIV works by starving the virus of the molecular building blocks that it needs to replicate, according to research published online ...