BaBar Collaboration Completes Data Reprocessing

December 18, 2008 By Kelen Tuttle BaBar Collaboration Completes Data Reprocessing

(PhysOrg.com) -- One might think that processing the records of 22 billion electron and positron collisions once would be enough. But not so for the BaBar collaboration, which this week announced the completion of reprocessing for 99.99 percent of its huge coffers of Upsilon(4S) raw data.

Processing is one of the very first steps in data analysis, and involves putting raw data into a more useful form. This requires taking the signal recorded by BaBar's many layers of detectors and reconstructing which types of particles left the signals, while traveling in what directions and at what speeds. These reconstructed data are then compared to simulated data to identify particularly interesting events, and divided into many different streams from which researchers can pluck event types of interest.

Over the years, the collaboration has again and again reworked the method and programs it uses to process data. By reprocessing the entire dataset with the newest software, the collaboration has now created a standardized dataset across the experiment's eight years of data collection.

"This was a huge effort undertaken by many people," said BaBar Computing Coordinator Homer Neal. "It takes a lot of work to do something like this, but it's worthwhile to create such a uniform and deep dataset."

The reprocessing project began in 2007, when the collaboration decided to invest the time and effort to produce the best software possible for the final phase of data-taking. "And from there, the argument was easy for reprocessing everything with that same software," said Emeritus BaBar Computing Coordinator Gregory Dubois-Felsmann.

The first step was to write the new reconstruction software—no easy task. Taking the signals from the detector and working backward to figure out what actually happened is an extremely complex process. When you make improvements in one area of the software, Dubois-Felsmann said, there is always the chance that you have worsened some other aspect accidentally. Nonetheless, through multiple iterations and by checking the software against large amounts of data, researchers validated the new software last spring.

"We were slowed down a bit by the bad budget news and the decision to take the last few months of data at a lower energy," said Dubois-Felsmann. "We needed to rewrite the software for this lower energy as well, so we didn't finish until about a month later than originally planned."

Even with this late start, the collaboration finished reconstructing BaBar's eight years of data with the new software ahead of schedule. The success, Neal and Dubois-Felsmann agreed, is a result of hard work and the ability to expand computing resources both at SLAC and at the Padova computing center in Italy, where much of the reprocessing took place. "Both SLAC and Padova were wonderfully supportive and made this happen," said Neal.

In addition to improving the event reconstruction, researchers also made improvements to two other areas of the production process: simulation and data skimming.

Although it seems slightly counterintuitive at first, simulated data are integral to the analysis of real data. That's because the only way to understand the output of BaBar's detectors is to simulate the many different types of collisions that could occur—and what those collisions would look like when recorded by the layers of detectors—and then compare the real data to the simulations. "Essentially, you see how theoretical, fundamental physics interacts with your detector," said Neal. "Doing all of these simulations was the biggest challenge to the reconstruction effort."

About 20 computing sites around the world contributed to this effort. Thanks to these sites (including SLAC, where simulation production nearly doubled in 2008), the collaboration simulated about 7.5 billion events. While almost all sites achieved record production levels, two of the sites, the computing center at IN2P3 in France and the Rutherford Appleton Laboratory in the United Kingdom, were at times producing as much as 20% of the total production each. "It was really a great effort from the collaboration's computing centers," said Dubois-Felsmann. "Not only did we expand infrastructure to make this happen, but we also optimized the process, accumulating a lot of one-percent improvements. In this way, we increased the speed by 20 to 30 percent."

The last step in processing is the separation of data into different streams based on the event types apparent in each collision. This process, called skimming, was performed at the computing centers—SLAC, GridKa in Germany, and RAL and the University of Manchester in the U.K. By upgrading the efficiency of the data skimming system, researchers hastened along this time-intensive process.

In all, the reprocessing was so successful that not only is the BaBar data set more accurate, but more of the data are being used than ever before. "Our data set has actually been growing since we turned the detector off," said Homer.

This is possible because in the past, any possibly distorted data were immediately discarded from analysis. "For example, for each hour worth of data we took, we would make a 'rough and ready' decision on the data to decide if it was good enough for analysis," said Dubois-Felsmann. "This time around, we went back and looked at the excluded data to decide if the original decision was too conservative." Researchers also loosened the filters that determined whether specific events were interesting and thus worth looking at in the future, and revived data that had been excluded because they seemed too anomalous at the time.

"Essentially, we were able to fix some problems that we were not able to fix in the past," said Dubois-Felsmann. "As a result, our dataset is about five percent larger and better than ever before."

Provided by SLAC National Accelerator Laboratory


print this article email this article download pdf blog this article bookmark this article     Stumble it Digg this share on Facebook retweet share on Reddit add to delicious
Rate this story - 4 /5 (1 vote)


December 18, 2008 all stories

Comments: 0

4 /5 (1 vote)
  • Stumble this up

  • Digg this

  • share this

  • hide
  • Related Stories

  • 'Cloud' computing market 14 bln dollars by 2014: Gartner
    created 1hour ago | popularity not rated yet | comments 0
  • HP Enables Better, Faster Decision Making with Breakthrough Sensing Technology
    created Nov 05, 2009 | popularity not rated yet | comments 0
  • Touting tech tools of the future
    created Nov 05, 2009 | popularity not rated yet | comments 0
  • Microsoft raises cloud computing concerns
    created Nov 05, 2009 | popularity not rated yet | comments 0
  • Microsoft, Taiwan to set up cloud computing centre
    created Nov 04, 2009 | popularity not rated yet | comments 0



  • hide
  • Relevant PhysicsForums posts

  • Bodies in motionÂ…..
    created 2 hours ago
  • Refraction optics help
    created 2 hours ago
  • A basketball Jump Shot
    created 3 hours ago
  • help with accelerometer
    created 4 hours ago
  • Young's Double Slit - Fringe Width
    created 9 hours ago
  • Pressure exerted by a liquid is different to gas?
    created 9 hours ago
  • More from Physics Forums - General Physics

Other News

Solving big problems

Solving big problems with new quantum algorithm

Physics / Quantum Physics

created 58 minutes ago | popularity 4.8 / 5 (4) | comments 0

(PhysOrg.com) -- In a recently published paper, Aram Harrow at the University of Bristol and colleagues from MIT in the United States have discovered a quantum algorithm that solves large problems much faster ...


First Bose-Einstein condensation of strontium

First Bose-Einstein condensation of strontium

Physics / Quantum Physics

created 6 hours ago | popularity 5 / 5 (4) | comments 1

In an international first, scientists from the Institute of Quantum Optics and Quantum Information (IQOQI, Austria) produced a Bose-Einstein condensate of the alkaline-earth element strontium, thus narrowly ...


The LHC tunnel

Peckish bird briefly downs big atom smasher

Physics / General Physics

created 12 hours ago | popularity 3.8 / 5 (9) | comments 11

A peckish bird briefly knocked out part of the world's biggest atom smasher by causing a chain reaction with a piece of bread, the European Organisation for Nuclear Research (CERN) said Monday.


Plasma-in-a-bag for sterilizing devices

Physics / General Physics

created 3 hours ago | popularity not rated yet | comments 0

The practice of sterilizing medical tools and devices helped revolutionize health care in the 19th century because it dramatically reduced infections associated with surgery. Through the years, numerous ways of sterilization ...


Ginzburg helped develop the Soviet Union's hydrogen bomb in the late 1940s and early 1950s

Russian bomb physicist Ginzburg dead at 93

Physics / General Physics

created 14 hours ago | popularity 5 / 5 (4) | comments 0

Nobel Physics prize winner Vitaly Ginzburg, who helped develop the Soviet hydrogen bomb, has died at age 93, the Russian Academy of Sciences said Monday.