More Power to Google

April 7, 2007

Google is seeking the optimal energy efficiency for its large data centers, and it is counting on its top engineers to help deliver it.

Luiz Barroso, a distinguished engineer at Google, discussed the company's projects to reach optimal energy efficiency in a talk entitled, "Watts, faults and other fascinating 'dirty' words computer architects can no longer afford to ignore," at the company's complex here on April 5.

Barroso, a former Digital Equipment engineer with a history of delivering load balancing software for large-scale systems and for working on the design of the core Google infrastructure, summarized two projects he has been working on.

One, a power provisioning study, will be formally released in a paper this summer, Barroso said.

Two main points arose from the power provisioning study, he said: "Maximizing usage of available power capacity is key," and "systems are typically very power-inefficient on nonpeak conditions."

Moreover, Barroso said, "Power/energy efficiency and fault-tolerance are central to the design of large-scale computing systems today. And technology trends are likely to make them even more relevant in the future, increasingly affecting smaller-scale systems."

Barroso acknowledged that Google is building data centers where there is hydroelectric power and "engineers are squeezing every little watt out of every card."

Indeed while circuit designers have to worry about things like temperature and other issues, "we worry about the affordability of building data centers," Barroso said.

He noted that it costs between $10 and $22 per watt to build a data center, while the U.S. average energy cost is only 80 cents per watt. So "it costs more to build a data center than to power it for 10 years," Barroso said.

"You want to get as close as possible to optimal usage," because unused watts cost money, he said.

So for the power provisioning study, Google looked at how much energy its machines were using over six months.

The example for the study covered only 800 machines of the thousands Google employs, and one of the findings was that "you spend 60 percent of your time at or below your peak, and racks of machines are never at peak at the same time."

Moreover, "the data center as a whole is never going above 70 percent of capacity, and that shows we could have deployed 40 percent more machines."

Barroso highlighted two hot areas of computer design made famous in the '90s that have proven to be flawed. One is the acceleration of single-thread performance, which he referred to as the megahertz race. The other is the building of big, distributed shared memory systems, which he called the DSM race.

The theory behind the DSM race was that large-scale computing systems should use a shared-memory programming model because it was familiar to programmers and facilitates sharing of expensive resources, among other things. But the undoing of the DSM race was fault containment, Barroso said.

"A single fault can bring down the entire shared memory domain," Barroso said. "It's a very hard problem to solve … and most of the solutions are inadequate."

Meanwhile, in the megahertz race, where even unmodified software simply gets faster by itself because of some computer architectural tricks; "the megahertz race crashes into the power wall," Barroso said.

He said that every year enterprises can buy faster servers for about the same price, "but much more energy is being used so systems become power-inefficient."

Joked Barroso: "When you get to the point where power costs more than servers, you'll have a situation like the cell phone industry model where utility companies might say, 'I'll give you these servers for free if you sign this energy contract.'"

Barroso also mentioned H.R. 5646, a congressional bill signed into law last year to promote the use of energy-efficient computer servers in the United States.

"There are a lot of things you can do to reduce energy conversion losses, like go to single-voltage rail power supply units [PSUs]," Barroso said. "You can get up to a four times reduction in conversion losses."

Moreover, Barroso said Google is "working with - its - partners to create open standards for higher-efficiency PSUs." He later said the list of partners includes Intel and AMD.

Meanwhile, new technologies such as multicore processors and increasing parallelism offer promise. "But there's a catch," Barroso said. "Are there enough threads? Can we expect programmers to build efficient/concurrent programs?"

Indeed, with more data it is easier to do parallelism. "At Google we're interested in problems where there's a truckload of data, so it might be a little easier for us," Barroso said.

However, fault-tolerant software is powerful, but it is not enough, Barroso said. Large-scale systems also need additional monitoring.

Google employs what it calls its System Health Infrastructure, which talks to every server in the system frequently and collects health signals and activity information, Barroso said

Asked if Google might consider open-sourcing this technology, Barroso said "We've been looking at open-sourcing some of the code for some time." However, "some of this is infrastructure and we build it so intertwined with other software we have that it's hard to pull things apart."

In addition, Google uses self-monitoring, analysis and reporting technology, or SMART, to do early detection of problems. And it found that disk drives with scan errors are 10 times more likely to fail than those with no errors, Barroso said.

However, the company found that more than half of the drives that failed showed no signals, he said. Indeed, 56 percent had no strong signals at all, he said.

"It's fairly easy to predict something if you give a long enough time frame," Barroso said. "I predict we're all going to die," he quipped.

In addition, Barroso said the Google study found that temperature was not shown to be a significant factor in disk failures - slightly warmer temperatures did not cause any more failures than cooler ones.

"If the variability of temperature is not that great then data center designers have a lot more flexibility" in designing more energy-efficient facilities, Barroso said.

Copyright 2007 by Ziff Davis Media, Distributed by United Press International

3.4 /5 (7 votes)  

Rank 3.4 /5 (7 votes)
Tags

Related Stories
Relevant PhysicsForums posts
  • A way to send and receive wireless data
    created5 hours ago
  • Tabletop Cold Fusion Reactor
    created7 hours ago
  • Calling function with no input argument
    createdFeb 10, 2012
  • Force free body diagram problem on gym equipment
    createdFeb 10, 2012
  • Empirical data regarding shower heads and water
    createdFeb 10, 2012
  • feed hold button on CNC lathe
    createdFeb 09, 2012
  • More from Physics Forums - General Engineering

More news stories

Walney offshore wind farm is world's biggest (for now)

(PhysOrg.com) -- The Walney wind farm on the Irish Sea--characterized by high tides, waves and windy weather--officially opened this week. The farm is treated in the press as a very big deal as the Walney ...

Technology / Energy & Green Tech

created 8 hours ago | popularity 3.6 / 5 (8) | comments 24 | with audio podcast weblog

Europeans protest controversial Internet pact

Tens of thousands of people marched in protests in more than a dozen European cities Saturday against a controversial anti-online piracy pact that critics say could curtail Internet freedom.

Technology / Internet

created 4 hours ago | popularity 5 / 5 (3) | comments 0

GPS court ruling leaves US phone tracking unclear

A US Supreme Court decision requiring a warrant to place a GPS device on the car of a criminal suspect leaves unresolved the bigger issue of police tracking using mobile phones, legal experts say.

Technology / Telecom

created 8 hours ago | popularity 4 / 5 (2) | comments 0

Netflix settlement trims 14 pct off 4Q earnings

(AP) -- Netflix pressed the rewind button on its fourth-quarter earnings after settling allegations that the video subscription service violated a consumer-privacy law.

Technology / Business

created 8 hours ago | popularity not rated yet | comments 0

Navy to begin tests on electromagnetic railgun prototype launcher

The Office of Naval Research (ONR)'s Electromagnetic (EM) Railgun program will take an important step forward in the coming weeks when the first industry railgun prototype launcher is tested at a facility ...

Technology / Engineering

created Feb 06, 2012 | popularity 4.6 / 5 (14) | comments 85 | with audio podcast


Europe stakes billion-dollar bet on new rocket

A pencil-slim rocket is scheduled to lift into space from South America on Monday, carrying a billion-dollar bet that Europe can grab a juicy slice of the market to place satellites in low orbit.

Study finds that anti-diabetic medication can prevent the long-term effects of maternal obesity

In a study to be presented today at the Society for Maternal-Fetal Medicine's annual meeting, The Pregnancy Meeting, in Dallas, Texas, researchers will report findings that show that short therapy with the anti-diabetic medication ...

Steroid injections prove effective in treatment of lumbar disc herniations

The use of epidural steroid injections may be a more efficient treatment option for lumbar disc herniations, according to research presented today at the American Orthopaedic Society for Sports Medicine's Specialty Day in ...

Amateur football players not always keen on returning to play after ACL injuries

Despite the known success rates of reconstructive Anterior Cruciate Ligament (ACL) surgery, the number of high school and collegiate football players returning to play may not be as high as anticipated, say researchers presenting ...

Study finds elevated levels of cell-free DNA in first trimester do not predict preeclampsia

In a study to be presented today at the Society for Maternal-Fetal Medicine's annual meeting, The Pregnancy Meeting, in Dallas, Texas, researchers will report findings that indicate that elevated levels of cell-free DNA in ...

PRP treatment aids healing of elbow injuries say researchers

As elbow injuries continue to rise, especially in pitchers, procedures to help treat and get players back in the game quickly have been difficult to come by. However, a newer treatment called platelet rich plasma (PRP) may ...