Conquering the chaos in modern, multiprocessor computers

March 10, 2010 by Hannah Hickey

(PhysOrg.com) -- Computers should not play dice. That, to paraphrase Einstein, is the feeling of a University of Washington computer scientist with a simple manifesto: If you enter the same computer command, you should get back the same result. Unfortunately, that is far from the case with many of today's machines. Beneath their smooth exteriors, modern computers behave in wildly unpredictable ways, said Luis Ceze, a UW assistant professor of computer science and engineering.

"With older, single-processor systems, computers behave exactly the same way as long as you give the same commands. Today's computers are non-deterministic. Even if you give the same set of commands, you might get a different result," Ceze said.

He and UW associate professors of computer science and engineering Mark Oskin and Dan Grossman and UW graduate students Owen Anderson, Tom Bergan, Joseph Devietti, Brandon Lucia and Nick Hunt have developed a way to get modern, multiple-processor computers to behave in predictable ways, by automatically parceling sets of commands and assigning them to specific places. Sets of commands get calculated simultaneously, so the well-behaved program still runs faster than it would on a single processor.

Next week at the International Conference on Architectural Support for Programming Languages and Operating Systems in Pittsburgh, Bergan will present a software-based version of this system that could be used on existing machines. It builds on a more general approach the group published last year, which was recently chosen as a top paper for 2009 by the Institute of Electrical and Electronics Engineers' journal Micro.

In the old days one computer had one processor. But today's consumer standard is dual-core processors, and even quad-core machines are appearing on store shelves. Supercomputers and servers can house hundreds, even thousands, of processing units.

On the plus side, this design creates computers that run faster, cost less and use less power for the same performance delivered on a single . On the other hand, multiple processors are responsible for elusive errors that freeze Web browsers and crash programs.

It is not so different from the classic chaos problem in which a butterfly flaps its wings in one place and can cause a hurricane across the globe. Modern shared-memory computers have to shuffle tasks from one place to another. The speed at which the information travels can be affected by tiny changes, such as the distance between parts in the or even the temperature of the wires. Information can thus arrive in a different order and lead to unexpected errors, even for tasks that ran smoothly hundreds of times before.

"With multi-core systems the trend is to have more bugs because it's harder to write code for them," Ceze said. "And these concurrency bugs are much harder to get a handle on."

One application of the UW system is to make errors reproducible, so that programs can be properly tested.

"We've developed a basic technique that could be used in a range of systems, from cell phones to data centers," Ceze said. "Ultimately, I want to make it really easy for people to design high-performing, low-energy and secure systems."

Last year Ceze, Oskin, and Peter Godman, a former director at Isilon Systems, founded a company to commercialize their technology. Petra is initially named after the Greek word for rock because it hopes to develop "rock-solid systems," Ceze said. The Seattle-based startup will soon release its first product, Jinx, which makes any errors that are going to crop up in a program happen quickly.

"We can compress the effect of thousands of people using a program into a few minutes during the software's development," Ceze said. "We want to allow people to write code for multi-core systems without going insane."

The company already has some big-name clients trying its product, Ceze said, though it is not yet disclosing their identities.

"If this erratic behavior irritates us, as software users, imagine how it is for banks or other mission-critical applications."

More information: http://www.ece.cmu.edu/CALCM/asplos10/doku.php

Provided by University of Washington (news : web)

3.2 /5 (9 votes)  

Filter


Move the slider to adjust rank threshold, so that you can hide some of the comments.


Display comments: newest first

XopherMV
Mar 10, 2010

Rank: 3 / 5 (2)
Ok, great. This article tells me nothing. How about at least a high-level description of the techniques?
JayK
Mar 10, 2010

Rank: 1 / 5 (2)
I agree, this article is fairly useless, doesn't begin to describe the methodologies or even configuration data that was used to generate the conclusion.

There are also a lot of accusations in this article that are unbacked by proof, such as "web browsers locking up on multiprocessor machines." Poorly designed code will lock up on any sort of architecture, and code that is poorly written can easily lock up due to race conditions that work on older systems.
jj2009
Mar 10, 2010

Rank: not rated yet
a rock solid system. great, ill believe it when i see it! will it run microsoft windows?
eachus
Mar 11, 2010

Rank: not rated yet
Lol! Two things going on here. One is that they have a product which, in effect rattles the cage to get events to happen in different (but still legal under the API) orders.

The other is a lesson which Software Engineers discovered a long time ago. You can't test in quality, the most you can do is remove some of the bugs. And then you run into the problem that fixing those bugs can introduce more bugs, until the project is eventually dropped without ever shipping a product.

For complex reasons, Ada was designed to fix these problems by making it easy to see that the program was correct, even if that made it harder to write. In 1983 I taught one of the earliest programming courses to existing programmers, and we got a rude shock. Around 30% of the existing, professional programmers in the course could not write a correct Ada program. (Not Hello World, but on the order of 50 lines of code.) What went wrong?
eachus
Mar 11, 2010

Rank: not rated yet
We found out one of the problems when we took a scheduling algorithm (written in Fortran) from an operating system and re-implemented it in Ada to see if it was slower or faster. We found 13 errors in 110 SLOC (ignoring block comments) in the Fortran. Once we fixed them, and compared the generated code, the resulting machine code was almost identical.

For some reason management was more interested in the errors in the existing code that had been shipping for over a year, than in how fast either version ran. Anyway we learned that the problem with programming in Ada, was that it expected you to get all these exceptional and edge case right the first time.

Tools were developed for writing (hard real-time) tasking code in Ada that work and result in code that can be proven correct even when distributed across multiple homogeneous or heterogeneous processors.

The problem is that it takes at least a year or more to become productive. :-( The good news is no debugging.
komone
Mar 11, 2010

Rank: not rated yet
Partly in response to Xopher... the issue is that imperative languages like c, java, ada, etc share memory and thus cause side-effects. Shared transactional memory (STM) is certainly one response to the problem, but actor based systems and lambda calculus offer a better approach to resolving these issues at the heart of imperative programming. Functional languages e.g. erlang, haskell et al. appear to be the most productive way forward when faced with the multicore situation.
abhishekbt
Mar 11, 2010

Rank: not rated yet
@jj2009 - You must be kidding right?
The two items in your post are antonyms.
Jo01
Mar 11, 2010

Rank: not rated yet
Logic is the ultimate (and only) tool to master programming on single, multi or ultra core hardware.

The complexity increases maybe, depending on the tools and paradigm used, but logic is always sufficient to resolve the problem.

When it isn't it is save to say it is a hardware problem, most of the time related to disk errors.
In 30 years time I've never encountered a CPU related hardware error or memory related hardware error that resulted in unexplainable code behavior.
(I know that memory stress tests on common not error corrected hardware reveals errors, but that's not what I am talking about. What I mean is that debugging an error manifested by using a program never led to faulty hardware, in my case.)

Also, avoiding (and resolving) memory leaks when programming in for example C is as hard as parallel programming. The combination, C and parallel, is of course the ultimate fun.

...
Jo01
Mar 11, 2010

Rank: not rated yet
I think it's completely wrong to state that modern computers are chaotic, unpredictable and chaotic isn't necessarily synonym.

J
CSharpner
Mar 11, 2010

Rank: not rated yet
"and even quad-core machines are appearing on store shelves"

What?? They've been on store shelves for YEARS. I got mine over 2 years ago at CompUSA, which has been closed for over a year now.

This article gives zero information on what this magic algorithm or process is. From what little information is provided, it sounds like it's just something that does load balancing of threads, which all modern OS's already do, so I'm still curious what this is. This article is essentially a really long title to a story that was never written.

And what's this "unpredictable" so-called problem they speak of? Things are fairly predictable in my software development... even in my multi-threaded development, except of course, for things that HAVE to be unpredictable like threads that pause until external events and the like.
JayK
Mar 11, 2010

Rank: 2.3 / 5 (3)
CShaprner: I think they're selling something and this is just a pitch full of hyperbole and gross exaggerations.
raron
Mar 13, 2010

Rank: not rated yet
"Today's computers are non-deterministic."

L-O-L. I'd never thought I'd read THAT in a physics related website.

However, I must confess to thinking the same thoughts myself (especially concerning some popular OS out there), but that doesn't mean it is "non-deterministic". Only a very, very complex machine dependent on variables not easily controlled by mere users.

This is either a joke, or there is actually a very, very frustrated "computer scientist" at University of Washington. Which isn't all that unthinkable, really.
:-)
El_Nose
Mar 16, 2010

Rank: not rated yet
This still does not get rid of coding errors -- in fact no program can ensure that a piece od code does not have a programming error in it - outside of checking syntax

@raron

the nondeterministic part comes from the Processor - today's processors uses branch prediction and this adds the nondeterministic part -- there was a very good explanation of this as a lecture - that i cannot find the link to ;-( -- anyway because of branch prediction in the processor you cannot exactly guarentee that the same input will lead to exactly the same output IF -- BIG IF -- you are using multiple cores - basically you have one cores.

1) branch prediction on a single process with the same input will have exactly the same effect with the same input

2) but what if you depend on data from another process and the code in that process on another core or thread is momentarily wrong -- and then you have a context switch -- you have stored data that has been updated and you continue forward
El_Nose
Mar 16, 2010

Rank: not rated yet
these are really programming issues that a careful programmer can overcome and the fact that multicores is really not the point -- anyone making a program with a lot of threads will run into the same issue even if on one core without the proper locking of information.
taka
Mar 19, 2010

Rank: 1 / 5 (1)
All nowadays programming is based on sequential execution of commands (including functional languages unfortunately). If there is parallelism of any kind involved then this sequence do not exist any more and programmer is not capable to handle it as soon as it is anything else then trivial. If things become parallel then there are no ways to abstract or divide tasks any more. Proper parallel assembler must emerge before truly parallel high-level languages become possible and it should obviously be based on moving data around between commands-functions that just transform it (no memory to spoil, no next command to execute in wrong time).
Rank 3.2 /5 (9 votes)
Relevant PhysicsForums posts
  • Remote Internet voting security flaw?
    createdFeb 01, 2012
  • iPhone battery over time
    createdJan 30, 2012
  • Best alternate Tablet to an iPad for writing math or physics equations?
    createdJan 26, 2012
  • Sending SMS to a website
    createdJan 20, 2012
  • Need help with my technical fest!
    createdJan 19, 2012
  • Ubislate 7 upgrade to the world's cheapest tablet
    createdJan 06, 2012
  • More from Physics Forums - Computing & Technology

More news stories

Hackers intercept FBI, Scotland Yard call (Update)

(AP) -- Trading jokes and swapping leads, investigators from the FBI and Scotland Yard spent the conference call strategizing about how to bring down the hacking collective known as Anonymous, responsible ...

Technology / Internet

created 13 hours ago | popularity 5 / 5 (7) | comments 22

Japanese entrepreneurs aim for Silicon Valley

For an emerging generation of Japanese innovators, the dream isn't a job for life at a big company. They have new ambitions, and they're determined to go places. Especially Silicon Valley.

Technology / Business

created 11 hours ago | popularity 5 / 5 (1) | comments 1

A 'natural' solution for transportation

As the United States transitions away from a primarily petroleum-based transportation industry, a number of different alternative fuel sources—ethanol, biodiesel, electricity and hydrogen—have each ...

Technology / Energy & Green Tech

created 14 hours ago | popularity 3 / 5 (2) | comments 13

Hackers deface website of lawyers for US Marine

Members of the hacker group Anonymous defaced the website on Friday of the law firm that defended a US Marine who faced charges in connection with the 2005 killing of 24 Iraqi civilians.

Technology / Internet

created 8 hours ago | popularity 5 / 5 (2) | comments 2

TV executives crave viewers who watch 2 screens

Forget the small screen and the big screen. The hottest new thing in television is the "second screen" - the one on the tablet computer or cell phone that an increasing number of viewers keep an eye on while they're watching ...

Technology / Telecom

created 7 hours ago | popularity 1 / 5 (2) | comments 0


Amazon fungi found that eat polyurethane, even without oxygen

(PhysOrg.com) -- Until now polyurethane has been considered non-biodegradable, but a group of students from Yale University in the US has found fungi that will not only eat and digest it, they will do so even in the absence ...

Scientists chart high-precision map of Milky Way's magnetic fields

(PhysOrg.com) -- Scientists at the Naval Research Laboratory (NRL) are part of an international team that has pooled their radio observations into a database, producing the highest precision map to date of ...

Whole exome sequencing identifies cause of metabolic disease

Sequencing a patient's entire genome to discover the source of his or her disease is not routine – yet. But geneticists are getting close.

Hearing metaphors activates brain regions involved in sensory experience

When a friend tells you she had a rough day, do you feel sandpaper under your fingers? The brain may be replaying sensory experiences to help understand common metaphors, new research suggests.

Renowned physicist invents microscope that can peer at living brain cells

(PhysOrg.com) -- Ever since scientists began studying the brain, they’ve wanted to get a better look at what was going on. Researchers have poked and prodded and looked at dead cells under electron microscopes, ...

New kind of high-temperature photonic crystal could someday power everything from smartphones to spacecraft

A team of MIT researchers has developed a way of making a high-temperature version of a kind of materials called photonic crystals, using metals such as tungsten or tantalum. The new materials — which ...