AMD Stream Processor First to Break 1 Teraflop Barrier

June 16, 2008 AMD Stream Processor First to Break 1 Teraflop Barrier

At the International Supercomputing Conference, AMD today introduced its next-generation stream processor, the AMD FireStream 9250, specifically designed to accelerate critical algorithms in high-performance computing (HPC), mainstream and consumer applications.

Leveraging the GPU design expertise of AMD’s Graphics Product Group, AMD FireStream 9250 breaks the one teraflop barrier for single precision performance. It occupies a single PCI slot, for unmatched density and with power consumption of less than 150 watts, the AMD FireStream 9250 delivers an unprecedented rate of performance per watt efficiency with up to eight gigaflops per watt.

Customers can leverage AMD’s latest FireStream offering to run critical workloads such as financial analysis or seismic processing dramatically faster than with CPU alone, helping them to address more complex problems and achieve faster results. For example, developers are reporting up to a 55x performance increase on financial analysis codes as compared to processing on the CPU alone, which supports their efforts to make better and faster decisions. Additionally, the use of flexible GPU technology rather than custom accelerators assists those creating application-specific systems to enhance and maintain their solutions easily.

The AMD FireStream 9250 stream processor includes a second-generation double-precision floating point hardware implementation delivering more than 200 gigaflops, building on the capabilities of the earlier AMD FireStream 9170, the industry’s first GP-GPU with double-precision floating point support. The AMD FireStream 9250’s compact size makes it ideal for small 1U servers as well as most desktop systems, workstations, and larger servers and it features 1GB of GDDR3 memory, enabling developers to handle large, complex problems.

AMD enables development of the FireStream family of processors with its AMD Stream SDK, designed to help developers create accelerated applications for AMD FireStream, ATI FireGL and ATI Radeon GPUs. AMD takes an open-systems approach to its stream computing development environment to ensure that developers can access and build on the tools at any level. AMD offers published interfaces for its high-level language API, intermediate language, and instruction set architecture; and the AMD Stream SDK’s Brook+ front-end is available as open source code.

In keeping with its open systems philosophy, AMD has also joined the Khronos Compute Working Group. This working group’s goals include developing industry standards for data parallel programming and working with proposed specifications like OpenCL. The OpenCL specification can help provide developers with an easy path to development across multiple platforms.

“An open industry standard programming specification will help drive broad-based support for stream computing technology in mainstream applications,” said Rick Bergman, senior vice president and general manager, Graphics Product Group, AMD. “We believe that OpenCL is a step in the right direction and we fully support this effort. AMD intends to ensure that the AMD Stream SDK rapidly evolves to comply with open industry standards as they emerge.”

The growth of the stream computing market has accelerated over the past few years with Fortune 1000 companies, leading software developers and academic institutions utilizing stream technology to achieve tremendous performance gains across a variety of applications.

“Stream computing is increasingly important for mainstream and consumer applications and is no longer limited to just the academic or engineering industries. Today we are truly seeing a fundamental shift in emerging system architectures,” said Jon Peddie, president, Jon Peddie Research. “As the industry’s only provider of both high-performance discrete GPUs and x86-compatible CPUs, AMD is uniquely well-suited to developing these architectures.”

AMD customers, including ACCIT, Centre de Physique de Particules de Marseille, Neurala and Telanetix are using the AMD Stream SDK and current AMD FireStream, ATI FireGL or ATI Radeon boards to achieve dramatic performance gains on critical algorithms in HPC, workstation and consumer applications. Currently, Neurala reports that it is achieving 10-200x speedups over the CPU alone on biologically inspired neural models, applicable to finance, image processing and other applications.

AMD is also working closely with world class application and solution providers to ensure customers can achieve optimum performance results. Stream computing application and solution providers include CAPS entreprise, Mercury Computer Systems, RapidMind, RogueWave and VizExperts. Mercury Computer Systems provides high-performance computing systems and software designed for complex image, sensor, and signal processing applications. Its algorithm team reports that it has achieved 174 GFLOPS performance for large 1D complex single-precision floating point FFTs on the AMD FireStream 9250.

AMD plans to deliver the FireStream 9250 and the supporting SDK in Q3 2008 at an MSRP of $999 USD. AMD FireStream 9170, the industry’s first double-precision floating point stream processor, is currently available for purchase and is competitively priced at $1,999 USD.

Source: AMD


print this article email this article download pdf blog this article bookmark this article     Stumble it Digg this share on Facebook retweet share on Reddit add to delicious
Rate this story - 4.6 /5 (36 votes)

Rank Filter

Move the slider to adjust rank threshold, so that you can hide some of the comments.


Display comments: newest first

  • Graeme - Jun 17, 2008
    • Rank: 4 / 5 (3)
    Nvidia Tesla-10 Series processor has beaten AMD to the Teraflop speed. It has 240 cores.
  • Egnite - Jun 17, 2008
    • Rank: 3.5 / 5 (2)
    "the industry's first double-precision floating point stream processor, is currently available for purchase and is competitively priced at $1,999 USD."

    Would this be the Nvidia Tesla-10 then? At half the price, I know which I'll be buying for the next upgrade :-)
  • ryuuguu - Jun 17, 2008
    • Rank: 4.3 / 5 (3)
    at 700w the Telse-10 produces a lot of heat. Also it does not just plug into a PCI slot.
  • fleem - Jun 17, 2008
    • Rank: 4.7 / 5 (3)
    Stream computing has some advantages and some disadvantages of both pure parallel (multi-core with totally separate, but synchronous threads) processing and pure serial (single core--one thread) processing. So when comparing hardware platforms its a little hard to compare apples to apples. It depends on how many pipelines, cores, pipeline lag, and application requirements. Granted, most apps that need tons of CPU can be distributed.
  • donjoe0 - Jun 20, 2008
    • Rank: 4.5 / 5 (2)
    "It occupies a single PCI slot, for unmatched density"

    Funny you should use a picture of a dual-slot card to illustrate that. :rolleyes:

June 16, 2008 all stories

Comments: 5

4.6 /5 (36 votes)
  • Stumble this up

  • Digg this

  • share this

  • hide
  • Related Stories




  • hide
  • Relevant PhysicsForums posts

  • Achromat lens - magnifying LCD
    created 12 hours ago
  • Control System
    created Nov 24, 2009
  • Base Isolation Systems in Skyscrapers?
    created Nov 23, 2009
  • Need to interview a Computer Hardware Engineer for school project
    created Nov 23, 2009
  • transient heat transfer
    created Nov 23, 2009
  • Trying to adapt a fuel gage circuit
    created Nov 22, 2009
  • More from Physics Forums - General Engineering

Other News

Apple's iPhone

Tips to keep iPhone battery going strong

Electronics / Consumer & Gadgets

created 46 minutes ago | popularity not rated yet | comments 0

In talking with my iPhone-using friends, we often seem to bring up how to squeeze the most life from the iPhone's "nonreplaceable" battery.


This curvaceous humanoid made of layers of cardboard is billed as the first eco-friendly robot

Robo-chefs and fashion-bots on show in Tokyo

Electronics / Robotics

created 1hour ago | popularity not rated yet | comments 0

Forget the Transformers and Astroboy: Japan's latest robots don't save the world -- they cook snacks, play with your kids, model clothes, and search for disaster victims.


Review: A riff on robotics with self-tuning guitar (AP)

Review: A riff on robotics with self-tuning guitar

Electronics / Consumer & Gadgets

created 15 hours ago | popularity 3.7 / 5 (7) | comments 2

(AP) -- New cars have been tuning themselves for the better part of two decades now, so it should feel less impressive that Gibson has built a guitar that can smoothly do the same.


Apple's iPhone

Modified iPhones Are Compromised By New Worm

Electronics / Consumer & Gadgets

created 21 hours ago | popularity 5 / 5 (4) | comments 1

(PhysOrg.com) -- Several research security firms have reported a new worm attack against jail broken iPhones, dubbed "Ikee.B or "Duh", this worm searches for personal and banking information.


Droid smart phone

Top 10 tech toys for 2009

Electronics / Consumer & Gadgets

created 13 hours ago | popularity 2.5 / 5 (2) | comments 0

This year, I've grouped my list of Top 10 tech toys into price ranges. Keep in mind that the prices listed are the suggested retail, and you may be able to find better deals.