Computer scientists develop solutions for long-term storage of digital data

April 21, 2008

Although the digital age is well under way, one crucial detail remains to be worked out--how to store vast amounts of digital information in a way that allows future generations to recover it.

"The problem is how to build a large-scale data storage system to last 50 to 100 years," said Ethan Miller, associate professor of computer science in the Baskin School of Engineering at the University of California, Santa Cruz.

Tape libraries are widely used for data storage, but digital tape has many shortcomings as an archival medium. Miller's group has come up with a new approach, called Pergamum, which uses hard disk drives to provide energy-efficient, cost-effective storage. The declining cost of hard drives has made them more competitive with tape, and they offer numerous advantages for searching and retreiving data. "It's like the difference between a VCR and TiVo," Miller said.

Pergamum, named after the ancient Greek library that made the transition from fragile papyrus to more durable parchment, is a distributed network of intelligent, disk-based storage devices. The team that developed it includes UCSC graduate students Mark Storer and Kevin Greenan, along with researcher Kaladhar Voruganti of NetApp (formerly Network Appliance), a company that focuses on storage and data management solutions.

Archival storage is a big issue for businesses, partly due to legal requirements for the preservation of financial and business records, and also because data mining strategies can turn stored data into a valuable resource. Long-term storage is also a growing issue for individuals who are filling their personal computers with digital photos, movies, and documents.

"There is a risk that an entire generation's cultural history could be lost if people aren't able to retrieve that data," Storer said. "Everyone is switching to digital cameras, but we've never demonstrated that digital data can be reliably preserved for a long time."

Pergamum has attracted a lot of attention from industry since Storer presented it at a leading conference in the field, the USENIX Conference on File and Storage Technologies (FAST '08), held in San Jose in February. Robin Harris, an industry consultant who writes an influential blog called StorageMojo, called the Pergamum paper his "favorite FAST '08 paper".

The researchers designed the system to provide reliable, energy-efficient data storage using off-the-shelf components. It also has the ability to evolve over time as storage technologies change. "You want to avoid 'forklift upgrades,' where you have to get rid of the old system and transfer all your data to a whole new system," Miller said.

According to Storer, businesses are beginning to recognize that archival storage is very different from simply backing up their data. "A backup is a safety net--you hope you won't need it. Archival data you do want to use--it's a valuable resource and you want to be able to mine it for information," he said.

Tapes work well for backups, in which data are written once, rarely read, and not kept indefinitely. But archival data should be easy to read, query, browse, and search, and tape has inherent weaknesses in these areas. Existing disk-based systems offer excellent performance, but rely on power-hungry central controllers.

"Energy usage is a big issue, so a lot of our effort in designing Pergamum focused on dramatically reducing power use," Miller said.

Pergamum uses individual building blocks consisting of a hard drive; a small, low-power processor (like the chip in an iPhone); a flash memory card; and an ethernet port. These units, called "tomes," are connected using relatively inexpensive ethernet switches.

"Each tome is like a minicomputer, but with very low power demands," Miller said. "When not in use, it can shut down almost completely."

Even when active, the devices use very little power (less than 13 watts), which can be delivered over the network using Power over Ethernet technology. As a result, each unit is essentially a self-contained box with a network connection. The flash memory provides low-power, persistent storage so that many operations can be performed without activating the hard drive.

For reliability, Pergamum uses two levels of redundancy--within and between disks--to protect from both disk failures and errors in writing data to a disk (so-called "latent sector errors"). Tomes can be easily added to expand the system or to replace failed disks. And if hard disk drives become obsolete in 10 years, Pergamum won't suffer the same fate. The system doesn't care what the actual storage medium is, as long as the device can implement the simple protocol that will allow it to function as part of the network.

"In 50 years, the devices might use holographic storage," Storer said. "As long as you can wrap the new storage medium in this intelligent layer that speaks the protocol, it can participate in the network."

Pergamum is one of several related projects being developed by researchers in the Storage Systems Research Center (SSRC) at UCSC's Baskin School of Engineering. The center's other archival storage projects include Deep Store, which dramatically reduces the amount of space required to store data, and POTSHARDS, which provides long-term secure storage using "secret splitting" instead of traditional encryption. Both of these projects would be compatible with Pergamum, Miller said.

Source: University of California - Santa Cruz


print this article email this article download pdf blog this article bookmark this article     Stumble it Digg this share on Facebook retweet share on Reddit add to delicious
Rate this story - 3.9 /5 (15 votes)

Rank Filter

Move the slider to adjust rank threshold, so that you can hide some of the comments.


Display comments: newest first

  • superhuman - Apr 21, 2008
    • Rank: not rated yet
    System build with off the shelf hard disks definitely wont "last 50 to 100 years".

    Described system is a cheap and easily upgradable short term storage.
  • gopher65 - Apr 21, 2008
    • Rank: not rated yet
    Gold plated DVDs will last up to 300 years in optimal storage conditions, but they are a wee bit expensive. The word "cheap" seems to be the key word in their idea, because we have better expensive systems right now.
  • holoman - Apr 21, 2008
    • Rank: not rated yet
    Non-volatile ferroelectric / multiferroic molecules in studies have shown to maintain binary dipole position for ~100 years.

    Colossal Storage Corp. using multiferroics is now in testbed proof at several universities but don't expect any storage device until 2010-2012.

  • Star_Gazer - Apr 22, 2008
    • Rank: not rated yet
    I think the design of the device allows for hard drive to be completely off for periods of no-access times. Flash can cache some of the frequent used data, so it doesn't have to be red from the hard drive over and over. Hard drives in desktop computers and laptop computers spin about 99% of computer's uptime. They reliably work for 3-5 years and after that bearings start wearing off. So when the hard drive is on for couple hours a month or couple hours total in a year, it may potentially last several hundred years, given humidity and temperature of the environment remains within operational parameters.

April 21, 2008 all stories

Comments: 4

3.9 /5 (15 votes)
  • Stumble this up

  • Digg this

  • share this

  • hide
  • Related Stories

  • Chief of Intel's biggest division heads to EMC
    created Sep 14, 2009 | popularity not rated yet | comments 0
  • Kingston Unveils the World’s First 256GB USB Flash Drive
    created Jul 22, 2009 | popularity not rated yet | comments 0
  • 5 reasons electronics show is still relevant
    created Jan 09, 2009 | popularity not rated yet | comments 0
  • Shuttle docks at space station, unloads parts (Update 2)
    created Nov 18, 2009 | popularity not rated yet | comments 0
  • DataONE helping scientists deal with data deluge
    created Nov 18, 2009 | popularity not rated yet | comments 0



  • hide
  • Relevant PhysicsForums posts

  • Help with a camera choice
    created Nov 18, 2009
  • casio calculator that's similar to TI-89
    created Nov 08, 2009
  • Advice on what cell phone to get
    created Nov 08, 2009
  • Changing the language options on your phone.
    created Nov 03, 2009
  • HP strange RPN operation???
    created Nov 02, 2009
  • Databases in physics
    created Oct 31, 2009
  • More from Physics Forums - Computing & Technology

Other News

Google said Teracent can pick and choose from thousands of creative elements of a display ad in real-time

Google buying display ad startup Teracent

Technology / Internet

created 4 minutes ago | popularity not rated yet | comments 0

Google is acquiring Web display advertising startup Teracent, the Internet giant announced on Monday.


Intel logo A

Intel wants a chip implant in your brain

Technology / Hi Tech

created 7 hours ago | popularity 3.9 / 5 (12) | comments 18

(PhysOrg.com) -- Computer chip maker Intel wants to implant a brain-sensing chip directly into the brains of its customers to allow them to operate computers and other devices without moving a muscle.


Workers at the Statkraft Osmotic power plant prototype in Tofte

Harnessing the power of salt, Norway tries osmotic power

Technology / Energy

created 8 hours ago | popularity 2.5 / 5 (2) | comments 2

After wind, sun, currents and tides, a company is preparing to make clean electricity by harnessing another natural phenomenon, the energy-unleashing encounter of freshwater and seawater.


Fox CEO wants US to join France on Internet piracy

Technology / Internet

created 3 hours ago | popularity 3 / 5 (1) | comments 0

(AP) -- The chief executive of Fox Filmed Entertainment said Monday the U.S. should join France in cutting off the Internet connection of users who repeatedly download copyright-protected films.


Microsoft has held talks with Rupert Murdoch's News Corp over removing its news websites from Google, a report said

News Corp, Microsoft hold talks on Google: report

Technology / Internet

created 8 hours ago | popularity 3 / 5 (2) | comments 3

Microsoft has held talks with Rupert Murdoch's News Corp over a possible plan for the software giant to pay the media company to remove its news websites from Google, a report said Monday.