Seeing things: Researchers teach computers to recognize objects

October 13, 2009 by Larry Hardesty Seeing things: Researchers teach computers to recognize objects

Enlarge

A new object recognition algorithm treats unrelated images as if they were consecutive frames of video. Since it assumes that the objects in both images are the same, it tries to deform the objects in the first image until they map onto the objects in the second. If the objects in one of the images have already been outlined and labeled, the algorithm can simply transfer the labels to the other image. Courtesy of Ce Liu

(PhysOrg.com) -- If computers could recognize objects, they could automatically search through hours of video footage for a particular two-minute scene. A tourist strolling down a street in a strange city could take a cell-phone photo of an unmarked monument and immediately find out what it was. And an Internet image search on, say, "Shakespeare" would pull up pictures of Shakespeare, not pictures of Gwyneth Paltrow in the movie Shakespeare in Love. Though object recognition is one of the major research topics in computer vision, MIT researchers may have found a way to make it much more practical.

Typically, object recognition algorithms need to be "trained" using in which objects have been outlined and labeled by hand. By looking at a million pictures of cars labeled "car," an can learn to recognize features shared by images of cars. The problem is that for every new class of objects — trees, buildings, telephone poles — the algorithm has to be trained all over again.

But Esther and Harold E. Edgerton Associate Professor of Electrical Engineering and Computer Science Antonio Torralba and Computer Science and Lab graduate students Ce Liu, PhD '09, and Jenny Yuen have developed an object recognition system that doesn't require any training. Nonetheless, it still identifies objects with 50 percent greater accuracy than the best prior algorithm.

Seeing things: Researchers teach computers to recognize objects
Enlarge

Credits - Images courtesy of Ce Liu

The system uses a modified version of a so-called motion estimation algorithm, a type of algorithm common in video processing. Since consecutive frames of video usually change very little, data compression schemes often store the unchanging aspects of a scene once, updating only the positions of moving objects. The motion estimation algorithm determines which objects have moved from one frame to the next. In a video, that's usually fairly easy to do: most objects don't move very far in one-30th of a second. Nor does the algorithm need to know what the object is; it just has to recognize, say, corners and edges, and how their appearance typically changes under different perspectives.

The MIT researchers' new system essentially treats unrelated images as if they were consecutive frames in a video sequence. When the modified motion estimation algorithm tries to determine which objects have "moved" between one image and the next, it usually picks out objects of the same type: it will guess, for instance, that the 2006 Infiniti in image two is the same object as the 1965 Chevy in image one.

If the first image comes from the type of database used to train computer vision systems, the Infiniti will already be labeled "car." The new system simply transfers the label to the Chevy.

Seeing things: Researchers teach computers to recognize objects
Enlarge

Credits - Courtesy of Ce Liu

Seeing things: Researchers teach computers to recognize objects
Enlarge

Credits - Courtesy of Ce Liu

The greater the resemblance of the labeled and unlabeled images, the better the algorithm works. Fortunately, Torralba's earlier work was largely directed toward amassing a huge database of labeled images. Torralba and his colleagues have developed a simple web-based system called LabelMe that lets online volunteers tag objects in digital images, and they also created a web site called 80 Million Tiny Images that sorts the images according to subject matter. When confronted with an unlabeled image, the new object recognition algorithm is likely to find something similar in Torralba's database. And as the database grows larger, that likelihood will only increase.

"It's a real commonsense solution to a fundamental problem in computer vision," says Marshall Tappen, a researcher at the University of Central Florida. "The results are great and better than you can get with much more complicated methods." Tappen adds that "a large database makes it possible to do lots of really interesting thing that no one's even envisioned. There are lots of interesting things it can do beyond just standard object recognition, so I think it's really going to enable a lot of innovation." Tappen points in particular to recent work on image editing and image completion done by Alyosha Efros at Carnegie Mellon University. "If you look at his last few Siggraph papers" — that is, papers presented at Siggraph, the major conference in the field of computer graphics — "they're all using LabelMe," Tappen says.

Provided by Massachusetts Institute of Technology (news : web)


print this article email this article download pdf blog this article bookmark this article     Stumble it Digg this share on Facebook retweet share on Reddit add to delicious
Rate this story - 3.2 /5 (5 votes)

Rank Filter

Move the slider to adjust rank threshold, so that you can hide some of the comments.


Display comments: newest first

  • designmemetic - Oct 14, 2009
    • Rank: not rated yet
    this seems to work very similar to how the eye process of sacades works to integrate multiple images for identifying descrete objects. good work.

October 13, 2009 all stories

Comments: 1

3.2 /5 (5 votes)
  • Stumble this up

  • Digg this

  • share this

  • hide
  • Related Stories

  • Researchers develop new image-recognition software
    created May 21, 2008 | popularity not rated yet | comments 0
  • Researchers use Web images to add realism to edited photos
    created Jul 10, 2007 | popularity not rated yet | comments 0
  • New system estimates geographic location of photos
    created Jun 18, 2008 | popularity not rated yet | comments 0
  • Stanford site advances science of turning 2-D images into 3-D models
    created Jan 23, 2008 | popularity not rated yet | comments 0
  • Computer vision may not be as good as thought
    created Jan 25, 2008 | popularity not rated yet | comments 0



  • hide
  • Relevant PhysicsForums posts

  • kindle e-reader and scientific papers
    created 10 hours ago
  • Help with a camera choice
    created Nov 18, 2009
  • casio calculator that's similar to TI-89
    created Nov 08, 2009
  • Advice on what cell phone to get
    created Nov 08, 2009
  • Changing the language options on your phone.
    created Nov 03, 2009
  • HP strange RPN operation???
    created Nov 02, 2009
  • More from Physics Forums - Computing & Technology

Other News

AT&T and Verizon ads duel on airwaves and in court

Technology / Business

created 35 minutes ago | popularity not rated yet | comments 0

(AP) -- What would the holidays be without bickering between siblings? AT&T and Verizon are swamping TV with ads attacking facets of each other's wireless networks. While the ads stick fairly close to the truth, there's ...


The number of text messages that a mobile user in S.Korea can send out a day has been restricted to 500, down from 1,000

S.Korea halves ceiling on text messages to fight spam

Technology / Telecom

created 22 minutes ago | popularity not rated yet | comments 0

South Korean authorities on Wednesday halved the daily limit on text messages sent out by mobile phones as part of a campaign against spam, officials said.


New computer cluster gets its grunt from games

New computer cluster gets its grunt from games

Technology / Computer Sciences

created 16 minutes ago | popularity not rated yet | comments 0

Technology designed to blast aliens in computer games is part of a new GPU (Graphics Processing Units) computer cluster that will process CSIRO research data thousands of times faster and more efficiently ...


Selling chip makers on optical computing

Selling chip makers on optical computing

Technology / Semiconductors

created 14 hours ago | popularity 4.9 / 5 (7) | comments 1

(PhysOrg.com) -- Computer chips that transmit data with light instead of electricity consume much less power than conventional chips, but so far, they've remained laboratory curiosities. Professors Vladimir ...


Facebook creates dual-class structure, but no IPO (AP)

Facebook creates dual-class structure, but no IPO

Technology / Business

created 10 hours ago | popularity 1 / 5 (1) | comments 0

(AP) -- Facebook has created a dual-class stock structure designed to give founder Mark Zuckerberg and other existing shareholders control over the company.