Context is ev ... well, something, anyway
March 5, 2010 by Larry Hardesty
Standard object recognition software mistakenly detects a sofa, a cabinet, and a mirror in a street scene (right), but a new MIT system corrects those errors (left) using statistical information about how often certain types of objects occur together. Image: Myung 'Jin' Choi
Today, computers can't reliably identify the objects in digital images. But if they could, they could comb through hours of video for the two or three minutes that a viewer might be interested in, or perform web searches where the search term was an image, not a sequence of words. And of course, object recognition is a prerequisite for the kind of home assistance robot that could execute an order like "Bring me the stapler." Now, MIT researchers have found a way to improve object recognition systems by using information about context. If the MIT system thinks it’s identified a chair, for instance, it becomes more confident that the rectangular thing nearby is a table.
A typical object recognition system will scan a digital image for groups of pixels that differ from those around them; those pixels could define an edge, a corner or some other feature of an object. Usually, the system has been trained on a set of sample images, which teaches it how to correlate feature patterns with particular objects.
Some researchers have tried to use context information to refine those correlations. But according to Myung “Jin” Choi, a grad student in MIT’s Laboratory for Information and Decision Systems and one of the leaders of the new project, those researchers were generally working with a standard training set that included examples of only about 20 different types of object. In that case, it was fairly straightforward to specify how frequently each object co-occurred with every other object in the set.
Upping the ante
A system that could recognize only 20 different objects, however, wouldn’t be very useful. And with a large number of objects, it becomes computationally impractical to consider the frequency of all possible two-object combinations. In work to be presented at the IEEE Conference on Computer Vision and Pattern Recognition this summer, Choi and her colleagues — including graduate student Joseph Lim and Professors Antonio Torralba and Alan Willsky — describe a different approach. Working with a training set that included more than 4,000 images and 107 different types of objects, they created algorithms that pored through the images and automatically constructed a hierarchical map of the object categories — kind of like the organizational chart for a large company, which shows who reports to whom. In the map, each object is connected to at most one object above it in the hierarchy (everyone in the organization reports to only one person), drastically reducing the number of connections that the system has to consider. The connection between any two objects is given a weight that indicates how often the objects appear together in the training images. The map also encodes information about the typical relative locations of two connected objects: buildings generally appear above roads, for instance, not below them.
When the system analyzes a new image, it uses standard object recognition algorithms to generate a list of candidate objects, together with each object’s “confidence score” — a statistical measure of how likely the object is to have been correctly identified. Then it revises those scores on the basis of the information encoded in the contextual map.
In experiments, Choi compared the performance of the bare object recognition algorithms with their performance when augmented by the contextual map. In both cases, she considered the three objects per image with the highest confidence scores. The bare algorithms correctly identified all three objects roughly 14 percent of the time; with the addition of the contextual map, the success rate jumped to about 25 percent.
Long row to hoe
Of course, that means that the system still failed to correctly identify three objects per image about 75 percent of the time, which shows just how difficult the problem of object recognition remains. “Context really is essential,” says Serge Belongie, an associate professor of computer science at the University of California, San Diego who has worked on both object recognition in general and context-based object recognition in particular. “It deserves a proper treatment, and Jin is doing that.” But Belongie cautions that context awareness will never be more than an augmentation of an underlying system that recognizes objects from visual features. “We absolutely cannot afford to take our eye off the ball of the component recognition systems that need to feed these context engines,” he says. And, he adds, to be useful, object recognition systems will need to be much more precise than today’s prototypes are. “Imagine that you take a picture of a wild mushroom while you’re hiking,” Belongie says, “and then you send it to the system to find out what it is. And it says, ‘Mushroom!’ You’re like, Thanks. That’s really useful. I knew that part.”
Nonetheless, Choi is continuing to improve her contextual-map system, against the day when the underlying algorithms are more reliable. The next version of the system, she says, will add entries to the map that, in effect, represent higher-level scene descriptions. Street scenes, for instance, may frequently feature sky, buildings and roads, while building interiors may frequently feature floors, walls and windows. The system won’t need to explicitly label these additional map entries, however; it will simply register them as foci around which certain types of objects regularly cluster. She’s confident that this modification will make the added benefits of context awareness even more acute.
Provided by Massachusetts Institute of Technology (news : web)
-
Seeing things: Researchers teach computers to recognize objects
Oct 13, 2009 |
not rated yet |
0
-
Out of sight, out of mind? Not really
Aug 23, 2005 |
not rated yet |
0
-
Shape encoding may start in the retina
Sep 12, 2007 |
not rated yet |
0
-
New search technique for images and videos has broad applications
Nov 10, 2009 |
not rated yet |
0
-
Toddlers link new words to whole objects
Oct 17, 2007 |
not rated yet |
0
-
Engineers build first sub-10-nm carbon nanotube transistor
Feb 01, 2012 |
4.9 / 5 (30) |
30
-
Something old, something new: Evolution and the structural divergence of duplicate genes
Jan 31, 2012 |
4.6 / 5 (7) |
1
-
The hidden nanoworld of ice crystals: Revealing the dynamic behavior of quasi-liquid layers
Jan 30, 2012 |
5 / 5 (3) |
1
-
Stock market network reveals investor clustering
Jan 27, 2012 |
3.9 / 5 (23) |
8
-
Of microchemistry and molecules: Electronic microfluidic device synthesizes biocompatible probes
Jan 26, 2012 |
5 / 5 (1) |
0
-
Synergistic relations between computer science and technology.
Feb 06, 2012
-
how do iphone gloves work?
Feb 05, 2012
-
iPhone battery over time
Jan 30, 2012
-
Best alternate Tablet to an iPad for writing math or physics equations?
Jan 26, 2012
-
Sending SMS to a website
Jan 20, 2012
-
Need help with my technical fest!
Jan 19, 2012
- More from Physics Forums - Computing & Technology
More news stories
Advanced power-grid model finds low-cost, low-carbon future in West
(PhysOrg.com) -- The least expensive way for the Western U.S. to reduce greenhouse gas emissions enough to help prevent the worst consequences of global warming is to replace coal with renewable and other ...
Technology / Energy & Green Tech
1 hour ago |
5 / 5 (1) |
1
|
Small modular reactor design could be a 'SUPERSTAR'
(PhysOrg.com) -- Though most of today's nuclear reactors are cooled by water, we've long known that there are alternatives; in fact, the world's first nuclear-powered electricity in 1951 came from a reactor ...
Technology / Energy & Green Tech
1 hour ago |
5 / 5 (3) |
3
|
Engineering images bring life to submerged city
(PhysOrg.com) -- Photo-realistic 3D mapping and digital reconstruction of an ancient underwater city in Greece have earned a team from the University of Sydney's Faculty of Engineering and Information Technologies ...
36 minutes ago |
not rated yet |
0
New power source discovered
(PhysOrg.com) -- Researchers at the Massachusetts Institute of Technology (MIT) and RMIT University have made a breakthrough in energy storage and power generation.
Technology / Energy & Green Tech
36 minutes ago |
5 / 5 (1) |
0
World's first 300mm-fab compatible directed self-assembly process line
At next weeks SPIE Advanced Lithography conference (San Jose, CA), imec announces the successful implementation of the world first 300mm fab-compatible Directed Self-Assembly (DSA) process line all-under-one-roof ...
42 minutes ago |
not rated yet |
0
Mars Science Laboratory computer issue resolved
(PhysOrg.com) -- Engineers have found the root cause of a computer reset that occurred two months ago on NASA's Mars Science Laboratory and have determined how to correct it.
Clam fields found at deep, low-temperature Mariana vents
(PhysOrg.com) -- Scientists have marveled at the unusual life forms thriving at high temperature hydrothermal vents of the deep ocean.
Seeing colors in music, tasting flavors in shapes may happen in life's early months
Famed violinist Itzhak Perlman sees a deep forest green whenever he plays a B-flat on his Stradivarius' G string. The A on the E string is red.
Could Venus be shifting gear?
(PhysOrg.com) -- ESAs Venus Express spacecraft has discovered that our cloud-covered neighbour spins a little slower than previously measured. Peering through the dense atmosphere in the infrared, the ...
The question of life in the ancient world
Theres a general feeling that we dont get the Greeks ancient or modern. Many, including heads of state like Angela Merkel, visibly shake their head in exasperation, rightly or wrongly, at ...
Study suggests girls can 'rewire' brains to ward off depression
(Medical Xpress) -- What if you could teach your brain to respond differently to things that make you feel sad, down or stressed out? What if doing that helped ward off depression?