Can networked human computation solve computer language comprehension?

January 26, 2009

Researchers at the University of Essex hope to answer this question by getting more volunteers to take part in their online game, Phrase Detectives.

Jon Chamberlain, from Essex's School of Computer Science and Electronic Engineering, explains: ‘Human language is not an unconnected series of words, phrases and sentences but a series of people, objects and ideas that refer to each other in different ways. The complexity of language makes it sound "natural" to a reader but it can be difficult to define the rules that allow us to understand it.

‘Consider the statement: "Mary is a teacher who is 25 years old. She lives in England." A human reader can easily ascertain facts about Mary's occupation, age and residence by, for example, knowing that the word "she" refers to the person "Mary". However, comprehending this type of language referencing is a challenge facing programmers when designing computer systems that try to understand text, such as search, translation and summarisation systems.’

This is where the work of those playing Phrase Detectives becomes important. The game, part of a larger project called AnaWiki, is an attempt to address the bottleneck in creating annotated linguistic resources. By initially investigating anaphoric references (as in the example above) the project aims to develop a resource larger than anything currently available.

Players (or detectives) register at: http://www.phrased … tectives.org and read through texts, making annotations to highlight relationships between words and phrases. They may be asked to 'name the culprit', so will be given a word or phrase and must look for it appearing earlier in the text. For example: 'Sherlink Holmes went to the shop. He got some tobacco for his pipe.' The word ‘he’ refers to 'Sherlink Holmes'.

Jon added: ‘Players of the game are helping to create a resource that is rich in linguistic information and improves future technology. This project aims to collect a significant amount of data and investigate the possibility of using mass collaboration to train computer systems.

‘The best way to understand a language is to have lots of examples where the meaning has been clarified. Unfortunately creating this type is resource is both time consuming and expensive but the new approach offered by Phrase Detective should address this resource shortage. The same methodology could also be used to create resources for machine translation, semantics and other linguistic phenomenon.’

So far, players have made over 40,000 annotations in four weeks. However, the researchers hope more will join as detectives and that people will add new text to the site for analysis.

Phrase Detectives can be defined as part of a genre of “games with a purpose” (GWAP) that collect data on images, texts and music. The crucial element of these games is that players receive points for agreeing with each other. They are motivated to collaborate with their partners in order to score maximum points. This ensures that players are attempting to provide good quality information, as this will result in the most agreement.

The Essex researchers believe Phrase Detectives is the first attempt to collect linguistic judgements using a fun, collaborative online game. They aim to make the tasks and the texts interesting so it feels more like a computer game than a linguistic task. The data collected can then be used to improve computer systems that try to understand text. For example, it could help search engines find information more relevant to your searches.

So, can networked human computation really solve complex language comprehension tasks on computers? Initial results from the beta version of the game look promising and more detailed analysis will completed in early 2009.

Source: University of Essex

4.3 /5 (3 votes)  

Filter


Move the slider to adjust rank threshold, so that you can hide some of the comments.


Display comments: newest first

gmurphy
Jan 26, 2009

Rank: not rated yet
wow, this is really cool, a solid dataset for training text comprehension algorithms.
Rank 4.3 /5 (3 votes)
Relevant PhysicsForums posts

More news stories

Zuckerberg's focus drives Facebook's ascent

When Mark Zuckerberg showed up to rent Judy Fusco's Los Altos, Calif., house in the fall of 2004, soon after he'd arrived in Silicon Valley, the landlord was immediately struck by his confidence.

Technology / Internet

created 1 hour ago | popularity 1 / 5 (1) | comments 0

Review: Netflix and Hulu's new scripted originals

Within just over a week, Netflix and Hulu are both debuting their first stabs at original scripted programming.

Technology / Business

created 1 minute ago | popularity not rated yet | comments 0

Tailor-made search tools for the Web

For companies, customer feedback is a matter of strategic importance. Smart apps for the semantic analysis of user opinions from the Web help businesses keep an eye on feedback. Users benefit as well: with ...

Technology / Software

created 16 minutes ago | popularity not rated yet | comments 0

New error-correcting codes guarantee the fastest possible rate of data transmission

Error-correcting codes are one of the triumphs of the digital age. They’re a way of encoding information so that it can be transmitted across a communication channel — such as an optical fiber o ...

Technology / Computer Sciences

created 5 hours ago | popularity 5 / 5 (3) | comments 2 | with audio podcast

Netflix light on flicks as viewers soak up TV shows

Like most fresh faces that arrive in Hollywood, Netflix wanted to be a movie star. But now it's learning what many in Tinseltown have known for decades: Movies are sexy, but the real money is in television.

Technology / Business

created 2 hours ago | popularity not rated yet | comments 1


New understanding of DNA repair could eventually lead to cancer therapy

A research group in the Faculty of Medicine & Dentistry at the University of Alberta is hoping its latest discovery could one day be used to develop new therapies that target certain types of cancers.

Hovering not hard if you're top-heavy, researchers find

Top-heavy structures are more likely to maintain their balance while hovering in the air than are those that bear a lower center of gravity, researchers at New York University's Courant Institute of Mathematical Sciences ...

Grass to gas: Researchers' genome map speeds biofuel development

Researchers at the University of Georgia have taken a major step in the ongoing effort to find sources of cleaner, renewable energy by mapping the genomes of two originator cells of Miscanthus x giganteus, a large perenn ...

Both maternal and paternal age linked to autism

Older maternal and paternal age are jointly associated with having a child with autism, according to a recently published study led by researchers at The University of Texas Health Science Center at Houston (UTHealth).

Night, weekend delivery OK for babies with birth defects

Weekday delivery is no better than night or weekend delivery for infants with birth defects, according to a new study presented today at The Pregnancy Meeting, the Society for Maternal-Fetal Medicine's annual conference. ...

Sonic Cradle lands spot in TED exhibition

A Simon Fraser University graduate student project that melds music, meditation and modern technology has landed a rare spot as an exhibit at TEDActive 2012 in Palm Springs, California this month.