Sorting facts and opinions for Homeland Security

September 24, 2006

What are newspapers around the world saying about the latest speech by President George W. Bush? More importantly, how much of what they are saying is factual and how much opinion? And down the line, are some of the opinions being presented as if they were facts?

A new research program by a Cornell computer scientist, in collaboration with colleagues at the University of Pittsburgh and University of Utah, aims to teach computers to scan through text and sort opinion from fact. The research is funded by the U.S. Department of Homeland Security, which has designated the consortium of three universities as one of four University Affiliate Centers (UAC) to conduct research on advanced methods for information analysis and to develop computational technologies that contribute to national security. Cornell will receive $850,000 of $2.4 million in funding provided for the consortium over three years.

"Lots of work has been done on extracting factual information -- the who, what, where, when," explained Claire Cardie, Cornell professor of computer science, who is one of three co-principal investigators for the grant. "We're interested in seeing how we would extract information about opinions."

Cardie is an expert on "information extraction," in which computers scan text to find meaning in natural language. Computer programmers and science fiction fans know that computers are usually very literal and demand that information be presented according to rigid rules. Humans, on the other hand, are capable of understanding that "Please pass the salt," "May I have the salt," "Hey, is there any salt down there?" and "Yuk, this really needs salt" all mean much the same thing. Cardie's computer programs try to bridge the gap by identifying subjects, objects and other key parts of sentences to determine meaning.

The new research will use machine-learning algorithms to give computers examples of text expressing both fact and opinion and teach them to tell the difference. A simplified example might be to look for phrases like "according to" or "it is believed." Ironically, Cardie said, one of the phrases most likely to indicate opinion is "It is a fact that ..."

The work also will seek to determine the sources of information cited by a writer. "We're making sure that any information is tagged with a confidence. If it's low confidence, it's not useful information," Cardie added.

In addition to the research project, Cardie said, the new UAC has educational goals, seeking to train students to work in information extraction and presenting seminars and workshops for other researchers. The center also will offer summer seminars for women and underrepresented minority undergraduates.

The Department of Homeland Security has established the UACs, Cardie said, partly because it currently lacks enough in-house expertise in natural-language processing. Although the research may conjure fears about invasions of privacy, Cardie says she will be working only with publicly available material, primarily news reports and editorials from English-language newspapers worldwide.

"The techniques would have to be changed considerably to work on documents like e-mails," she noted.

The results, she added, will always include pointers to the original sources, so that when a computer draws some conclusion, human beings will be able to look at the original material and determine whether or not the conclusion was correct.

Source: Cornell University


print this article email this article download pdf blog this article bookmark this article     Stumble it Digg this share on Facebook retweet share on Reddit add to delicious
Rate this story - 1.8 /5 (41 votes)


September 24, 2006 all stories

Comments: 0

1.8 /5 (41 votes)
  • Stumble this up

  • Digg this

  • share this

  • hide
  • Related Stories

  • New tool for helping pediatric heart surgery
    created 3 hours ago | popularity not rated yet | comments 0
  • Evaluating eHealth: How to make evaluation more methodologically robust
    created 3 hours ago | popularity not rated yet | comments 0
  • As robots become more common, Stanford experts consider the legal challenges
    created 15 hours ago | popularity not rated yet | comments 0
  • Predicting the fate of underground carbon
    created 18 hours ago | popularity not rated yet | comments 0
  • Computational microscope peers into the working ribosome (w/ Video)
    created 19 hours ago | popularity not rated yet | comments 0



  • hide
  • Relevant PhysicsForums posts

  • Help with a camera choice
    created Nov 18, 2009
  • casio calculator that's similar to TI-89
    created Nov 08, 2009
  • Advice on what cell phone to get
    created Nov 08, 2009
  • Changing the language options on your phone.
    created Nov 03, 2009
  • More from Physics Forums - Computing & Technology

Other News

NREL Uncovers Clean Energy Leaders State by State

NREL Uncovers Clean Energy Leaders State by State

Technology / Energy

created 34 minutes ago | popularity not rated yet | comments 0

(PhysOrg.com) -- That California and Texas still lead the United States in generating renewable energy probably is no surprise. But, NREL's 2009 State of the States report shows that several smaller states ...


Opera logo

Stable Opera 10.10 browser with Unite now available

Technology / Software

created 1hour ago | popularity 5 / 5 (2) | comments 0

(PhysOrg.com) -- The web browser Opera 10.10 has been released as a stable version, and it has a number of new features to enhance the browsing experience, including "Unite", which is a group of applications ...


Key scientist says politics behind stolen e-mails

Technology / Other

created 2 hours ago | popularity not rated yet | comments 3

(AP) -- A leading climate change scientist said hackers breaking into a university's computer server and then posting documents online show the nasty politics of global warming.


Just in time for Black Friday: students turn iPhone into barcode scanner

Just in time for Black Friday: students turn iPhone into barcode scanner

Technology / Software

created 13 hours ago | popularity 4.7 / 5 (3) | comments 0

(PhysOrg.com) -- Comparing prices over the Internet has become a common practice for consumers. Now, just in time for Black Friday, a group of Missouri University of Science and Technology students is putting ...


IBM Researchers Lower Language Barrier With Text Translator

Technology / Computer Sciences

created 15 hours ago | popularity 4.5 / 5 (4) | comments 0

IBM Researchers are helping to break the language barrier with the advent of technology dubbed "n.Fluent" -- smart software that translates text between English and 11 other languages. IBM employees use it to instantaneously ...