Researchers Use Wikipedia To Make Computers Smarter

January 6th, 2007

Using Wikipedia, Technion researchers have developed a way to give computers knowledge of the world to help them “think smarter,” making common sense and broad-based connections between topics just as the human mind does. The new method will help computers filter e-mail spam, perform Web searches and even conduct intelligence gathering at more sophisticated levels than current programs.

Researchers at the Technion-Israel Institute of Technology have found a way to give computers encyclopedic knowledge of the world to help them “think smarter,” making common sense and broad-based connections between topics just as the human mind does.

The new method will help computers filter e-mail spam, perform Web searches and even conduct electronic intelligence gathering at a much more sophisticated level than current programs, according to researchers Evgeniy Gabrilovich and Shaul Markovitch of the Technion Faculty of Computer Science. The findings will be presented next week in Hyderabad, India during the Twentieth International Joint Conference for Artificial Intelligence.

The program devised by the Technion researchers helps computers map single words and larger fragments of text to a database of concepts built from the online encyclopedia Wikipedia, which has over one million articles in its English-language version. The Wikipedia-based concepts act as “background knowledge” to help computers figure out the meaning of the text entered into a Web search, for instance.

Giving computers this deeper knowledge has been a long-standing problem in artificial intelligence, according to Markovitch. “Humans use a significant amount of background knowledge” to understand text, “but we didn’t know how to have computers access such knowledge,” he said.

Most Web search and e-mail filter programs appear smart by calculating how often certain words appear in two texts, Markovitch explained. “But what is common to all these applications is that the programs that actually do this kind of thing don’t understand text. They treat text as a collection of words, but they don’t understand the meaning of words.”

This shallow understanding is what makes an e-mail spam filter block all messages containing the word “vitamin,” but fail to block messages containing the word “B12.” “If the program never saw “B12” before, it’s just a word without any meaning. But you would know it’s a vitamin,” Markovitch said.

“With our methodology, however, the computer will use its Wikipedia-based knowledge base to infer that "B12" is strongly associated with the concept of vitamins, and will correctly identify the message as spam," he added.

Or, computers could look at a chunk of text about Saddam Hussein and weapons of mass destruction and know that it is conceptually related to topics such as the Iraq war and U.S. Senate debates on intelligence—even if those terms do not appear anywhere in the original text.

The method also helps computers figure out ambiguous terms—deciding, for instance, whether the word “mouse” refers to the computer device or the fuzzy animal. This can be especially important in translated documents, Markovitch said.

In the near future, the Technion researchers hope to improve their method by adding information from the Web page links inside Wikipedia articles. They are already pursuing a patent on their work, which they say will be of interest to the intelligence community and Web search engine companies, among others.

Source: American Technion Society


print this article email this article download pdf blog this article bookmark this article     Digg this Stumble it share on Facebook share on Reddit add to delicious save to Yahoo! bookmarks
3.8/5 after 57 votes


January 6th, 2007 all stories
Technology / Computer Sciences

Comments: 0
Rank: 3.8/5 after 57 votes

  • Stumble this up

  • Digg this

  • Share it:
  • share on Facebook
  • share on MySpace
  • share on Slashdot
  • rss-newsfeed
  • share on Google
  • share on Reddit
  • add to delicious
  • save to Yahoo! bookmarks
  • share on Windows Live
  • Add to Mixx!
Rating: 3.8/5 after 57 votes

  • Related Stories

  • A glimpse at Intel's futuristic gadgets
    created Jul 01, 2009 | popularity not rated yet | comments 0
  • Guatemalan fears a tweet will make him a jailbird
    created Jun 26, 2009 | popularity not rated yet | comments 0
  • How to text message and avoid pain
    created Jun 23, 2009 | popularity not rated yet | comments 0
  • Twitter's uses extend to law enforcement
    created Jun 23, 2009 | popularity not rated yet | comments 0
  • China backpedals on filtering software order
    created Jun 17, 2009 | popularity not rated yet | comments 0

Tags


  • Physicists Demonstrate Quantum Memory with Matter Qubits
    Physicists Demonstrate Quantum Memory with Matter Qubits
    Physics / General Physics
    created 18 hours ago | popularity 4.5 / 5 (11) | comments 1
  • 'Holey' Nanosheets for Wastewater Dye Removal
    Nanotechnology / Nanomaterials
    created Jul 01, 2009 | popularity 5 / 5 (5) | comments 1
  • Jellyfish Robot Swims Like its Biological Counterpart
    Jellyfish Robot Swims Like its Biological Counterpart
    Electronics / Robotics
    created Jun 26, 2009 | popularity 4.4 / 5 (7) | comments 1
  • Could Maxwell's Demon Exist in Nanoscale Systems?
    Could Maxwell's Demon Exist in Nanoscale Systems?
    Physics / General Physics
    created Jun 24, 2009 | popularity 4.4 / 5 (18) | comments 29
  • Living Safely with Robots, Beyond Asimov's Laws
    Living Safely with Robots, Beyond Asimov's Laws
    Electronics / Robotics
    created Jun 22, 2009 | popularity 4.6 / 5 (50) | comments 39
  • Other News

    Homeland Security Secretary Janet Napolitano

    US government Internet traffic to be screened: report (Update)

    Technology / Internet

    created 17 hours ago | popularity 5 / 5 (1) | comments 2

    The Obama administration is planning to use the National Security Agency to screen Internet traffic between government agencies and the private sector, the Washington Post reported Friday.


    Volkswagen hopes to turn out its first all-electric car in 2013

    Volkswagen plans electric car in 2013: head

    Technology / Energy

    created 11 hours ago | popularity 1 / 5 (1) | comments 0

    German auto maker Volkswagen hopes to turn out its first all-electric car in 2013, VW head Martin Winterkorn said Friday.


    Japanese veterans in Imperial Army uniforms march in Tokyo

    Japanese imperial army maps to go online

    Technology / Internet

    created 9 hours ago | popularity 1 / 5 (1) | comments 0

    Old Asia-Pacific maps from Japanese Imperial Army archives are going online for modern use, such as studying changes in forest cover or the growth of cities, a Japanese researcher said Friday.


    US wants privacy in new cyber security system (AP)

    US wants privacy in new cyber security system

    Technology / Internet

    created 21 hours ago | popularity 4 / 5 (1) | comments 0

    (AP) -- The Obama administration is moving cautiously on a new pilot program that would both detect and stop cyber attacks against government computers, while trying to ensure citizen privacy protections.


    Racing car powered by chocolate and steered by carrots takes to the track at Goodwood

    Technology / Engineering

    created 15 hours ago | popularity 1 / 5 (2) | comments 0

    A racing car created from potatoes and carrots and powered by chocolate will be put through its paces this weekend at the world’s largest celebration of motorsport.