Researchers Use Wikipedia To Make Computers Smarter

January 6, 2007

Using Wikipedia, Technion researchers have developed a way to give computers knowledge of the world to help them “think smarter,” making common sense and broad-based connections between topics just as the human mind does. The new method will help computers filter e-mail spam, perform Web searches and even conduct intelligence gathering at more sophisticated levels than current programs.

Researchers at the Technion-Israel Institute of Technology have found a way to give computers encyclopedic knowledge of the world to help them “think smarter,” making common sense and broad-based connections between topics just as the human mind does.

The new method will help computers filter e-mail spam, perform Web searches and even conduct electronic intelligence gathering at a much more sophisticated level than current programs, according to researchers Evgeniy Gabrilovich and Shaul Markovitch of the Technion Faculty of Computer Science. The findings will be presented next week in Hyderabad, India during the Twentieth International Joint Conference for Artificial Intelligence.

The program devised by the Technion researchers helps computers map single words and larger fragments of text to a database of concepts built from the online encyclopedia Wikipedia, which has over one million articles in its English-language version. The Wikipedia-based concepts act as “background knowledge” to help computers figure out the meaning of the text entered into a Web search, for instance.

Giving computers this deeper knowledge has been a long-standing problem in artificial intelligence, according to Markovitch. “Humans use a significant amount of background knowledge” to understand text, “but we didn’t know how to have computers access such knowledge,” he said.

Most Web search and e-mail filter programs appear smart by calculating how often certain words appear in two texts, Markovitch explained. “But what is common to all these applications is that the programs that actually do this kind of thing don’t understand text. They treat text as a collection of words, but they don’t understand the meaning of words.”

This shallow understanding is what makes an e-mail spam filter block all messages containing the word “vitamin,” but fail to block messages containing the word “B12.” “If the program never saw “B12” before, it’s just a word without any meaning. But you would know it’s a vitamin,” Markovitch said.

“With our methodology, however, the computer will use its Wikipedia-based knowledge base to infer that "B12" is strongly associated with the concept of vitamins, and will correctly identify the message as spam," he added.

Or, computers could look at a chunk of text about Saddam Hussein and weapons of mass destruction and know that it is conceptually related to topics such as the Iraq war and U.S. Senate debates on intelligence—even if those terms do not appear anywhere in the original text.

The method also helps computers figure out ambiguous terms—deciding, for instance, whether the word “mouse” refers to the computer device or the fuzzy animal. This can be especially important in translated documents, Markovitch said.

In the near future, the Technion researchers hope to improve their method by adding information from the Web page links inside Wikipedia articles. They are already pursuing a patent on their work, which they say will be of interest to the intelligence community and Web search engine companies, among others.

Source: American Technion Society

3.8 /5 (57 votes)  

Rank 3.8 /5 (57 votes)
Tags

Related Stories
Relevant PhysicsForums posts
  • Computer Architecture Help
    created2 hours ago
  • Emulators on lower powered spartphones - PSX4droid
    createdFeb 14, 2012
  • Digital scratch pad?
    createdFeb 13, 2012
  • Quantum computer faster than regular computer?
    createdFeb 13, 2012
  • Synergistic relations between computer science and technology.
    createdFeb 06, 2012
  • how do iphone gloves work?
    createdFeb 05, 2012
  • More from Physics Forums - Computing & Technology

More news stories

US regulators pull plug on LightSquared

US telecom regulators have pulled the plug on an ambitious plan to build a high-speed wireless broadband network, citing potential interference with GPS navigation devices.

Technology / Telecom

created 1 hour ago | popularity not rated yet | comments 2

US Senate in new cybersecurity push

US senators, warning of potentially catastrophic cyberattacks, introduced a bill Tuesday aimed at protecting critical infrastructure such as power, water and transportation systems.

Technology / Internet

created 2 hours ago | popularity 5 / 5 (1) | comments 7

FCC plans to nix wireless network that may jam GPS

Federal officials are effectively killing a private company's plans to start a national high-speed wireless broadband network after concluding it would in some cases jam GPS devices.

Technology / Telecom

created 3 hours ago | popularity not rated yet | comments 1

Virtual reality supports planning by architects

Even the most exact construction plan lacks many details and design options. The building owner needs imagination to obtain an idea of the constructed building. Now, the 3D video glasses made by the KIT spin-off ...

Technology / Hi Tech & Innovation

created 2 hours ago | popularity not rated yet | comments 0

Twitter subpoenas a challenge to intellectual privacy

The City of New York recently subpoenaed a Twitter account as part of an ongoing Occupy Wall Street criminal case. The Occupy protester named in the case is challenging the subpoena.

Technology / Internet

created 2 hours ago | popularity not rated yet | comments 0


Plasmas torn apart: Physicists make discovery that hints at origin of phenomena like solar flares

January saw the biggest solar storm since 2005, generating some of the most dazzling northern lights in recent memory.

Prion proteins play powerful role in survival, evolution of wild yeast strains

Prions, the much-maligned proteins most commonly known for causing "mad cow" disease, are commonly used in yeast to produce beneficial traits in the wild. Moreover, such traits can be passed on to subsequent generations and ...

Lava formations in eastern Oregon linked to rip in giant slab of Earth

Like a stream of air shooting out of an airplane's broken window to relieve cabin pressure, scientists at Scripps Institution of Oceanography at UC San Diego say lava formations in eastern Oregon are the result ...

Astronomers watch delayed broadcast of a rare celestial eruption

Eta Carinae, one of the most massive stars in our Milky Way galaxy, unexpectedly increased in brightness in the 19th century. For ten years in the mid-1800s it was the second-brightest star in the sky. (Now it is not even ...

New molecule discovered in fight against allergy

Scientists at The University of Nottingham have discovered a new molecule that could offer the hope of new treatments for people allergic to the house dust mite.

Black hole came from a shredded galaxy

(PhysOrg.com) -- Astronomers using NASA's Hubble Space Telescope have found a cluster of young, blue stars encircling the first intermediate-mass black hole ever discovered. The presence of the star cluster ...