Researchers classify Web searches

April 10, 2008

Although millions of people use Web search engines, researchers show that – by using relatively simple methods – most queries submitted can be classified into one of three categories.

Jim Jansen, assistant professor in Penn State's College of Information Sciences and Technology, worked with IST undergraduate Danielle Booth and Amanda Spink, Queensland University of Technology, to find that Web search engine users are doing primarily informational, navigational or transactional searching.

Informational searching involves looking for a specific fact or topic, navigational searching seeks to locate a specific Web site and transactional searching looks for information related to buying a particular product or service.

The research was the first published work of its kind done using actual searching data, with the aim of real-time classification. Researchers analyzed more than 1.5 million queries from hundreds of thousands of search engines users. Findings showed that about 80 percent of queries are informational and about 10 percent each are for navigational and transactional purposes.

Jansen and his colleagues arrived at those results by selecting random samples of records and analyzing query length, the order of the query in the session and the search results. These fields helped the team develop an algorithm that classified the searches with a 74-percent accuracy rate.

"Other results have classified comparatively much smaller sets of queries, usually manually," Jansen said. "This research aimed to classify queries automatically.

"Our findings have broad implications for search engines and e-commerce if they can classify the user intent of queries in real time. This is why we wanted a computational undemanding algorithm," Jansen continued. "It proves the 80/20 rule that 80 percent of the cases can be achieved with these clear-cut methods."

The paper "Determining the informational, navigational and transactional intent of Web queries" will appear in the May 2008 issue of Information Processing & Management. The article is currently available online.

The Penn State researcher said he plans to continue this research using a more complex algorithm that will hopefully yield a 90-percent accuracy rate using similar searching criteria.

Source: Penn State

3.2 /5 (9 votes)  

Filter


Move the slider to adjust rank threshold, so that you can hide some of the comments.


Display comments: newest first

Valentiinro
Apr 10, 2008

Rank: not rated yet
I am wondering where porn fits into this scheme. There is quite a large amount of that on the internet. Informational I suppose?
gopher65
Apr 11, 2008

Rank: 5 / 5 (1)
Definitely Informational.
Rank 3.2 /5 (9 votes)
Tags

Related Stories
Relevant PhysicsForums posts

More news stories

Google might launch Drive for cloud storage soon

(PhysOrg.com) -- Google's next big move, according to the Wall Street Journal, is a cloud storage service called Drive. Hardly first to the plate, Google is simply catching up to introducing its cloud reposi ...

Technology / Internet

created 15 hours ago | popularity 4.8 / 5 (5) | comments 5 | with audio podcast report

Iran blocks email, restricts net access: reports

Iran has further restricted access to the Internet and blocked popular email services for the past few days, in a move a top lawmaker said could "cost the regime dearly," media reports said on Sunday.

Technology / Internet

created 8 hours ago | popularity 5 / 5 (2) | comments 4

Love a click away in Indonesia's Twitter Republic

He was a geeky kid from Yogyakarta, she a glamorous city girl in Jakarta. In a country with one of the world's most vibrant social networking scenes they fell in love on Twitter.

Technology / Internet

created 16 hours ago | popularity 4 / 5 (1) | comments 0

Walney offshore wind farm is world's biggest (for now)

(PhysOrg.com) -- The Walney wind farm on the Irish Sea--characterized by high tides, waves and windy weather--officially opened this week. The farm is treated in the press as a very big deal as the Walney ...

Technology / Energy & Green Tech

created Feb 11, 2012 | popularity 4.1 / 5 (14) | comments 52 | with audio podcast weblog

Navy to begin tests on electromagnetic railgun prototype launcher

The Office of Naval Research (ONR)'s Electromagnetic (EM) Railgun program will take an important step forward in the coming weeks when the first industry railgun prototype launcher is tested at a facility ...

Technology / Engineering

created Feb 06, 2012 | popularity 4.5 / 5 (19) | comments 95 | with audio podcast


Scientists discover molecular secrets of 2,000-year-old Chinese herbal remedy

For roughly two thousand years, Chinese herbalists have treated Malaria using a root extract, commonly known as Chang Shan, from a type of hydrangea that grows in Tibet and Nepal. More recent studies suggest that halofuginone, ...

New method to examine batteries -- MRI from the inside

There is an ever-increasing need for advanced batteries for portable electronics, such as phones, cameras, and music players, but also to power electric vehicles and to facilitate the distribution and storage of energy derived ...

A mitosis mystery solved: How chromosomes align perfectly in a dividing cell

Although the process of mitotic cell division has been studied intensely for more than 50 years, Whitehead Institute researchers have only now solved the mystery of how cells correctly align their chromosomes during symmetric ...

Lab study raises questions over nano-particle impact

Tests involving chickens have raised questions about the impact on health from engineered nano-particles, the ultra-fine grains commonly used in drugs and processed foods, scientists said on Sunday.

Overeating may double risk of memory loss

New research suggests that consuming between 2,100 and 6,000 calories per day may double the risk of memory loss, or mild cognitive impairment (MCI), among people age 70 and older. The study was released today and will be ...

Starve a virus, feed a cure? Findings show how some cells protect themselves against HIV

A protein that protects some of our immune cells from the most common and virulent form of HIV works by starving the virus of the molecular building blocks that it needs to replicate, according to research published online ...