New metasearch engine leaves Google, Yahoo crawling

March 25, 2009 New metasearch engine leaves Google, Yahoo crawling

Enlarge

Weiyi Meng, a professor of computer science at Binghamton University, State University of New York, is hopeful that one day in the not-too-distant future, you'll be able to type a query into an online search engine and have it deliver not Web pages that may contain an answer, but just the answer itself. Credit: Jonathan Cohen

One day in the not-too-distant future, you'll be able to type a query into an online search engine and have it deliver not Web pages that may contain an answer, but just the answer itself, says Weiyi Meng, a professor of computer science at Binghamton University, State University of New York.

For instance, imagine typing in "Who starred in the film Casablanca?" The would respond with "Humphrey Bogart and Ingrid Bergman."

Not impressed?

Try asking a more nuanced question, such as "What do Americans think of universal health care?" A search engine will create a report indicating trends in opinion based on what has been posted to the Web.

Search engines may eventually be used to conduct polling and even help sort fact from fiction, said Meng, who is helping to make such possibilities a reality, both through his research and as president of a company called Webscalers.

The way Meng sees it, big search engines such as and Yahoo are fundamentally flawed. The Web has two parts: the and the . The surface Web is made up of perhaps 60 billion pages. The deep Web, at some 900 billion pages, is about 15 times larger.

Google, which relies on a "" to examine pages and catalog them for future searches, can search about 20 billion pages. Web crawlers follow links to reach pages and often miss content that isn't linked to any other page or is in some way "hidden."

Meng, along with researchers at the University of Illinois at Chicago and the University of Louisiana at Lafayette, has helped pioneer large-scale metasearch-engine technology that harnesses the power of small search engines to come up with results that are more accurate and more complete.

"Most of the pages on the deep Web aren't directly 'crawlable.' We want to connect to small search engines and reach the deep Web," he said. "That's the idea. Many people have the that Google can search everything, and if it's not there it doesn't exist. But we should be able to retrieve many times more than what Google can search."

Not only can a metasearch engine probe deeper, it can also offer the latest information.

"In principle," Meng said, "small guys are much better able to maintain the freshness of their data. Google has a program to 'crawl' all over the world. Depending on when the crawler has last visited your server, there's a delay of days or weeks before a new page will show up in that search. We can get fresher results."

The concept is not new. In fact, the first metasearch engine was built in 1994.

"The big difference between our technology and the ones pursued by other people is that most of the other technologies do the metasearching on top of a small number of general-purpose search engines, such as Yahoo, Google or MSN," Meng explained. "We have a completely different perspective. We want to build large-scale metasearch engines on top of many small search engines."

The Web has millions of search engines at businesses, universities, newspapers and other organizations. Since 1997, and with continued funding from the National Science Foundation, Meng and his collaborators have found ways to run queries across multiple search engines and sort through the results.

Webscalers is based in the Start-Up Suite at Binghamton University's Innovative Technologies Complex, which is home to several young companies that have their roots in faculty inventions.

"If the Web keeps on growing, a company like Google may run out of resources to crawl all of those pages," said Vijay V. Raghavan, vice president of Webscalers and a faculty member at the University of Louisiana at Lafayette. "We won't have that problem. We will scale much better."

Webscalers' technology could be useful for large organizations with many divisions. For example, Webscalers has developed a prototype that would allow a search of all 64 campuses in the State University of New York system as well as SUNY's central administration.

"People can use it to find collaborators," Meng said. "It could also help prospective students find programs they're interested in."

The technology could be adapted to large companies or even the government, Meng said.

Challenges for large-scale metasearch engines include determining which search engines are the best for a given query, automating the interaction with search engines as well as organizing the search results.

Meng hopes to build a grand metasearch engine one day that would integrate all of the 1 million small search engines into a single system. "There are still a lot of significant challenges in creating a system of such magnitude," he said, "but I am optimistic that such a metasearch engine can be built."

Try out the concept online

Webscalers has already launched several metasearch products:

The first is a news metasearch engine called AllinOneNews. Available at http://www.allinonenews.com , it connects to 1,800 news sources in 200 countries. That's the largest metasearch engine in the world.

Webscalers also offers MySearchView, a system that allows any user to create his or her own metasearch engine just by checking off a few options at http://www.mysearchview.com .

Source: Binghamton University


print this article email this article download pdf blog this article bookmark this article     Stumble it Digg this share on Facebook retweet share on Reddit add to delicious
Rate this story - 3.7 /5 (15 votes)

Rank Filter

Move the slider to adjust rank threshold, so that you can hide some of the comments.


Display comments: newest first

  • earls - Mar 25, 2009
    • Rank: 4 / 5 (1)
    I guess every month from here on out we'll hear about the next big "Google Killer."

    This "metasearch" however, has little do to with a typical "Google Search." It seems to be more related to the "post-processing" of the results than actually finding them.

    Airkin also made me aware in another article that it "seems to be limited to its own database (think Wiki) not like Google's large ones of the internet."

    This is evidenced by "Meng hopes to build a grand metasearch engine one day that would integrate all of the 1 million small search engines into a single system."

    It seems like a step back in my mind... Or at least, too infantile to be of any use (yet).

    Another issue that should be consider is "Should you trust the result."

    The converse of "one simple answer" is being painted as a negative: Google search returns many (too many?) results that have to be poured over to distill an answer... Though this is not really the case, as (generally) you'll get the answer you're looking for in the top 10 results.

    However, with one absolute answer, "just because the computer said," how do you know it's the correct answer? Many results gives you the ability to compare and contrast and decide for yourself what's true.

    I suppose (and understand) this is what Wolfram and Meng are attempting to accomplish... An authoritative response that falls within the human margin of error... But it just seems to me "humans are computers, and computers aren't human." Is there simply an natural disconnect between the two different "mediums" or will a singularity be reached in the future?

    I wonder what the metasearch would have to say about that question. ;) "42."
  • vlam67 - Mar 25, 2009
    • Rank: not rated yet
    yeah, sure, great Meng. Type in "Tibet" and the answer is " China's territory". Enough said.
  • ealex - Mar 26, 2009
    • Rank: not rated yet
    Wasn't there recently another one of these. What's up with that? Is there a grudge against google on the physorg team? This is basically the exact same stuff, only different search engine.

    Let it go already, we get it.
  • Choice - Mar 29, 2009
    • Rank: not rated yet
    The program should return several answers and let the asker choose the one he or she likes.
  • pcunix - Mar 29, 2009
    • Rank: not rated yet
    "Depending on when the crawler has last visited your server, there's a delay of days or weeks before a new page will show up in that search. We can get fresher results."

    Really? Gosh, I've seen pages I post show up literally minutes later.

    For this kind of stuff, I'll believe it when I see it, and I think seeing it is a long, long way off.
  • denijane - Apr 07, 2009
    • Rank: not rated yet
    As much as I like it, there is one thing that we must admit-the search the way it is,provides us with more information. For example, wanting to know something more a date, will provide you with pages and pages with related content that you have to skip trough in order to find out what you're looking for. And during this process,you learn a lot more and sometimes even stuff that are quite useful for you, but wouldn't have known otherwise. While if you got the answer in a line or 3, you would limit your knowledge.

    Yes, I know this isn't really a flaw. I just wanted to point out that all the search engines have their good and their bad sides and can develop simultaneously.

March 25, 2009 all stories

Comments: 6

3.7 /5 (15 votes)
  • Stumble this up

  • Digg this

  • share this

  • hide
  • Related Stories

  • Search engine branding to be examined by researcher
    created Jun 11, 2008 | popularity not rated yet | comments 0
  • Branding matters -- even when searching
    created Jun 28, 2007 | popularity not rated yet | comments 0
  • Search engine mashup
    created Jul 06, 2007 | popularity not rated yet | comments 0
  • Clicks on sponsored links lower than previously reported but show growth potential
    created Aug 22, 2007 | popularity not rated yet | comments 0
  • Search engines return similar results for e-commerce comparison shopping
    created Feb 02, 2006 | popularity not rated yet | comments 0



  • hide
  • Relevant PhysicsForums posts

  • Controling/Reading a CDROM drive.
    created 17 hours ago
  • casio calculator that's similar to TI-89
    created Nov 08, 2009
  • Advice on what cell phone to get
    created Nov 08, 2009
  • Changing the language options on your phone.
    created Nov 03, 2009
  • HP strange RPN operation???
    created Nov 02, 2009
  • Databases in physics
    created Oct 31, 2009
  • More from Physics Forums - Computing & Technology

Other News

Syringes with influenza virus vaccine sitting on a tray

Google launches online flu shot finder

Technology / Internet

created 10 minutes ago | popularity not rated yet | comments 0

Google on Tuesday launched an online tool for tracking down where to get vaccinations against H1N1 and seasonal influenza in the United States.


The New York Times headquarters in New York City

New York Times publishes 'crowd-funded' article

Technology / Internet

created 1hour ago | popularity not rated yet | comments 0

The science section of The New York Times contained an unusual article on Tuesday. The story about a huge floating garbage patch in the Pacific Ocean was not written by a Times reporter but by a freelance ...


New 'finFETS' promising for smaller transistors, more powerful chips

New 'finFETs' promising for smaller transistors, more powerful chips

Technology / Semiconductors

created 15 hours ago | popularity 4.9 / 5 (10) | comments 2

(PhysOrg.com) -- Purdue University researchers are making progress in developing a new type of transistor that uses a finlike structure instead of the conventional flat design, possibly enabling engineers ...


New search technique for images and videos has broad applications

New search technique for images and videos has broad applications

Technology / Computer Sciences

created 14 hours ago | popularity 5 / 5 (5) | comments 0

(PhysOrg.com) -- Engineers at the University of California, Santa Cruz, have developed a powerful new approach to a fundamental problem in computer vision: how to program a computer to recognize or categorize ...


Hydrogen milestone moves energy independence one step forward

Hydrogen milestone moves energy independence one step forward

Technology / Energy

created 13 hours ago | popularity 3.9 / 5 (7) | comments 1

(PhysOrg.com) -- Big things often come in small packages. That's certainly the case with the potential created by recent successes in hydrogen research at Idaho National Laboratory.