New metasearch engine leaves Google, Yahoo crawling

March 25, 2009 New metasearch engine leaves Google, Yahoo crawling

Enlarge

Weiyi Meng, a professor of computer science at Binghamton University, State University of New York, is hopeful that one day in the not-too-distant future, you'll be able to type a query into an online search engine and have it deliver not Web pages that may contain an answer, but just the answer itself. Credit: Jonathan Cohen

One day in the not-too-distant future, you'll be able to type a query into an online search engine and have it deliver not Web pages that may contain an answer, but just the answer itself, says Weiyi Meng, a professor of computer science at Binghamton University, State University of New York.

For instance, imagine typing in "Who starred in the film Casablanca?" The would respond with "Humphrey Bogart and Ingrid Bergman."

Not impressed?

Try asking a more nuanced question, such as "What do Americans think of universal health care?" A search engine will create a report indicating trends in opinion based on what has been posted to the Web.

Search engines may eventually be used to conduct polling and even help sort fact from fiction, said Meng, who is helping to make such possibilities a reality, both through his research and as president of a company called Webscalers.

The way Meng sees it, big search engines such as and Yahoo are fundamentally flawed. The Web has two parts: the and the . The surface Web is made up of perhaps 60 billion pages. The deep Web, at some 900 billion pages, is about 15 times larger.

Google, which relies on a "" to examine pages and catalog them for future searches, can search about 20 billion pages. Web crawlers follow links to reach pages and often miss content that isn't linked to any other page or is in some way "hidden."

Meng, along with researchers at the University of Illinois at Chicago and the University of Louisiana at Lafayette, has helped pioneer large-scale metasearch-engine technology that harnesses the power of small search engines to come up with results that are more accurate and more complete.

"Most of the pages on the deep Web aren't directly 'crawlable.' We want to connect to small search engines and reach the deep Web," he said. "That's the idea. Many people have the that Google can search everything, and if it's not there it doesn't exist. But we should be able to retrieve many times more than what Google can search."

Not only can a metasearch engine probe deeper, it can also offer the latest information.

"In principle," Meng said, "small guys are much better able to maintain the freshness of their data. Google has a program to 'crawl' all over the world. Depending on when the crawler has last visited your server, there's a delay of days or weeks before a new page will show up in that search. We can get fresher results."

The concept is not new. In fact, the first metasearch engine was built in 1994.

"The big difference between our technology and the ones pursued by other people is that most of the other technologies do the metasearching on top of a small number of general-purpose search engines, such as Yahoo, Google or MSN," Meng explained. "We have a completely different perspective. We want to build large-scale metasearch engines on top of many small search engines."

The Web has millions of search engines at businesses, universities, newspapers and other organizations. Since 1997, and with continued funding from the National Science Foundation, Meng and his collaborators have found ways to run queries across multiple search engines and sort through the results.

Webscalers is based in the Start-Up Suite at Binghamton University's Innovative Technologies Complex, which is home to several young companies that have their roots in faculty inventions.

"If the Web keeps on growing, a company like Google may run out of resources to crawl all of those pages," said Vijay V. Raghavan, vice president of Webscalers and a faculty member at the University of Louisiana at Lafayette. "We won't have that problem. We will scale much better."

Webscalers' technology could be useful for large organizations with many divisions. For example, Webscalers has developed a prototype that would allow a search of all 64 campuses in the State University of New York system as well as SUNY's central administration.

"People can use it to find collaborators," Meng said. "It could also help prospective students find programs they're interested in."

The technology could be adapted to large companies or even the government, Meng said.

Challenges for large-scale metasearch engines include determining which search engines are the best for a given query, automating the interaction with search engines as well as organizing the search results.

Meng hopes to build a grand metasearch engine one day that would integrate all of the 1 million small search engines into a single system. "There are still a lot of significant challenges in creating a system of such magnitude," he said, "but I am optimistic that such a metasearch engine can be built."

Try out the concept online

Webscalers has already launched several metasearch products:

The first is a news metasearch engine called AllinOneNews. Available at http://www.allinonenews.com , it connects to 1,800 news sources in 200 countries. That's the largest metasearch engine in the world.

Webscalers also offers MySearchView, a system that allows any user to create his or her own metasearch engine just by checking off a few options at http://www.mysearchview.com .

Source: Binghamton University


print this article email this article download pdf blog this article bookmark this article     Stumble it Digg this share on Facebook retweet share on Reddit add to delicious
Rate this story - 3.7 /5 (15 votes)

Rank Filter

Move the slider to adjust rank threshold, so that you can hide some of the comments.


Display comments: newest first

  • earls - Mar 25, 2009
    • Rank: 4 / 5 (1)
    I guess every month from here on out we'll hear about the next big "Google Killer."

    This "metasearch" however, has little do to with a typical "Google Search." It seems to be more related to the "post-processing" of the results than actually finding them.

    Airkin also made me aware in another article that it "seems to be limited to its own database (think Wiki) not like Google's large ones of the internet."

    This is evidenced by "Meng hopes to build a grand metasearch engine one day that would integrate all of the 1 million small search engines into a single system."

    It seems like a step back in my mind... Or at least, too infantile to be of any use (yet).

    Another issue that should be consider is "Should you trust the result."

    The converse of "one simple answer" is being painted as a negative: Google search returns many (too many?) results that have to be poured over to distill an answer... Though this is not really the case, as (generally) you'll get the answer you're looking for in the top 10 results.

    However, with one absolute answer, "just because the computer said," how do you know it's the correct answer? Many results gives you the ability to compare and contrast and decide for yourself what's true.

    I suppose (and understand) this is what Wolfram and Meng are attempting to accomplish... An authoritative response that falls within the human margin of error... But it just seems to me "humans are computers, and computers aren't human." Is there simply an natural disconnect between the two different "mediums" or will a singularity be reached in the future?

    I wonder what the metasearch would have to say about that question. ;) "42."
  • vlam67 - Mar 25, 2009
    • Rank: not rated yet
    yeah, sure, great Meng. Type in "Tibet" and the answer is " China's territory". Enough said.
  • ealex - Mar 26, 2009
    • Rank: not rated yet
    Wasn't there recently another one of these. What's up with that? Is there a grudge against google on the physorg team? This is basically the exact same stuff, only different search engine.

    Let it go already, we get it.
  • Choice - Mar 29, 2009
    • Rank: not rated yet
    The program should return several answers and let the asker choose the one he or she likes.
  • pcunix - Mar 29, 2009
    • Rank: not rated yet
    "Depending on when the crawler has last visited your server, there's a delay of days or weeks before a new page will show up in that search. We can get fresher results."

    Really? Gosh, I've seen pages I post show up literally minutes later.

    For this kind of stuff, I'll believe it when I see it, and I think seeing it is a long, long way off.
  • denijane - Apr 07, 2009
    • Rank: not rated yet
    As much as I like it, there is one thing that we must admit-the search the way it is,provides us with more information. For example, wanting to know something more a date, will provide you with pages and pages with related content that you have to skip trough in order to find out what you're looking for. And during this process,you learn a lot more and sometimes even stuff that are quite useful for you, but wouldn't have known otherwise. While if you got the answer in a line or 3, you would limit your knowledge.

    Yes, I know this isn't really a flaw. I just wanted to point out that all the search engines have their good and their bad sides and can develop simultaneously.

March 25, 2009 all stories

Comments: 6

3.7 /5 (15 votes)
  • Stumble this up

  • Digg this

  • share this

  • hide
  • Related Stories

  • Search engine branding to be examined by researcher
    created Jun 11, 2008 | popularity not rated yet | comments 0
  • Branding matters -- even when searching
    created Jun 28, 2007 | popularity not rated yet | comments 0
  • Search engine mashup
    created Jul 06, 2007 | popularity not rated yet | comments 0
  • Clicks on sponsored links lower than previously reported but show growth potential
    created Aug 22, 2007 | popularity not rated yet | comments 0
  • Search engines return similar results for e-commerce comparison shopping
    created Feb 02, 2006 | popularity not rated yet | comments 0



  • hide
  • Relevant PhysicsForums posts

  • Help with a camera choice
    created Nov 18, 2009
  • casio calculator that's similar to TI-89
    created Nov 08, 2009
  • Advice on what cell phone to get
    created Nov 08, 2009
  • Changing the language options on your phone.
    created Nov 03, 2009
  • HP strange RPN operation???
    created Nov 02, 2009
  • Databases in physics
    created Oct 31, 2009
  • More from Physics Forums - Computing & Technology

Other News

Taking the drudgery out of software development

Taking the drudgery out of software development

Technology / Software

created 19 minutes ago | popularity not rated yet | comments 0

(PhysOrg.com) -- Software developers will no longer have to reinvent the wheel when writing new programs and applications thanks to a clever new set of tools and a central repository of 'building blocks'.


Selling chip makers on optical computing

Selling chip makers on optical computing

Technology / Semiconductors

created 3 hours ago | popularity 5 / 5 (3) | comments 0

(PhysOrg.com) -- Computer chips that transmit data with light instead of electricity consume much less power than conventional chips, but so far, they've remained laboratory curiosities. Professors Vladimir ...


Nokia to ax 220 R&D jobs in Japan

Technology / Business

created 3 hours ago | popularity not rated yet | comments 0

(AP) -- Nokia Corp. said Tuesday it is axing 220 jobs at research and development units in Japan as the world's largest mobile phone maker continues to cut costs.


EU drops Qualcomm antitrust probe

Technology / Business

created 3 hours ago | popularity not rated yet | comments 0

(AP) -- European Union antitrust regulators on Tuesday dropped a monopoly abuse probe into wireless chip maker Qualcomm Inc. after mobile phone companies withdrew complaints about high royalty fees.


Joost assets bought by online ad company Adconion

Technology / Business

created 3 hours ago | popularity not rated yet | comments 0

(AP) -- The struggling online video startup Joost, begun with much fanfare by the creators of Skype and Kazaa, has been sold to an online advertising company.