西西河

主题:google的挑战者:clusty -- 林小筑

共:💬9 新:
分页树展主题 · 全看首页 上页
/ 1
下页 末页
  • 家园 google的挑战者:clusty

    http://clusty.com/

    简单地说,这个搜索引擎的卖点在于自动的把搜索结果进行分类组织(clustering)。比如说搜索"java",他就把搜索结果自动分成一下类别。

    ⇨Technology (32)

    ⇨Open Source (16)

    ⇨FAQ, Java programming (16)

    ⇨JavaScript (22)

    ⇨Tutorials (14)

    ⇨Java Applets (17)

    ⇨Games (13)

    ⇨Download Java (6)

    ⇨Reviews (9)

    ⇨Class (8)

    其中有些类还能展开,划分成跟小的类。比如把technology类展开,就成了下面这个样子。

    Technology (32)

    ⇨Developer Forums (2)

    ⇨Mobile, Information Device Profile (3)

    ⇨Marketplace For Java Technology (2)

    ⇨Servlets, XML (3)

    ⇨Microsoft (2)

    ⇨Apple, Mac (2)

    ⇨Certification Java (2)

    ⇨Java Programming (3)

    ⇨Other Topics (13)

    利用了人工智能技术做的,而不是人类进行的手工分类,所以结果当然不能尽善尽美。但这体现了一种崭新的思想:当网上信息量多到了泛滥的程度时该怎么办? 应该利用计算机来帮人类过滤和组织这些信息。

    说起来,其实google也有了类似的东西,就是其新闻聚合器。http://news.google.com.hk/news?ned=cn&hl=zh-CN

    http://clusty.com/

    New Company Starts Up a Challenge to Google

    September 30, 2004

    By JOHN MARKOFF

    SAN FRANCISCO, Sept. 29 - Google executives have long

    conceded that one of their great fears is to be overtaken

    by a more advanced Internet search technology. Vivisimo, a

    company founded by three former Carnegie Mellon University

    computer scientists, is hoping to prove that Google's

    worries are well founded.

    Four-year-old Vivisimo plans to start Clusty, a free,

    consumer search service based on results from Yahoo's

    Overture engine, Thursday.

    Vivisimo already offers a search service for corporate

    customers, which clusters results into categories to make

    them easier to sort through. Search "swift boat," for

    example, and Vivisimo returns 149 results - listing them

    one by one, and also as a table of categories, like "Swift

    Boat Veterans," "John Kerry" and "Patrol Craft Fast" on the

    left-hand side of the Web page.

    The new Clusty service for consumers, which will be free

    and supported by advertising revenue, uses a similar

    organizational structure. But it also presents a series of

    tabs enabling the user to see results from sources besides

    the general Web, including shopping information, yellow

    pages, news, blogs, and images.

    Vivisimo, which is privately held and is profitable,

    according to its executives, has been selling its

    clustering technology to corporations for research by their

    employees. Now Vivisimo is making an effort to compete more

    broadly by attracting consumers to its Web site,

    clusty.com.

    The service is meant to address the confusion that can be

    created when search engines return huge lists. Clustering

    is also intended to help users find related material they

    may overlook when they employ services that utilize page

    ranking methods. Such methods employ a variety of software

    algorithms to rank Web pages by their perceived relevance

    to a query.

    Many search experts say that clustering offers a better way

    of looking at information than Google's page ranking

    system.

    "As databases get larger, trying to pull the proverbial

    needle out of the haystack gets tougher and tougher," said

    Gary Price, a librarian who is also the news editor at

    SearchEngineWatch, a Web site that covers the industry.

    "Here, you're getting a bit of extra help."

    Vivisimo's co-founder and chief executive, Raul

    Valdes-Perez, was a protégé of Herbert A. Simon, a Nobel

    laureate who was a pioneer in artificial intelligence

    research. Before co-founding Vivisimo, Mr. Valdes-Perez was

    a computer scientist at Carnegie Mellon University. He

    professes that the way to deal with information overload is

    with information "overlook" - techniques that strip away

    extraneous information.

    Clusty would generate money for Vivisimo by placing several

    search-related advertisements from Overture on the

    right-hand side of each page. Revenue from the ads would be

    shared by Vivisimo and Overture.

    Unlike many start-ups, which are launched with venture

    capital financing, Vivisimo was created with help from a $1

    million grant from the National Science Foundation Small

    Business Innovation Research program, which is intended to

    stimulate innovation by new companies.

    Vivisimo is not the first to introduce clustering for Web

    surfers. Northern Light, a search engine company founded in

    1996, had offered a consumer service featuring what it

    called "custom search folders." But that company is now

    focused on corporate applications.

    Google is also using clustering technology, but in a more

    limited fashion: its news page provides links to topics

    that appear on news sites.

    Microsoft and Yahoo have been drawn into the search

    business in part because of Google's profitability and

    rapidly growing revenue - $962 million for the quarter that

    ended in June, up from $389 million in the previous

    quarter.

    The introduction of Clusty comes two weeks after A9, a

    subsidiary of Amazon.com, introduced a service focused on

    organizing information retrieved during various Web

    searches.

    "Search will look more like the magazine business than the

    soda market," said Oren Etzioni, a computer scientist at

    University of Washington and an advisory board member of

    Vivisimo. He predicts that users might select from a

    variety of services, rather than from a few dominant

    players.

    "The competition has shifted from crawling the Web and

    returning an answer quickly," Mr. Etzioni said, "to adding

    value to the information that has been retrieved."

    A Google spokesman declined to comment on the service.

    Vivisimo's executives are betting that there is an audience

    for providing a different view of Web search results.

    "Google is excellent at crawling as much of the Web as they

    can; we don't do that," said Mr. Valdes-Perez. Instead,

    Vivisimo tackles the question, "How do you solve the

    problem of information overload?"

    http://www.nytimes.com/2004/09/30/technology/30search.html?ex=1097903707&ei=1&en=87e20490beecdd4b

    • 家园 试了一下,好象返回的结果没有google多

      我觉得对于一般的dummy user来说,使用google就足够了. 这个新的搜索引擎对于想搞研究的人可能比较有用,比如说,可以通过分类对搜索内容进行thorough review.

      • 家园 是啊,它其实是对Overture的再包装,所以先天不足

        Overture的结果当然比不上google

        这个clusty只是指了一个方向,因为信息实在太多了,所以在返回搜索结果前要让计算机作一些过滤和提炼,才会对人类更有用。这说明搜索引擎仍大有可为。

        google的人才储备很强,要做和这个clusty差不多(大致应该用到自然语言处理和机器学习,都是google的强项)的应该不难,甚至应该做得更好。

    • 家园 大概是Automated Categorization的real world

      版. 根据介绍, 应该还是based on text/context, keyword的. 这方面的研究已经很久(其实比search engine早得多), 不过一直没有象Google一样在整个Web范围实践过.

      最近的一个Project就是在做类似的事: 在一个search engine中增加categorization的选项.

    • 家园 我也试试看。
    • 家园 不错,有一些新的想法!
    • 家园 试了一下

      感觉比Goolge更有条理,以后我们又多了一条枪

      但不支持中文!

      • 家园 好像还是支持中文的

        刚刚用中文作了搜索。中文的内容和古狗百度还是不能比的。而且没有网页快照。有些连接点击之后早就过时了。不象古狗百度还可以从网页快照里知道一些内容。

分页树展主题 · 全看首页 上页
/ 1
下页 末页


有趣有益,互惠互利;开阔视野,博采众长。
虚拟的网络,真实的人。天南地北客,相逢皆朋友

Copyright © cchere 西西河