文章详情

  • 游戏榜单
  • 软件榜单
关闭导航
热搜榜
热门下载
热门标签
php爱好者> 资讯>Apache Nutch 1.2 Released

Apache Nutch 1.2 Released

时间:2010-09-25  来源:红薯

Nutch 是一个开源Java 实现的搜索引擎。它提供了我们运行自己的搜索引擎所需的全部工具。包括全文搜索和Web爬虫。

Apache Nutch 1.2 包含了不少的改进和bug修复,详情请看 CHANGES 文件。

你可以通过下面地址下载最新版的 Apache Nutch:

http://www.apache.org/dyn/closer.cgi/nutch/

CHANGES:

* NUTCH-901 Make index-more plug-in configurable (Markus Jelsma via mattmann)
* NUTCH-908 Infinite Loop and Null Pointer Bugs in Searching (kubes via mattmann)
* NUTCH-906 Nutch OpenSearch sometimes raises DOMExceptions (Asheesh Laroia via ab)
* NUTCH-862 HttpClient null pointer exception (Sebastian Nagel via ab)
* NUTCH-905 Configurable file protocol parent directory crawling (Thorsten Scherler, mattmann, ab)
* NUTCH-877 Allow setting of slop values for non-quote phrase queries on query-basic plugin (kubes via jnioche)
* NUTCH-716 Make subcollection index filed multivalued (Dmitry Lihachev via jnioche)
* NUTCH-878 ScoringFilters should not override the injected score
* NUTCH-870 Injector should add the metadata before calling injectedScore (jnioche via mattmann)
* NUTCH-858 No longer able to set per-field boosts on lucene documents (ab)
* NUTCH-869 Add parse-html back (jnioche)
* NUTCH-871 MoreIndexingFilter missing date format (Max Lynch via mattmann)
* NUTCH-696 Timeout for Parser (ab, jnioche)
* NUTCH-857 DistributedBeans should not close their RPC counterparts (kubes)
* NUTCH-855 ScoringFilter and IndexingFilter: To allow for the propagation of URL Metatags   and their subsequent indexing (Scott Gonyea via mattmann)
* NUTCH-677 Segment merge filering based on segment content (Marcin Okraszewski via mattmann)
* NUTCH-774 Retry interval in crawl date is set to 0 (Reinhard Schwab via mattmann)
* NUTCH-697 Generate log output for solr indexer and dedup (Dmitry Lihachev, Jeroen van Vianen via mattmann)
* NUTCH-850 SolrDeleteDuplicates needs to clone the SolrRecord objects (jnioche)
* NUTCH-838 Add timing information to all Tool classes (Jeroen van Vianen, mattmann)
* NUTCH-835 Document deduplication failed using MD5Signature (Sebastian Nagel via ab)
* NUTCH-831 Allow configuration of how fields crawled by Nutch are stored / indexed /   tokenized (Jeroen van Vianen via mattmann)
* NUTCH-278 Fetcher-status might need clarification: kbit/s instead of kb/s shown (Alex McLintock via mattmann)
* NUTCH-833 Website is still Lucene branded (mattmann, Alex McLintock)
* NUTCH-832 Website menu has lots of broken links - in particular the API docs (Alex McLintock via mattmann)

相关阅读 更多 +
排行榜 更多 +
辰域智控app

辰域智控app

系统工具 下载
网医联盟app

网医联盟app

运动健身 下载
汇丰汇选App

汇丰汇选App

金融理财 下载