摘要
Nutch is a well matured, production ready Web crawler. Nutch 1.x enables fine grained configuration, relying on?Apache Hadoop??data structures, which are great for batch processing.
当前版本
1.17 (src-tar, src-z
作者机构
The Apache Software Foundation
标签:
评论