Apache Hadoop?is a collection of?open-source?software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It provides a?so
科学软件资源导航
Scientific software resource navigation
HanLP 是由一系列模型与算法组成的 Java 工具包,目标是普及自然语言处理在生产环境中的应用。HanLP 具备功能完善、性能高效、架构清晰、语料时新、可自定义的特点。
Heritrix is a web crawler designed for web archiving. It was written by the Internet Archive. It is available under a free software license and written in Java. The main interface is accessible using
HistCite?is a software package used for?bibliometric?analysis and information visualization.?
tml2txt converts HTML to markdown.
HAP is an HTML parser written in C# to read/write DOM and supports plain XPATH or XSLT.
HTML Parser is a Java library used to parse HTML in either a linear or nested fashion. Primarily used for transformation or extraction, it features filters, visitors, custom tags and easy to use JavaB
HyperCam captures the action from your Windows screen and saves it to AVI (Audio-Video Interleaved) movie file. Sound from your system microphone is also recorded.
HyperRESEARCH gives you complete access and control, with keyword coding, mind-mapping tools, theory building and much more.
IKAnalyzer 是一个开源的,基于java语言开发的轻量级的中文分词工具包。从2006年12月推出1.0版开始,IKAnalyzer已经推出 了3个大版本。最初,它是以开源项目 Lucene为应用主体的,结合词典分词和文法分析算法的中文分词组件。新版本的IKAnalyzer3.0则发展为 面向Java的公用分词组件,独立于Lucene项目,同时提供了对Lucene的默认优化实现。...