科学软件资源导航

MALLET

MALLET is a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text

LJParser

Yiwei Wang
0

灵玖LJParser自然语言语义分析系统是网络搜索、自然语言理解和文本挖掘的技术开发的基础工具集，开发平台由多个中间件组成，各个中间件API可以无缝地融合到客户的各类复杂应用系统之中，可兼容Windows，Linux，FreeBSD等不同操作系统，可以供Java，C，C#等各类开发语言使用。

Lingpipe

LingPipe is a toolkit that uses computer linguistics to process text information. It can be used for the following tasks: Find names, organizations or locations in the news.

jiebaR

Chinese text segmentation, keyword extraction and speech tagging For R.

jieba

0

“结巴”中文分词：做最好的 Python 中文分词组件

IKAnalyzer

林良益
0

IKAnalyzer 是一个开源的，基于java语言开发的轻量级的中文分词工具包。从2006年12月推出1.0版开始，IKAnalyzer已经推出了3个大版本。最初，它是以开源项目 Lucene为应用主体的，结合词典分词和文法分析算法的中文分词组件。新版本的IKAnalyzer3.0则发展为面向Java的公用分词组件，独立于Lucene项目，同时提供了对Lucene的默认优化实现。

HanLP

HanLP 是由一系列模型与算法组成的 Java 工具包，目标是普及自然语言处理在生产环境中的应用。HanLP 具备功能完善、性能高效、架构清晰、语料时新、可自定义的特点。

Gensim

Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora. Target audience is the natural language processing (NLP) and information retrieval (IR) c

fastNLP

fastNLP是一款轻量级的自然语言处理（NLP）工具包，目标是快速实现NLP任务以及构建复杂模型。采用Java编写的中文自然语言处理开源项目，提供了进行自然语言处理的工具，包括分词、词性标注、句法分析、文本相似度计算等以及进行处理所需的数据集。本项目现已停止维护。

CRF++

2013
0

CRF++?is a simple, customizable, and open source implementation of?Conditional Random Fields (CRFs)?for segmenting/labeling sequential data. CRF++ is designed for generic purpose and will be applied t

标签: #Natural language processing