- - -
# NodeJieba [简体中文](README.md)
## Introduction
`NodeJieba` provides chinese word segmentation for Node.js based on [CppJieba].
## Install
npm install nodejieba
Or [cnpm] instead of npm
npm install nodejieba --registry=https://registry.npm.taobao.org --nodejieba_binary_host_mirror=https://npm.taobao.org/mirrors/nodejieba
## Usage
var nodejieba = require("nodejieba");
var result = nodejieba.cut("南京市长江大桥");
See details in [test/demo.js](test/demo.js)
### Initialization
Initialization is optional and will be executed once `cut` is called with the default dictionaries.
Loading the default dictionaries can be called explicitly by
This is similar to the internal call of
dict: './dict/jieba.dict.utf8',
hmmDict: './dict/hmm_model.utf8',
userDict: './dict/userdict.utf8',
idfDict: './dict/idf.utf8',
stopWordDict: './dict/stop_words.utf8',
If a dictionary parameter is missing, its default value will be uesd.
#### Dictionary description
+ dict: the main dictionary with weight and lexical tags, it's recommended to use the default dictionary
+ hmmDict: hidden markov model, it's recommended to use the default dictionary
+ userDict: user dictionary, it's recommended to modify it to your use case
+ idfDict: idf information for keyword extraction
+ stopWordDict: list of stop words for keyword extraction
### POS Tagging
var nodejieba = require("nodejieba");
//[ { word: '红掌', tag: 'n' },
// { word: '拨', tag: 'v' },
// { word: '清波', tag: 'n' } ]
See details in [test/demo.js](test/demo.js)
### Keyword Extractor
var nodejieba = require("nodejieba");
var topN = 4;
console.log(nodejieba.extract("升职加薪,当上CEO,走上人生巅峰。", topN));
//[ { word: 'CEO', weight: 11.739204307083542 },
// { word: '升职', weight: 10.8561552143 },
// { word: '加薪', weight: 10.642581114 },
// { word: '巅峰', weight: 9.49395840471 } ]
console.log(nodejieba.textRankExtract("升职加薪,当上CEO,走上人生巅峰。", topN));
//[ { word: '当上', weight: 1 },
// { word: '不用', weight: 0.9898479330698993 },
// { word: '多久', weight: 0.9851260595435759 },
// { word: '加薪', weight: 0.9830464899847804 },
// { word: '升职', weight: 0.9802777682279076 } ]
See details in [test/demo.js](test/demo.js)
## Testing
Testing passed in the following version:
+ `node v10`
+ `node v12`
+ `node v14`
+ `node v15`
## Use Cases
+ [gitbook-plugin-search-pro]
+ [pinyin]
## Similar projects
+ [@node-rs/jieba](https://github.com/Brooooooklyn/node-rs/tree/master/packages/jieba)
## Performance
It is supposed to have the best performance out of all available Node.js modules. There is a post available in mandarin [Jieba中文分词系列性能评测].
## Online Demo
(chrome is suggested)
