Class | Description |
---|---|
StringSplitter |
Split a string into an array of tokens according to the
tokenisation scheme defined in TokeniserRegex
|
StringSplitterJP |
Split a string into an array of tokens according to the
tokenisation scheme defined in TokeniserJP
|
SubcorpusIndexer |
Tokenise a chunk of text and record the position of each token
|
TokeniserGNU |
Tokenise a chunk of text and record the position of each token
|
TokeniserJP |
Tokenise a chunk of *Japanese laguage* text and record the
position of each token
|
TokeniserJPLucene |
Tokenise a chunk of *Japanese laguage* text and record the
position of each token using Lucene's infrastructure and the
Kuromoji analyser.
|
TokeniserRegex |
Tokenise a chunk of text and record the position of each token
|