Interface | Description |
---|---|
TCInvertedIndex |
Inverted indices for text categorisation must implement this
interface.
|
Class | Description |
---|---|
BVProbabilityModel |
Store inverted indices of terms and categories (indexed to
documents) which form the basis of a probability model.
|
IdFrequencyPair |
Represent a doc id and the number of times a term (in BVProbabilityModel) occurs in it
|
NewsItemAsBooleanVector |
Store text as an array of booleans, its id and categories (also as a vector)
|
NewsItemAsOccurVector |
Store text as an array of integers (each representing the number of times a term occurs in the text),
its id and categories (also as a vector).
|
ParsedCorpus |
Store
ParsedDocument s. |
ParsedDocument |
Store text (as a StringBuffer), its id (as a String), and the
categories to which it belongs (as a vector)
|
ParsedNewsItem | Deprecated
Use ParsedDocument instead.
|
ParsedText | Deprecated
Used ParsedCorpus instead
|
TCProbabilityModel |
Define methodos to be implemented by all probability models for
text categorisation.
|
TSReducedText |
Store NewsItemAsOccurVector's, handle conversions to a variety
of formats (just plain strings and ARFF, at the moment, actually)
N.B.: TSReducedText is usually BIG! (so use with caution, unless
you have lots of memory to spare).
|