| Interface | Description |
|---|---|
| TCInvertedIndex |
Inverted indices for text categorisation must implement this
interface.
|
| Class | Description |
|---|---|
| BVProbabilityModel |
Store inverted indices of terms and categories (indexed to
documents) which form the basis of a probability model.
|
| IdFrequencyPair |
Represent a doc id and the number of times a term (in BVProbabilityModel) occurs in it
|
| NewsItemAsBooleanVector |
Store text as an array of booleans, its id and categories (also as a vector)
|
| NewsItemAsOccurVector |
Store text as an array of integers (each representing the number of times a term occurs in the text),
its id and categories (also as a vector).
|
| ParsedCorpus |
Store
ParsedDocuments. |
| ParsedDocument |
Store text (as a StringBuffer), its id (as a String), and the
categories to which it belongs (as a vector)
|
| ParsedNewsItem | Deprecated
Use ParsedDocument instead.
|
| ParsedText | Deprecated
Used ParsedCorpus instead
|
| TCProbabilityModel |
Define methodos to be implemented by all probability models for
text categorisation.
|
| TSReducedText |
Store NewsItemAsOccurVector's, handle conversions to a variety
of formats (just plain strings and ARFF, at the moment, actually)
N.B.: TSReducedText is usually BIG! (so use with caution, unless
you have lots of memory to spare).
|