modnlp.idx.query
public class WordQuery extends java.lang.Object
Modifier and Type | Field and Description |
---|---|
static byte |
LEFTWILDCARD_TYPE |
static java.lang.String |
QSEPTOKEN |
static byte |
REGEX_TYPE |
static java.lang.String |
REGEXPMARKER |
static byte |
RIGHTWILDCARD_TYPE |
static char[] |
SEPTKARR |
static java.lang.String |
SEPTOKEN |
static byte |
WORD_TYPE |
Constructor and Description |
---|
WordQuery(java.lang.String query,
Dictionary dict,
boolean cs)
Creates a new
WordQuery instance based on a query
string, to be processed against dict. |
Modifier and Type | Method and Description |
---|---|
java.lang.String |
getFirstWord() |
int |
getIntervToWord(java.lang.String w) |
java.lang.String |
getKeyword() |
WordForms |
getKeyWordForms() |
WordForms |
getLeastFrequentWord() |
Horizon |
getLeftHorizon()
Get a
Horizon object conaining the maximum distances
allowed in this query between the main keyword and the keywords
to the left of it in the query expression. |
Horizon |
getRightHorizon()
Get a
Horizon object conaining the maximum distances
allowed in this query between the main keyword and the keywords
to the right of it in the query expression. |
WordForms[] |
getWFormsArray() |
static java.lang.String |
getWildcardsLHS(java.lang.String key) |
static java.lang.String |
getWildcardsRHS(java.lang.String key) |
WordForms |
getWordForms(int i) |
boolean |
isJustKeyword() |
static boolean |
isLeftWildcard(java.lang.String key) |
static boolean |
isRightWildcard(java.lang.String key) |
static boolean |
isValidQuery(java.lang.String q)
Perform basic sanity check on a query (for use by clients, for
instance, where full parsing of the query is impossible without
accessing the server)
|
static boolean |
isWildcard(java.lang.String key) |
boolean |
matchConcordance(java.lang.String cline,
int ctx)
Deprecated.
|
public static final java.lang.String QSEPTOKEN
public static final java.lang.String REGEXPMARKER
public static char[] SEPTKARR
public static java.lang.String SEPTOKEN
public static final byte WORD_TYPE
public static final byte LEFTWILDCARD_TYPE
public static final byte RIGHTWILDCARD_TYPE
public static final byte REGEX_TYPE
public WordQuery(java.lang.String query, Dictionary dict, boolean cs) throws WordQueryException
WordQuery
instance based on a query
string, to be processed against dict. The query expression can be expressed in the following syntax:
k_1[+[i_1]]k_2+[i_2]+k_3+...+[i_{n-1}]k_n].
word_i
can be a single keyword or a
(Unix-style) wildcard (e.g. test*
will retrive all
words wich start with test
(e.g. test
,
tests
, testament
,
etc). i_n
denotes the maximum number of
intervening words between k_{n+1}
and
k_1
.
The syntax also allows you to specify sequences of key words, and/or wildcards, and the maximum number of intervening words you wish to allow between each element in the sequence.
Examples:
seen+before
will find * ...never seen before...
etc;
seen+[1]before
finds, in addition,
...seen her before...
, ...seen ie before...
, and all
sequences in which there is at most one word between
seen
and before
.
know+before*
will find ...know before...
,
...know beforehand
, etc.
".*less.*"
will find ...less...
,
...hopeless, hopelessness
, etc.
N.B.: much (perhaps most) of the functionality in this class
should really be moved into
modnlp.idx.database.Dictionary. Ideally, Dictionary should handle
the functionality currently implemented in
modnlp.idx.query.WordQuery.matchConcordance() and auxiliary
methods, leaving only query parsing and related methods for
WordQuery. This hasn't been done yet to preserve backward
compatibility with tec-serverquery
- a String
The query stringdict
- a Dictionary
the top-level index accessor classcs
- a boolean
if true, the query is case sensitive.WordQueryException
- if an error occurspublic WordForms getKeyWordForms()
public WordForms[] getWFormsArray()
public Horizon getRightHorizon()
Horizon
object conaining the maximum distances
allowed in this query between the main keyword and the keywords
to the right of it in the query expression.Horizon
object or null
if 1
or less keywords.public Horizon getLeftHorizon()
Horizon
object conaining the maximum distances
allowed in this query between the main keyword and the keywords
to the left of it in the query expression.Horizon
object or null
if 1
or less keywords in query.public static final boolean isLeftWildcard(java.lang.String key)
public static final boolean isRightWildcard(java.lang.String key)
public static final boolean isWildcard(java.lang.String key)
public static java.lang.String getWildcardsLHS(java.lang.String key)
public static java.lang.String getWildcardsRHS(java.lang.String key)
public java.lang.String getKeyword()
public java.lang.String getFirstWord()
public boolean isJustKeyword()
public WordForms getLeastFrequentWord()
public WordForms getWordForms(int i)
public boolean matchConcordance(java.lang.String cline, int ctx)
matchConcordance
match cline
against
this query (represented after parseQuery()
by
queryArray
and intervArray
)
NB: this method really belongs in Dictionary
. TO DO:
Check potential backward compat problems in tec-server and
deprecate WordQuery.matchConcordance() in favour of
Dictionary.matchConcordance().cline
- a String
valuectx
- an int
valuetrue
if cline matches, false otherwise.public int getIntervToWord(java.lang.String w)
public static boolean isValidQuery(java.lang.String q)
q
- a String
valueboolean
value