Searching for Words and
Phrases
If you select full-text and enter
words and phrases separated by commas, each comma represents the ACCRUE
operator, which is a fuzzy OR--it means, the more of these the better." By default,
words and phrases in the query are stemmed, meaning the search is broadened to
include the stemmed variations of these words. The effect of the ACCRUE
operator is to assign importance in the form of a score to each document having
matched the query. The score assigned to a document is based on the number of
word matches the document contains and the density of those matches.
The query below will search for the phrase "desktop publisher" and
stemmed variations of the word "editor":
desktop publisher, editor
The search engine automatically recognizes a topic name as a
Verity topic in a query expression and will treat a word as a topic if it
matches the name of an existing topic. For example, the following query will
search for the topic named "HTML" and stemmed variations of the word
"editor" because "HTML" is the name of a valid topic in
this Topic Internet Server demo:
HTML, editor
You may want the search engine to treat the word HTML as a word instead of a topic. Also, you may want to search for the word "editor" and not the word along with all of its stemmed variations. To do this, you just delimit the search term in double-quotation marks. For example, the following query will search for the word "HTML" and the word "editor":
"HTML", "editor"
Note that searches are not case-sensitive by default. This
means you can use "HTML" or "html" in the above examples
and get the same search results.
Searching for Text that
includes punctuation or special characters
If you're looking
for the error message "E0-1234", you'd want to express the string by
it's components and let it be matched by the resulting phrase. Thus, in the
search field, you'd express "E0-1234" as "E0 - 1234" (with
spaces between the terms).
Verity full-text searching is based on a word index that is built when you
index documents. What counts as a word at index time is controlled by the
style.lex file. If no style.lex is used, our indexing engine has built-in
defaults. These defaults do not include the apostrophe as part of a word, but
treat it as punctuation and ignore it. So a word like "women's" is
indexed as two words: the word "women" followed by the word
"s". If you search for the phrase "national people s party"
rather than "national people's party", you should find the documents
you're looking for.
This behavior may seem odd, but if you include the apostrophe as part of a word
and you have a document that only has "women's" in it (and not
"women"), then a search for "women" will not retrieve that
document. So our defaults are tuned for searches on "simple" words, at
the expense of phrases including possessive forms.
For information on using your own custom style.lex, see chapter 7 of the
Collection Building manual. More information on the style.lex is also available
online in the FAQ Designing and using your own style.lex file (searching on
special characters).
Using Verity
Query Language
You can use operators and modifiers to apply logic to your query and pinpoint the exact information you are interested in.
Popular operators are: AND, OR, ACCRUE, and NEAR. A modifier can
be used with an operator to further define your question for the search engine.
Frequently-used modifiers are: MANY and NOT. By
default, the words "and," "or," and "not" are
interpreted as Verity query language; all other query language elements,
such as the NEAR operator, are interpreted as words unless surrounded by angle
brackets. Sample query expressions using query language are below.
The AND operator selects documents that contain all of the search elements
you specify. To find documents that contain both evidence of the topic
named "HTML" and at least one stemmed variation of the word
"editor," you can use the following query:
HTML and editor
The OR operator selects documents that show evidence of at least one of the search elements. To find documents that contain either evidence of the topic named "HTML" or at least one stemmed variation of the word "editor," you can use the following query:
HTML or editor
The MANY modifier is applied to words and phrases for a full-text search by default. This modifier affects how documents are scored and tells the search engine to give the highest scores to documents with the highest density of word matches. This modifier can be used explicitly with many operators with the exception of: AND, OR, ACCRUE. When you enter a word such as "editor" as a query, the search engine interprets this as:
<MANY> <STEM>editor
The <STEM> operator says search for the stemmed variations of this word. The <STEM> operator and the related <WORD> operator can be used with other modifiers, such as: CASE (for case-sensitive searches) and NOT (to exclude information from searches).
Proximity Search
Method
There
are several search methods for doing proximity searches. A proximity search
looks for documents containing search terms within close proximity of each
other. The following operators enable proximity search methods: NEAR, PHRASE,
SENTENCE, PARAGRAPH.
The NEAR operator selects documents containing specified search terms within
close proximity to each other. Document scores are calculated based on the
relative number of words between search terms; the closer the search terms, the
higher the score. To find documents that contain the word "HTML" and
stemmed variations of the word "publishing" within close proximity to
each other, you can use this query:
"HTML"<NEAR>publishing
The
SENTENCE and PARAGRAPH operators are used to specify a search within a sentence
or paragraph. The syntax for using these operators is similar. To find
documents that contain the word "HTML" and stemmed variations of the
word "publishing" within the same paragraph, you can use this query:
"HTML"<PARAGRAPH>publishing
Excluding
Information Want to exclude something from a search? That's what the NOT
modifier does. For example, to find documents containing stemmed variations of
the words "server" and "configuration" in close proximity
to each other, but not stemmed variations of the word "firewall", you
enter this query:
server<NEAR>configuration<AND>
<NOT>firewall
Zone Searching
You can search in any named HTML zone, such as
<TITLE> and <H1>. This query
will find documents whose titles have stemmed variations of the words
"web" and "security" in them:
(web,
security)<IN>title
An HTML zone name corresponds to an HTML tag name.
Getting
Started with Queries
Word Combination Operators
Word Operators
Field Operators
Using Modifiers
Rules for Adding Modifiers