Last Revised:
Wed Sep 13 13:56:20 MET DST 1995
Htgrep Query Formats
This page describes the query formats that are supported by the
htgrep
search engine package.
The support for boolean queries was provided by
Paul Sutton.
Htgrep supports either querying by keywords (described here)
or by
Perl regular
expressions.
Htgrep can be
configured
to use either querying format.
By default, it assumes that queries are boolean keyword searches,
unless the query contains a backslash ("\"), in which case a Perl
regular expression is assumed.
How to enter search queries
The search query can either be one or more words (a simple search)
separated by spaces, or a boolean expression. By default, all
searches are performed in a case insensitive manner, for example,
entering "HOUSE" is identical to entering "house", "House" or even
"HoUSe".
(If necessary, it is now possible to configure htgrep to understand case
sensitive queries.)
Simple Searches
Enter a single word to find any search record that contains the
exact whole word entered. For example, the search entry "world"
would find records containing the word "world", but not "worldwide".
If you enter more than one word, it will find entries containing all of
the words you entered. For example, "world economy" will find entries
containing both the word "world" and the word "economy" (but not
necessarily next to each other or in that order).
To find parts of words, use an asterisk (*) to represent missing
parts of the word. For example, if you enter "world*" it will match
"worldwide", "worlds", etc. Similarly, "*world" would find
"underworld", etc.
Boolean Searches
For more control over the search query, you can use a boolean
expression. If you enter the word or between two search words (with
a space between each word and the "or") it will find any record which
contains either the first word, or the second word, or both. For
example, "apple or orange" would find records containing the word
"apple" or the word "orange", or both.
If instead of the word or you entered and it would match only
records which contained both the word "apple" and the word "orange".
Note that this would be the same as a simple search for "apple orange"
because if the boolean commands are omitted, it defaults to assuming an
and between each search word.
To find records which do not contain a particular word, place
the word not before it. For example, "not blue" would find all the
records which do not contain the word "blue". You can combine the
"and", "or" and "not" commands, for example "apple and not red" would
find records containing the word apple but not the word red.
For advanced use, you can use brackets to group the expression. For
example, "apple and (red or green)" would find all records containing
the word "apple" and either "red" or "green" (or both). If the
brackets are omitted, the and command has higher precedence, so "apple
and red or green" would find all records contain "apple" and "red",
and also records containing "green".
Perl Regular Expressions
If you want to use a perl regular expression rather than a simple or
boolean search, make sure you use a \char contruct (eg \w or \s). Any
search query which contains a backslash will be treated as a perl
regular expression.
You can also force htgrep to always prefer perl regular expressions
by setting the tag "boolean=no".
See the
htgrep FAQ list
for more details.
Paul Sutton, 26 August 1994
Oscar Nierstrasz, 13 September, 1995