Purpose of Document
This defines how to specify searches for my
Common Gateway Interface lookup to select bibliographic entries.
Summary
An entry
is selected if there is a line in it that contains a substring
that matches the pattern. The search is carried out by the UNIX awk
program and so the pattern must be expressed in the awk form of regular
expression. This means that certain symbols including periods, parentheses, brackets,
slashes, asterisks and vertical bars have special meanings and will need
a "\" in front of them.
By a fluke irregular operations can also be expressed.
[ Irregular Patterns ]
Example Patterns
Formal Definition
- Item::=@Line.
Each Item is essentially a set of Lines -- because the order is not
important to the search.
- Line::=#char.
A given line either matches or does not match a pattern... the set of
matching Lines is written M:
M:: pattern->@Line. The lines that match a given pattern
An item is selected by a pattern if any of its lines matches the pattern:
- For Item I, pattern p, selected(I,p)::= for some L:I(L in M(p) ).
The lookup cgi use awk for searching. Typically M(p) is any string that
contains a string that matches p as defined below.
- pattern::= See http://csci.csusb.edu/dick/samples/regular_expressions.html
Irregular Patterns
Awk excepts Boolean operators in its searches:
awk '/string1/&&/string2/'
So by including "/&&/" and "/&&!/" you can search for two strings
in any order (but on one line):
Program/&&/Law
- Items with a line with both "Program" and "Law" in some order on it.
Program/&&!/Law
- Items with a line with "Program" but without "Law" on it.
ACL/&&!/ORACLE
- Items referring to ACL but not ORACLE
Lexemes
- char::= See http://www.csci.csusb.edu/dick/samples/comp.text.ASCII.html#char,
-- the set of ASCII characters.
- empty::@#char= {""}, -- the set with a single empty string in it.