Query Language

What queries consist of

The queries in the query language of Artefact are built from tokens, which may serve as operands, operators, parentheses, or modifiers. Parentheses are used for specifying the order in which the phrases appearing in the query are to be evaluated, whereas modifiers impose context restrictions on the documents that are searched for. It is also possible to specify document fields to be searched for the specified operands.

Operands

Operators

Operator AND

Aliases: and .

The AND operator combines its operands, which can be words and/or expressions enclosed in parentheses. If AND combines two words A and B, then the documents relevant to the phrase A AND B will be those containing both A and B. The documents relevent to the phrase A AND B AND C will be those containing all three words A, B, and C. In other words, AND requires that all its operands appear in a document, for it to be relevant to the query.

The AND operator can be omitted, so that the following phrases are equivalent:

repairing computers <=> repairing AND computers
repairing tractors AND carriages <=> repairing AND tractors AND carriages
  <=> repairing tractors carriages

Operator OR

Aliases: or .

The OR operator combines its operands, which can be words and/or expressions enclosed in parentheses. If OR combines two words A and B, then the documents relevant to the phrase A OR B will be those containing either only A, or only B, or both A and B.

The operators AND and OR have the same(!) precedence, which means that the order in which they take effect is determined only by parentheses.

Sequencing operator :

The : operator requires that its operands appear in a sentence in a document in the specified order. The words that match the operands may be separated in the document by arbitrary sequences of words. It is possible to limit the number of separating words.

For instance:

tax :3 (export OR import)

Consider the query

(repairing OR selling) computers

It means that the document must contain a sentence that contains either repairing, or selling, or both repairing and selling, followed by computers, which may be separated by an arbitrary sequence of words. A similar query

(repairing OR selling) :2 computers

means that the document must contain a sentence that contains either repairing, or selling, or both repairing and selling, followed by computers, which may be separated by zero, one or two words.

Field search

For the search within a specified field (group of fields) in the document, the operand or query phrase in parenthesis must be preceeded with a "/" and field name (names).

For example:

/date 09.05.1996!d /title /text (Yeltsin : visit : USA)

Specifiers of the relevant context

Except for the queries containing sequencing operators, all queries we have so far considered specified conditions imposed on whole documents, i.e. the queries required that some words should appear in a document, without paying attention to the distances between those words in the document. Sometimes, if a database consists of large documents, this can result in too many documents being selected, a phenomenon that is often referred to as "search noise". This noise can be reduced by restricting the context in which the operands should appear in the document.

Modifiers

The context in which the search for words must be performed can be specified by modifiers. The search can be restricted to

The scope of a modifier in the query

The scope of a modifier appearing at the top level is the whole query. Otherwise, if a modifier appears inside a pair of parentheses, its scope is the part of the query enclosed in the parentheses. This is illustrated by the following picture, where A, B, C, ... stand for words, and /a, /b, /c, for some modifiers

the scope of /a   the scope of /b   the scope of /  
((A B C /a) or (D : E F /b) or ((L or M) and N /c)) /d
the scope of /d

If several /wN or /sN modifiers have the same scope, only the lastof them takes effect.

If several /f modifiers have the same scope, all of them takeeffect: the context will include all the fields thus specified.

The character / in modifiers can always be replaced with \, without any change in their meaning.