AP5 Reference Manual

author: Don Cohen don@isis.cs3-inc.com
(The mail will bounce but the bounce message will give you a working address.)
last revised: 2006/5/15

Contents

Introduction

AP5 is an extension to commonlisp which allows users to "program" at a more "specificational" level. By this we mean that it allows you to focus more on what you want the machine to do and less on the details of how. AP5 is a compromise between lisp and the gist specification language. In particular, it incorporates those parts of gist that can be reasonably compiled. Unlike gist, it allows programmers to deal with performance issues. There are two mechanisms for this. One is that the compiler can be instructed via "annotations", descibed below. The fallback mechanism is to use lisp when necessary.

AP5 represents state (that is, data) as a set of relationships among a set of objects, as in a model of first order logic or a relational database. The language for accessing this data includes the language of first order logic. We will refer to it as FOL+. (The ways in which it extends the language of first order logic are described in many pieces spread through this manual.) A relation is a set of tuples, which are sequences of objects. All tuples of a relation have the same length (number of objects) which is called the arity of the relation. The combination of a relation and a tuple of the appropriate length may be regarded as a primitive sentence, or fact, which is either true or false. As in first order logic, there is no notion of a fact being "unknown" or "meaningless". If the relation < is the set of pairs of numbers such that the first is less than the second, it makes perfect sense to ask whether the tuple [John, Susan] is in the < relation - the answer is no. We assume that the set of objects (and the set of tuples) is not finite. The set of objects should be thought of as including all potential lisp objects, e.g., all numbers. A more precise definition of object appears in Equivalence.

AP5 programs can add and remove relationships among objects. They can also test (determine the truth value of) conditions expressed in FOL+ and iterate through objects that satisfy conditions expressed in FOL+. If it is assumed that relations do not change other than by AP5 mechanisms (this is the default assumption, but the reverse can be declared) then AP5 can provide automatic enforcement of FOL+ conditions and automatic invocation of programs when FOL+ conditions become true (see Rules).

AP5 programs are normally built in two phases. The first is specification, in which the programmer invents relations and rules and uses them to write programs that "do the right thing". In the second phase the program is optimized by adding appropriate annotations, such as representations or sizes of relations. These annotations influence the AP5 compiler's algorithm selection. They do not affect the meaning of the programs. They affect the source text only in very minor ways so the annotated programs need not be debugged again. Thus the programmer initially concentrates on behavior without worrying about efficiency concerns, and later can easily experiment with different performance tradeoffs. AP5 provides a fairly large library of annotations from which to choose, and allows programmers to add others.

AP5 differs from theorem proving systems and "logic programming" languages in that it does not allow assertion of arbitrary conditions, such as "all men are mortal". It may, however, be able to check whether all men are mortal, and if so, prevent this fact from becoming false. Another difference is that AP5 does not (normally) do unbounded searches. Queries that may cause theorem provers to start infinite computations cause AP5 to generate compile time errors.

Organization of the manual

This manual assumes that readers are familiar with lisp. The material above is meant to provide a general idea of what AP5 is. The next chapter tells a novice what he needs to know in order to browse and start writing programs that use the database. More serious users will be interested in the middle of the manual - rules, annotations, types and equivalence. The end of the manual is meant to contain details that even serious users will rarely need.

Basics

Wffs

At any moment only finitely many relations have been declared. These will be referred to as "primitive relations". Well-formed formulae (wffs) are expressions built from primitive relations, connectives (NOT, AND, OR, IMPLIES, EQUIV, XOR), quantifiers (A and E, meaning For-All and There-Exists), variables (which are identified by context) and lisp expressions. If R is an n-ary primitive relation (or the name of one), then any list with R as the CAR and a list of length n as the CDR is a "primitive wff", denoting the relation R applied to the arguments in the CDR. More syntax is introduced later [In particular, if you run across Start, Previously, or symbols containing the characters #, @, $, =, or ?, see Rules, Types, Equivalence and esoteric syntax.], but these need not be understood in order to write most wffs. Other than quantifiers, connectives, and relations, the only thing you need to understand is the interpretation of arguments. Each argument is either a constant such as 12 or '(this list), a symbol which names a bound variable (if the primitive wff is in the scope of the binding), or a lisp expression. Thus, in the wff

(E (x) (R 'John x (f x)))
'John is a constant, x is a bound variable, and (f x) is a lisp expression. If we were to ask for the truth value of this wff, the expression (f x) would be evaluated first - note that the x refers to some lisp variable bound outside the wff. We would then find out whether the relation R contained any 3-tuple starting with the symbol John and ending with the value computed for (f x).

The set of wffs includes TRUE, FALSE, (AND wff*), (OR wff*), (IMPLIES wff wff), (EQUIV wff wff), (XOR wff wff), (NOT wff), (A Vars wff), and (E Vars wff), where Vars is a list of distinct symbols representing bound variables. Wffs containing quantifiers, connectives, and a few other constructs introduced later are called compound wffs, while other wffs are called primitive.

Lists of the form (Vars s.t. wff) are called descriptions if wff contains no lisp forms to be evaluated, i.e., the arguments to relations must be variables or constants. These may be used as relations, e.g. (and (P x) (((y) s.t. (Q y)) x)) is a wff which means the same thing as (and (P x) (Q x)). We regard a description as a primitive relation if its wff is a primitive wff.

Descriptions are analogous to lambda expressions in lisp. The syntax of wffs also includes Funcall and Apply, analogous to the lisp functions of those names. A wff whose CAR is the symbol Funcall is interpreted by evaluating the second element and treating it as the relation to be applied to the (values of the) other arguments. A wff whose CAR is the symbol Apply is interpreted by evaluating the second element and treating it as a relation to be applied to an argument list that is constructed by evaluating all the other arguments and appending all but the last to the last one. Thus (Apply r x (list y z)) is usually equivalent to (Funcall r x y z). [The exception arises when y or z are used as bound variables. Apply wffs always treat their last arguments as lisp forms to be evaluated, whereas in funcall wffs, as in normal wffs, the arguments may refer to bound variables. Thus (E (x) (funcall r x x)) means that there is an x that is related to itself by the relation which is the value of r, whereas (E (x) (apply r x x)) means that the relation which is the value of r has at least one true tuple which can be obtained by putting something in front of the value of the lisp variable x.] We regard a wff starting with funcall, or apply as primitive when the value of the second element is a primitive relation.

Relations can also be defined by computations that cannot be expressed in wffs. Descriptions, in fact, may be regarded as a special case of relation definition. Just as it is possible to replace a relation in a wff with a description, it is also possible to replace it with a more general definition. This is described in more detail in the section on anonymous relations.

Simple Database Operations

The simplest ways to access a database are to ask whether something is true or to add or delete facts:

Macro (++ . primitivewff)
makes primitivewff true.
[The notation (A . B) refers to a list whose CAR is A and whose CDR is B.]

Macro (-- . primitivewff)
makes primitivewff false

Macro (?? . primitivewff)
returns NIL if wff is false, something else if it's true.

For example, if there were a binary relation named R, one would make it relate the symbol George to the number 3 by doing (++ R 'George 3). One could find out whether George was related by R to anything at all by doing (?? E (x) (R 'George x)).

Normally, non-primitive wffs can be tested but cannot be directly updated.

There is also an "update" macro that combines addition and deletion:

Macro (update relation of expression* {from expression*} to expression*)
The syntactic keywords OF, FROM, and TO may be in any package. Relation must be the name of a relation, the arity of which is the number of expressions following OF plus the number following TO. If FROM is supplied it must be followed by the same number of expressions as TO. The tuple formed by appending the OF expressions and the TO expressions is added to the relation. That tuple is allowed to have already been in the relation. If FROM is supplied, the tuple formed by appending the OF expressions and the FROM expressions is removed from the relation. If that tuple was not initially in the relation, update generates an error. If FROM is not supplied, and the tuple to be added was not initially in the relation, then if there is any tuple in the relation starting with the OF tuple, an arbitrary such tuple is removed. The updates are done atomically (see transition). Update is meant primarily for situations where the set of values following OF forms a key of the relation, in which case the result is deterministic.

Thus, for example,

(update phone of 'George to 126)
will make the phone of George be 126, and if George originally had another phone, one such phone will be removed.
(update phone of 'George from 225 to 126)
requires that his phone was initially 225. It removes 225 and adds 126 (although the addition may be a noop).

In addition there is a macro for creating an object and relating it to several other objects:

Macro (create type* {relation = expression*}*)
As an example,

(create person firstname = "John" lastname = "Smith"
	parents = mother father)
In a single transition (again, see transition), this creates a new object and adds a number of facts about it. In this case it would be equivalent to the following:
(let ((person (make-dbobject)))
   (atomic (++ person person)
	   (++ firstname person "John")
	   (++ lastname person "Smith")
	   (++ parents person mother father))
   person)
In the current version (we hope to fix this!) all create's allocate a lisp object of the same representation, called dbobject (for Data Base Object). It is still possible to do your own allocation, but the create macro does not help you. Several of the relation representations presented below require this representation.

Function (make-dbobject)
creates and returns a new DBObject.

Declaring Relations

The generally useful form for declaring a new relation is DefRelation. We provide a brief overview here, although the details are postponed for later. For a beginner it is sufficient to know that an n-ary relation can be declared by supplying only a name and arity, e.g.,

(DefRelation PhoneNumber :arity 2)
Typically one provides documentation and declares the types (unary relations) required of objects allowed in the tuples, e.g.,
(DefRelation PhoneNumber :arity 2 :types (person integer)
  :documentation
  "(PhoneNumber x y) means that the integer y is the phone number
   of the person x")
The other common form that will be useful to a novice is defining a relation by a description, e.g.,
(DefRelation sibling
  :definition ((x y) s.t. (E (parent) (and (child parent x)
                                           (child parent y)
                                           (not (eql x y)))))
  :documentation
  "(Sibling x y) means x and y are distinct and share a parent")
In this case the arity and types are not needed - they can be deduced from the definition. However they can be provided if you like.

When a DefRelation form is evaluated for an already existing relation, the assumption is that its definition is being changed, and the result ought to conform to the new definition.

As a preview and incentive to keep reading, we mention that DefRelation allows you to specify such things as

  • data representions of relations, e.g., hasharray, alist, etc.
  • relations that are computed or derived from other relations, e.g., transitive closures
  • cardinality constraints, e.g., every person has at least one office
  • how many tuples you expect to populate a relation, e.g., the typical person will be the child of 2 parents and the parent of 3 children.

Relation Names

DefRelation is analogous to Defun in lisp. It "binds" a name to a relation object, which is analogous to a lisp function object. Furthermore, it is useful not only for declaring new relations, but also for changing already existing relations. To emphasize this analogy we provide the following functions:

Function (rboundp symbol)
This is analogous to lisp Fboundp - it returns T if the argument is the name of a relation and NIL if not.

Function (symbol-relation symbol)
This is analogous to lisp Symbol-Function - it returns the relation object of a given name. It signals an error if there is none.

Function (rmakunbound symbol)
This is analogous to lisp Fmakunbound - it removes the association between a name and any associated relation object. This does not "destroy" the relation object, however. The closest thing in AP5 to "destroying" a relation is (-- Relation relation), which makes it no longer be a relation.

The functions above are trivially defined in terms of this relation:

Relation (relationname relation symbol)
means that symbol is the name of the relation object relation. Relations are normally named, in order to make them easier to refer to. (Actually, names are also needed in order to compile to a file.) Relation names are just symbols. It is possible to change the name of a relation, but, as in lisp, this tends to cause problems, since compiled files identify relations by name. The recommended way to create aliases for relations is via defined relations.

Generators

Typically a database is used not just to test whether a particular fact is true, but to find out what facts are true. The general mechanism for this is called generators. For any sequence of variables, V, (a list of distinct symbols), and wff, W which uses (only) the variables in V freely, there is a set of tuples that satisfy W, in the sense that substituting the values for corresponding variables results in a wff that is true. The usual mathematical notation for this set is {V | W}. AP5 provides access to this set via the loop keyword s.t., standing for "such that", which is preceded by V and followed by W. As a special case, if V is a symbol (other than NIL) it is treated as a list whose only element is that symbol. NIL is treated as a list of no variables. W may contain arbitrary lisp expressions as slot fillers in its primitive wffs. Within the loop body, the variables may be lexically accessed by name. It is considered bad style to assign to the variables within the loop body (and we do not promise that the iteration will work if this is done). The iteration may be terminated by executing RETURN from lexically within the loop body. Variables may be introduced with "#" and "@" notation (see esoteric syntax).

We mention in passing the degenerate cases:
(loop for nil s.t. true do body) will do the body once.
(loop for nil s.t. false do body) will not do the body at all.

For fans of the commonlisp DO macro and its relatives, an AP5 analog is also provided:

Macro (do-s.t. (vars wff {result-form}) . form*)
iterates over bindings of the variables vars satisfying wff. The remarks above all apply here as well. For each tuple of values generated for the variables from the wff, the statements are executed with those values lexically bound to the variables. If the generator terminates, the result form is executed and its value(s) becomes the value(s) of do-s.t.. It is NOT executed if a statement RETURNs from do-s.t.. Within the result-form, the variables are not lexically bound by the do-s.t. form; values, if any, from the enclosing environment are seen.

Examples

To get a list of George's siblings, do
(loop for x s.t. (sibling 'George x) collect x).

To count the pairs of known siblings do
(loop for (x y) s.t. (sibling x y) count t).

To print up to n people whose PhoneNumber is the value of x, do
(loop for y s.t. (PhoneNumber y x) as i below n do (print y)).

Useful generator macros

The following common uses for generators have been encapsulated as macros:

Macro (any vars {s.t. wff} {ifnone form*})
If there are values for vars that satisfy wff, then one such set of values is returned (multiple values if multiple vars), otherwise the ifnone forms are evaluated, and the result(s) of the last are returned. By default, ifnone generates an error. Vars may be either spliced in or a list of symbols, e.g., (any x y z s.t. ...) = (any (x y z) s.t. ...). Finally, if there is no s.t. in the form, the condition defaults to true. [Of course, "s.t. true" only makes sense if the variables are typed, e.g., (any relation) will return a relation, whereas (any x) would cause an error.]

Macro (theonly vars {s.t. wff} {ifnone form*} {ifmany form*})
is similar, but if there is more than one set of values for vars the ifmany forms (which also default to an error) are evaluated and the last value(s) returned. The ifmany forms do not have access to bindings of variables that satisfy the wff. Again, the wff defaults to true.

Macro (forany vars {s.t. wff} form* {ifnone form*})
If there is any set of values for vars, then the forms are evaluated, with the value(s) of the last being returned. The variables of vars are bound to some set of such values. If there is no such set of values, the ifnone forms are evaluated (the variables are not bound here) and the result(s) of the last returned. Again, ifnone defaults to an error, and wff defaults to true.

Macro (fortheonly vars {s.t. wff} form* {ifnone form*} {ifmany form*})
is analogous.

Macro (listof vars {s.t. wff})
If there is only one variable in vars this returns a list of the values for that variable that satisfy wff. If there are multiple variables it returns a list of lists of values (in the same order as the variable list). (This macro was written because I always found myself typing its expansion in order to browse the database. Perhaps another form would be better for programming.) Again, (listof relation) will return a list of all relations.

Macro (set-listof vars s.t. (relation . args) form)
The intent of this macro is that (listof vars s.t. (relation . args)) should return the result of evaluating form. It only works if the relation can be added and deleted. This is meant primarily as an initialization form, e.g.,

(set-listof x s.t. (Phone Don x) '(226))

Getting Started

After having read the preceeding sections you should know almost enough to actually start using AP5. Here we try to supply what's missing.

You have to begin with a machine on which AP5 is loaded. AP5 is defined in the AP5 package, so it will be convenient to use a package that uses the AP5 package, e.g.

(in-package "MY-PACKAGE"
  :use (list (find-package "AP5") (find-package "LISP")))

In addition to this documentation you have the AP5 database to help you figure out how to do things. This is a very significant resource. Of course, its value grows as you learn more about what it contains. In order to get you started, we mention the following:

Function (describerel rel-or-name {stream})
produces some output describing a relation. For instance, (Describerel 'relation) produces the following information:

RELATION  implementation: DEFINED
arity: 1
Compare Relations: (EQL)
Definition:
((R) S.T. (E (X) (RELATIONARITY R X)))
Appears in the definition of: 
(Typeconstraint-DELETEMATCHER-0 ...)
Immediate superrelations: (REL-OR-IMP RULE-OR-RELATION)
Immediate subrelations: (NONTRIGGERABLE TYPE ...)

(Relation relation) is true of all known relations, i.e., (loop for r s.t. (Relation r) collect r) will give a list of relations. The definition of a relation is an object in the first place of a tuple of the RelationArity relation.

Function (describerule rule-or-name {stream})
produces some output describing a rule. (Rules are described later.)

Function (tell-me-about object {:stream stream} {:verbose verbose} {:rels rels} {:more number})
will print to stream all true, primitive facts about object (which can be any lisp object) that it can find in the database. This tends to be a slow operation, especially the first time it's done for an object of a particular type after creating many new relations. Rels is a list of the relations you want to know about - it defaults to all relations not in the relation:

Relation (dont-tell-me-about relation)

If verbose is non-NIL (default is NIL), then it tells you what relation it's on, what relations (and positions) cannot be generated, which relations are equivalence relations (their tuples are not listed), and what relations (and positions) haven't been generated before (for these it has to compile a new generating function, which takes time). More (default 5) is the number of facts to report with object in any one position of any one relation before stopping to ask whether you want more. If more is NIL it will never ask, but just report all the facts it can find. A negative value for more means that after displaying (the absolute value of) that many facts tell-me-about should automatically go on to the next relation.

Two particularly useful relations for browsing are:
Relation (derived-from relation-or-rule relation)
and its transitive closure,
Relation (derived-from* relation-or-rule relation)
These relate a relation or rule to any relation from which it is derived. (In the case of a rule this means any relation from which any relation in the trigger is derived.) This is especially valuable for finding out what rules might react to a change in the database, or what would be affected by a change in a definition of a relation.

Annotations, representations and derivations

At this point we have described how to declare two different kinds of relations - those that are stored explicitly in memory and those defined by wff. While this is sufficient for obtaining many interesting behaviors, AP5 offers a lot more. First, it allows you to choose a representation of data that you think will make your program more efficient. You can even create your own representations, although that is described later in the manual. Next, you can provide information to the compiler that helps it to optimize the access of compound wffs. You can also optimize your program by deciding to redundantly store a representation of a relation defined by wff. (This is advantageous if you spend a lot of time reading the relation but it's not often updated.) All of these are called "annotations" because they do not affect the meaning of the specification. They may affect whether the specification can be compiled, but if it can be compiled, it will mean the same thing regardless of the annotations.

As part of the optimization process it is often useful to see what algorithms AP5 is compiling for you. The usual question is how it decides to generate a compound wff. The answer is supplied (in pseudo-code) by the following function:

Function (describe-algorithm description)
We also describe in this section some instructions that you can give AP5 which do affect the meaning of a specification. These are standard idioms for declaring how a relation is to be computed or derived from other relations. Again, the standard idioms can be augmented by your own, but that is described later in the manual.

We divide relations into three classes: stored, derived and computed.

Stored relations are those which can be directly updated. You can think of these relations as being represented in some data structure. The representation parameter in DefRelation determines what kind of data structure that is. The choice of representation for a stored relation is an annotation.

Derived relations are defined in terms of some other relation(s), and changes as those relations change. One decides whether a tuple is in a derived relation by doing a computation that access these other relations. (To get a relation that accesses both data structures and other relations you can factor out the data structures into stored relations and let a derived relation access those.) The derivation parameter is used to describe how derived relations are to be derived, and it does influence the meaning of the AP5 program

Computed relations never change. It seems academic to wonder whether tuples are tested by examining a data structure or whether the "data" is embedded in a program. A declaration that a relation is computed simply means that it is static, i.e., unchanging. In practice the mechanisms for declaring computed relations turn out to be the same as for declaring derived relations - both describe computations that, in the case of derived relations happen to access changing data.

You tell AP5 which type of relation you are declaring by supplying arguments to the DefRelation macro. Only one of the keyword arguments :derivation, :representation or :computation should be supplied - :computation for computed relations, :derivation for derived relations and :representation for stored relations. We will refer to these parameters collectively as the "implementation". A number of possible values for these parameters are described below. If none of derivation, computation or representation is supplied, then if a definition is provided, the derivation defaults to defined, otherwise the representation defaults to base.

Relations can, in fact, be both stored and derived, i.e., a data structure may redundantly record what could in principle be derived. We call these relations maintained relations. At present only relations defined by wffs may be maintained. (Anonymous relations make it convenient to define the relation that you want to maintain.) This annotation is supplied simply by providing both :definition and :representation arguments to DefRelation.

Another annotation is provided by the :size keyword to DefRelation. This is a list alternating between patterns and numbers. For instance, we might declare the Child relation as follows:

(DefRelation Child
 :documentation "(Child x y) means that y is a child of x"
 :size ((input output) 3 
        (output input) 2
        (output output) 1000))
This would mean that the expected values of the following programs
(loop for x s.t. (child a x) count t)
(loop for x s.t. (child x a) count t)
(loop for (x y) s.t. (child x y) count t)
would be 3, 2, and 1000 repectively, i.e., a typical person would have 3 children and 2 parents, and there would be 1000 tuples in the child relation. This information is used by the compiler to choose among algorithms. For instance, suppose you write this program:
(any x s.t. (and (child p1 x) (child x p2)))
Should this compile into a program that searches the children of p1 for one whose child is p2 or a program that searches the parents of p2 for one whose parent is p1? The answer depends on how expensive it is to find parents and children, but also how many there are. Normally when you find AP5 using a bad algorithm (even though you supplied what you thought were good representation choices!) the problem is that it doesn't know some critical size data.

In general, patterns can contain the symbols INPUT and OUTPUT, and constants. Given arbitrary objects for the slots labeled INPUT, the size annotation tells you how many tuples to expect from generating the slots labeled OUTPUT. The special size NIL means that there may be infinitely many matching tuples.

Predefined representations

While it is possible to define your own representations and derivations (this is described towards the back of the manual), this is usually not necessary. The following choices are already supplied.

Base is a particularly dumb, but general representation of relations, which simply stores a list of the true tuples.

Partial-index means that an index is maintained (implemented as a hasharray) on specified slots. The slot numbers are provided as arguments to partial-index, e.g., (partial-index 0 2) to maintain indices for slots 0 and 2. Such relations have quick access to all tuples for which a particular value occupies an indexed position. This implementation is similar to the "base" implementation in that it allows complete generation of relations of any arity. It's much faster for many types of generation but of course requires much more space. Specifying no slots, i.e., (partial-index), is functionally equivalent to base.

StructProp means that a binary relation relates only DBObjects to arbitrary other objects, y.

Tree relations are stored by using values of the first argument as discriminators which select subtrees for the remaining arguments. The discrimination is done by assoc or gethash, depending on the number of branches.

TreeProp relations are a combination of structprop and tree relations. If R is a treeprop relation and (R x y) is true, then y will be on the property list of x, which must be a DBObject, and there will also be a hashtable with an entry indexed on y having x in its value. This has the advantage of very fast access of y from x, while the ability to generate all the tuples in the relation is retained.

TreeProp+ relations are a generalization of treeprop in that domain values need not be DBObjects. Range values are stored on property lists when the domain values are DBObjects and symbols, and otherwise in a hashtable.

Two-Way-Structprop means that the relation is a binary relation that relates DBObjects to other DBObjects by storing each on the property list of the other. Thus for such a relation it is impossible to generate {x,y | (R x y)}.

Stored-In-Place and Stored-In-Place-Single are representations which accept as a parameter a piece of code that computes a place where values are stored. This is described in the section on creating new implementations. For the purpose of declaring relations we describe the implementation translators that DefRelation accepts to create specialized relations of these two "real" implementations.

GlobalVar means that the (unary) relation is stored as a list of values in the value of a global variable. This is declared in DefRelation by

:representation (globalvar [variable-name])
AP5 assumes that you will never set that variable other than with database update macros (,e.g., ++). If you do intend to set it, defrelation is the wrong way to declare it.

GlobalVar-single means that there is at most one value in the (unary) relation, and it is stored in the value of a global variable. This is declared in DefRelation by

:representation (globalvar-single [variable-name])
The remarks above still apply. The relation will be true of no values if the variable is unbound or a special empty marker.

The empty marker is the value of ap5::*no-single-value-stored*. Of course it is an error to add a tuple containing that value - it is only to be used for initializing data structures identified with "-Single" relations.

Declaring a -Single relation does not automatically declare a count constraint. You can, of course, include the constraint in DefRelation.

Structure means that the (binary) relation relates objects represented as structures to the values in a list stored in some field of the structure. This is declared in DefRelation by

:representation (structure [structure-name] [field-name])

Structure-Single is similar to defstruct except that the field contains either the single value related to the structure or an empty-marker.

SymbolProp means that the (binary) relation relates symbols to the values in a list stored as a property of the symbol. Note that it is not possible to generate all the tuples of such a relation. This is declared in DefRelation by

:representation (symbolprop [property-name])

SymbolProp-single is similar but only one value is stored under the property name.

Alist means that the (binary) relation is stored as an association list. You have to supply a piece of code that (a) can be used with SETF and (b) finds the alist. The DefRelation parameter is

:representation (alist [code to find alist])
For instance, (car *data*) would be acceptable code, since (setf (car *data) [value]) can be compiled.

Alist-single is similar but instead of associating a list of values with a value, only one value is stored.

HashTable means that the (binary) relation is stored in a hashtable that hashes on the domain to find a list of related values. Again, you must supply a form to find the hashtable:

:representation (hashtable [code to find hashtable])
You (not AP5) must arrange for the hashtable to use the appropriate test for the canonicalized tuples (see Equivalence).

HashTable-single is similar but only stores a single value.

Array means that the relation is stored as an array. The arity of the relation must be one more than the number of dimensions of the array. Such a relation relates array indices to the elements of a list stored in the array at those indices. Again, you must supply a form to find the array, and it is your responsibility to give it the right number of dimensions:

(defvar myarray (make-array '(2 3 4)))
(defrelation arrayrel :arity 4
  :representation (array myarray))
It is assumed that different array indicies are not considered equivalent by the comparison relation for the relation.

Array-single is similar but only stores a single value.

(defvar myarray1 (make-array '(2 3 4) :initial-element
			 ap5::*no-single-value-stored*))
(defrelation arrayrel1 :arity 4
   :representation (array myarray1))

Individual and Individual-Derived are a representation and a derivation with no capabilities whatever. They are to be used for relations where all the capabilities are supplied in the DefRelation form. Some of the "representations" above and "derivations" below are actually provided via translation functions that produce Individual and Individual-Derived relations.

Multiple (redundant) representations are allowed (if they are compatible) by supplying as the representation argument to defrelation a list of the form (:multiple rep1 rep2 ...), where the elements other than the first would be allowed as representation arguments.

Predefined derivations and computations

Most of these may be used with either the :derivation or :computation parameter of DefRelation. The only difference is that :computation declares that the relation is static and :derivation declares that it is not.

Defined means that the relation is defined by a description. This is what you get by supplying a :definition parameter but no implementation parameter. For example, one might define the Sibling relation from the Parent relation as

((x y) s.t.
 (E (z) (and (Parent x z) (Parent y z) (not (eql x y)))))
In other words, two distinct objects are siblings if they share a parent. In this case it would be possible to generate the sibling relation, but not to directly update it. Recursive relations are not supported. The reason is that they typically are not really definitive, i.e., several different sets of tuples satisfy the definition. [The section on derived relations describes how you can get relations to exhibit recursive behaviors, such as transitive closures.] In fact, defined relations may not contain funcall's or apply's, and the expressions used as arguments must all be either bound variables or constants (not expressions to be evaluated at run time). [See the description of Memo if you intend to use DBObjects (,e.g., relations) or other "weird" objects as constants in wffs.]

Coded means that a relation is computed from a lisp form. This is specified to DefRelation by a :computation or :derivation parameter of the form (coded (vars s.t. lispform)). The lispform should use the variable names in vars and evaluate to something other than NIL if the relation is true of those values. The lispform should never generate an error, since relations are supposed to be either true or false for all tuples. One could define the EQL relation with

(DefRelation EQL :arity 2
  :computation (coded ((x y) s.t. (eql x y))))
In general, such relations are expected to be static. If this is not what you want, see the section on derived relations, in particular, the explanation of triggers. See SimpleGenerator and RelAdder for information about updating and generating.

Lisp-Predicate means that the relation is computed by a lisp predicate. This is specified to DefRelation by

:derivation (Lisp-Predicate )
, e.g.,
(DefRelation EQL :arity 2
   :computation (Lisp-Predicate eql))
The parameter supplied to Lisp-Predicate must be the name of a lisp predicate of arity arguments. The new relation is testable, but not generable, assertable, or deletable. The lisp predicate serves as the testing function. The current implementation can be used even for predicates that cause errors, e.g., < is not defined for non-numeric arguments. Testing code will treat any error as an indication that the relation is false. (Be careful if you use your own predicates!)

Lisp-Function means that the relation is computed by a lisp function. This is specified to DefRelation by

:derivation
   (Lisp-Function   )
Fname is the name of a lisp function of input-arity arguments, returning output-arity values. The relation is true of a tuple of length input-arity + output-arity if the values of the function on the first input-arity elements are the last output-arity arguments. [In some cases it will be useful to assign equivalence relations to the output arguments - see Equivalence.] Such a relation is testable and can generate a single value for its output roles given values for its input roles. The current implementation, as above, catches errors. Any input that causes an error in the function is assumed to participate in no tuples of the relation.

BaseType means that the relation is an inherited type - see the description of basetypes in the section on types.

(DefRelation person :derivation basetype)
creates a basetype named person. The :types argument can be used to supply one supertype. (If more are needed they can be added separately.)

Unionclass means that the relation is a unary relation which is the union of several other unary relations.

(Defrelation person (unionclass man woman child))
is another possible definition of person. It is approximately the same as
(Defrelation person :definition 
  ((x) s.t. (or (man x) (woman x) (child x))))
except that recognizes more subtype relations. (In particular, the types man, woman, and child are recognized as subtypes of person.) It is possible to include a comparison (see Equivalence) as in
(Defrelation person (unionclass :equiv equal man woman child))

Intersectionclass means that the relation is a unary relation which is the Intersection of several other unary relations. The syntax is similar to unionclass.

TClosure means that the relation is the transitive closure of another relation. The transitive closure is true of objects x and y if there is a finite sequence (with length at least two) of objects starting with x and ending with y, such that every adjacent pair of objects in the sequence is related by the other relation (called the immediate relation). This is specified to DefRelation by

:derivation (tclosure )

A common variant of transitive closure relates every object (of some type - usually the domain of the immediate relation) to itself. For example, the subtype relation relates every type to itself. This is equivalent to the transitive closure of a new relation defined as:

((subrel superrel) s.t.
 (or (isubtype subrel superrel)
     (and (relation subrel) (eql subrel superrel))))

Cardinality means that the relation counts the number of tuples of some other relation. For purposes of exposition, suppose we have the following relation:

(Works-on% person project percent)
meaning that some percentage of a person's effort is spent on a given project. Suppose further that only non-zero percentages are represented by tuples, i.e., that when a person does not work on a project at all, there is simply no tuple for that person and project. We could derive the relation
(Number-of-Projects person count)
which relates a person to the number of projects he works on by doing:
(DefRelation Number-of-Projects 
  :derivation (cardinality works-on%
                           (input output output)))

In general, every tuple of the original relation corresponds to exactly one tuple of the derived relation. That tuple is obtained by keeping only the elements of the original tuple at the "input" positions, and adding an integer to the end. The integer at the end of a derived tuple is the number of tuples of the original relation that correspond to that derived tuple. Thus, if the pattern consists entirely of output's, the derived relation will contain at most a single one-tuple: the number of tuples in the entire relation. At the other extreme, if the pattern consists entirely of input's, then the derived relation will contain exactly as many tuples as the original relation, but each tuple will have a 1 at the end.

Sum is similar to cardinality, but instead of counting the tuples, the derived relation adds the elements of all the tuples at one particular position. Again using the works-on% relation above, we could derive the relation

(Total% person total)
which relates a person to the sum of his percentages on all projects by doing:
(DefRelation Total% :derivation
   (sum works-on% (input output sum)))

Again, every original tuple corresponds to exactly one derived tuple, and that tuple is obtained by keeping only the elements of the original tuple at the "input" positions. However the number at the end of the derived tuple is the sum of numbers at the "sum" position of all corresponding tuples. The implementation ASSUMES that all tuples will contain numbers in the "sum" position. Notice that it is possible for sum relation tuples to end in zero. There will be a derived tuple if and only if there are any original corresponding tuples.

Extreme is similar to cardinality and sum, but is used for finding extreme values. Again using the works-on% relation above we could derive the relation

(Main-Project person project percentage)
which includes a tuple of works-on% if there is no other tuple for the same person with a larger percentage by doing:
(DefRelation Main-Project :derivation
   (extreme works-on% > (input output extreme))
Notice that the tuples of an extreme relation are a subset of those of the original relation. The >= relation means that a tuple should be included if the object in its "extreme" position is greater or equal to the objects in that position of all other tuples that match in the input positions. Thus there will be at least one tuple for every person in the works-on% relation. There could also be more than one - a person might work 50% on two projects. The < relation would have meant to keep only the tuples with the smallest percentages. The relation <= does the same thing as the relation <. If we change the input to an output then all the tuples will have the same percentage - the highest percentage that anyone works on any single project. The implementation ASSUMES that the ordering relation is transitive and static.

Anonymous relations: (It may be possible to define anonymous stored relations, but these are generally not as useful as derived and computed relations.) Wherever a relation is needed in a wff, it is possible to declare an "anonymous" relation. This is just like a regular relation except that it is declared without a name. EQUAL definitions will refer to the same relations, however. The form of the anonymous relation is similar to defrelation, except that the symbol DEFRELATION is replaced by ANON-REL and there is no relation name argument.

Other implementations may appear in later versions of this manual.

Partial orders

A binary relation (or description) R is a partial order if it is non-reflexive (nothing is related to itself) and transitive. The macros below provide an expression language for defining partial orders out of other partial orders.

It is anticipated that these macros will be used in wffs, e.g., in the :definition argument to defrelation. Having defined a relation, R, that is a partial order, one might use it for such purposes as:

(sort seq #'(lambda (x y) (?? R x y)))

None of the arguments of these macros is evaluated. We use the term "ordering expression" to mean any of the following:

  • the NAME of a relation that is a partial order
  • a description, ((v1 v2) s.t. ...), that is a partial order
  • a list whose macroexpansion is an ordering expression

Macro (inverse-order ordering-expression)
This is just the inverse relation of the argument - if the argument relates x to y then the inverse relates y to x.

Macro (priority-order ordering-expression+)
A priority order P is a sequence of partial orders. The priority order relates x to y, if there is some member of the sequence that relates x to y, and no earlier member relates y to x.

Macro (order-by-class class-expression {ordering-expression})
A class-based ordering C causes members of the class (type) denoted by class-expression to precede non-members. Two members of the class are ordered with respect to one another by the order denoted by ordering-expression, if it is provided. Specifically, C relates x to y if and only if either, x is a member of the designated class and y is not, or both x and y are members of the class, and ordering-expression relates x to y. Class-expression may be either the name of a unary relation or a unary description. The default ordering-expression is the empty ordering -- ((x y) s.t. false).

Macro (image-order ordering-expression image-relation)
The image order orders values indirectly in terms of an ordering on values to which they are related. Image-relation may be the name of a binary relation or may be a binary description. Let R be the relation denoted by image-relation, and O be the relation denoted by ordering-expression. The image order relates x to y if and only if

(E (x' y') (and (O x' y') (R x x') (R y y')))
For an image order to be a partial order, it is sufficient that O be a partial order and that for any value x, all its images under R are interchangeable with respect to O. (The latter is true, in particular, if no value x has multiple images under R.)

Macro (Lexicographic-order ordering-expression)
A lexicographic order orders sequences in terms of an ordering on their positionally corresponding members. Suppose O is the partial order denoted by ordering-expression, and x and y are sequences (lists or vectors). Then the lexicographic order relates x to y if, there is some index, i, such that O relates the i'th element of x to the i'th element of y, and for all smaller indices, j, the j'th elements of x and y are not related (in either order) by O.

Macro (literal-order equivalence . values)
A literal order orders the elements of values according to their position in the list. It relates x to y if the first occurrence of x precedes the first occurrence of y in values, or if x occurs in values and y does not. Equivalence must be the name of an equivalence relation, or a binary description that satisfies the axioms for being an equivalence relation. It is used to define occurrence in the list. (See Equivalence.)

Rules, Constraints and Atomic Transitions

Rules provide a way to prevent or react to certain conditions arising in the database. Not surprisingly (given what has already been described), these conditions are specified in AP5 by wffs. Before rules are described, however, we point out that the presence of rules makes it important to be able to control the size (granularity) of changes to the database. As an example, suppose we have a constraint that every person must have a unique name. If every update (++ or --) were done separately, it would be impossible to create a new person or change the name of a person, since the first step would always result in a person without a unique name. Several updates have to be made "at the same time".

The Atomic construct provides control over the size of database updates. The lisp macro

Macro (atomic form* {ifabort abortform*} {ifnormal normalform*})
evaluates form* in order (barring an abort, described below). However, all attempts to update the database are held off until the end, at which time they are all done "at once". This means that no rule can react to a situation in which some but not all of these changes have been made. Every primitive update, u (a ++ or --) is semantically equivalent to (Atomic u).

As a first order approximation (refined below), we can view an AP5 program as doing a sequence of atomic updates:

(loop while t do (atomic ...))
AP5 expands each atomic update into a loop, something like:
(let (automation-queue)
  [do atomic update]
  (loop while automation-queue do 
        (funcall (pop automation-queue))))
where each atomic update may add activities to the queue, and the activities themselves may do further atomic updates. The order in which these activities are performed is not specified. However, all of them will be performed before control returns to the program making the original update. The mechanism by which activities are enqueued is called automation rules, and is described below.

Furthermore, AP5 expands each (do atomic update) into an inner loop something like:

(loop until [proposed atomic update consistent] do
   [attempt to fix atomic update]
where both consistency and the mechanism for fixing atomic updates are defined by consistency rules, also described below.

At any time inside the form* of an atomic, execution of the macro

Macro (abort tag format-string . objects)
will cause an exit to the (outermost) atomic, with no changes to the database. In this case the elements of abortform* for that atomic are evaluated, and the value(s) of the last returned from the atomic. These may refer to

Variable abortdata
This is bound to a list which starts with the arguments to abort, and then contains additional information (which we will try to formalize in a later edition of this manual), including the data built up so far to represent this transition in the global-history list, documented in the section on transition history. The default ifabort code calls error on the arguments to abort, excluding the tag. The idea is that the tag should be used by abortform* as a classification of the problem, so it can decide how to recover. Of course, it is free to use any other data returned from the abort as well. For atomics without ifabort forms, including database updates that are not contained in an explicit atomic, the format-string provides a way to describe to the user what went wrong.

The following macro is useful for debugging aborted updates:

Macro (from-abort {({:abortdata abortdata} {:updatestatus updatestatus} {:delta delta})} . body)
In its simplest form, when you're in an error caused by an abort (so abortdata is bound), (From-Abort ()) simply puts you in the context of the abort and executes (BREAK). This allows you to evaluate expressions from the "last" inconsistent state before the abort. Updates are not allowed. A saved previous value of abortdata may be used as the value of abortdata and forms to be evaluated in the aborted context may also be supplied.

There are also facilities for retrying and correcting aborted transitions. These are not detailed here because we hope they will be self explanatory.

If there is no abort, then after the database is updated, normalform* is executed and the values of the last are returned from the atomic. If no normalforms are supplied, the values of the successful atomic are those of the last of form*.

Within an atomic, (Atomic form* ifabort abortform* ifnormal normalform*) is equivalent to (progn form* normalform*), i.e., any abort will escape to the outermost atomic. Since atomics are dynamically scoped, the same piece of code might have different effects depending on whether it is executed from inside an atomic.

Macro (insist {explanation-string} . wff)
will cause the atomic to abort (in the end) if wff would not be true in the resulting database. This wff may refer to runtime values which are computed at the time of the insist. For example, if one has the lisp variable x bound to an object which should end up being in relation R, (Insist (R x)) will make sure that it is. The explanation is simply used as an error message if the wff turns out to be false.

Macro (inatomic)
returns a non-NIL value if executed inside an atomic and NIL otherwise. In particular, you can write code that is not allowed to be executed as part of an atomic by doing (when (InAtomic) (Abort ...)).

Although we don't expect it to be needed very often, we mention that at the top level of an outermost atomic, the symbol REVEAL allows things that are done afterward in the atomic to "see the effects" of updates done before the reveal. The effects do not include any rules that would react to those changes. This might be useful if, for example, you want to create a rule and, in the same atomic transition, assert something about that rule. If you do

(atomic (defautomation rule1 ...)
	(++ P (theonly r s.t. (rulename r 'rule1))))
you'll get an error: the form to find the rule named rule1 is being evaluated before there is any such rule. If we do
(atomic (defautomation rule1 ...) reveal
	(++ P (theonly r s.t. (rulename r 'rule1))))
then the rule will be visible when it is needed. Of course, if the atomic aborts then this relation will not survive.

For now we consider it an error for reveal to be used anywhere other than the top level of an outermost atomic. We may later define what it means in inner atomics, in consistency rule responses, etc.

Examining and undoing history

The ability to examine and "back up" through history, undoing database transitions makes it easier to recover from errors (and correspondingly more attractive to experiment). The database transitions are recorded in the global-history list. If all updates are reversible, events can be "undone".

Function (history ...)
is the history browser. It uses the *query-io* stream to display information, ask questions and get answers. Many print control parameters can be set interactively. Initial values for these parameters can be passed in as keyword arguments. These default to the values of special variables which can be set or bound in order to make defaults more permanent. The browser offers two types of "undo", which we'll call "forward" and "backward". Backward undoing corresponds to "going back in time". Only the most recent event can be undone backward, after which it is "forgotten" and the previous event can be undone backwards. Backward undoing ignores rules. Undoing "forward" means doing a new atomic update with the opposite effect of the current event: delete all the stored tuples added and add all the stored tuples deleted. Forward undo's are done with rules in effect. Any event may be undone forward (and in fact the undo may later be undone). Note, however, that forward undo's may abort or trigger automations.

Note: See the note in the section about decaching.

If you plan to back up to a particular point, another mechanism is more useful:

Function (history-label name)
creates an event labelled with name.

Function (undo-to name)
undoes (backward) all events back to (and including) the one labelled with name.

This facility is used to provide a database transaction macro:

Macro (transaction {label} . body)
If the first item in a transaction is a symbol then it is treated as a history label for the beginning of the transaction. A transaction may not be executed inside an atomic. It may contain atomic transactions to be executed sequentially. At any time inside a transaction the form:

Macro (abort-transaction . values)
will cause any transitions done inside the innermost transaction to be undone, and the values will be returned as the values of the transaction. If no AbortTransaction (or throw outside the transaction) is done, the values of the last form in the transaction are used as the values of the transaction. N.B. Any non-local transfer of control from inside a transaction to outside causes an abort-transaction.

Macro (intransaction)
returns T from inside a transaction, NIL otherwise. Transactions may be nested.

Consistency Rules

Consistency rules provide a way to maintain "constraints". Constraints are conditions, expressed by wffs, that are supposed to be true in any state of the database. Each constraint is associated with a consistency rule which reacts to proposed transitions that would violate the constraint by either causing the transition to abort or proposing additonal updates for the transition. We view an atomic update as requesting the database to reach a state in which all facts added by the atomic are true and all facts deleted are false. Therefore consistency rules are not allowed to remove changes from transitions. It also follows that any attempt to add and delete the same fact in the same transition must abort.

Intuitively we want atomic transitions to be augmented only with updates that are needed to satisfy the constraints. Unfortunately there may be alternative ways to satisfy this informal specification. It is also desirable for the augmentation to be easily predicted by the programmer (it helps for it to be deterministic) and relatively efficient for the machine to compute. We provide here a heuristic approximation that in practice seems to meet these requirements.

When a set of changes is proposed as an atomic transition, AP5 determines whether the state resulting from that set of changes would be consistent (satisfy all constraints). If so, the changes are made and the transition is complete. Otherwise, the consistency rules associated with the threatened constraints (those that would be violated by the proposed transition) are executed in an environment that allows access to both the (inconsistent) proposed state and the (consistent) current state. The rules do not see augmentations proposed by other rules. If no rules propose any augmentations, the transition aborts (since it's inconsistent and nothing can be done to fix it). If any rule proposes an augmentation that's incompatible with another proposed update (either original or augmentation), the transition aborts. Otherwise, all the proposed augmentations are made to the proposed set of updates and the result is proposed as an atomic transition (which may trigger more consistency rules).

This algorithm is deterministic (if the rules are deterministic) and in fact the result does not depend on the order in which the rules are considered. However infinite loops are possible, e.g., if there's a rule which reacts to the addition of R(n) for any integer n by adding R(n+1). Next, it is obvious (but important to understand) that the code in a consistency rule must be able to tolerate an inconsistent database. Also, when several cycles of consistency rules are needed, the cycle in which something happens (and thus the length of the path from one rule to another) can make a difference.

As an example, suppose we start in a situation where the (unary) relation P is everywhere false. There are two rules, R1 and R2. R1 requires that
(A (x) (implies (P x) (E (y) (Q y)))). It reacts to a threat of violation by doing (++ Q 1). R2 requires that (A (x) (implies (P x) (Q x))), and reacts to a threat of violation (for x) by doing (++ Q x). Clearly, if we do (++ P 2), the reaction of rule R2 will in some sense make the action of R1 unnecessary. However, the presence of both rules will cause the addition of two instances of Q. suppose, however, that R1 had been replaced by two rules, R1a and R1b, which communicate through a new relation P1: R1a requires that (A (x) (implies (P x) (P1 x))), and reacts by doing (++ P1 x), while R1b requires that (A (x) (implies (P1 x) (E (y) (Q y)))), and reacts by doing (++ Q 1). Now if we do (++ P 2), only one instance of Q will be added. Rule R1a will be triggered and will cause (P1 2) to be added, but after that, R1b will never be triggered, due to the addition of (Q 2).

One particularly important case is that if a rule creates some new object, and it is important that it not create two such objects, then its action should prevent it from triggering (as opposed to making an update that will cause some other rule to do something that will prevent it from triggering). Of course, if the action can be done many times without causing any trouble, such as making the same assertion, then a long path (in terms of consistency cycles) to the solution (in the sense that the rule will no longer trigger) still works.

Another point is that the inability to remove an update from a transition sometimes necessitates the toleration of redundancy. As an example, suppose we have a stored relation (R x y), but we really only want to access its transitive closure, R*. Then we might want to try to keep R minimal, in the sense that there's no need to store (R x y) when (R* x y) can be derived in its absence. However, a rule that prohibited such redundancy would prevent otherwise legal updates, such as

(atomic (++ R a b) (++ R b c) (++ R a c))
An attempt to delete the redundant (R a c) would contradict the original update. This problem can only be tackled by giving some preprocessor access to the entire set of updates. An alternative is to tolerate the redundancy locally (for the transition), but add an automation rule that deletes redundant facts in another transition.

note: We are considering how to provide the ability to install such a "preprocessor" for derived relations, but this is not yet provided.

In order to decide how to react to an inconsistent set of proposed changes, a consistency rule may want to see what has changed. For example, one interpretation of the constraint that nobody have more than one office is that whenever an office is given to a person, if he already had another office then he should be required to vacate it. In order to see what is changed, consistency rules are allowed to use a limited form of temporal reference in the wffs that they use for database access. In particular, the two special forms
(Previously wff) and
(Start wff)
are allowed as wffs. In the absence of these temporal operators a wff refers to the proposed, possibly inconsistent state. Previously refers to the (consistent) state before the atomic transition started. (Start wff) means (and wff (not (Previously wff))), i.e., the wff is proposed to become true. Temporal reference is allowed in the following places:

  • In reactions of consistencyrules
  • In triggers of consistency and automation rules (introduced below)
  • In functions for generating automation invocations
  • In Trigger code (introduced later)

Macro (previously . body)
is also defined as a lisp macro. The forms are evaluated as of the previous state. The macro generates an error if executed outside one of the situations (listed above) where temporal reference is allowed.

There are subtle differences between consistency rules that maintain time sensitive as opposed to time insensitive conditions. For example, if we prohibit (Start P), P is allowed to be true, but it may not become true after having been false. Also, the knowledge that a time insensitive wff was true (or false) before means that certain things need not be checked. For example, if we prohibit (E (x) (P x)), then whenever P becomes true of anything, the condition becomes violated. On the other hand, if we prohibit (Start (E (x) (P x))), then as long as there is already an example of P, adding a new one will not violate the condition. The great majority of consistency triggers will not use Start.

The normal ways to declare constraints are:

Macro (alwaysrequired name trigger {:repair repair} {:enforcement-level enforcement} {:reporter reporter} {:documentation documentation})

Macro (neverpermitted name trigger {:repair repair} {:enforcement-level enforcement} {:reporter reporter} {:documentation documentation})

The only difference is that AlwaysRequired creates a rule that reacts when the condition is (about to be) false, while NeverPermitted reacts when it's true. Functions can create rules with computed names, triggers, etc. by calling the following versions defined as functions rather than macros:

Function (alwaysrequire name trigger reaction enforcement {:reporter reporter} {:documentation documentation})

Function (neverpermit name trigger reaction enforcement {:reporter reporter} {:documentation documentation})
The macro versions do more at compile time (an advantage when you compile to a file). They also compile reactions that are specified as lambda-expressions.

Name is a symbol, trigger is a wff, and reaction is a lisp function. The rule may be interpreted as meaning that the wff should never be true (for neverpermitted) or false (for alwaysrequired). If it threatens to become true, then the reaction is executed in an attempt to "repair" the problem. If the trigger has the form (E vars ...) for neverpermitted or (A vars ...) for alwaysrequired (or macroexpands to such a form), then every binding of vars which serves as an example (and thus violates the constraint) will be passed to a separate invocation of the reaction. The reaction function is meant to do something to maintain the constraint (for this binding of vars).

Function (ignore ...)
is a function which does nothing with any number of arguments. This is useful as a reaction if you don't have a reaction to propose but do not want to automatically abort (it's possible that another rule might fix the problem).

Enforcement is one of {:total, :incremental, :none}, indicating the degree to which AP5 enforces the constraint. Total means that when the constraint is declared, AP5 checks to see that it is satisfied, [Actually, total means to insist that the constraint be satisfied in the state resulting from the addition of the rule.] and also notices (and takes the appropriate action) when it is about to be violated. Incremental means that AP5 assumes the constraint to be satisfied when it is declared but watches for violations. [In different versions, the rule itself may become active at different times, e.g., only after the atomic transition is complete, or after the consistency cycle in which the (proposed) database contains the rule. Of course, in the second case, it will be deactivated by an abort.] None means that AP5 simply assumes the constraint is always satisfied. Other enforcement levels may be supported in the future. Different levels are useful because sometimes it is impossible, impractical or unnecessary to enforce a constraint. The AP5 compiler will feel free to make use of constraints as assumptions, even if it is not enforcing them.

Reporter allows you to specify a function for reporting failures of consistency rules. If an atomic transition aborts because some rules could not be satisfied and if no IFABORT was supplied, this rule-specific error-reporting code is used (if present) to print error messages. The arguments to this function are the same as those for the reaction of the rule.

There's a constraint that only one rule have a given name. If you try to name a new rule with the same name as an old one, the old rule is deleted.

Ruletriggers are like wffdefns in that they are not allowed to contain arbitrary lisp forms to be evaluated. All arguments to relations must be bound variables or constants. (However, see the description of Memo.)

The RuleTrigger and Constraint-Enforcement-Level cannot be altered (other than by creating or destroying the rule). If you want to change them, you can redeclare the rule (with different values in those slots), which will have the effect of replacing the rule.

Example:

(neverpermit 'Two-offices-for-same-person
  '(E (x y person)
      (and (Office person x) (Office person y)
	   (not (eql x y))))
  #'(lambda (x y person) (declare (ignore x))
     (when (?? Previously (Office person y))
	(-- Office person y)))
  :incremental
  :reporter
  #'(lambda (ignore ignore2 person)
     (declare (ignore ignore ignore2))
     (format *error-output*
	     "~A should not have two offices" person)))

Note that when a state is proposed in which p has two offices, the rule will trigger twice - each office has a turn at being both x and y. If both offices are new, there is no reaction, and the condition remains violated - which causes an abort. An alternative reaction could have been

(when (?? Start (Office person x)) (-- Office person y))
which would have aborted from an attempt to add two new offices for a different reason - an attempt to add and delete the same fact in the same transition.

Several constraint idioms have been identified. These have been (or will be) made into relations:

  • subtype constraints, e.g., every integer is a number - these are discussed in the chapter on types
  • type constraints on relations, e.g., the FirstName relation would only relate people to strings - these are also discussed in the chapter on types.
  • disjoint types, e.g., nothing is both a symbol and a number - these will again be described in the chapter on types.
  • count restrictions, e.g., no person can have more than one spouse

Notes:
- Declaration of a rule involves compiling various pieces of triggering code. This tends to take some time.
- It may also generate errors (aborts) on the grounds that the rule cannot be triggered or tested. In some cases it may actually be possible to do what AP5 says it cannot, and I'm willing to see examples. In other cases it is not possible, and so the rule cannot be implemented, at least with the current implementations of the relations involved. See the section on large wffs.

Consistency Checkers

We have generalized consistency rules to allow constraints that cannot be conveniently expressed in our relational language.

Type (consistencychecker rule)
means that rule is one of these generalized constraints. Such rules may be declared by the macro

Macro (defconsistencychecker name description response {:reporter reporter} {:documentation documentation})
Description is either a description that could be used as an automation rule trigger (see the section on automations), or the special description "(nil s.t. true)". On every consistency cycle, every match of the description causes an invocation of response. Unlike a consistency rule, the fact that such an invocation takes place does not necessarily mean that a constraint has been violated. That is decided by the response. Like a consistency response, the response may make updates, do insist's and even abort. If it makes updates (even noops) AP5 considers the current situation to be inconsistent, i.e., it violates the constraint represented by this rule. Otherwise, if the value of the response is NIL, AP5 considers the rule to be satisfied, and any other value indicates that it is violated. This mechanism is clearly more general than consistency rules, but is usually not needed. Also, consistency rules have the advantage that the constraint is explicit in the trigger, while for consistency checkers the constraint is at least partly in the lisp code.

Count Constraints

Cardinality (or "count") constraints express facts such as "every employee has at most one office", or "every employee has at least one office". The general form is:
For every n-tuple of objects, [x1, ..., xn] of types t1, ..., tn (actually, descriptions can be used as well as types), there are count-restriction m+n-tuples in the relation R, which can be formed by inserting m arbitrary objects in specified positions in [x1, ..., xn].

The types and the positions at which objects are to be inserted are specified by "patterns" which are lists of length m+n, which must be the arity of R. The elements of the list are either types or descriptions, or the symbol OUTPUT which stands for an inserted object.

The currenly supported values for count-restriction are:

  • :none There are no such tuples, e.g., while employees may in general be assigned secretaries, there should not be any secretaries who have secretaries: the constraint for the pattern (SECRETARY OUTPUT) is :none.
  • :multiple The relation must contain at least one tuple, e.g., if every employee has an office, the constraint for the pattern (EMPLOYEE OUTPUT) would be :multiple.
  • :optional The relation must contain at most one tuple, e.g., if no employee can have more than one office, the constraint for the pattern (EMPLOYEE OUTPUT) would be :optional.

Of course, "none" implies "optional", and "multiple" is incompatible with "none". However, "multiple" and "optional" are independent. This leads to two more possible cases:

  • :unique Exactly one (both multiple and optional), e.g., every person has one office.
  • :any No constraint, e.g., a person can have any number of children.

Examples:
if we want to restrict the relation Works-On%(person, project, number) so that every employee works some amount on some project, the count-restriction would be :multiple and the pattern would be (employee output output), i.e., for every employee there is at least one project and number such that Works-On%(employee, project, number).
If we want to say that there can be no more than one number for any given person and project, we can use the pattern (employee project output) and the count-restriction :optional.

Cardinality constraints are usually declared as part of a relation declaration, with the :count argument of DefRelation. In fact, the value of this parameter is, in effect, the result of appending together a bunch of argument lists to the Restrict-Cardinality macro described below (except that the name of the relation being declared is left out).

Macro (restrict-cardinality relname pattern {:countspec countspec} {:enforcement enforcement} {:replacing replacing} {:default default} {:too-many-repair too-many-repair} {:too-few-repair too-few-repair} {:delete-any-other-countspecs delete-any-other-countspecs})
is the macro that declares count constraints. Relname is the name of a relation and pattern is a list, each element of which is a typename, description, or the symbol output. Countspec is one of {:any, :unique, :optional, :multiple, :none}. Enforcement is an enforcement level. The default countspec is :any and the default enforcement is :incremental.

If countspec is either :optional or :unique, and replacing is non-NIL, then the rule will have a reaction that deletes old tuples in order to "make room for" new ones. For example, giving a person a new name will cause his old name to be removed if the constraint is declared as replacing. The other side of the coin is that if countspec is either :unique or :multiple, the default parameter can be used to provide a reaction function capable of supplying one or more tuples when the constraint is violated. The default function will be applied to an N-tuple of values that are instances of the types named in the N typed slots of the pattern. It should return, as multiple values, zero or more M-tuples, each M-tuple containing a value for each of the M OUTPUT slots of the pattern. The M+N tuple formed by merging the input with the outputs of the function will be added to the database as the reaction. [If the function wishes to supply NO tuples in a particular case, it should return NO values, not return NIL.]

Other reaction code can be supplied via the arguments Too-Many-Repair and Too-Few-Repair. (Unique constraints are represented as two rules, so both reactions can be specified independently.) The default reactions do nothing.

In the case of Too-Many-Repair violation of an upper bound constraint will result in application of the function. If countspec is :optional, it will be applied to the N values corresponding to the typed pattern elements for which there are too many tuples. If countspec is :none, it will be applied to N+M values. They will be the slots of a tuple that violates the constraint. The arguments will be passed in the order of the slots of the relation, immaterial of which slots in the pattern were typed and which were OUTPUT. The function, like any other reaction function, may assert and retract facts to reestablish consistency. [The Replacing keyword is a special case of a Too-Many-Repair; it is an error to have non-NIL values for both keywords.]

In the case of Too-Few-Repair, the violation of a MULTIPLE constraint will result in application of the function to N values -- the same ones described above for a Default function. The function, like any other reaction function, may assert and retract facts to reestablish consistency. [The Default keyword is a special case of a Too-Few-Repair; it is an error to have non-NIL values for both keywords.]

Delete-any-other-countspecs may be T, NIL, or a list of patterns. T means the same thing as (list pattern). The meaning is that all count constraints for this relation except those for the patterns listed are to be removed. However NIL means not to remove any other constraints.

In order to find out what cardinality constraints apply in a given case, the following are provided:

Function (cardinality-of-pattern rel-or-name pattern)

Function (cardinality-for-tuple rel-or-name i-o-pattern {output-indicator}) The first accepts a relation and pattern and returns a countspec that is implied by existing count constraints. The second is similar but the pattern is altered in that all the types are replaced by objects and the "output" slots are identified by being EQ to output-indicator, which defaults to the symbol OUTPUT but can be supplied if that is one of the objects you want to pass in. This returns a countspec for those particular objects, sort of like asking about all the tuples of the types of those objects and merging the results.

These functions amount to very limited theorem provers. They recognize that count constraints imply similar constraints for more specialized types. They currently do not try to prove subrel relationships for descriptions. The also recognize that "at most n (x,y) s.t. (R a x y)" implies "at most n (y) s.t. (R x b y)". (A similar argument for "at least" is not currently used since it requires knowing that types are non-empty.)

Automation rules

An automation rule reacts to a completed atomic transition. Whatever it does cannot be part of that transition. The reaction of an automation rule does not have access to two states, like that of a consistency rule, but it can be sensitive to transitions because its trigger can refer to and thereby bind variables (available to the response) from two states. In fact, automations are required to trigger on transitions rather than just states. They are not supposed to trigger on transitions that have nothing to do with the trigger. Formally, a wff is not allowed as the trigger for an automation if there is some situation in which that wff would be true regardless of the previous situation (or some situation that would make it true of the next transition regardless of the next state). [I think this is the right condition, but I haven't yet proven it.] This implies that the wff must refer to both the current and previous state. (Actually, in the current version, it must use Start.)

(Not (Start P)) is not a legal trigger for an automation, because there are states (namely those in which P is false) for which it is true regardless of the previous state.

(Start (Not P)) is allowed.

(Or P (Start Q)) is not allowed, since if P is true then the condition is true regarless of the previous state.

(And P (Start Q)) is allowed.

Macro (defautomation name trigger action {:documentation documentation})
This is analogous to NeverPermitted.

Unlike consistency rules, the triggers for automation rules are descriptions of the form (vars s.t. wff). The action is evaluated for every binding of vars such that wff is true. This is the normal way to create automations.

Automation Generators

Sometimes it is inconvenient for an automation rule to have no access to the atomic transition other than the variables matched in its trigger. For example, one might want to print a warning that the following set of objects just entered the relation P, and not be satisfied with a separate warning for each. For this purpose we have generalized automation rules in a way analogous to consistency checkers.

Type (AutomationGenerator rule)
means that rule is such a generalized rule. These rules are declared by the macro

Macro (defautomationgenerator name description response {:documentation documentation})
Description is either a description that could be used as an automation rule trigger, or the special description "(nil s.t. true)".

On every state transition, every match of the description causes an invocation of response. Unlike automation rules, the response has access to both the previous and new states. However, it is not allowed to do updates. Rather its job is to decide what invocations should be run as automations later. In the above example, the rule that wanted to print a warning would have as its trigger (nil s.t. (E (x) (start (P x))). Its response would then collect all the data and enqueue as an automation a program to print the single warning. This last action is performed by

Macro (enqueue-automation . body)
which creates a closure out of body to be executed as an automation.

Note:
Recall that closures over loop variables do not necessarily refer to the values of those variables at closure time! If you find yourself writing enqueue-automation inside a loop, you probably want to copy the loop variables, e.g., rather than
(loop for x s.t. (P x) do (enqueue-automation (F x))), you want
(loop for x s.t. (P x) do (let ((x x)) (enqueue-automation (F x)))).

Types

In AP5, the term "type" is synonymous with "unary relation". We have endeavored to provide the generally useful features normally associated with types. Of course, one typically wants to be able to test whether an object is of a particular type, or generate the objects of some type. These capabilities are provided by virtue of types being relations.

Subtypes

Relation (subtype t1 t2)
means that t1 is a subtype of t2, i.e., every object of type t1 is also of type t2. Notice that this means every type is a subtype of itself. In fact, Subtype is a special case of the more general relation:

Relation (subrel subrel superrel)
Means that every tuple satisfying the first relation satisfies the second. (Thus SubType is a subrel of SubRel.) Subrel is actually derived from a relation named ISubRel (for "immediate subrelation"). The recommended way to update the relation (type) lattice is to update SubRel (or SubType, for types). The deletion code for SubRel deletes any ISubRel tuple and insists that the SubType tuple be false. This will abort if the SubType tuple is implied by other ISubRel tuples. The solution is to atomically delete all of the relevant SubRel tuples.

Notes:
Other semantics for updates of SubType are certainly reasonable. We choose this one because it generates errors when there is some chance that you're about to do something you didn't mean. Of course you can change this behavior if you wish by changing the update macros for SubType.
There is an automation rule that keeps the ISubRel relation minimal, i.e., deletes subrelation constraints implied by other ones.
Also, there's currently a rule that prohibits different relations from being subrelations of each other. Another way of declaring synonyms is to just give one relation two names. It's not clear whether this rule is a good idea, given that synonyms present other problems.
There is also a primitive "classifier" that attempts to recognize subtype relations between defined types and other types. (More generally, it attempts to recognize type constraints for defined relations from their definitions.) This is currently implemented as an automation rule which prints messages to tell you what it has added.

The initial type hierarchy includes:

 Entity (true for any object)
   Number (numberp)
     Integer (integerp)
   String (Stringp)
   Function (Functionp)
   Cons (Consp)
     Description ((vars s.t. wff) as in defined relation)
   Symbol (Symbolp)
   other lisp types
   DBObject 
     MatchNode 
     Rule 
       AutomationRule
       ConsistencyRule
       MaintainRule
     Relation 
       Type 
         BaseType (see below)

BaseTypes (inheritance)

In some applications (especially those where data tends to be called "knowledge") various forms of "inheritance" are desired. In the case of types the usual requirement is that by virtue of asserting that an object is of some type it should become an element of all supertypes. In AP5 this is accomplished by (you'd never guess!) first order logic definitions of the types involved. Types that "inherit" elements from their subtypes are called BaseTypes (which is an unfortunate name in that they have little to do with "base relations"). Basetypes are defined in terms of the SubType relation and another relation, named Classification. In particular, if some type, T1, is a basetype, it is defined as the set of objects, x, such that the Classification relation holds between x and some subtype T2 of T1 (which might be T1 itself).

Typically, after declaring a baseype (with DefRelation ... :derivation basetype) one makes a number of SubType assertions.

Type Constraints

A common use for constraints is assuring that objects in a particular slot of a particular relation are of a particular type. For example, to be sure that the relation (PhoneNumber x y) relates objects only to phone numbers (identified by the relation (Phone x)), one might do something like:

(Alwaysrequire 'Phone-Type-Restriction 
  '(A (x y) (implies (PhoneNumber x y) (Phone y)))
  'ignore :total)

This would abort something like (++ Phone (DBO Don) nil) (assuming NIL is not a phone), or (-- Phone 226) (if 226 was the Phone of someone), but would not abort something like
(Atomic (++ Phone (DBO Don) 'bam) (++ Phone 'bam)).

Type constraints are normally declared along with a relation as part of the DefRelation syntax. A list of types is supplied as the :types parameter. Enforcements can also be supplied (see DefRelation for details).

The default reaction code does the following: if the object in the constrained slot just ceased to be of the required type, the (now illegal) tuple is removed from the constrained relation. The intent is that when an object ceases to be of some type, any tuples involving that object, which therefore become illegal, will be removed. (AP5 does not support the notion of destroying an object.) The default reaction does nothing to react to attempts to put an object of the wrong type into a constrained relation - the attempt is allowed to abort.

Disjointness constraints

Another useful bit of domain model is the fact that certain types are disjoint, i.e., they have no common elements.

Macro (defdisjoint {enforcement} . types)
asserts that all the types are pairwise disjoint. The types may be type names or unary descriptions. The default enforcement is :none.

Programming constructs using types

The following macros have been defined to facilitate the use of types in lisp code:

Macro (declare-type . declarations)
Each declaration is a list of the form (typename . variablenames). The binding of each variable is checked to be sure it has the given type.

Macro (returning typename . body)
The body is evaluated, and the value returned is checked to be sure it has the given type. Multiple values can be checked by supplying a typename of the form (values typename1 typename2 ...).

Macro (defun-typed name lambda-list . body)

Macro (let-typed bindings . body)

Macro (prog-typed bindings . body)
All of these act just like their untyped versions, except that variables of the form type#name or type# (just like those allowed in wffs) are type checked on entry (see esoteric syntax).

Macro (aptypecase key . clauses)
This evaluates key and then looks for a clause whose CAR is the name of a type of that object. The CAR can also be a list of typenames, in which case the key matches if it is of any of those types. Finally, a CAR of T or OTHERWISE matches no matter what the key is. If such a clause is found, the remaining elements are evaluated, and the value(s) of the last returned.

Macro (wff-if . ifform)
This uses IF-THEN-ELSEIF-ELSE syntax (like interlisp if), but the conditions are wffs, not lisp expressions. Also, for any condition that starts with an existential quantifier, the corresponding forms (after the then) may use the existential variables freely. Omitted THEN clauses are treated like THEN T. Also, an omitted final ELSE clause is treated like ELSE T.

Type dependent I/O

The form in which DBObjects (not general lisp objects) are printed is dependent on their types.

Relation (proper-name dbobject symbol)
relates DBObjects to their proper-name(s).

Macro (dbo . identifying-data)
accepts a proper-name (if it is the proper-name of a unique object), and the printer for DBObjects prints a proper-name of the object if there is one. Notice that the proper-name of an object is a symbol, with a package. The printer for DBObjects does not print the package prefix, but the reader is sensitive to the package in which the name is read.

For particular types of DBObjects, the syntax (DBO type name) evaluates to the object of the given type with the given name, e.g.,
(DBO RELATION relationname) evaluates to the named relation,
(DBO TYPE typename) evaluates to the named type
(DBO RULE rulename) evaluates to the named rule.

Function (relationp x)
returns x if it is a relation, the relation named x if there is one, and otherwise NIL.

To print a DBObject (without a proper name), AP5 looks for a type of the object which has a printer function which is willing to do the printing. This function should accept the same arguments as the common lisp :print-function option of defstruct. It should return a string to print or the value NIL. If it returns NIL the search for a printer continues. The printer function is declared by adding tuples to the relation:

Relation (printer type function)

The suggested convention is that printers should print #,(DBO . x), where x is a list whose car is a type, and that types be given readers that can translate the list back into the object when invoked by DBO. Readers are similarly defined by:

Relation (reader type function)
A reader function should accept a list whose car is a type, and return either a DBObject or NIL. If it returns NIL (which is not a DBObject), the search for an acceptable reader will continue. As a convenience to those who write printers,

Macro (printdbo . x)
will print #,(DBO . x). AP5 supplies reader and printer functions for relations and rules, e.g., the way to refer to a relation object is #,(DBO Relation name). Notice that, since relations such as Type, Subtype and Printer all relate relations (rather than their names), DBO may be useful in your attempts to use AP5.

If there is no acceptable printing function, the printer prints
#,(DBO DBOBJECT address), which cannot be read back in.

The notion of "the types of an object" is really not first order, and so is not available as a relation. However, we recognize its usefulness, as in the case of printing and other "object-oriented" applications.

Function (all-types-of object)
returns a list of all the types of object, even those that are supertypes of others.

Function (most-specific-types-of list-of-types)
returns a list of the elements of its argument that are not supertypes of other elements.

Examples

(DefRelation person :derivation basetype)
NIL
(listof type)
(#,(DBO TYPE PERSON) #,(DBO TYPE RULE) ...)
(DefRelation employee :derivation basetype)
NIL
(++ Proper-Name (make-dbobject) 'Don)
NIL
(++ Classification (DBO Don) (DBO Type Employee))
NIL
(loop for p s.t. (person p) collect p)
NIL
(++ Subtype (DBO Type Employee) (DBO Type Person))
NIL
(listof person)
(DON)

Equivalence

In describing the semantics of AP5's virtual database, we have had to appeal to the concept of two references to values being references to "the same value". Unlike logicians, lisp programmers regularly deal with several theories of equality. For example they sometimes compare objects with EQ (in common-lisp, more often EQL), and sometimes with EQUAL. If we do (++ R '(1 2 3)), we might want (?? R '(1 2 3)) to return T, even though the reader considers the two lists to be different. It would be much more painful (and less efficient) to have to ask
(?? E (x) (and (R x) (Equal x '(1 2 3)))).

We will only refer to the equivalence of two values with respect to some equivalence relation. We use the term "equivalence relation" in the mathematical sense: a binary relation that is transitive, reflexive and symmetric. In addition we impose the restriction that equivalence relations are never updated. (Mathematics does not deal with updates to relations. The reason for this restriction will be described later.) The term "reflexive", meaning that every object is equivalent to itself, actually makes sense only in the context of some primitive "identity" relation. In AP5, as in lisp, that relation is EQ. Equivalence relations partition the universe of lisp values into disjoint equivalence classes. Two n-tuples of objects, W and V, are said to be equivalent with respect to an n-tuple of equivalence relations, E, iff for each i from 1 to n, Wi and Vi are equivalent with respect to (related by) Ei.

Non-uniform Equivalence

Up to now we have spoken of a relation in the database as arbitrarily partitioning tuples of lisp values into TRUE tuples and FALSE tuples. We now refine this concept. For each relation R, there is a fixed (not varying over time) n-tuple of equivalence relations E(R), where n is the arity of R. We will call this the equivalence signature of R. For any two n-tuples of values which are equivalent with respect to E(R), R must be either TRUE of both or FALSE of both. Intuitively, R cannot distinguish between tuples that are equivalent with respect to its signature. R may be thought of as mapping each tuple of equivalence classes C=(c1, ..., cn) to TRUE or FALSE, where ci is the equivalence class defined by E(R)i. Lisp objects are used merely as representatives of these classes. Thus the concept of object really corresponds to an equivalence class.

Adding or deleting tuples must make all equivalent tuples true or false. It is not possible to add and delete equivalent tuples in an atomic transition (any attempt to do so will abort). We can now explain why equivalence relations are not to be updated. Suppose the relation R uses equivalence relation E and that R holds for object o1 but not for object o2. If E starts to hold between o1 and o2 then clearly R has to change, but there seems to be no basis for deciding whether R should start to hold of o2 or cease to hold of o1.

The concept of equivalence also has semantic import for generating. When we generate (x1 ... xn) s.t. (R x1 ... xn) we obviously want each generated tuple to satisfy R, i.e., the result of testing R on the tuple would be TRUE. The other requirement is that for each tuple of objects satisfying R, exactly one equivalent tuple should be generated.

Relation (compare-relation relation position equivalence-relation)
describes the equivalence signatures of relations. (Positions start with 0 and end with one less than the arity of the relation.) This should be asserted of a relation as soon as it is declared (and before it is used). In the absence of such a fact, the default comparison is Eql.

This solves the problem of equivalence for primitive stored relations. The remaining problem is what to do about wffs (including defined relations, which are defined by wffs). As an extreme example of the problem, suppose we have a relation R with the equivalence signature [Same-Color, Same-Size]. What should be the result of generating x s.t. (R x x)? If we regard R as relating equivalence classes, we're looking for things that are both Same-Color and Same-Size equivale