AP5 Reference Manual

author: Don Cohen don@isis.cs3-inc.com
The mail will bounce but the bounce message will give you a working address.
last revised: 2021/02/12

Introduction
Basics
Annotations, representations and derivations
Rules, Constraints and Atomic Transitions
Types
Equivalence
Finer Control over Implementation
Database representation of domain model
Miscellaneous
Appendices
index

Introduction

AP5 is an extension to commonlisp which allows users to "program" at a more "specificational" level. By this we mean that it allows you to focus more on what you want the machine to do and less on the details of how. AP5 is a compromise between lisp and the gist specification language. In particular, it incorporates those parts of gist that can be reasonably compiled. Unlike gist, it allows programmers to deal with performance issues. There are two mechanisms for this. One is that the compiler can be instructed via "annotations", descibed below. The fallback mechanism is to use lisp when necessary.

AP5 represents state (that is, data) as a set of relationships among a set of objects, as in a model of first order logic or a relational database. The language for accessing this data includes the language of first order logic. We will refer to it as FOL+. (The ways in which it extends the language of first order logic are described in many pieces spread through this manual.) A relation is a set of tuples, which are sequences of objects. All tuples of a relation have the same length (number of objects) which is called the arity of the relation. The combination of a relation and a tuple of the appropriate length may be regarded as a primitive sentence, or fact, which is either true or false. As in first order logic, there is no notion of a fact being "unknown" or "meaningless". If the relation < is the set of pairs of numbers such that the first is less than the second, it makes perfect sense to ask whether the tuple [John, Susan] is in the < relation - the answer is no. We assume that the set of objects (and the set of tuples) is not finite. The set of objects should be thought of as including all potential lisp objects, e.g., all numbers. A more precise definition of object appears in Equivalence.

AP5 programs can add and remove relationships among objects. They can also test (determine the truth value of) conditions expressed in FOL+ and iterate through objects that satisfy conditions expressed in FOL+. If it is assumed that relations do not change other than by AP5 mechanisms (this is the default assumption, but the reverse can be declared) then AP5 can provide automatic enforcement of FOL+ conditions and automatic invocation of programs when FOL+ conditions become true (see Rules).

AP5 programs are normally built in two phases. The first is specification, in which the programmer invents relations and rules and uses them to write programs that "do the right thing". In the second phase the program is optimized by adding appropriate annotations, such as representations or sizes of relations. These annotations influence the AP5 compiler's algorithm selection. They do not affect the meaning of the programs. They affect the source text only in very minor ways so the annotated programs need not be debugged again. Thus the programmer initially concentrates on behavior without worrying about efficiency concerns, and later can easily experiment with different performance tradeoffs. AP5 provides a fairly large library of annotations from which to choose, and allows programmers to add others.

AP5 differs from theorem proving systems and "logic programming" languages in that it does not allow assertion of arbitrary conditions, such as "all men are mortal". It may, however, be able to check whether all men are mortal, and if so, prevent this fact from becoming false. Another difference is that AP5 does not (normally) do unbounded searches. Queries that may cause theorem provers to start infinite computations cause AP5 to generate compile time errors.

Organization of the manual

This manual assumes that readers are familiar with lisp. The material above is meant to provide a general idea of what AP5 is. The next chapter tells a novice what he needs to know in order to browse and start writing programs that use the database. More serious users will be interested in the middle of the manual - rules, annotations, types and equivalence. The end of the manual is meant to contain details that even serious users will rarely need.

Basics

Wffs

At any moment only finitely many relations have been declared. These will be referred to as "primitive relations". Well-formed formulae (wffs) are expressions built from primitive relations, connectives (NOT, AND, OR, IMPLIES, EQUIV, XOR), quantifiers (A and E, meaning For-All and There-Exists), variables (which are identified by context) and lisp expressions. If R is an n-ary primitive relation (or the name of one), then any list with R as the CAR and a list of length n as the CDR is a "primitive wff", denoting the relation R applied to the arguments in the CDR. More syntax is introduced later [In particular, if you run across Start, Previously, or symbols containing the characters #, @, $, =, or ?, see Rules, Types, Equivalence and esoteric syntax.], but these need not be understood in order to write most wffs. Other than quantifiers, connectives, and relations, the only thing you need to understand is the interpretation of arguments. Each argument is either a constant such as 12 or '(this list), a symbol which names a bound variable (if the primitive wff is in the scope of the binding), or a lisp expression. Thus, in the wff

    (E (x) (R 'John x (f x)))

'John is a constant, x is a bound variable, and (f x) is a lisp expression. If we were to ask for the truth value of this wff, the expression (f x) would be evaluated first - note that the x refers to some lisp variable bound outside the wff. We would then find out whether the relation R contained any 3-tuple starting with the symbol John and ending with the value computed for (f x).

The set of wffs includes TRUE, FALSE, (AND wff*), (OR wff*), (IMPLIES wff wff), (EQUIV wff wff), (XOR wff wff), (NOT wff), (A Vars wff), and (E Vars wff), where Vars is a list of distinct symbols representing bound variables. Wffs containing quantifiers, connectives, and a few other constructs introduced later are called compound wffs, while other wffs are called primitive.

Lists of the form (Vars s.t. wff) are called descriptions if wff contains no lisp forms to be evaluated, i.e., the arguments to relations must be variables or constants. These may be used as relations, e.g. (and (P x) (((y) s.t. (Q y)) x)) is a wff which means the same thing as (and (P x) (Q x)). We regard a description as a primitive relation if its wff is a primitive wff.

Descriptions are analogous to lambda expressions in lisp. The syntax of wffs also includes Funcall and Apply, analogous to the lisp functions of those names. A wff whose CAR is the symbol Funcall is interpreted by evaluating the second element and treating it as the relation to be applied to the (values of the) other arguments. A wff whose CAR is the symbol Apply is interpreted by evaluating the second element and treating it as a relation to be applied to an argument list that is constructed by evaluating all the other arguments and appending all but the last to the last one. Thus (Apply r x (list y z)) is usually equivalent to (Funcall r x y z). [The exception arises when y or z are used as bound variables. Apply wffs always treat their last arguments as lisp forms to be evaluated, whereas in funcall wffs, as in normal wffs, the arguments may refer to bound variables. Thus (E (x) (funcall r x x)) means that there is an x that is related to itself by the relation which is the value of r, whereas (E (x) (apply r x x)) means that the relation which is the value of r has at least one true tuple which can be obtained by putting something in front of the value of the lisp variable x.] We regard a wff starting with funcall, or apply as primitive when the value of the second element is a primitive relation.

Relations can also be defined by computations that cannot be expressed in wffs. Descriptions, in fact, may be regarded as a special case of relation definition. Just as it is possible to replace a relation in a wff with a description, it is also possible to replace it with a more general definition. This is described in more detail in the section on anonymous relations.

Simple Database Operations

The simplest ways to access a database are to ask whether something is true or to add or delete facts:

Macro (++ . primitivewff)
makes primitivewff true.
[The notation (A . B) refers to a list whose CAR is A and whose CDR is B.]

Macro (-- . primitivewff)
makes primitivewff false

Macro (?? . primitivewff)
returns NIL if wff is false, something else if it's true.

For example, if there were a binary relation named R, one would make it relate the symbol George to the number 3 by doing (++ R 'George 3). One could find out whether George was related by R to anything at all by doing (?? E (x) (R 'George x)).

Examples: (copy and paste from between brackets to try) [ (in-package :ap5)(defrelation r :arity 2) (?? r 'george 3) (?? e(x)(r 'george x))(++ r 'george 3)(?? e(x)(r 'george x)) (?? r 'george 3)(-- r 'george 3)(?? e(x)(r 'george x)) ]

> (in-package :ap5)
#<PACKAGE AP5>
> (defrelation r :arity 2) ;; define binary relation R
[defrel R]
#,(DBO RELATION R)
> (?? r 'george 3)
NIL
> (?? e(x)(r 'george x))
NIL
(++ r 'george 3)
NIL
> (?? e(x)(r 'george x))
T
> (?? r 'george 3)
T
> (-- r 'george 3)
NIL
> (?? e(x)(r 'george x))
NIL

Normally, non-primitive wffs can be tested but cannot be directly updated.

There is also an "update" macro that combines addition and deletion:

Macro (update relation of expression* {from expression*} to expression*)
The syntactic keywords OF, FROM, and TO may be in any package. Relation must be the name of a relation, the arity of which is the number of expressions following OF plus the number following TO. If FROM is supplied it must be followed by the same number of expressions as TO. The tuple formed by appending the OF expressions and the TO expressions is added to the relation. That tuple is allowed to have already been in the relation. If FROM is supplied, the tuple formed by appending the OF expressions and the FROM expressions is removed from the relation. If that tuple was not initially in the relation, update generates an error. If FROM is not supplied, and the tuple to be added was not initially in the relation, then if there is any tuple in the relation starting with the OF tuple, an arbitrary such tuple is removed. The updates are done atomically (see transition). Update is meant primarily for situations where the set of values following OF forms a key of the relation, in which case the result is deterministic.

Thus, for example,

    (update phone of 'George to 126)

will make the phone of George be 126, and if George originally had another phone, one such phone will be removed.

    (update phone of 'George from 225 to 126)

requires that his phone was initially 225. It removes 225 and adds 126 (although the addition may be a noop).

Examples: (copy and paste from between brackets to try) [ (in-package :ap5)(defrelation phone :arity 2)(listof phone) (update phone of 'George to 126) (listof phone) (update phone of 'George to 225)(listof phone) (update phone of 'George from 225 to 126)(listof phone) (update phone of 'George from 225 to 128) ]

> (in-package :ap5)
#<PACKAGE AP5>
> (defrelation phone :arity 2) ;; define binary phone relation
[defrel PHONE]
#,(DBO RELATION PHONE)
> (listof phone) ;; list all phone tuples
NIL
> (update phone of 'George to 126)
126
> (listof phone)
((GEORGE 126))
> (update phone of 'George to 225)
225
> (listof phone)
((GEORGE 225))
> (update phone of 'George from 225 to 126)
126
> (listof phone)
((GEORGE 126))
> (update phone of 'George from 225 to 128)

*** - required precondition not true: PHONE (GEORGE 225)

In addition there is a macro for creating an object and relating it to several other objects:

Macro (create type {relation = expression*}*)
As an example,

    (create person firstname = "John" lastname = "Smith"
            parents = mother father)

In a single atomic transition (again, see atomic), this creates a new object and adds a number of facts about it. In this case we can see its translation as follows:

Examples: (copy and paste from between brackets to try) [ (in-package :ap5) (defrelation person :arity 1) (defrelation firstname :arity 2) (defrelation lastname :arity 2) (defrelation parents :arity 3) (macroexpand-1 '(create person firstname = "John" lastname = "Smith" parents = mother father)) ]

> (in-package :ap5)
#<PACKAGE AP5>
> (defrelation person :arity 1)
[defrel PERSON]
#,(DBO TYPE PERSON)
> (defrelation firstname :arity 2)
[defrel FIRSTNAME]
#,(DBO RELATION FIRSTNAME)
> (defrelation lastname :arity 2)
[defrel LASTNAME]
#,(DBO RELATION LASTNAME)
> (defrelation parents :arity 3)
[defrel PARENTS]
#,(DBO RELATION PARENTS)
> (macroexpand-1
 '(create person firstname = "John" lastname = "Smith" parents = mother father))
(ATOMIC
 (LET ((PERSON (MAKE-DBOBJECT)))
   (++ PERSON PERSON)
   (++ FIRSTNAME PERSON "John")
   (++ LASTNAME PERSON "Smith")
   (++ PARENTS PERSON MOTHER FATHER)
   PERSON)) ;
T

In the current version (we hope to fix this!) all create's allocate a lisp object of the same representation, called dbobject (for Data Base Object). It is still possible to do your own allocation, but the create macro does not help you. Several of the relation representations presented below require this representation.

Function (make-dbobject)
creates and returns a new DBObject.

Declaring Relations

The generally useful form for declaring a new relation is DefRelation. We provide a brief overview here, although the details are postponed for later. For a beginner it is sufficient to know that an n-ary relation can be declared by supplying only a name and arity, e.g.,

Examples: (copy and paste from between brackets to try) [ (in-package :ap5) (DefRelation PhoneNumber :arity 2) ]

> (in-package :ap5)
#<PACKAGE AP5>
> (DefRelation PhoneNumber :arity 2)
[defrel PHONENUMBER]
#,(DBO RELATION PHONENUMBER)

Typically one provides documentation and declares the types (unary relations) required of objects allowed in the tuples, e.g.,

Examples: (copy and paste from between brackets to try) [ (in-package :ap5) (defrelation person :arity 1) (DefRelation PhoneNumber :arity 2 :types (person integer) :documentation "(PhoneNumber x y) means that the integer y is the phone number of the person x") ]

> (in-package :ap5)
#<PACKAGE AP5>
> (defrelation person :arity 1)
[defrel PERSON]
#,(DBO TYPE PERSON)
> (DefRelation PhoneNumber :arity 2 :types (person integer)
 :documentation "(PhoneNumber x y) means that the integer y is the phone number
 of the person x")
[defrel PHONENUMBER
WARNING: Estimated size ...
]
#,(DBO RELATION PHONENUMBER)

The other common form that will be useful to a novice is defining a relation by a description, e.g.,

Examples: (copy and paste from between brackets to try) [ (in-package :ap5) (defrelation child :arity 2) (DefRelation sibling :definition ((x y) s.t. (E (parent) (and (child parent x) (child parent y) (not (eql x y))))) :documentation "(Sibling x y) means x and y are distinct and share a parent") ]

> (in-package :ap5)
#<PACKAGE AP5>
> (defrelation child :arity 2)
[defrel CHILD]
#,(DBO RELATION CHILD)
> (DefRelation sibling
  :definition ((x y) s.t. (E (parent) (and (child parent x)
                                           (child parent y)
                                           (not (eql x y)))))
  :documentation
  "(Sibling x y) means x and y are distinct and share a parent")
[defrel SIBLING]

In this case the arity and types are not needed - they can be deduced from the definition. However they can be provided if you like.

When a DefRelation form is evaluated for an already existing relation, the assumption is that its definition is being changed, and the result ought to conform to the new definition.

As a preview and incentive to keep reading, we mention that DefRelation allows you to specify such things as

data representions of relations, e.g., hasharray, alist, etc.
relations that are computed or derived from other relations, e.g., transitive closures
cardinality constraints, e.g., every person has at least one office
how many tuples you expect to populate a relation, e.g., the typical person will be the child of 2 parents and the parent of 3 children.

Relation Names

DefRelation is analogous to Defun in lisp. It "binds" a name to a relation object, which is analogous to a lisp function object. Furthermore, it is useful not only for declaring new relations, but also for changing already existing relations. To emphasize this analogy we provide the following functions:

Function (rboundp symbol)
This is analogous to lisp Fboundp - it returns T if the argument is the name of a relation and NIL if not.

Function (symbol-relation symbol)
This is analogous to lisp Symbol-Function - it returns the relation object of a given name. It signals an error if there is none.

Function (rmakunbound symbol)
This is analogous to lisp Fmakunbound - it removes the association between a name and any associated relation object. This does not "destroy" the relation object, however. The closest thing in AP5 to "destroying" a relation is (-- Relation relation), which makes it no longer be a relation.

The functions above are trivially defined in terms of this relation:

Relation (relationname relation symbol)
means that symbol is the name of the relation object relation. Relations are normally named, in order to make them easier to refer to. (Actually, names are also needed in order to compile to a file.) Relation names are just symbols. It is possible to change the name of a relation, but, as in lisp, this tends to cause problems, since compiled files identify relations by name. The recommended way to create aliases for relations is via defined relations.

Generators

Typically a database is used not just to test whether a particular fact is true, but to find out what facts are true. The general mechanism for this is called generators. For any sequence of variables, V, (a list of distinct symbols), and wff, W which uses (only) the variables in V freely, there is a set of tuples that satisfy W, in the sense that substituting the values for corresponding variables results in a wff that is true. The usual mathematical notation for this set is {V | W}. AP5 provides access to this set via the loop keyword s.t., standing for "such that", which is preceded by V and followed by W. As a special case, if V is a symbol (other than NIL) it is treated as a list whose only element is that symbol. NIL is treated as a list of no variables. W may contain arbitrary lisp expressions as arguments in its primitive wffs. Within the loop body, the variables may be lexically accessed by name. It is considered bad style to assign to the variables within the loop body (and we do not promise that the iteration will work if this is done). The iteration may be terminated by executing RETURN from lexically within the loop body. Variables may be introduced with "#" and "@" notation (see esoteric syntax).

We mention in passing the degenerate cases:
(loop for nil s.t. true do body) will do the body once.
(loop for nil s.t. false do body) will not do the body at all.

For fans of the commonlisp DO macro and its relatives, an AP5 analog is also provided:

Macro (do-s.t. (vars wff {result-form}) . form*)
iterates over bindings of the variables vars satisfying wff. The remarks above all apply here as well. For each tuple of values generated for the variables from the wff, the statements are executed with those values lexically bound to the variables. If the generator terminates, the result form is executed and its value(s) becomes the value(s) of do-s.t.. It is NOT executed if a statement RETURNs from do-s.t.. Within the result-form, the variables are not lexically bound by the do-s.t. form; values, if any, from the enclosing environment are seen.

Useful generator macros

The following common uses for generators have been encapsulated as macros:

Macro (any vars {s.t. wff} {ifnone form*})
If there are values for vars that satisfy wff, then one such set of values is returned (multiple values if multiple vars), otherwise the ifnone forms are evaluated, and the result(s) of the last are returned. By default, ifnone generates an error. Vars may be either spliced in or a list of symbols, e.g., (any x y z s.t. ...) = (any (x y z) s.t. ...). Finally, if there is no s.t. in the form, the condition defaults to true. [Of course, "s.t. true" only makes sense if the variables are typed, e.g., (any relation) will return a relation, whereas (any x) would cause an error.]

Macro (theonly vars {s.t. wff} {ifnone form*} {ifmany form*})
is similar, but if there is more than one set of values for vars the ifmany forms (which also default to an error) are evaluated and the last value(s) returned. The ifmany forms do not have access to bindings of variables that satisfy the wff. Again, the wff defaults to true.

Macro (forany vars {s.t. wff} form* {ifnone form*})
If there is any set of values for vars, then the forms are evaluated, with the value(s) of the last being returned. The variables of vars are bound to some set of such values. If there is no such set of values, the ifnone forms are evaluated (the variables are not bound here) and the result(s) of the last returned. Again, ifnone defaults to an error, and wff defaults to true.

Macro (fortheonly vars {s.t. wff} form* {ifnone form*} {ifmany form*})
is analogous.

Macro (listof vars {s.t. wff})
If there is only one variable in vars this returns a list of the values for that variable that satisfy wff. If there are multiple variables it returns a list of lists of values (in the same order as the variable list). (This macro was written because I always found myself typing its expansion in order to browse the database. Perhaps another form would be better for programming.) Again, (listof relation) will return a list of all relations.

Macro (set-listof vars s.t. (relation . args) form)
The intent of this macro is that (listof vars s.t. (relation . args)) should return the result of evaluating form. It only works if the relation can be added and deleted. This is meant primarily as an initialization form. (See following example.)

Examples: (copy and paste from between brackets to try) [ (in-package :ap5) (defrelation phone :arity 2) (set-listof phone '((george 123)(sam 456)(sam 123))) (listof phone) (loop for (x y) s.t. (phone x y) collect (list x y)) (loop for (y) s.t. (phone 'sam y) count t) (loop for (y) s.t. (phone 'sam y) as i below 1 do (print y)) ]

> (in-package :ap5)
#<PACKAGE AP5>
> (defrelation phone :arity 2)
[defrel PHONE]
#,(DBO RELATION PHONE)
> (set-listof x y s.t. (phone x y) '((george 123)(sam 456)(sam 123))) ;; provide data
;; above is equivalent to (set-listof phone '((george 123)(sam 456)(sam 123)))
NIL
> (listof phone)
((SAM 123) (SAM 456) (GEORGE 123))
> (loop for (x y) s.t. (phone x y) collect (list x y)) ;; similar to listof
((SAM 123) (SAM 456) (GEORGE 123))
> (loop for (y) s.t. (phone 'sam y) count t) ;; count sam's phones
2
> (loop for (y) s.t. (phone 'sam y) as i below 1 do (print y))

123 
NIL

Getting Started

After having read the preceeding sections you should know almost enough to actually start using AP5. Here we try to supply what's missing.

You have to begin with a machine on which AP5 is loaded. AP5 is defined in the AP5 package. Unless you want to do everything in the AP5 package (as we do in most examples in this manual), it will be convenient to use a package that uses the AP5 package, along with the COMMON-LISP package.

Examples:
> (defpackage "MY-PACKAGE" (:use "AP5" "CL"))
#<PACKAGE MY-PACKAGE>

It turns out that there are a few symbols in the ap5 package that conflict with symbols in the common-lisp package, and you might prefer the ap5 versions to the common-lisp versions. The symbols compile, defmethod, defun and loop are all intended to be upward compatible with the common-lisp versions. Loop is extended to understand s.t. and the others are extended to include ap5 code in code compiled to file. Other conflicts:

++ the ap5 operator to add to a relation, in common lisp it's the next to last form executed by the read-eval-print loop
abort in ap5, the way to exit an atomic transaction without making any updates, a condition type in common lisp
array a relation representation in ap5, a common lisp type
structure a relation representation in ap5, a common lisp type
variable a record type in ap5, a documentation type in common lisp

Assuming that you want to write ap5 code (which is why your package uses ap5), it seems likely that you'll want something similar to the following.

Examples:
> (defpackage "MY-PACKAGE" (:use "AP5" "CL")
 (:shadowing-import-from "AP5" compile defmethod defun loop ++ abort))

In addition to this documentation you have the AP5 database to help you figure out how to do things. This is a very significant resource. Of course, its value grows as you learn more about what it contains. In order to get you started, we mention the following:

Function (describerel rel-or-name {stream})
produces some output describing a relation.

Examples: (copy and paste from between brackets to try) [ (in-package :ap5) (Describerel 'relation) ]

> (in-package :ap5)
#<PACKAGE AP5>
> (Describerel 'relation)
RELATION  implementation: DEFINED
arity: 1
Compare Relations: (EQL)
Definition:
((R) S.T. (E (X) (RELATIONARITY R X)))
(Relation relation) is true of all known relations, 
i.e., (loop for r s.t. (Relation r) collect r) will give a list of relations.
The definition of a relation is an object in the first place of a tuple of
the RelationArity relation.
Appears in the definition of: 
(#,(DBO RULE Typeconstraint-PRECEDES-IN-SEQUENCE-3)  ...)
Immediate superrelations: (REL-OR-IMP RULE-OR-RELATION)
Immediate subrelations: (DONT-TELL-ME-ABOUT INLINEREL ...)

Function (describerule rule-or-name {stream})
produces some output describing a rule. (Rules are described later.)

Function (tell-me-about object {:stream stream} {:verbose verbose} {:rels rels} {:more number})
will print to stream all true, primitive facts about object (which can be any lisp object) that it can find in the database. This tends to be a slow operation, especially the first time it's done for an object of a particular type after creating many new relations. Rels is a list of the relations you want to know about - it defaults to all relations not in the relation:

Relation (dont-tell-me-about relation)

If verbose is non-NIL (default is NIL), then it tells you what relation it's on, what relations (and positions) cannot be generated, which relations are equivalence relations (their tuples are not listed), and what relations (and positions) haven't been generated before (for these it has to compile a new generating function, which takes time). More (default 5) is the number of facts to report with object in any one position of any one relation before stopping to ask whether you want more. If more is NIL it will never ask, but just report all the facts it can find. A negative value for more means that after displaying (the absolute value of) that many facts tell-me-about should automatically go on to the next relation.

Two particularly useful relations for browsing are:
Relation (derived-from relation-or-rule relation)
and its transitive closure,
Relation (derived-from* relation-or-rule relation)
These relate a relation or rule to any relation from which it is derived. (In the case of a rule this means any relation from which any relation in the trigger is derived.) This is especially valuable for finding out what rules might react to a change in the database, or what would be affected by a change in a definition of a relation.

Annotations, representations and derivations

At this point we have described how to declare two different kinds of relations - those that are stored explicitly in memory and those defined by wff. While this is sufficient for obtaining many interesting behaviors, AP5 offers a lot more. First, it allows you to choose a representation of data that you think will make your program more efficient. You can even create your own representations, although that is described later in the manual. Next, you can provide information to the compiler that helps it to optimize the access of compound wffs. You can also optimize your program by deciding to redundantly store a representation of a relation defined by wff. (This is advantageous if you spend a lot of time reading the relation but it's not often updated.) All of these are called "annotations" because they do not affect the meaning of the specification. They may affect whether the specification can be compiled, but if it can be compiled, it will mean the same thing regardless of the annotations.

As part of the optimization process it is often useful to see what algorithms AP5 is compiling for you. The usual question is how it decides to generate a compound wff. The answer is supplied (in pseudo-code) by the following function:

Function (describe-algorithm description)
We also describe in this section some instructions that you can give AP5 which do affect the meaning of a specification. These are standard idioms for declaring how a relation is to be computed or derived from other relations. Again, the standard idioms can be augmented by your own, but that is described later in the manual.

We divide relations into three classes: stored, derived and computed.

Stored relations are those which can be directly updated. You can think of these relations as being represented in some data structure. The representation parameter in DefRelation determines what kind of data structure that is. The choice of representation for a stored relation is an annotation.

Derived relations are defined in terms of some other relation(s), and changes as those relations change. One decides whether a tuple is in a derived relation by doing a computation that access these other relations. (To get a relation that accesses both data structures and other relations you can factor out the data structures into stored relations and let a derived relation access those.) The derivation parameter is used to describe how derived relations are to be derived, and it does influence the meaning of the AP5 program

Computed relations never change. It seems academic to wonder whether tuples are tested by examining a data structure or whether the "data" is embedded in a program. A declaration that a relation is computed simply means that it is static, i.e., unchanging. In practice the mechanisms for declaring computed relations turn out to be the same as for declaring derived relations - both describe computations that, in the case of derived relations happen to access changing data.

You tell AP5 which type of relation you are declaring by supplying arguments to the DefRelation macro. Only one of the keyword arguments :derivation, :representation or :computation should be supplied - :computation for computed relations, :derivation for derived relations and :representation for stored relations. We will refer to these parameters collectively as the "implementation". A number of possible values for these parameters are described below. If none of derivation, computation or representation is supplied, then if a definition is provided, the derivation defaults to defined, otherwise the representation defaults to base.

Relations can, in fact, be both stored and derived, i.e., a data structure may redundantly record what could in principle be derived. We call these relations maintained relations. At present only relations defined by wffs may be maintained. (Anonymous relations make it convenient to define the relation that you want to maintain.) This annotation is supplied simply by providing both :definition and :representation arguments to DefRelation.

Another annotation is provided by the :size keyword to DefRelation. This is a list alternating between patterns and numbers. For instance, we might declare the Child relation as follows:

Examples: (copy and paste from between brackets to try) [ (in-package :ap5) (DefRelation Child :arity 2 :documentation "(Child x y) means that y is a child of x" :size ((input output) 3 (output input) 2 (output output) 1000)) ]

> (in-package :ap5)
#<PACKAGE AP5>
> (DefRelation Child :arity 2
 :documentation "(Child x y) means that y is a child of x"
 :size ((input output) 3 
        (output input) 2
        (output output) 1000))

This would mean that the expected values of the following programs

    (loop for x s.t. (child a x) count t)
    (loop for x s.t. (child x a) count t)
    (loop for (x y) s.t. (child x y) count t)

would be 3, 2, and 1000 repectively, i.e., a typical person would have 3 children and 2 parents, and there would be 1000 tuples in the child relation. This information is used by the compiler to choose among algorithms. For instance, suppose you write this program:

    (any x s.t. (and (child p1 x) (child x p2)))

Should this compile into a program that searches the children of p1 for one whose child is p2 or a program that searches the parents of p2 for one whose parent is p1? The answer depends on how expensive it is to find parents and children, but also how many there are. Normally when you find AP5 using a bad algorithm (even though you supplied what you thought were good representation choices!) the problem is that it doesn't know some critical size data.

In general, patterns can contain the symbols INPUT and OUTPUT, and constants. Given arbitrary objects for the arguments labeled INPUT, the size annotation tells you how many tuples to expect from generating values of the variables labeled OUTPUT. The special size NIL means that there may be infinitely many matching tuples.

Predefined representations

While it is possible to define your own representations and derivations (this is described towards the back of the manual), this is usually not necessary. The following choices are already supplied.

Base is a particularly dumb, but general representation of relations, which simply stores a list of the true tuples. This is the default if no implementation is supplied, i.e.,

    (defrelation foo :arity 2)

is equivalent to

    (defrelation foo :arity 2 :representation base)

(Partial-index [position]*) means that an index is maintained (implemented as a hash table) for the arguments at the specified postions, e.g.,

    (defrelation foo :arity 3 :representation (partial-index 0 2))

stores indices for the first and last arguments (positions 0 and 2). Such relations have quick access to all tuples for which a particular value occupies an indexed position. This implementation is similar to the "base" implementation in that it allows complete generation of relations of any arity. It's much faster for many types of generation but of course requires much more space. Specifying no slots, i.e., (partial-index), is functionally equivalent to base.

StructProp can only be used for a binary relation that relates only DBObjects to arbitrary other objects, y. A property on the property list of the DBObject is used to store a list of objects related to that DBObject.

    (defrelation phone# :arity 2 :representation structprop)

Tree relations stored the tuples of the relation as a tree with depth equal to the arity of the relation. From the root of the tree there is one branch for each value that apppears as the first argument of the relation, and a subtree associated with each branch similarly represents a relation of arity one less than its parent. If the original relation was (R x y z), then the branch associated with the value a would represent the relation corresponding to the description ((y z) s.t. (R a y z)), and it would similarly have one branch for each y value in that relation. The discrimination among branches is done by assoc or gethash, depending on the number of branches.

    (defrelation works-on% :arity 3 :representation tree)

TreeProp relations are a combination of structprop and tree relations. Treeprop can only be used for binary relations. If R is a treeprop relation and (R x y) is true, then y will be on the property list of x, which must be a DBObject, and there will also be a hashtable with an entry indexed on y having x in its value. This has the advantage of very fast access of y from x, while the ability to generate all the tuples in the relation is retained.

TreeProp+ relations are a generalization of treeprop in that domain values need not be DBObjects. Range values are stored on property lists when the domain values are DBObjects and symbols, and otherwise in a hashtable.

Two-Way-Structprop means that the relation is a binary relation that relates DBObjects to other DBObjects by storing each on the property list of the other. Thus for such a relation it is impossible to generate {x,y | (R x y)}.

    (defrelation parent :arity 2 :representation two-way-structprop)

Stored-In-Place and Stored-In-Place-Single are representations which accept as a parameter a piece of code that computes a place where values are stored. This is described in the section on creating new implementations. For the purpose of declaring relations we describe the implementation translators that DefRelation accepts to create specialized relations of these two "real" implementations.

GlobalVar means that the (unary) relation is stored as a list of values in the value of a global variable. This is declared in DefRelation by

    :representation (globalvar [variable-name])

as in

    (defvar *booktitles* nil)
    (defrelation booktitles :arity 1 :representation (globalvar *booktitles*))

Normally AP5 assumes that you will never set that variable other than with database update macros (,e.g., ++). This applies similarly to all of the Stored-In-Place variants. If you do intend to set it, you should see nonatomic.

GlobalVar-single means that there is at most one value in the (unary) relation, and it is stored in the value of a global variable. This is declared in DefRelation by

    :representation (globalvar-single [variable-name])

The remarks above still apply. The relation will be true of no values if the variable is unbound or a special empty marker.

The empty marker is the value of ap5::*no-single-value-stored*. Of course it is an error to add a tuple containing that value - it is only to be used for initializing data structures identified with "-Single" relations.

Declaring a -Single relation does not automatically declare a count constraint. You can, of course, include the constraint in DefRelation.

Structure means that the (binary) relation relates objects represented as structures to the values in a list stored in some field of the structure. This is declared in DefRelation by

    :representation (structure [structure-name] [field-name])

as in

    (defstruct ship name tons)
    (defrelation ship-name :representation (structure ship name))

Structure-Single is similar to defstruct except that the field contains either the single value related to the structure or an empty-marker.

SymbolProp means that the (binary) relation relates symbols to the values in a list stored as a property of the symbol. Note that it is not possible to generate all the tuples of such a relation. This is declared in DefRelation by

    :representation (symbolprop [property-name])

SymbolProp-single is similar but only one value is stored under the property name.

Alist means that the (binary) relation is stored as an association list. You have to supply a piece of code that (a) can be used with SETF and (b) finds the alist. The DefRelation parameter is

    :representation (alist [code to find alist])

For instance, (car *data*) would be acceptable code, since (setf (car *data) [value]) can be compiled.

Alist-single is similar but instead of associating a list of values with a value, only one value is stored.

HashTable means that the (binary) relation is stored in a hashtable that hashes on the domain to find a list of related values. Again, you must supply a form to find the hashtable:

    :representation (hashtable [code to find hashtable])

You (not AP5) must arrange for the hashtable to use the appropriate test for the canonicalized tuples (see Equivalence).

HashTable-single is similar but only stores a single value.

Array means that the relation is stored as an array. The arity of the relation must be one more than the number of dimensions of the array. Such a relation relates array indices to the elements of a list stored in the array at those indices. Again, you must supply a form to find the array, and it is your responsibility to give it the right number of dimensions:

Examples: (copy and paste from between brackets to try) [ (in-package :ap5) (defvar myarray (make-array '(2 3 4))) (defrelation arrayrel :arity 4 :representation (array myarray)) ]

> (in-package :ap5)
#<PACKAGE AP5>
> (defvar myarray (make-array '(2 3 4)))
MYARRAY
> (defrelation arrayrel :arity 4
  :representation (array myarray))

In this example (++ arrayrel a b c d) would generate an error if a were not either 0 or 1 or if b were not an integer in [0,2] or c were not an integer in [0,3]. It is assumed that different array indices are not considered equivalent by the comparison relation for the relation.

Array-single is similar but only stores a single value.

Examples: (copy and paste from between brackets to try) [ (in-package :ap5) (defvar myarray1 (make-array '(2 3 4) :initial-element ap5:*no-single-value-stored*)) (defrelation arrayrel1 :arity 4 :representation (array myarray1)) ]

> (in-package :ap5)
#<PACKAGE AP5>
> (defvar myarray1 (make-array '(2 3 4) :initial-element
			 ap5:*no-single-value-stored*))
MYARRAY1
> (defrelation arrayrel1 :arity 4
   :representation (array myarray1))

Individual and Individual-Derived are a representation and a derivation with no capabilities whatever. They are to be used for relations where all the capabilities are supplied in the DefRelation form. Some of the "representations" above and "derivations" below are actually provided via translation functions that produce Individual and Individual-Derived relations.

Multiple (redundant) representations are allowed (if they are compatible) by supplying as the representation argument to defrelation a list of the form (:multiple rep1 rep2 ...), where the elements other than the first would be allowed as representation arguments.

Predefined derivations and computations

Most of these may be used with either the :derivation or :computation parameter of DefRelation. The only difference is that :computation declares that the relation is static and :derivation declares that it is not.

Defined means that the relation is defined by a description. This is what you get by supplying a :definition parameter but no implementation parameter. For example, one might define the Sibling relation from the Parent relation as

    ((x y) s.t. (E (z) (and (Parent x z) (Parent y z) (not (eql x y)))))

In other words, two distinct objects are siblings if they share a parent. In this case it would be possible to generate the sibling relation, but not to directly update it. Recursive relations are not supported. The reason is that they typically are not really definitive, i.e., several different sets of tuples satisfy the definition. [The section on derived relations describes how you can get relations to exhibit recursive behaviors, such as transitive closures.] In fact, defined relations may not contain funcall's or apply's, and the expressions used as arguments must all be either bound variables or constants (not expressions to be evaluated at run time). [See the description of Memo if you intend to use DBObjects (,e.g., relations) or other "weird" objects as constants in wffs.]

Coded means that a relation is computed from a lisp form. This is specified to DefRelation by a :computation or :derivation parameter of the form (coded (vars s.t. lispform)). The lispform should use the variable names in vars and evaluate to something other than NIL if the relation is true of those values. The lispform should never generate an error, since relations are supposed to be either true or false for all tuples. One could define the EQL relation with

    (DefRelation EQL :arity 2 :computation (coded ((x y) s.t. (eql x y))))

In general, such relations are expected to be static. If this is not what you want, see the section on derived relations, in particular, the explanation of triggers. See SimpleGenerator and RelAdder for information about updating and generating.

Lisp-Predicate means that the relation is computed by a lisp predicate. This is specified to DefRelation by

    :derivation (Lisp-Predicate [name of lisp predicate])

, e.g.,

    (DefRelation EQL :arity 2 :computation (Lisp-Predicate eql))

The parameter supplied to Lisp-Predicate must be the name of a lisp predicate of arity arguments. The new relation is testable, but not generable, assertable, or deletable. The lisp predicate serves as the testing function. The current implementation can be used even for predicates that cause errors, e.g., < is not defined for non-numeric arguments. Testing code will treat any error as an indication that the relation is false. (Be careful if you use your own predicates!)

Lisp-Function means that the relation is computed by a lisp function. This is specified to DefRelation by

    :derivation (Lisp-Function [fname] [input-arity] [output-arity])

Fname is the name of a lisp function of input-arity arguments, returning output-arity values. The relation is true of a tuple of length input-arity + output-arity if the values of the function on the first input-arity elements are the last output-arity arguments. [In some cases it will be useful to assign equivalence relations to the output arguments - see Equivalence.] Such a relation is testable and can generate a single value for its output roles given values for its input roles. The current implementation, as above, catches errors. Any input that causes an error in the function is assumed to participate in no tuples of the relation.

BaseType means that the relation is an inherited type - see the description of basetypes in the section on types.

Examples: (copy and paste from between brackets to try) [ (in-package :ap5) (DefRelation person :derivation basetype) ]

> (in-package :ap5)
#<PACKAGE AP5>
> (DefRelation person :derivation basetype)
[defrel PERSON]
#,(DBO TYPE PERSON)

creates a basetype named person. The :types argument can be used to supply one supertype. (If more are needed they can be added separately.)

Unionclass means that the relation is a unary relation which is the union of several other unary relations.

    (Defrelation person :derivation (unionclass man woman child))

is approximately the same as

    (Defrelation person :definition ((x) s.t. (or (man x) (woman x) (child x))))

except that it also adds assertions that the types man, woman, and child are all subtypes of person.

It is possible to include a comparison (see Equivalence) as in

    (Defrelation person (unionclass :equiv equal man woman child))

That turns out not to be the same as

    (Defrelation person :equivs (equal) (unionclass man woman child))

because equivs arguments are ignored in the case of :definitions in favor of computing the equivalence from the definition. Rather, it is equivalent to

    (Defrelation person :definition ((x@equal) s.t. (or (man x) (woman x) (child x))))

Intersectionclass means that the relation is a unary relation which is the Intersection of several other unary relations. This is completely analogous to unionclass.

TClosure means that the relation is the transitive closure of another relation. The transitive closure is true of objects x and y if there is a finite sequence (with length at least two) of objects starting with x and ending with y, such that every adjacent pair of objects in the sequence is related by the other relation (called the immediate relation). This is specified to DefRelation by

    :derivation (tclosure [immediate-relname])

A common variant of transitive closure relates every object (of some type - usually the domain of the immediate relation) to itself. For example, the subtype relation relates every type to itself. This is equivalent to the transitive closure of a new relation defined as:

    ((subrel superrel) s.t.
     (or (isubtype subrel superrel)
         (and (relation subrel) (eql subrel superrel))))

Cardinality means that the relation counts the number of tuples of some other relation. For purposes of exposition, suppose we have the following relation:

    (Works-on% person project percent)

meaning that some percentage of a person's effort is spent on a given project. Suppose further that only non-zero percentages are represented by tuples, i.e., that when a person does not work on a project at all, there is simply no tuple for that person and project. We could derive the relation

    (Number-of-Projects person count)

which relates a person to the number of projects he works on as follows:

Examples: (copy and paste from between brackets to try) [ (in-package :ap5) (defrelation works-on% :arity 3 :types (string string number)) (defrelation number-of-projects :derivation (cardinality works-on% (input output output))) ]

> (in-package :ap5)
#<PACKAGE AP5>
> (defrelation works-on% :arity 3 :types (string string number))
[defrel WORKS-ON%]
#,(DBO RELATION WORKS-ON%)
> (defrelation number-of-projects :derivation
   (cardinality works-on% (input output output)))
[defrel NUMBER-OF-PROJECTS]
#,(DBO RELATION NUMBER-OF-PROJECTS)

In general, every tuple of the original relation corresponds to exactly one tuple of the derived relation. That tuple is obtained by keeping only the elements of the original tuple at the "input" positions, and adding an integer to the end. The integer at the end of a derived tuple is the number of tuples of the original relation that correspond to that derived tuple. Thus, if the pattern consists entirely of output's, the derived relation will contain at most a single one-tuple: the number of tuples in the entire relation. At the other extreme, if the pattern consists entirely of input's, then the derived relation will contain exactly as many tuples as the original relation, but each tuple will have a 1 appended to the end.

Sum is similar to cardinality, but instead of counting the tuples, the derived relation adds the elements of all the tuples at one particular position. Again using the works-on% relation above, we could derive the relation

    (Total% person total)

which relates a person to the sum of his percentages on all projects as follows:

Examples: (copy and paste from between brackets to try) [ (in-package :ap5) (defrelation works-on% :arity 3 :types (string string number)) (defrelation total% :derivation (sum works-on% (input output sum))) ]

> (in-package :ap5)
#<PACKAGE AP5>
> (defrelation works-on% :arity 3 :types (string string number))
[defrel WORKS-ON%]
#,(DBO RELATION WORKS-ON%)
> (defrelation total% :derivation
   (sum works-on% (input output sum)))
[defrel TOTAL%]
#,(DBO RELATION TOTAL%)

Again, every original tuple corresponds to exactly one derived tuple, and that tuple is obtained by keeping only the elements of the original tuple at the "input" positions. However the number at the end of the derived tuple is the sum of numbers at the "sum" position of all corresponding tuples. The implementation ASSUMES that all tuples will contain numbers in the "sum" position. Notice that it is possible for sum relation tuples to end in zero. There will be a derived tuple if and only if there are any original corresponding tuples.

Extreme is similar to cardinality and sum, but is used for finding extreme values. Again using the works-on% relation above we could derive the relation

    (Main-Project person project percentage)

which includes a tuple of works-on% if there is no other tuple for the same person with a larger percentage as follows:

Examples: (copy and paste from between brackets to try) [ (in-package :ap5) (defrelation works-on% :arity 3 :types (string string number)) (DefRelation Main-Project :derivation (extreme works-on% > (input output extreme))) ]

> (in-package :ap5)
#<PACKAGE AP5>
> (defrelation works-on% :arity 3 :types (string string number))
[defrel WORKS-ON%]
#,(DBO RELATION WORKS-ON%)
> (DefRelation Main-Project :derivation
   (extreme works-on% > (input output extreme)))
[defrel MAIN-PROJECT]
#,(DBO RELATION MAIN-PROJECT)

Notice that the tuples of an extreme relation are a subset of those of the original relation. The >= relation means that a tuple should be included if the object in its "extreme" position is greater or equal to the objects in that position of all other tuples that match in the input positions. Thus there will be at least one tuple for every person in the works-on% relation. There could also be more than one - a person might work 50% on two projects. The < relation would have meant to keep only the tuples with the smallest percentages. The relation <= does the same thing as the relation <. If we change the input to an output then all the tuples will have the same percentage - the highest percentage that anyone works on any single project. The implementation ASSUMES that the ordering relation is transitive and static.

Anonymous relations: (It may be possible to define anonymous stored relations, but these are generally not as useful as derived and computed relations.) Wherever a relation is needed in a wff, it is possible to declare an "anonymous" relation. This is just like a regular relation except that it is declared without a name. EQUAL definitions will refer to the same relations, however. The form of the anonymous relation is similar to defrelation, except that the symbol DEFRELATION is replaced by ANON-REL and there is no relation name argument.

Other implementations may appear in later versions of this manual.

Partial orders

A binary relation (or description) R is a partial order if it is non-reflexive (nothing is related to itself) and transitive. The macros below provide an expression language for defining partial orders out of other partial orders.

It is anticipated that these macros will be used in wffs, e.g., in the :definition argument to defrelation. Having defined a relation, R, that is a partial order, one might use it for such purposes as:

(sort seq #'(lambda (x y) (?? R x y)))

None of the arguments of these macros is evaluated. We use the term "ordering expression" to mean any of the following:

the NAME of a relation that is a partial order
a description, ((v1 v2) s.t. ...), that is a partial order
a list whose macroexpansion is an ordering expression

Macro (inverse-order ordering-expression)
This is just the inverse relation of the argument - if the argument relates x to y then the inverse relates y to x.

Macro (priority-order ordering-expression+)
A priority order P is a sequence of partial orders. The priority order relates x to y, if there is some member of the sequence that relates x to y, and no earlier member relates y to x.

Macro (order-by-class class-expression {ordering-expression})
A class-based ordering C causes members of the class (type) denoted by class-expression to precede non-members. Two members of the class are ordered with respect to one another by the order denoted by ordering-expression, if it is provided. Specifically, C relates x to y if and only if either, x is a member of the designated class and y is not, or both x and y are members of the class, and ordering-expression relates x to y. Class-expression may be either the name of a unary relation or a unary description. The default ordering-expression is the empty ordering -- ((x y) s.t. false).

Macro (image-order ordering-expression image-relation)
The image order orders values indirectly in terms of an ordering on values to which they are related. Image-relation may be the name of a binary relation or may be a binary description. Let R be the relation denoted by image-relation, and O be the relation denoted by ordering-expression. The image order relates x to y if and only if

(E (x' y') (and (O x' y') (R x x') (R y y')))

For an image order to be a partial order, it is sufficient that O be a partial order and that for any value x, all its images under R are interchangeable with respect to O. (The latter is true, in particular, if no value x has multiple images under R.)

Macro (Lexicographic-order ordering-expression)
A lexicographic order orders sequences in terms of an ordering on their positionally corresponding members. Suppose O is the partial order denoted by ordering-expression, and x and y are sequences (lists or vectors). Then the lexicographic order relates x to y if, there is some index, i, such that O relates the i'th element of x to the i'th element of y, and for all smaller indices, j, the j'th elements of x and y are not related (in either order) by O.

Macro (literal-order equivalence . values)
A literal order orders the elements of values according to their position in the list. It relates x to y if the first occurrence of x precedes the first occurrence of y in values, or if x occurs in values and y does not. Equivalence must be the name of an equivalence relation, or a binary description that satisfies the axioms for being an equivalence relation. It is used to define occurrence in the list. (See Equivalence.)

Rules, Constraints and Atomic Transitions

Rules provide a way to prevent or react to certain conditions arising in the database. Not surprisingly (given what has already been described), these conditions are specified in AP5 by wffs. Before rules are described, however, we point out that the presence of rules makes it important to be able to control the size (granularity) of changes to the database. As an example, suppose we have a constraint that every person must have a unique name. If every update (++ or --) were done separately, it would be impossible to create a new person or change the name of a person, since the first step would always result in a person without a unique name. Several updates have to be made "at the same time".

The Atomic construct provides control over the size of database updates. The lisp macro

Macro (atomic form* {ifabort abortform*} {ifnormal normalform*})
evaluates form* in order (barring an abort, described below). However, all attempts to update the database are held off until the end, at which time they are all done "at once". This means that no rule can react to a situation in which some but not all of these changes have been made. Every primitive update, u (a ++ or --) is semantically equivalent to (Atomic u).

As a first order approximation (refined below), we can view an AP5 program as doing a sequence of atomic updates:

(loop while t do (atomic ...))

AP5 expands each atomic update into a loop, something like:

(let (automation-queue)
  [do atomic update]
  (loop while automation-queue do 
        (funcall (pop automation-queue))))

where each atomic update may add activities to the queue, and the activities themselves may do further atomic updates. The order in which these activities are performed is not specified. However, all of them will be performed before control returns to the program making the original update. The mechanism by which activities are enqueued is called automation rules, and is described below.

Furthermore, AP5 expands each (do atomic update) into an inner loop something like:

(loop until [proposed atomic update consistent] do
   [attempt to fix atomic update]

where both consistency and the mechanism for fixing atomic updates are defined by consistency rules, also described below.

At any time inside the form* of an atomic, execution of the macro

Macro (abort tag format-string . objects)
will cause an exit to the (outermost) atomic, with no changes to the database. In this case the elements of abortform* for that atomic are evaluated, and the value(s) of the last returned from the atomic. These may refer to

Variable abortdata
This is bound to a list which starts with the arguments to abort, and then contains additional information (which we will try to formalize in a later edition of this manual), including the data built up so far to represent this transition in the global-history list, documented in the section on transition history. The default ifabort code calls error on the arguments to abort, excluding the tag. The idea is that the tag should be used by abortform* as a classification of the problem, so it can decide how to recover. Of course, it is free to use any other data returned from the abort as well. For atomics without ifabort forms, including database updates that are not contained in an explicit atomic, the format-string provides a way to describe to the user what went wrong.

The following macro is useful for debugging aborted updates:

Macro (from-abort {({:abortdata abortdata} {:updatestatus updatestatus} {:delta delta})} . body)
In its simplest form, when you're in an error caused by an abort (so abortdata is bound), (From-Abort ()) simply puts you in the context of the abort and executes (BREAK). This allows you to evaluate expressions from the "last" inconsistent state before the abort. Updates are not allowed. A saved previous value of abortdata may be used as the value of abortdata and forms to be evaluated in the aborted context may also be supplied.

There are also facilities for retrying and correcting aborted transitions. These are not detailed here because we hope they will be self explanatory.

If there is no abort, then after the database is updated, normalform* is executed and the values of the last are returned from the atomic. If no normalforms are supplied, the values of the successful atomic are those of the last of form*.

Within an atomic, (Atomic form* ifabort abortform* ifnormal normalform*) is equivalent to (progn form* normalform*), i.e., any abort will escape to the outermost atomic. Since atomics are dynamically scoped, the same piece of code might have different effects depending on whether it is executed from inside an atomic.

Macro (insist {explanation-string} . wff)
will cause the atomic to abort (in the end) if wff would not be true in the resulting database. This wff may refer to runtime values which are computed at the time of the insist. For example, if one has the lisp variable x bound to an object which should end up being in relation R, (Insist (R x)) will make sure that it is. The explanation is simply used as an error message if the wff turns out to be false.

Macro (inatomic)
returns a non-NIL value if executed inside an atomic and NIL otherwise. In particular, you can write code that is not allowed to be executed as part of an atomic by doing (when (InAtomic) (Abort ...)).

Although we don't expect it to be needed very often, we mention that at the top level of an outermost atomic, the symbol REVEAL allows things that are done afterward in the atomic to "see the effects" of updates done before the reveal. The effects do not include any rules that would react to those changes. This might be useful if, for example, you want to create a rule and, in the same atomic transition, assert something about that rule. If you do

(atomic (defautomation rule1 ...)
	(++ P (theonly r s.t. (rulename r 'rule1))))

you'll get an error: the form to find the rule named rule1 is being evaluated before there is any such rule. If we do

(atomic (defautomation rule1 ...) reveal
	(++ P (theonly r s.t. (rulename r 'rule1))))

then the rule will be visible when it is needed. Of course, if the atomic aborts then this relation will not survive.

For now we consider it an error for reveal to be used anywhere other than the top level of an outermost atomic. We may later define what it means in inner atomics, in consistency rule responses, etc.

Examining and undoing history

The ability to examine and "back up" through history, undoing database transitions makes it easier to recover from errors (and correspondingly more attractive to experiment). The database transitions are recorded in the global-history list. If all updates are reversible, events can be "undone".

Function (history ...)
is the history browser. It uses the *query-io* stream to display information, ask questions and get answers. Many print control parameters can be set interactively. Initial values for these parameters can be passed in as keyword arguments. These default to the values of special variables which can be set or bound in order to make defaults more permanent. The browser offers two types of "undo", which we'll call "forward" and "backward". Backward undoing corresponds to "going back in time". Only the most recent event can be undone backward, after which it is "forgotten" and the previous event can be undone backwards. Backward undoing ignores rules. Undoing "forward" means doing a new atomic update with the opposite effect of the current event: delete all the stored tuples added and add all the stored tuples deleted. Forward undo's are done with rules in effect. Any event may be undone forward (and in fact the undo may later be undone). Note, however, that forward undo's may abort or trigger automations.

Note: See the note in the section about decaching.

If you plan to back up to a particular point, another mechanism is more useful:

Function (history-label name)
creates an event labelled with name.

Function (undo-to name)
undoes (backward) all events back to (and including) the one labelled with name.

This facility is used to provide a database transaction macro:

Macro (transaction {label} . body)
If the first item in a transaction is a symbol then it is treated as a history label for the beginning of the transaction. A transaction may not be executed inside an atomic. It may contain atomic transactions to be executed sequentially. At any time inside a transaction the form:

Macro (abort-transaction . values)
will cause any transitions done inside the innermost transaction to be undone, and the values will be returned as the values of the transaction. If no AbortTransaction (or throw outside the transaction) is done, the values of the last form in the transaction are used as the values of the transaction. N.B. Any non-local transfer of control from inside a transaction to outside causes an abort-transaction.

Macro (intransaction)
returns T from inside a transaction, NIL otherwise. Transactions may be nested.

Consistency Rules

Consistency rules provide a way to maintain "constraints". Constraints are conditions, expressed by wffs, that are supposed to be true in any state of the database. Each constraint is associated with a consistency rule which reacts to proposed transitions that would violate the constraint by either causing the transition to abort or proposing additonal updates for the transition. We view an atomic update as requesting the database to reach a state in which all facts added by the atomic are true and all facts deleted are false. Therefore consistency rules are not allowed to remove changes from transitions. It also follows that any attempt to add and delete the same fact in the same transition must abort.

Intuitively we want atomic transitions to be augmented only with updates that are needed to satisfy the constraints. Unfortunately there may be alternative ways to satisfy this informal specification. It is also desirable for the augmentation to be easily predicted by the programmer (it helps for it to be deterministic) and relatively efficient for the machine to compute. We provide here a heuristic approximation that in practice seems to meet these requirements.

When a set of changes is proposed as an atomic transition, AP5 determines whether the state resulting from that set of changes would be consistent (satisfy all constraints). If so, the changes are made and the transition is complete. Otherwise, the consistency rules associated with the threatened constraints (those that would be violated by the proposed transition) are executed in an environment that allows access to both the (inconsistent) proposed state and the (consistent) current state. The rules do not see augmentations proposed by other rules. If no rules propose any augmentations, the transition aborts (since it's inconsistent and nothing can be done to fix it). If any rule proposes an augmentation that's incompatible with another proposed update (either original or augmentation), the transition aborts. Otherwise, all the proposed augmentations are made to the proposed set of updates and the result is proposed as an atomic transition (which may trigger more consistency rules).

This algorithm is deterministic (if the rules are deterministic) and in fact the result does not depend on the order in which the rules are considered. However infinite loops are possible, e.g., if there's a rule which reacts to the addition of R(n) for any integer n by adding R(n+1). Next, it is obvious (but important to understand) that the code in a consistency rule must be able to tolerate an inconsistent database. Also, when several cycles of consistency rules are needed, the cycle in which something happens (and thus the length of the path from one rule to another) can make a difference.

As an example, suppose we start in a situation where the (unary) relation P is everywhere false. There are two rules, R1 and R2. R1 requires that
(A (x) (implies (P x) (E (y) (Q y)))). It reacts to a threat of violation by doing (++ Q 1). R2 requires that (A (x) (implies (P x) (Q x))), and reacts to a threat of violation (for x) by doing (++ Q x). Clearly, if we do (++ P 2), the reaction of rule R2 will in some sense make the action of R1 unnecessary. However, the presence of both rules will cause the addition of two instances of Q. suppose, however, that R1 had been replaced by two rules, R1a and R1b, which communicate through a new relation P1: R1a requires that (A (x) (implies (P x) (P1 x))), and reacts by doing (++ P1 x), while R1b requires that (A (x) (implies (P1 x) (E (y) (Q y)))), and reacts by doing (++ Q 1). Now if we do (++ P 2), only one instance of Q will be added. Rule R1a will be triggered and will cause (P1 2) to be added, but after that, R1b will never be triggered, due to the addition of (Q 2).

One particularly important case is that if a rule creates some new object, and it is important that it not create two such objects, then its action should prevent it from triggering (as opposed to making an update that will cause some other rule to do something that will prevent it from triggering). Of course, if the action can be done many times without causing any trouble, such as making the same assertion, then a long path (in terms of consistency cycles) to the solution (in the sense that the rule will no longer trigger) still works.

Another point is that the inability to remove an update from a transition sometimes necessitates the toleration of redundancy. As an example, suppose we have a stored relation (R x y), but we really only want to access its transitive closure, R*. Then we might want to try to keep R minimal, in the sense that there's no need to store (R x y) when (R* x y) can be derived in its absence. However, a rule that prohibited such redundancy would prevent otherwise legal updates, such as

(atomic (++ R a b) (++ R b c) (++ R a c))

An attempt to delete the redundant (R a c) would contradict the original update. This problem can only be tackled by giving some preprocessor access to the entire set of updates. An alternative is to tolerate the redundancy locally (for the transition), but add an automation rule that deletes redundant facts in another transition.

note: We are considering how to provide the ability to install such a "preprocessor" for derived relations, but this is not yet provided.

In order to decide how to react to an inconsistent set of proposed changes, a consistency rule may want to see what has changed. For example, one interpretation of the constraint that nobody have more than one office is that whenever an office is given to a person, if he already had another office then he should be required to vacate it. In order to see what is changed, consistency rules are allowed to use a limited form of temporal reference in the wffs that they use for database access. In particular, the two special forms
(Previously wff) and
(Start wff)
are allowed as wffs. In the absence of these temporal operators a wff refers to the proposed, possibly inconsistent state. Previously refers to the (consistent) state before the atomic transition started. (Start wff) means (and wff (not (Previously wff))), i.e., the wff is proposed to become true. Temporal reference is allowed in the following places:

In reactions of consistencyrules
In triggers of consistency and automation rules (introduced below)
In functions for generating automation invocations
In Trigger code (introduced later)

Macro (previously . body)
is also defined as a lisp macro. The forms are evaluated as of the previous state. The macro generates an error if executed outside one of the situations (listed above) where temporal reference is allowed.

There are subtle differences between consistency rules that maintain time sensitive as opposed to time insensitive conditions. For example, if we prohibit (Start P), P is allowed to be true, but it may not become true after having been false. Also, the knowledge that a time insensitive wff was true (or false) before means that certain things need not be checked. For example, if we prohibit (E (x) (P x)), then whenever P becomes true of anything, the condition becomes violated. On the other hand, if we prohibit (Start (E (x) (P x))), then as long as there is already an example of P, adding a new one will not violate the condition. The great majority of consistency triggers will not use Start.

The normal ways to declare constraints are:

Macro (alwaysrequired name trigger {:repair repair} {:enforcement-level enforcement} {:reporter reporter} {:documentation documentation})

Macro (neverpermitted name trigger {:repair repair} {:enforcement-level enforcement} {:reporter reporter} {:documentation documentation})

The only difference is that AlwaysRequired creates a rule that reacts when the condition is (about to be) false, while NeverPermitted reacts when it's true. Functions can create rules with computed names, triggers, etc. by calling the following versions defined as functions rather than macros:

Function (alwaysrequire name trigger reaction enforcement {:reporter reporter} {:documentation documentation})

Function (neverpermit name trigger reaction enforcement {:reporter reporter} {:documentation documentation})
The macro versions do more at compile time (an advantage when you compile to a file). They also compile reactions that are specified as lambda-expressions.

Name is a symbol, trigger is a wff, and reaction is a lisp function. The rule may be interpreted as meaning that the wff should never be true (for neverpermitted) or false (for alwaysrequired). If it threatens to become true, then the reaction is executed in an attempt to "repair" the problem. If the trigger has the form (E vars ...) for neverpermitted or (A vars ...) for alwaysrequired (or macroexpands to such a form), then every binding of vars which serves as an example (and thus violates the constraint) will be passed to a separate invocation of the reaction. The reaction function is meant to do something to maintain the constraint (for this binding of vars).

Function (ignore ...)
is a function which does nothing with any number of arguments. This is useful as a reaction if you don't have a reaction to propose but do not want to automatically abort (it's possible that another rule might fix the problem).

Enforcement is one of {:total, :incremental, :none}, indicating the degree to which AP5 enforces the constraint. Total means that when the constraint is declared, AP5 checks to see that it is satisfied, [Actually, total means to insist that the constraint be satisfied in the state resulting from the addition of the rule.] and also notices (and takes the appropriate action) when it is about to be violated. Incremental means that AP5 assumes the constraint to be satisfied when it is declared but watches for violations. [In different versions, the rule itself may become active at different times, e.g., only after the atomic transition is complete, or after the consistency cycle in which the (proposed) database contains the rule. Of course, in the second case, it will be deactivated by an abort.] None means that AP5 simply assumes the constraint is always satisfied. Other enforcement levels may be supported in the future. Different levels are useful because sometimes it is impossible, impractical or unnecessary to enforce a constraint. The AP5 compiler will feel free to make use of constraints as assumptions, even if it is not enforcing them.

Reporter allows you to specify a function for reporting failures of consistency rules. If an atomic transition aborts because some rules could not be satisfied and if no IFABORT was supplied, this rule-specific error-reporting code is used (if present) to print error messages. The arguments to this function are the same as those for the reaction of the rule.

There's a constraint that only one rule have a given name. If you try to name a new rule with the same name as an old one, the old rule is deleted.

Ruletriggers are like wffdefns in that they are not allowed to contain arbitrary lisp forms to be evaluated. All arguments to relations must be bound variables or constants. (However, see the description of Memo.)

The RuleTrigger and Constraint-Enforcement-Level cannot be altered (other than by creating or destroying the rule). If you want to change them, you can redeclare the rule (with different values in those slots), which will have the effect of replacing the rule.

Example:

(neverpermit 'Two-offices-for-same-person
  '(E (x y person)
      (and (Office person x) (Office person y)
	   (not (eql x y))))
  #'(lambda (x y person) (declare (ignore x))
     (when (?? Previously (Office person y))
	(-- Office person y)))
  :incremental
  :reporter
  #'(lambda (ignore ignore2 person)
     (declare (ignore ignore ignore2))
     (format *error-output*
	     "~A should not have two offices" person)))

Note that when a state is proposed in which p has two offices, the rule will trigger twice - each office has a turn at being both x and y. If both offices are new, there is no reaction, and the condition remains violated - which causes an abort. An alternative reaction could have been

(when (?? Start (Office person x)) (-- Office person y))

which would have aborted from an attempt to add two new offices for a different reason - an attempt to add and delete the same fact in the same transition.

Several constraint idioms have been identified. These have been (or will be) made into relations:

subtype constraints, e.g., every integer is a number - these are discussed in the chapter on types
type constraints on relations, e.g., the FirstName relation would only relate people to strings - these are also discussed in the chapter on types.
disjoint types, e.g., nothing is both a symbol and a number - these will again be described in the chapter on types.
count restrictions, e.g., no person can have more than one spouse

Notes:
- Declaration of a rule involves compiling various pieces of triggering code. This tends to take some time.
- It may also generate errors (aborts) on the grounds that the rule cannot be triggered or tested. In some cases it may actually be possible to do what AP5 says it cannot, and I'm willing to see examples. In other cases it is not possible, and so the rule cannot be implemented, at least with the current implementations of the relations involved. See the section on large wffs.

Consistency Checkers

We have generalized consistency rules to allow constraints that cannot be conveniently expressed in our relational language.

Type (consistencychecker rule)
means that rule is one of these generalized constraints. Such rules may be declared by the macro

Macro (defconsistencychecker name description response {:reporter reporter} {:documentation documentation})
Description is either a description that could be used as an automation rule trigger (see the section on automations), or the special description "(nil s.t. true)". On every consistency cycle, every match of the description causes an invocation of response. Unlike a consistency rule, the fact that such an invocation takes place does not necessarily mean that a constraint has been violated. That is decided by the response. Like a consistency response, the response may make updates, do insist's and even abort. If it makes updates (even noops) AP5 considers the current situation to be inconsistent, i.e., it violates the constraint represented by this rule. Otherwise, if the value of the response is NIL, AP5 considers the rule to be satisfied, and any other value indicates that it is violated. This mechanism is clearly more general than consistency rules, but is usually not needed. Also, consistency rules have the advantage that the constraint is explicit in the trigger, while for consistency checkers the constraint is at least partly in the lisp code.

Count Constraints

Cardinality (or "count") constraints express facts such as "every employee has at most one office", or "every employee has at least one office". The general form is:
For every n-tuple of objects, [x₁, ..., x_n] of types t₁, ..., t_n (actually, descriptions can be used as well as types), there are count-restriction m+n-tuples in the relation R, which can be formed by inserting m arbitrary objects in specified positions in [x₁, ..., x_n].

The types and the positions at which objects are to be inserted are specified by "patterns" which are lists of length m+n, which must be the arity of R. The elements of the list are either types or descriptions, or the symbol OUTPUT which stands for an inserted object.

The currenly supported values for count-restriction are:

:none There are no such tuples, e.g., while employees may in general be assigned secretaries, there should not be any secretaries who have secretaries: the constraint for the pattern (SECRETARY OUTPUT) is :none.
:multiple The relation must contain at least one tuple, e.g., if every employee has an office, the constraint for the pattern (EMPLOYEE OUTPUT) would be :multiple.
:optional The relation must contain at most one tuple, e.g., if no employee can have more than one office, the constraint for the pattern (EMPLOYEE OUTPUT) would be :optional.

Of course, "none" implies "optional", and "multiple" is incompatible with "none". However, "multiple" and "optional" are independent. This leads to two more possible cases:

:unique Exactly one (both multiple and optional), e.g., every person has one office.
:any No constraint, e.g., a person can have any number of children.

Examples:
if we want to restrict the relation Works-On%(person, project, number) so that every employee works some amount on some project, the count-restriction would be :multiple and the pattern would be (employee output output), i.e., for every employee there is at least one project and number such that Works-On%(employee, project, number).
If we want to say that there can be no more than one number for any given person and project, we can use the pattern (employee project output) and the count-restriction :optional.

Cardinality constraints are usually declared as part of a relation declaration, with the :count argument of DefRelation. In fact, the value of this parameter is, in effect, the result of appending together a bunch of argument lists to the Restrict-Cardinality macro described below (except that the name of the relation being declared is left out).

Macro (restrict-cardinality relname pattern {:countspec countspec} {:enforcement enforcement} {:replacing replacing} {:default default} {:too-many-repair too-many-repair} {:too-few-repair too-few-repair} {:delete-any-other-countspecs delete-any-other-countspecs})
is the macro that declares count constraints. Relname is the name of a relation and pattern is a list, each element of which is a typename, description, or the symbol output. Countspec is one of {:any, :unique, :optional, :multiple, :none}. Enforcement is an enforcement level. The default countspec is :any and the default enforcement is :incremental.

If countspec is either :optional or :unique, and replacing is non-NIL, then the rule will have a reaction that deletes old tuples in order to "make room for" new ones. For example, giving a person a new name will cause his old name to be removed if the constraint is declared as replacing. The other side of the coin is that if countspec is either :unique or :multiple, the default parameter can be used to provide a reaction function capable of supplying one or more tuples when the constraint is violated. The default function will be applied to an N-tuple of values that are instances of the types named in the N typed slots of the pattern. It should return, as multiple values, zero or more M-tuples, each M-tuple containing a value for each of the M OUTPUT slots of the pattern. The M+N tuple formed by merging the input with the outputs of the function will be added to the database as the reaction. [If the function wishes to supply NO tuples in a particular case, it should return NO values, not return NIL.]

Other reaction code can be supplied via the arguments Too-Many-Repair and Too-Few-Repair. (Unique constraints are represented as two rules, so both reactions can be specified independently.) The default reactions do nothing.

In the case of Too-Many-Repair violation of an upper bound constraint will result in application of the function. If countspec is :optional, it will be applied to the N values corresponding to the typed pattern elements for which there are too many tuples. If countspec is :none, it will be applied to N+M values. They will be the slots of a tuple that violates the constraint. The arguments will be passed in the order of the slots of the relation, immaterial of which slots in the pattern were typed and which were OUTPUT. The function, like any other reaction function, may assert and retract facts to reestablish consistency. [The Replacing keyword is a special case of a Too-Many-Repair; it is an error to have non-NIL values for both keywords.]

In the case of Too-Few-Repair, the violation of a MULTIPLE constraint will result in application of the function to N values -- the same ones described above for a Default function. The function, like any other reaction function, may assert and retract facts to reestablish consistency. [The Default keyword is a special case of a Too-Few-Repair; it is an error to have non-NIL values for both keywords.]

Delete-any-other-countspecs may be T, NIL, or a list of patterns. T means the same thing as (list pattern). The meaning is that all count constraints for this relation except those for the patterns listed are to be removed. However NIL means not to remove any other constraints.

In order to find out what cardinality constraints apply in a given case, the following are provided:

Function (cardinality-of-pattern rel-or-name pattern)

Function (cardinality-for-tuple rel-or-name i-o-pattern {output-indicator}) The first accepts a relation and pattern and returns a countspec that is implied by existing count constraints. The second is similar but the pattern is altered in that all the types are replaced by objects and the "output" slots are identified by being EQ to output-indicator, which defaults to the symbol OUTPUT but can be supplied if that is one of the objects you want to pass in. This returns a countspec for those particular objects, sort of like asking about all the tuples of the types of those objects and merging the results.

These functions amount to very limited theorem provers. They recognize that count constraints imply similar constraints for more specialized types. They currently do not try to prove subrel relationships for descriptions. The also recognize that "at most n (x,y) s.t. (R a x y)" implies "at most n (y) s.t. (R x b y)". (A similar argument for "at least" is not currently used since it requires knowing that types are non-empty.)

Automation rules

An automation rule reacts to a completed atomic transition. Whatever it does cannot be part of that transition. The reaction of an automation rule does not have access to two states, like that of a consistency rule, but it can be sensitive to transitions because its trigger can refer to and thereby bind variables (available to the response) from two states. In fact, automations are required to trigger on transitions rather than just states. They are not supposed to trigger on transitions that have nothing to do with the trigger. Formally, a wff is not allowed as the trigger for an automation if there is some situation in which that wff would be true regardless of the previous situation (or some situation that would make it true of the next transition regardless of the next state). [I think this is the right condition, but I haven't yet proven it.] This implies that the wff must refer to both the current and previous state. (Actually, in the current version, it must use Start.)

(Not (Start P)) is not a legal trigger for an automation, because there are states (namely those in which P is false) for which it is true regardless of the previous state.

(Start (Not P)) is allowed.

(Or P (Start Q)) is not allowed, since if P is true then the condition is true regarless of the previous state.

(And P (Start Q)) is allowed.

Macro (defautomation name trigger action {:documentation documentation})
This is analogous to NeverPermitted.

Unlike consistency rules, the triggers for automation rules are descriptions of the form (vars s.t. wff). The action is evaluated for every binding of vars such that wff is true. This is the normal way to create automations.

Automation Generators

Sometimes it is inconvenient for an automation rule to have no access to the atomic transition other than the variables matched in its trigger. For example, one might want to print a warning that the following set of objects just entered the relation P, and not be satisfied with a separate warning for each. For this purpose we have generalized automation rules in a way analogous to consistency checkers.

Type (AutomationGenerator rule)
means that rule is such a generalized rule. These rules are declared by the macro

Macro (defautomationgenerator name description response {:documentation documentation})
Description is either a description that could be used as an automation rule trigger, or the special description "(nil s.t. true)".

On every state transition, every match of the description causes an invocation of response. Unlike automation rules, the response has access to both the previous and new states. However, it is not allowed to do updates. Rather its job is to decide what invocations should be run as automations later. In the above example, the rule that wanted to print a warning would have as its trigger (nil s.t. (E (x) (start (P x))). Its response would then collect all the data and enqueue as an automation a program to print the single warning. This last action is performed by

Macro (enqueue-automation . body)
which creates a closure out of body to be executed as an automation.

Note:
Recall that closures over loop variables do not necessarily refer to the values of those variables at closure time! If you find yourself writing enqueue-automation inside a loop, you probably want to copy the loop variables, e.g., rather than
(loop for x s.t. (P x) do (enqueue-automation (F x))), you want
(loop for x s.t. (P x) do (let ((x x)) (enqueue-automation (F x)))).

Types

In AP5, the term "type" is synonymous with "unary relation". We have endeavored to provide the generally useful features normally associated with types. Of course, one typically wants to be able to test whether an object is of a particular type, or generate the objects of some type. These capabilities are provided by virtue of types being relations.

Subtypes

Relation (subtype t1 t2)
means that t1 is a subtype of t2, i.e., every object of type t1 is also of type t2. Notice that this means every type is a subtype of itself. In fact, Subtype is a special case of the more general relation:

Relation (subrel subrel superrel)
Means that every tuple satisfying the first relation satisfies the second. (Thus SubType is a subrel of SubRel.) Subrel is actually derived from a relation named ISubRel (for "immediate subrelation"). The recommended way to update the relation (type) lattice is to update SubRel (or SubType, for types). The deletion code for SubRel deletes any ISubRel tuple and insists that the SubType tuple be false. This will abort if the SubType tuple is implied by other ISubRel tuples. The solution is to atomically delete all of the relevant SubRel tuples.

Notes:
Other semantics for updates of SubType are certainly reasonable. We choose this one because it generates errors when there is some chance that you're about to do something you didn't mean. Of course you can change this behavior if you wish by changing the update macros for SubType.
There is an automation rule that keeps the ISubRel relation minimal, i.e., deletes subrelation constraints implied by other ones.
Also, there's currently a rule that prohibits different relations from being subrelations of each other. Another way of declaring synonyms is to just give one relation two names. It's not clear whether this rule is a good idea, given that synonyms present other problems.
There is also a primitive "classifier" that attempts to recognize subtype relations between defined types and other types. (More generally, it attempts to recognize type constraints for defined relations from their definitions.) This is currently implemented as an automation rule which prints messages to tell you what it has added.

The initial type hierarchy includes:

 Entity (true for any object)
   Number (numberp)
     Integer (integerp)
   String (Stringp)
   Function (Functionp)
   Cons (Consp)
     Description ((vars s.t. wff) as in defined relation)
   Symbol (Symbolp)
   other lisp types
   DBObject 
     MatchNode 
     Rule 
       AutomationRule
       ConsistencyRule
       MaintainRule
     Relation 
       Type 
         BaseType (see below)

BaseTypes (inheritance)

In some applications (especially those where data tends to be called "knowledge") various forms of "inheritance" are desired. In the case of types the usual requirement is that by virtue of asserting that an object is of some type it should become an element of all supertypes. In AP5 this is accomplished by (you'd never guess!) first order logic definitions of the types involved. Types that "inherit" elements from their subtypes are called BaseTypes (which is an unfortunate name in that they have little to do with "base relations"). Basetypes are defined in terms of the SubType relation and another relation, named Classification. In particular, if some type, T1, is a basetype, it is defined as the set of objects, x, such that the Classification relation holds between x and some subtype T2 of T1 (which might be T1 itself).

Typically, after declaring a baseype (with DefRelation ... :derivation basetype) one makes a number of SubType assertions.

Type Constraints

A common use for constraints is assuring that objects in a particular slot of a particular relation are of a particular type. For example, to be sure that the relation (PhoneNumber x y) relates objects only to phone numbers (identified by the relation (Phone x)), one might do something like:

(Alwaysrequire 'Phone-Type-Restriction 
  '(A (x y) (implies (PhoneNumber x y) (Phone y)))
  'ignore :total)

This would abort something like (++ Phone (DBO Don) nil) (assuming NIL is not a phone), or (-- Phone 226) (if 226 was the Phone of someone), but would not abort something like
(Atomic (++ Phone (DBO Don) 'bam) (++ Phone 'bam)).

Type constraints are normally declared along with a relation as part of the DefRelation syntax. A list of types is supplied as the :types parameter. Enforcements can also be supplied (see DefRelation for details).

The default reaction code does the following: if the object in the constrained slot just ceased to be of the required type, the (now illegal) tuple is removed from the constrained relation. The intent is that when an object ceases to be of some type, any tuples involving that object, which therefore become illegal, will be removed. (AP5 does not support the notion of destroying an object.) The default reaction does nothing to react to attempts to put an object of the wrong type into a constrained relation - the attempt is allowed to abort.

Disjointness constraints

Another useful bit of domain model is the fact that certain types are disjoint, i.e., they have no common elements.

Macro (defdisjoint {enforcement} . types)
asserts that all the types are pairwise disjoint. The types may be type names or unary descriptions. The default enforcement is :none.

Programming constructs using types

The following macros have been defined to facilitate the use of types in lisp code:

Macro (declare-type . declarations)
Each declaration is a list of the form (typename . variablenames). The binding of each variable is checked to be sure it has the given type.

Macro (returning typename . body)
The body is evaluated, and the value returned is checked to be sure it has the given type. Multiple values can be checked by supplying a typename of the form (values typename1 typename2 ...).

Macro (defun-typed name lambda-list . body)

Macro (let-typed bindings . body)

Macro (prog-typed bindings . body)
All of these act just like their untyped versions, except that variables of the form type#name or type# (just like those allowed in wffs) are type checked on entry (see esoteric syntax).

Macro (aptypecase key . clauses)
This evaluates key and then looks for a clause whose CAR is the name of a type of that object. The CAR can also be a list of typenames, in which case the key matches if it is of any of those types. Finally, a CAR of T or OTHERWISE matches no matter what the key is. If such a clause is found, the remaining elements are evaluated, and the value(s) of the last returned.

Macro (wff-if . ifform)
This uses IF-THEN-ELSEIF-ELSE syntax (like interlisp if), but the conditions are wffs, not lisp expressions. Also, for any condition that starts with an existential quantifier, the corresponding forms (after the then) may use the existential variables freely. Omitted THEN clauses are treated like THEN T. Also, an omitted final ELSE clause is treated like ELSE T.

Type dependent I/O

The form in which DBObjects (not general lisp objects) are printed is dependent on their types.

Relation (proper-name dbobject symbol)
relates DBObjects to their proper-name(s).

Macro (dbo . identifying-data)
accepts a proper-name (if it is the proper-name of a unique object), and the printer for DBObjects prints a proper-name of the object if there is one. Notice that the proper-name of an object is a symbol, with a package. The printer for DBObjects does not print the package prefix, but the reader is sensitive to the package in which the name is read.

For particular types of DBObjects, the syntax (DBO type name) evaluates to the object of the given type with the given name, e.g.,
(DBO RELATION relationname) evaluates to the named relation,
(DBO TYPE typename) evaluates to the named type
(DBO RULE rulename) evaluates to the named rule.

Function (relationp x)
returns x if it is a relation, the relation named x if there is one, and otherwise NIL.

To print a DBObject (without a proper name), AP5 looks for a type of the object which has a printer function which is willing to do the printing. This function should accept the same arguments as the common lisp :print-function option of defstruct. It should return a string to print or the value NIL. If it returns NIL the search for a printer continues. The printer function is declared by adding tuples to the relation:

Relation (printer type function)

The suggested convention is that printers should print #,(DBO . x), where x is a list whose car is a type, and that types be given readers that can translate the list back into the object when invoked by DBO. Readers are similarly defined by:

Relation (reader type function)
A reader function should accept a list whose car is a type, and return either a DBObject or NIL. If it returns NIL (which is not a DBObject), the search for an acceptable reader will continue. As a convenience to those who write printers,

Macro (printdbo . x)
will print #,(DBO . x). AP5 supplies reader and printer functions for relations and rules, e.g., the way to refer to a relation object is #,(DBO Relation name). Notice that, since relations such as Type, Subtype and Printer all relate relations (rather than their names), DBO may be useful in your attempts to use AP5.

If there is no acceptable printing function, the printer prints
#,(DBO DBOBJECT address), which cannot be read back in.

The notion of "the types of an object" is really not first order, and so is not available as a relation. However, we recognize its usefulness, as in the case of printing and other "object-oriented" applications.

Function (all-types-of object)
returns a list of all the types of object, even those that are supertypes of others.

Function (most-specific-types-of list-of-types)
returns a list of the elements of its argument that are not supertypes of other elements.

Examples

(DefRelation person :derivation basetype)
NIL
(listof type)
(#,(DBO TYPE PERSON) #,(DBO TYPE RULE) ...)
(DefRelation employee :derivation basetype)
NIL
(++ Proper-Name (make-dbobject) 'Don)
NIL
(++ Classification (DBO Don) (DBO Type Employee))
NIL
(loop for p s.t. (person p) collect p)
NIL
(++ Subtype (DBO Type Employee) (DBO Type Person))
NIL
(listof person)
(DON)

Equivalence

In describing the semantics of AP5's virtual database, we have had to appeal to the concept of two references to values being references to "the same value". Unlike logicians, lisp programmers regularly deal with several theories of equality. For example they sometimes compare objects with EQ (in common-lisp, more often EQL), and sometimes with EQUAL. If we do (++ R '(1 2 3)), we might want (?? R '(1 2 3)) to return T, even though the reader considers the two lists to be different. It would be much more painful (and less efficient) to have to ask
(?? E (x) (and (R x) (Equal x '(1 2 3)))).

We will only refer to the equivalence of two values with respect to some equivalence relation. We use the term "equivalence relation" in the mathematical sense: a binary relation that is transitive, reflexive and symmetric. In addition we impose the restriction that equivalence relations are never updated. (Mathematics does not deal with updates to relations. The reason for this restriction will be described later.) The term "reflexive", meaning that every object is equivalent to itself, actually makes sense only in the context of some primitive "identity" relation. In AP5, as in lisp, that relation is EQ. Equivalence relations partition the universe of lisp values into disjoint equivalence classes. Two n-tuples of objects, W and V, are said to be equivalent with respect to an n-tuple of equivalence relations, E, iff for each i from 1 to n, Wi and V_i are equivalent with respect to (related by) E_i.

Non-uniform Equivalence

Up to now we have spoken of a relation in the database as arbitrarily partitioning tuples of lisp values into TRUE tuples and FALSE tuples. We now refine this concept. For each relation R, there is a fixed (not varying over time) n-tuple of equivalence relations E(R), where n is the arity of R. We will call this the equivalence signature of R. For any two n-tuples of values which are equivalent with respect to E(R), R must be either TRUE of both or FALSE of both. Intuitively, R cannot distinguish between tuples that are equivalent with respect to its signature. R may be thought of as mapping each tuple of equivalence classes C=(c₁, ..., c_n) to TRUE or FALSE, where c_i is the equivalence class defined by E(R)_i. Lisp objects are used merely as representatives of these classes. Thus the concept of object really corresponds to an equivalence class.

Adding or deleting tuples must make all equivalent tuples true or false. It is not possible to add and delete equivalent tuples in an atomic transition (any attempt to do so will abort). We can now explain why equivalence relations are not to be updated. Suppose the relation R uses equivalence relation E and that R holds for object o1 but not for object o2. If E starts to hold between o1 and o2 then clearly R has to change, but there seems to be no basis for deciding whether R should start to hold of o2 or cease to hold of o1.

The concept of equivalence also has semantic import for generating. When we generate (x₁ ... x_n) s.t. (R x₁ ... x_n) we obviously want each generated tuple to satisfy R, i.e., the result of testing R on the tuple would be TRUE. The other requirement is that for each tuple of objects satisfying R, exactly one equivalent tuple should be generated.

Relation (compare-relation relation position equivalence-relation)
describes the equivalence signatures of relations. (Positions start with 0 and end with one less than the arity of the relation.) This should be asserted of a relation as soon as it is declared (and before it is used). In the absence of such a fact, the default comparison is Eql.

This solves the problem of equivalence for primitive stored relations. The remaining problem is what to do about wffs (including defined relations, which are defined by wffs). As an extreme example of the problem, suppose we have a relation R with the equivalence signature [Same-Color, Same-Size]. What should be the result of generating x s.t. (R x x)? If we regard R as relating equivalence classes, we're looking for things that are both Same-Color and Same-Size equivalence classes. It seems unlikely that there are any, even if you think the concept is sensible. AP5 in general disallows wffs in which one variable appears in positions that are compared by different equivalence relations. The example above would generate a compile-time error. If all the positions in which a variable appears are compared by the same equivalence relation, that equivalence is used for the variable. For instance if you do

(listof x s.t. (and (P x) (Q x)))

where P and Q use the same equivalence relation, the result will be one representative of each equivalence class of that same equivalence relation which satisfies both P and Q. A degenerate case is that

(listof x y s.t. (R x y))

will produce the tuples of R with respect to the equivalence signature of R.

This takes care of almost but not quite all cases of interest. We now turn to the exceptions. One case is that one actually wants to ask the question at the beginning of the section: does the set R contain an element that is Equal to '(1 2 3). Suppose the comparison for R is actually Eql. The rule just stated indicates that

(?? E (x) (and (R x) (Equal x '(1 2 3))))

is ill-formed unless the Equal relation compares its arguments by Eql. (Even if it did, we wouldn't be able to ask similar questions for sets that used other comparisons.) Equivalence relations could reasonably be assigned any signature composed of finer equivalence relations (subrelations), e.g., it would make sense to consider the signature of Equal to be [Eql, Eql] or [Eq, Eq]. (This intuition is justified by a more formal argument below.) By convention, equivalence relations are considered to use themselves to compare their arguments, e.g., the equivalence signature for Equal is [Equal, Equal]. However, equivalence relations are exceptions to the rule above: variables may appear as arguments to equivalence relations if they are compared by finer equivalence relations (subrelations of the equivalence in question). Equivalence relations are thus used to translate between equivalence classes of different equivalence relations. In the example above we can think of Equal as being applied to two Eql equivalence classes: they are considered Equal if they are both subsets of the same Equal equivalence class. However this raises the question of how to interpret a request to generate

x s.t. (E (y) (and (R y) (Equal x y)))

In particular, by what equivalence should x be compared? The answer is provided by the following rule: if a variable appears only in equivalence relations, it is compared by the equivalence relation that relates two objects only if they are related by all of the equivalence relations in which the variable appears. (Usually one is finer than all the others, and that one is used. However, if there is no finest, e.g., x appears only as arguments to same-color and same-size, then objects are considered equivalent if they have both the same color and same size.) Therefore in this example, x is compared by Equal, and this query "summarizes" R, identifying elements that are Equal. In general AP5 can generate coarser equivalence classes from finer ones but not the other way around. For example, while it can generate the query above, it cannot generate

x s.t. (E (y) (and (R y) (Eq x y)))

which would be a set of larger (or equal) cardinality than R.

As another example where "translation" between equivalence classes is useful, consider how one might want to combine relations with different signatures. Suppose R1 is a set compared by Eql and R2 is a set compared by Equal. We might want to describe the "union" of these sets as compared by Eql:

x s.t. (or (R1 x) (E (y) (and (R2 y) (Equal x y))))

This set could not be generated since it might contain infinitely many objects for every element of R2. On the other hand the "union" as compared by Equal would be described as:

x s.t. (or (R2 x) (E (y) (and (R1 y) (Equal x y))))

This set could be generated (if both R1 and R2 could be generated). Of course, the "union" as compared by other equivalences can also be described by introducing two existential quantifiers.

In the same way that variables can be introduced with types, there is a syntax that allows introduction of variables with equivalence relations. If the name contains the character "@", the part after the @ is expected to be the name of an equivalence relation (see esoteric syntax). Thus person#x@equal, person#@equal and x@equal are all possible ways to introduce a variable with an equivalence.

((x@equal) s.t. (R a x))

means the same thing as

((x) s.t. (E (x') (and (R a x') (equal x x'))))

ExpandDescription (described later) can be used to see what other uses of @ mean.

In the same sense that it's "reasonable" to assign different signatures to equivalence relations, it's also "reasonable" to assign different signatures to static relations that cannot be generated at all. Recall that the signature really only affects the meanings of updates (how large an equivalence class is being updated) and generation (how large an equivalence class is represented by a generated tuple). For a relation like Integer, there is one coarsest possible comparison, which divides all objects into only two equivalence classes: integers and non-integers. However, any finer equivalence relation would be just as reasonable. In particular, it's convenient to be able to ask whether an object is an Integer in the set R, no matter what equivalence relation is used to compare elements of R, as long as that equivalence relation is finer than "both-integers-or-both-not". Notice that if R is a set of numbers compared by = (which relates the integer 1 to the non-integer 1.0), then this does not make sense. In order to allow this sort of overloading, we mark slots of such relations with the following relation:

Relation (anycomparison relation position)
This means that variables appearing in the given slot need not have the same comparison as the slot. One can view the rule for equivalence relations as implicitly treating both slots of every equivalence relation like an anycomparison slot.
Note:
Currently there is no enforcement of the constraint that such variables be compared by refinements of the comparison for that slot. This will be fixed at a later date.

Support for Multiple Equivalences

The burden of maintaining and generating tuples for individual relations so as to account for the equivalence signatures of those relations is passed along by AP5 to the individual implementations in the library. For example, BASE relations store a single representative tuple of any equivalance class of tuples, and search for tuples with the MEMBER function, using a suitable :TEST parameter to account for the relation's equivalence signature. The implementation may compile in the signature (preferably), or defer looking it up until run-time. Each implementation must either be so general as to handle any equivalence relation, whether pre- or user-defined, or must produce a compile-time error if asked to deal with an unsupported equivalence relation in the signature. It is of course acceptable for an implementation that cannot handle arbitrary equivalence signatures to be accompanied by a consistency rule that prohibits its use on relations with unacceptable signatures.

Although we have provided, we believe, meaningful semantics and useful support for multiple theories of equality, this support ends at the outer boundary of a wff. Once values obtained from a generator are passed out into the lisp world there is no "memory" of what equivalence class was used to generate them. Certain transformations you might expect to affect only efficiency actually affect semantics. For example, the program

(loop for x s.t. (and (P x) (Q x)) collect x)

might denote a list that is a superset of that denoted by

(loop for x s.t. (P x) when (?? Q x) collect x)

if Q's signature has a finer equivalence relation than is P's.

Equivalence Relations

AP5 considers an equivalence relation to be a binary relation which has been declared as an equivalence relation by asserting into the relation

Relation (equivalence-relation relation)
The initial set of equivalence relations includes the relational equivalents of the Common Lisp predicates EQ, EQL, and EQUAL. Relational versions of STRING-EQUAL, STRING=, and = are defined, which hold for two values iff the two values are EQ, or if the corresponding Common Lisp predicate evaluates, without error, to T. In general, when you invent a new equivalence relation it is a good idea to assert the appropriate SubRel facts that relate it to other equivalence relations. The compiler may need this information to recognize that certain things can actually be generated. It is the responsibility of the definer of new equivalence relations to make sure that they really are symmetric, reflexive and transitive. In particular they must relate every lisp object (as compared by EQ) to itself. AP5 cannot enforce these conditions and simply assumes that they are true.

Relation (symbol-functional-equivalentp x y)
is an equivalence relation meant for functions created by

Function (functional-equivalence-testable funarg functional-characterization {preferred-name})
This function returns a symbol whose symbol-function is funarg. However two such symbols are Symbol-Functional-Equivalentp if they were created with (non NIL) EQUAL functional-characterization's. Thus, for instance, if you have a program that updates a functional attribute, like RuleCode, and that attribute is compared by Symbol-Functional-Equivalentp, you can use Functional-Equivalence-Testable to create the values. Preferred-name (a string) is used as the name of the symbol if that name has not already been used.

AP5 currently needs some help with new equivalence relations. For reasons of efficiency it sometimes enters objects into hashtables. In order to use hashtables supported by common lisp (EQ, EQL, EQUAL, EQUALP), AP5 needs a fact of the following form:

for all x,y,
    (EQUIV x y) iff (EQ/EQL/EQUAL (f x) (f y)) and
    (EQUIV x (f x))

where EQUIV is the new equivalence relation, EQ/EQL/EQUAL is one of the relations EQ, EQL, or EQUAL, and f is a function. These facts are represented by the relation

Relation (canonicalizer equiv-relation EQ/EQL/EQUAL function)

(It would of course be possible to not require canonicalization functions and use less efficient algorithms. AP5 does not do this. We hope the problem will be solved by future extensions to common lisp hashing.)

A convenient way to declare (most) new equivalence relations is:

Macro (lisp-predicate-equivalence-relation name subrel {canonical-equiv {canon-fn}})
This declares name as a Lisp-Predicate-Relation-Name, asserts that it is an equivalence relation and asserts that it is a superrelation of subrel. It also produces a definition for the relation that guarantees that it is in fact a superrelation of subrel and that it never causes an error. The optional arguments are the canonical equivalence (EQ, EQL or EQUAL) and the canonicalizing function. An example use is:

(Lisp-Predicate-Equivalence-Relation string= equal
	equal canonical-cl-string=)

where canonical-cl-string= simply maps symbols to their names. This actually defines the String= relation by the following code:

(lambda (x y) (or (equal x y)
	          (ignore-errors (string= x y))))

Finer Control over Implementation

It is not expected that users will forever be content with the implementations described in this manual. An implementation is really nothing more than a symbol with which to associate capabilities that are shared by a set of relations (those of that implementation). This section tells you how to extend the capabilities of implementations, how to define new implementations and how to tune the systems you build. Each of these is done by telling AP5 more about the relations (or implementations) so that it can better decide how to use them (in the case of tuning), or so that it can interpret requests that previously had no meaning or implement requests that could not otherwise be implemented (in the case of extending relations or defining new implementations).

Tuning

In order to choose among several possible algorithms, AP5 estimates the cost of doing things one way or another. In the process, it uses various data that can be specified to defrelation. Most of this same information can be declared for whole classes of "implementations" (representations, derivations or computations) via the same arguments to defrepresentation, defderivation and defcomputation.

Testeffort provides an estimate of the effort to test a relation. It can be specified as a numerical argument to DefRelation, or more generally as a functional argument for either a relation or an implementation. When provided for an implementation it is taken to apply to all relations of that implementation (except those relations with their own testeffort entries). The function should take as arguments the wff to be tested, i.e., for a n-place relation it will be passed n+1 arguments, the first of which is the relation. The result should be either NIL (meaning that the relation cannot be tested), or a number proportional to the effort for doing the test. There has been little effort so far to calibrate the constant of proportionality, but the intent is that the result be the number of primitive machine instructions (assumed to be constant time) required for the test. To see AP5's estimate of the effort of testing a wff, try the function

Function (testeffort wff)
In many cases the effort is related to the size of the relation, which is estimated by

Function (relationsize vars wff)
This accesses the size data declared to DefRelation, a list alternating between patterns and sizes. A pattern is a list corresponding to arguments of the relation. The elements of the list are either the symbol OUTPUT denoting a variable to be generated, the symbol INPUT, denoting a value to be determined at runtime or a constant. Supplying the pair
(OUTPUT INPUT 13 OUTPUT) 10
in the declaration of a relation, R, means that the query:
(loop for (x y) s.t. (R x a 13 y) count t)
would normally be expected to return a number less than 10, regardless of the value for "a". [This means you can't use INPUT or OUTPUT as a constant. Does anyone need this capability and have a good idea of how to provide it?] A size of NIL means there may be infinitely many tuples. If there are no RelSize tuples for a pattern of all OUTPUT's, RelationSize uses the value of

Variable ap5::defaultrelsize
(initially 100) for this pattern. RelationSize returns the minimum over all most specific (where constants are more specific than INPUT which is more specific than OUTPUT) matching templates of size raised to the power fraction of Var's in the wff, e.g., a relation with 1000 tuples would be expected to generate 100 x, y pairs such that (R x y 1) and 10 x values such that (R x 1 2).

Examples

Relation (eql x y)
is described as follows:

 :size ((INPUT OUTPUT) 1 (OUTPUT INPUT) 1 (OUTPUT OUTPUT) NIL)
 :testeffort (LAMBDA (&REST X) 1)

In other words, the expected size of {x | (EQL a x)} is 1, the expected size of {x | (EQL x a)} is 1, the expected size of {x,y | (EQL x y)} is infinite, and any EQL relationship can be tested in 1 instruction.

The effort to test a base relation is proportional to both the number of tuples in the relation and the size of the tuples:

(DefRepresentation base ...
 :testeffort
 (lambda (&rest pat)
     (* (relationsize (cdr pat) pat) 3 (length pat))) ...)

Extending the Capabilities of a Relation

Relations are expected to present the following interfaces to the world (none is required, but relations that provide them are correspondingly more useful):

Add - a way to add a tuple to the relation
Delete - a way to delete a tuple from the relation
Test - a way to test a tuple for inclusion in the relation
Generate - a way to find (some of) the tuples in the relation
Trigger - in rare cases (described below) AP5 needs help to recognize that a tuple has been added or deleted from a relation

Triggers may only be associated with an individual relation, while the others may also be associated with an implementation (meaning that they are common to all relations of that implementation). If a relation and its implementation both have generators, all these generators are presumed to work for the relation. On the other hand, providing adders, deleters or testers for a relation will cause the corresponding interfaces for the implementation not to be used for that relation. Only if the relation itself has no associated data is the data associated with its implementation used. If neither exists, the relation is incapable of the corresponding action. The user is expected to keep the interfaces to a given relation consistent, e.g., guarantee that after a --, a ?? will return NIL.

There are currently two protocols for adding and deleting tuples. All relations (other than computed relations) may be given adders and deleters, which provide a way to add or delete a single tuple. Stored relations may be given updaters, which provide a way to make a whole set of updates together. If both are provided, AP5 is free to use either. In the case of derived relations (see the section on derived relations) which includes defined relations, the adder and deleter actually perform other database updates. For stored relations the adders, deleters and updaters alter lisp data structures.

In the cases of adding, deleting, and testing, the value is a functional object which must accept as arguments the relation and elements of the tuple to be added, deleted or tested, i.e., (apply macro wff). It should return a functional object (to be compiled) that accepts the runtime values of these same arguments. Testing code should return NIL if the tuple is not in the relation and something else if it is. Adding and deleting code have no useful values. Code for adding and deleting stored relations should not access the database at all. (During update it really will be in an inconsistent state.)

As a simple example, here's a declaration of a stored relation in which an adder, deleter and tester are provided:

(defrelation R1 :representation individual :arity 1
   :tester (lambda (&rest ignore) (declare (ignore ignore))
	      `(lambda (rel x)(declare (ignore rel))
		 (member x *R1DATA*)))
   :adder (lambda (&rest ignore) (declare (ignore ignore))
	    `(lambda (rel x) (declare (ignore rel))
		(pushnew x *R1DATA*)))
   :deleter (lambda (&rest ignore) (declare (ignore ignore))
		`(lambda (rel x) (declare (ignore rel))
		   (setf *R1DATA* (delete x *R1DATA* :count 1)))))

The Updater for a stored relation is a function of three arguments: the relation to be updated, a list of tuples to be added and a list of tuples to be deleted. Each tuple is represented as a list of objects. The function should make no other assumptions about the add and delete lists and their elements (representing the tuples). In particular, it is an error to smash them or to save pointers to them.

It is sometimes desirable for applications to provide relations to users on a read-only basis. This is done by declaring that the relation is not allowed to be explicitly updated:

Relation (not-allowed-to-update relation truthvalue)
This means that the update specified by truthvalue - T for ++ and NIL for -- - is not allowed for relation. If relation is derived, however, tuples may still become true or false. This is mostly useful for defined relations and relations with implementations that have adders or deleters. For instance you might store some data in a relation that you intend to provide on a read-only basis and then disallow further updates. Or you might want to retain the ability for your programs to update a relation that is advertised to others as read-only. This can be done by advertising the relation PUBLIC, defining it as ((x1 ... xn) s.t. (PRIVATE x1 ... xn)), and asserting NOT-ALLOWED-TO-UPDATE about the PUBLIC relation. This form of write protection can of course be overridden - for instance by deleting the Not-Allowed-to-Update tuple. (By default, NOT-ALLOWED-TO-UPDATE is allowed to be updated!)

Generators

Before describing the general case, we mention a simple mechanism for the usual case where you're willing to supply a function that computes a list of all the tuples to be generated. Of course this may be inefficient if there are many such tuples, since the cost of computing and storing them will be mostly wasted if only a few are used.

These forms may be supplied as generators to defrelation:
(SimpleGenerator pattern form {cost})
(SimpleMultipleGenerator pattern form {cost})
Pattern looks like arguments to a primitive wff, but the arguments are all either (distinct) symbols, standing for the inputs, or the symbol OUTPUT, standing for something to be generated. Form is a lisp expression that uses the symbols standing for inputs, and computes the tuples to be generated. The difference between SimpleGenerator and SimpleMultipleGenerator is that SimpleGenerator can only describe a generator for a single output position: form should return a list of such outputs. SimpleMultipleGenerator can describe generators for multiple output positions: form must return a list of tuples of objects, where the tuple corresponds to the OUTPUT's of the pattern in the same order. As a special case, a SimpleGenerator form is allowed to return a single non-listp value which is interpreted as the only value to be generated.

Examples:

(DefRelation + :computation (lisp-function + 2 1)
  :generator
  ((simplegenerator (a output c)
		    (and (numberp a) (numberp c) (- c a)))
   (simplegenerator (output a c)
		    (and (numberp a) (numberp c) (- c a)))))

declares a three place relation - the lisp-function computation supplies one generator:

(listof x s.t. (+ 3 4 x))

returns (7). The other generator arguments allow other kinds of generation, e.g.,

(listof x s.t. (+ 3 x 7))

returns (4).

The optional cost argument is a form that estimates the cost of the generator (roughly the cost of evaluating the second argument). If not supplied, SimpleGenerator assumes that it's cheap.

Generators are somewhat more complex than adders, deleters and testers because one typically wants to generate different subsets of a relation in different ways. AP5 accepts algorithms for a limited class of such subsets, namely those in which certain positions of the tuples are fixed. For instance you can supply an algorithm for generating {x | (R c x)} (where c will be known at runtime), but not for {x | (R x x)}.

The value of a generator argument is a "generator macro", which is a function to be applied to a list of arguments of the form (Subsetflg Vars Rel . Args). Vars is a list of variables, Rel is a relation object, Args is a list of expressions that correspond to the arguments of the relation, and Subsetflg is either a list or T (as its name indicates it used to be a boolean, but it has been generalized). The macro is supposed to return multiple values, each of which is an association list, ((attribute1 value1)(attribute2 value2) ...), that describes a generator (see the description of generators below). Each such generator is supposed to be relevent to the question of how to find {Vars | (Rel . Args)}. Subsetflg is a list containing a subset of vars, or T meaning the same thing as all of vars. The caller is interested both in algorithms that generate these and in algorithms that accept these as inputs. By contrast, members of vars that do not appear in subsetflg must be generated (that is, the caller will throw away any returned algorithms that do not). Any arguments not in vars will be used as inputs (that is, in order to use a returned algorithm that generates at such a position, the caller will generate tuples and discard any that do not match the input it has for that position). In most cases, user supplied generator macros will ignore this argument and only return one generator. However, any macro that actually returns different generators for generating variables at different argument positions should return all such applicable generators when subsetflg is true. In principle, we could always ask for all possible generators, but it may be more efficient to return only those that will be of some use.

The attributes of a generator are Template, Effort and InitialState. Template is a list of items that indicate which positions of the relation must be supplied as input and which can be generated as output. The convention is that the symbols Input and Output indicate the type of each argument. For example, a generator for a StructProp relation (R x y), which stores y in the property list of x, would have a generator with the template (Input Output), indicating that given x, it could generate the y's such that (R x y).

The Effort attribute should be a number that expresses the amount of work this generator will take to generate all the values of variables that are related to the input values. Remember that this estimate must be computed at compile time, before the values are known. If not supplied, Effort is taken to be proportional to the number of tuples in the entire relation (see RelationSize).

InitialState should be a function of the relation object and all of the values of non-variables (in the order in which they appear in the wff), e.g., in order to generate {x,y | (R x 1 y (f z))}, the InitialState should be a function of the form (lambda (rel a b) ...) where rel will be bound to the relation object whose name is R, a will be bound to the constant 1, and b will be bound to the runtime value of (f z). This function should return another function object (normally a closure) which, later will generate one output tuple each time it is called (with no arguments). The actual values to be returned by this function object are a flag indicating that the generator is exhausted and, if not (when the value is NIL), the next bindings of the output variables to be generated (in the order in which they appear in the wff). Of course, if the first value is T, the others don't matter.

Examples

There is only one generator for the Base implementation, which could have been defined as follows:

(DefRepresentation Base ...
    :generator
    ((lambda (subsetflg vars rel &rest args)
      `((template ,(loop for arg in args collect 'output))
	(effort
	  ,(* (relationsize args (cons rel args))
              3 (length args)))
	(initialstate
	  (lambda (rel &rest args)
	    (let ((state code to retrieve the list of tuples))
	      #'(lambda nil
	          (if state
		    (apply #'values nil (pop state))
		    t)))))))) ...)

This generator will be specialized by AP5 for the cases where some of the positions are bound at input, or where the same variable appears more than once, e.g., {x | (R x 1)} or {x | (R x x)}.

Derived Relations

Derived relations are computed from other relations and therefore will, in general, change when those other relations change. AP5 requires the code that implements such relations to follow a few conventions:

It must access the database only via AP5 access mechanisms. That is, it should not duplicate the lisp code used to implement the stored relations on which the derived relation depends. This is important because AP5 does something special to handle temporal reference for stored relations, but it assumes this is not necessary for derived relations since the code for those derived relations will access the adjusted versions of stored relations. (AP5 treats relations that cannot be updated in the same way - see PossibleToAdd.)
Update code (in reladders and reldeleters) must use ++ and -- to do updates of stored relations, and not directly alter lisp data. In other words, the update code for a derived relation translates updates of that relation into updates of the stored relations on which the derived relation depends. AP5 handles these updates during the phase of an atomic update where the atomic transition is being computed (like consistency rules), rather than afterwards when the changes are considered satisfactory and are to actually be made. Conversely, an update macro for a stored relation is expected to change the underlying lisp data and not do database updates. The update macro for a derived relation may access the database (in order to decide what changes to make). The database it sees is the (possibly inconsistent) one that resulted from the previous consistency cycle, i.e., if the update was done by a consistency rule, the macro can see the state that triggered that rule.
If the relation can change, AP5 must be made aware of when tuples are added or deleted, in order to trigger rules and compute starts. The mechanism for doing this is described below. It is also possible to explicitly admit that a relation can change but we are not telling AP5 how to detect such changes. In this case AP5 will not allow rules that must trigger on such changes. See NonTriggerable.

Triggers are declared in DefRelation via the :trigger argument. They specify that the relation being declared is derived from another relation, which kinds of changes to that relation can cause which kinds of changes to this one, and how those changes are computed. For instance, when R1 becomes true we can use F to compute the tuples of R2 (the relation being declared) which thereby become false. AP5 assumes that the set of such supporting relations declared in this way is complete - if none of them change then R2 is presumed not to change either. After AP5 computes all the updates to all relations on which R2 depends, it finds the set of functions that translate those into updates of R2, and calls each such function once to compute updates to R2. The function takes three arguments: the relation for which it is to compute the updates, whether additions to that relation are of interest and whether deletions are of interest. Most functions will compute only additions or only deletions and can thus ignore the last two arguments, but a function that computes both may be able to avoid work if only one is of interest. The function is free to report updates that are not of interest, but AP5 will ignore them. However, it is an error for a function to report updates other than those for which it was declared. The function can access the updates to R2's supporting relations, e.g., R1, or other relations on which they depend, by generating wffs containing Start. It is an error to read other relations. The function should "report updates" by using ++ and -- syntax, as if it were actually updating the derived relation. It is all right for a trigger function to report the same update more than once, but it must not report a fact that was already true as an addition or a fact that was already false as a deletion.

Trigger functions may also be used to generate tuples of the derived relation that "start" to be true or false. It is possible that the function may be called for this purpose even when none of the relations it depends on have changed. (We mention this so that you can write the function to avoid large amounts of unnecessary work.)

A possible future extension is to allow one function to compute updates to several derived relations at once.

In order to trigger rules AP5 must assume (require) that relations not change "behind its back" - all changes to derived relations must be described by trigger. All changes to stored relations must be done with ++ and --. (But see nontriggerable.) An important case to recognize, is that some implementations of lisp actually smash code, for example in macro expansions. Thus you might do

(++ R (setq a '(lambda (x) (push x foo))))
(funcall a 1)
(?? R (setq a '(lambda (x) (push x foo))))

and get the result NIL, even though R compared its arguments with EQUAL. Even worse, there might be a rule prohibiting the macroexpanded version of the lambda expression from being in R, in which case the inconsistency has gone undetected. If such smashing cannot be prevented, another possible solution is to compare the slot with an equivalence that considers the smashed object to be equivalent to the original. (The relation Symbol-Functional-Equivalentp is meant to help solve this problem.)

In addition, the rule compiler needs to know what can possibly change. This is partly a matter of optimization: there's no need to compile code to react to events that can never happen. However there are times when it's more than just optimization. A rule cannot be compiled (and the transition that defines it will abort) if it requires generating a set that cannot be generated. If this is needed only in response to an event that cannot occur, then the rule actually could be compiled if AP5 only knew that the event could not occur. Normally AP5 knows what relations can be updated by the use of :representation or :derivation as opposed to :computation. Finer detail can be provided by the arguments PossibleToAdd and PossibleToDelete.

Relation (possibletoadd relation T-OR-NIL)

Relation (possibletodelete relation T-OR-NIL)
T means that the update is possible, NIL means it is not.

Finally, it is sometimes convenient to think of lisp functions as relations even if you don't want to specify exactly what they depend on or how. Of course, these "relations" cannot be used to trigger rules. This is declared by inserting into the relation

Relation (nontriggerable relation)
This will actually prevent AP5 from compiling any rules that have to react to changes in relation.

See also the discussion of the No-Trigger relation in the section on rule triggering.

Expanding defined relations "in line"

Defined relations may be thought of as either analogous to functions or macros in lisp. There's no semantic difference between the two. However, implementation considerations may favor one over the other. Just as lisp provides both options, AP5 allows a defined relation to be treated either way. Treating a definition as a macro means that one ends up processing larger wffs (which means more compilation time). On the other hand, this makes certain optimizations possible. For instance, if P were defined as (and Q R), then (and P (not Q)) would simplify to False if P were treated like a macro. Indeed there are cases where a query can only be compiled if a definition is treated as a macro. On the other hand, functions have the advantage of allowing sharing of code, e.g., there would be more code shared between (?? or P R1) and (?? or P R2) if P were treated as a function. This translates into less compilation time again and also less space required for the compiled code.

For purposes of ++ (--), AP5 does not distinguish between function-like relations and macro-like relations. If the relation has an updater (adder, deleter), it is used. Similarly, if the relation has been declared impossible to directly add (delete), a compile time error results. Otherwise the definition is used. This may work if the definition is not a compound wff. (Actually, updates of negations and descriptions are translated in a reasonable way.)

For purposes of testing, generating and rule triggering, AP5 normally treats defined relations as functions. One exception is that a relation which is defined by a non-compound wff is normally treated as a macro. Another exception is definitions such as

((x) s.t. (or (eql x 'a) (eql x 'b)))

where the variable could be compared by several equivalence relations. These must be treated as macros since the equivalence will be determined by other uses of the variable in a larger context. If, on the other hand, the definition had been

((x@eql) s.t. (or (eql x 'a) (eql x 'b)))

then the equivalence for x would be fixed and the relation could be treated as a function.

There are two ways of overriding the defaults. One is to declare a relation "inline" with DefRelation, which means that the relation should be treated as a macro. If a relation is not in this set (and equivalence considerations do not force it to be treated as a macro) it is treated as a function. You can add or remove elements of this set.

A finer control is at the level of a given wff. The syntax for wffs allows two special cases:
(InLine wff)
(NotInLine wff)
If wff starts with a defined relation, then that occurrence will be treated as a function (for NotInLine) or macro (for InLine). These are analogous to the lisp inline and notinline declarations.

Nonatomic relations

All of the relations described so far have been assumed to change only as a result of ++ and --. However, this is not always desirable. First, there is data, such at the current time, which changes without any explicit updates. Second, the updates may be done in some program that was not originally written with AP5, but is simply being interfaced to an AP5 program. Finally, there are times when one is willing to give up such amenities as rule triggering and history recording in order to gain performance. AP5 allows such relations to be declared as "nonatomic". This means that they do not follow the normal atomic semantics of changing only as part of an atomic transition. In fact, they may change during an atomic transition, and remain changed even though the atomic aborts!

Declaring a relation as nonatomic means that

you cannot trigger any rules on updates to the relation
you will be warned if the relation appears in a rule trigger at all (for instance, it seems strange to impose the constraint that (Start (P x)) can only be true before 2:30 PM, since the constraint will be checked only at one time during the atomic. If that's 2:29:59 then the atomic might finish successfully after 2:30!)
updates to the relation take immediate effect (unlike atomic relations)
updates to the relation are not stored in the history

Any derived relation which is derived from a nonatomic relation is also nonatomic. If you provide an adder or deleter to such a relation it may only update other nonatomic relations. (Otherwise the update could not take immediate effect.)

DefRelation - putting it all together

Now that we've discussed most of the data that goes into defining a relation we supply the total documentation for DefRelation:

Macro (defrelation name &key representation derivation computation definition implementation documentation arity equivs types type-enforcements size count keep-other-countspecs updater adder deleter tester testeffort generator trigger possibletoadd possibletodelete inline nonatomic idempotent allow-interrupts cache-generators argnames alsofns)

Name is the name of the relation to be declared. The rest are keyword arguments (in this case there are so many that I prefer the normal lisp notation to the bnf shorthand for optional syntax, e.g., {:tester tester}).

Exactly one of representation, derivation, computation, definition or implementation should be supplied. Implementation is treated as a synonym for representation and is allowed for backward compatibility. Representation indicates that the relation is stored, derivation that it is derived and computation that it is computed. Definition indicates that it is defined by a description, and is either derived or computed, but this can be determined by analyzing the description. We refer to these parameters collectively as the "implementation". This is used to determine the relationimplementation of the relation.

In the case of definition, if the value is a list of length three with s.t. as the second element, that value is the description that defines the relation. Otherwise, the value is treated as a macro form to be expanded to produce the description. In these cases the iplementation of the relation is the symbol defined.

Otherwise (not definition), if the implementation parameter is the name of an implementation, that will be the implementation of the relation. If the representation is a list starting with :multiple (currently this is supported only for representation), then each remaining member of the list is a representation of the relation in the same form as described here. This is a way to get multiple redundant representations in order to have a larger choice of generators.

Otherwise the value is used like a macro to compute a new defrelation form. This value is either a list, (F . arguments) or a symbol, F, which is treated just like (F). The arguments convey more information about the relation to be declared. We call a function F used in this context an implementation translator. It should look something like this:

(Defun F (relname keywords &rest arguments) ...)

Relname and keywords are the first argument and the rest of the argument list passed to DefRelation. Arguments is the rest of the implementation parameter (which of course is also contained in the keywords list). All arguments to defrelation other than the implementation are used only by F. It should generate warnings if it decides to ignore them. Normally it will include them in its result. As an example,

(DefRelation R2 :representation (D2 1 4) :adder add-R2)

would be expanded by doing

(funcall 'D2 'R2 '(:representation (D2 1 4) :adder add-R2)
         1 4)

The result might be something like

(:arity 2 :representation MyRep
 :adder add-R2 :tester test-R2)

in which case the original defrelation would be roughly equivalent to

(DefRelation R2 :arity 2 :representation MyRep
  :adder add-R2 :tester test-R2)

The only difference is that the original implementation, in this case the list (D2 1 4) would be related to the relation R2 by the following relation:

Relation (parameter-list relation list)
This is used for describing the relation and may also be used elsewhere, e.g., test-R2 might actually read it.

It is conventional to use the same name for parameterized implementations and the functions that translate them, i.e., the SUM derivation is translated by the Sum function, so that a derivation of the form (SUM ...) translates into a derivation of SUM.

If none of implementataion parameters is supplied, the default is representation base.

Documentation is a documentation string.

Arity is the arity of the relation. If a definition is supplied the length of its argument list will be used as the arity. An error will be signalled if this disagrees with a supplied arity. Some implementation translators supply an arity. Otherwise, if equivs or types are specified, arity defaults to the maximum of their lengths. Otherwise the arity must be specified.

Equivs is a list of comparisons for the slots. (See the section on equivalence.) Each element is either the name of an equivalence relation, :any for an anycomparison slot, or :default. If the list is shorter than the arity, any missing slots will be treated like :default. If a definition is supplied, the comparisons will be computed from it, and any non-defaulted comparison that disagrees will cause an error. Otherwise, defaults are derived from the type constraint of the slots, as specified by the Types parameter.

Comparisons are derived from type constraints as follows: If the type constraint is a description, it is analyzed to find the comparison of its argument. (This could be :any.) Otherwise, if the type is NOT in the AnyComparison relation, then its comparison is used (as it must be for the typeconstraint to be well formed). Otherwise the defaults are examined.

Relation (default-equivalence type equivalence-relation)
If all of the most specific supertypes of a type that have defaults happen to agree on a single default, that one is used. Otherwise a continuable error is signaled from which the user can name a comparison.

Definition is a description of which tuples are in the relation.

Types is a list of type constraints, either type names or unary descriptions. If its length is less than arity, any remaining slots will be unconstrained (type entity).

Argnames allows you to specify a list of names (symbols) for the arguments of the relation. If this argument is not supplied, the argument names can also be supplied using the syntax described in esoteric syntax as part of the definition if there is one, or otherwise as part of the the types argument. For instance, instead of

(Defrelation supervisor :types (person person) :argnames (underling boss))

one might write

(Defrelation supervisor :types (person#underling person#boss) ...)

to indicate which argument was which. For cases where the type argument contains a description, an argument name can be specified as in

(Defrelation supervisor
   :types (((person#underling) s.t. ...) ...) ...)

The argument names for a relation are stored in the relationargnames relation.

Type-enforcements is a corresponding list of enforcements for the type constraints. If its length is less than the length of Types, any remaining enforcements will be defaulted. The default, which may also be specified by the symbol :default, is :none for derived relations and relations with definitions, and :incremental for others. In addition to an enforcement, this argument may supply a reaction in the same way as reactions are supplied to DefTypeConstraints. A given enforcement may be either an enforcement name, e.g., :none, a reaction, i.e., :remove, :norepair or (:repair function) where function is the reaction, or a 2 element list containing an enforcement name and a reaction in either order. Reactions default to :remove. Unlike DefTypeConstraints, neither enforcements nor reactions are sticky.

Size is a list alternating between input/output patterns and numbers, where each pair is a tuple to be added to the relsize relation with the new relation cons'd onto the front, e.g., :size ((input input output) 10 (output output output) 100)

Count is interpreted as a set of argument lists for restrict-cardinality: If you take a list of restrict-cardinality forms, remove the first two elements of each ('restrict-cardinality and the relation name), and then append all the lists together, you'll have an acceptable :count argument. None of these should use the :delete-any-other-countspecs argument. Instead :keep-other-countspecs can be specified in DefRelation. If it is T then all count constraints will be retained. Otherwise it should be a list of patterns of count constraints to keep. The default is NIL. Any count constraints other than these and the ones specified by the :count argument will be deleted.

Updater, Tester and Testeffort are the relupdater, reltester and testeffort of the relation.

Generator specifies a set of generators. It may be either a single generator or a list of generators. A single generator is either a symbol, which is assumed to name a function, a list starting with Lambda, which is assumed to be a function, or a list starting with SimpleGenerator or SimpleMultipleGenerator, which is just like the corresponding form that could be evaluated except that the arguments need not be quoted and the relation name is omitted from the pattern, e.g., (simplegenerator (x output y) (ignore-errors (- y x))).

Trigger, if specified, should be a list of trigger specifications, as in

(defrelation derived-rel ...
 :trigger ((r1 + f1 +) (r1 - f2 -)) ...)

which means that f1 computes additions to derived-rel (that's the second +) from additions of r1 (the first +) and f2 computes deletions from derived-rel from deletions of r1. Notice in this syntax, + and - are used to represent new truth values of true and false. The symbol +- may be used to mean both. This is equivalent to two trigger specifications, one containing + and the other -. (You can use two +-'s in the same specification, in which case it's equivalent to four specifications.) Similarly, the relation may be a list of relation names, in which case it stands for one copy of the specification for each name. If the functions are specified by symbols (function names), the definitions will have to come after the DefRelation since they will do adds and deletes of this relation. Finally, if any of the "function" specifications is NIL, this is taken to mean that there is a dependency that is not specified. This currently has the effect of declaring the relation nontriggerable.

Inline means to declare the relation to be inline. It defaults to T iff no representation is provided, a definition is provided and the definition "expands to" a non-compound wff.

Nonatomic means that the relation is nonatomic as described above. In this case it becomes relevant that additions and deletions to the relation may be idempotent, i.e., the code works whether the fact was already true or not. This can be specified by the idempotent argument, which may be NIL (the default), (:add), (:delete), or (:add :delete). If :add is in the list then it is assumed safe to call the adder or updater to add a tuple without first checking that the tuple to be added is false. Similarly, if :delete is in the list it is assumed safe to call the deleter or updater to delete a tuple without first checking that it was true. For fanatical optimizers, the allow-interrupts argument specifies that updates may be done without the protection of a critical section.

Cache-Generators means to compute a set of generators for the relation. This is likely to save time later when the relation is used. If the value is T then all possible generators will be precomputed. NIL means to compute none. Otherwise the value should be a list of patterns for which generators should be cached. Each pattern is a list whose length is the arity of the relation being declared and whose elements are symbols named INPUT or OUTPUT. The pattern means to cache a generator which accepts bindings for the INPUT positions and generates bindings for the OUTPUT positions. The default value is T for relations of arity 3 or less and NIL for others.

Alsofns is a list of functions to be called in the same transition where the implementation is stored. This is typically used for storing other related attributes of the relation. These functions are passed the relation (not its name) as their only argument.
Note -- The intent of defrelation is that it should be possible to use it to "redefine" a relation without an intermediate state that would cause such damage as removing all of the rules and relations that depend on it. (This would happen if you removed the relation and then defined it again.) Alsofns tends to be used for adding data about the relation, sometimes in the form of other relations that describe the newly defined relation, but not always. When the relation is redefined, often this additional data ought to be removed, or at least modified, and this is currently not supported by the alsofns mechanism.

I now think that this should be replaced by a declarative description of what data is to be added by the current defrelation. This should be stored with the relation, and then defrelation, if the relation already existed, should arrange to compare the new additional data to the old version, and make appropriate changes atomically.

For instance, the replacement for alsofns might report that we are to add the relationship (tclosure R1 R2) as the result of this definition of R2 (meaning that R2 is the transitive closure of R1). In that case, if the relation R2 already existed, we should find the added data for the old definition and compare it to the data to be added for the new definition in order to compute a set of changes to make, e.g., if R2 was not previously a transitive closure at all, we would remove its old data and add the tclosure tuple.

Current uses of alsofns:

unionclass and intersectionclass add subtype relationships (which are not undone on redefinition)

basetype adds subtype relationships (and matchnode?)

lisp-predicate-equivalence-relation (adds canonicalizer,equivalence-relation)

tclosure (asserts tclosure)

partial-index (can remove old index data)

slot(-single), stored-in-place(-single) and all variants (asserts stored-where)

(to be fixed)

When a DefRelation form is evaluated for an already existing relation, the assumption is that its definition is being changed, and the result ought to be something like the result of deleting the relation and then redefining it. However deleting the relation would remove certain data that might have been added after the previous definition, and it's not clear that this is the intent. Our current solution works as follows:

If the representation, derivation, or arity of the relation changes, all adder, deleter, tester, generator, trigger, size, and count data is removed from the old relation. These are presumed to be justified by the arity and representation or derivation of the relation. Regardless of whether the representation, derivation or arity changes, all values provided in these parameters are then added, and any values provided in the previous definition (using DefRelation) but not provided in this definition are removed.

Defattribute - the common special case

Most relations turn out to be binary. For this common case, the following macro offers more succinct syntax than defrelation:

Macro (defattribute relname domain-name range-name ...)
The arguments all translate directly into arguments of defrelation as follows:

Relname is the relation name - the first argument of defrelation,

Domain-name is the first element of defrelation's :types. Range-name is the second element of defrelation's :types.

The rest are keyword arguments:

:Domain-equivalence and range-equivalence are the first and second elements of defrelation's :equivs. :Domain-type-enforcement and :range-type-enforcement are the corresponding elements of defrelation's :type-enforcements.

:Countspec is the count keyword (such as :any) for the pattern (domain-name output). :Inverse-countspec is the one for (output range-name). :Replacing and :inverse-replacing specify that the too-many repairs for these should replace old values. :Default and :inverse-default are the corresponding default value functions for too-few repairs. :Count-enforcement and inverse-count-enforcement are the corresponding enforcements.

:Size is the number of tuples for the pattern (domain-name output), :inverse-size is the number for the pattern (output range-name), and :total-size is the number for the pattern (output output).

:Inverse-name is the name for another relation to be declared as the inverse of this one, i.e., if this relation relates x to y then the inverse relates y to x. (This does not correspond to any argument of defrelation.)

:Definition, :derivation, and :computation are as in defrelation.

Creating new representations, derivations, computations

New representations and derivations can be declared via these two macros:

Macro (defrepresentation name &key updater tester generators testeffort checkers idempotent)

Macro (defderivation name &key updater tester generators testeffort)

Macro (defcomputation name &key tester generators testeffort)

Database representation of domain model

While we have provided a lot of convenient macros for declaring relations and rules, these are also objects in the database. Specifically, all rules and relations are DBObjects with various attributes. For relations some of these have been mentioned before. Many others are described later (under "Finer Control over Implementation").

Formally, we do not regard the declaration of a relation as making infinitely many facts true and/or false (even though the declared relation might have infinitely many tuples). Rather, we regard it as adding a small number of facts about an existing (but previously "unused") object. The relations affected by a declaration include the following.

Relations

Relation (relationname relation name)
means that name is the name of the relation object relation.

Relation (relationarity relation arity)
means that arity is the arity of the relation object relation. An object with a RelationArity is, by definition, a relation object.

Relation (relationimplementation relation implementation)
means that implementation is the representation, computation, or derivation (a symbol) of the relation object relation.

Type (relation relation)
is true of all primitive relations, i.e., (listof relation) will give a list of relations (not their names).

Relation (wffdefn relation description)
associates relations with definitions.

Relation (relationargnames relation argumentlist)
This records the argument names supplied with the definition of the relation. The list can be nil, in a case like

(Defrelation supervisor :arity 2)

or a list of nil's, in a case like

(Defrelation supervisor :types (person person))

or a list of more useful symbols, e.g., (underling boss), in a case like

(Defrelation supervisor :types (person#underling person#boss))

or in a case like

(Defrelation supervisor :definition ((person#underling person#boss) s.t. ...))

or in a case like

(Defrelation supervisor :arity 2 :argnames (underling boss))

Relation (doc symbol symbol string)
This is the relational counterpart of the commonlisp Documentation function. In particular, relation names may have documentation associated with the symbol Relation and rule names with the symbol Rule. These are printed by DescribeRel and DescribeRule and may be included in the forms that are normally used to declare relations and rules.

Relation (basetype type)
means that x is a basetype. The adder for a basetype classifies an object as the basetype. (However there is a rule that may remove this classification if the object is also classified as a subtype.) The deleter removes such classifications.

Relation (classification x type)
means that x is of type t1. There is no requirement that for every type t1 s.t. (t1 x), x be classified as a t1 (or any subtype of t1). Of course, every member of a basetype is classified as a subtype of that basetype. Normally it's more useful to ask about (t1 x) than (Classification x t1).

changes arbitrarily As an example, suppose you were given the functions

HardCopy-Device-P(string) ; is string the name of a printer
 Add-HardCopy-Device(string) ; make string the name of a printer

You might declare a HardCopy-Device relation as follows:

(DefRelation HardCopy-Device :arity 1 :representation individual
    :tester
    (lambda (&rest ignore) (declare (ignore ignore))
	   '(lambda (rel x) (declare (ignore rel))
		    (HardCopy-Device-P x)))
    :adder
    (lambda (&rest ignore) (declare (ignore ignore))
	   '(lambda (rel x) (declare (ignore rel))
		    (Add-HardCopy-Device x)))
    :nontriggerable t)

The nontriggerable means that Add-HardCopy-Device might be called directly, as opposed to being done as a result of (++ HardCopy-Device ...). If you wish to use HardCopy-Device in rules you must "promise" that this will not happen, i.e., remove the :nontriggerable t. In this example, there is no way to remove a HardCopy-Device or to generate all HardCopy-Devices. If these capabilities existed they could be included in the DefRelation form. }

Representations, derivations and computations are all represented by the single relation:

Relation (implementation symbol)
Those that are derived or computed are represented by the subtype:

Relation (derived implementation)

Many attributes are shared among these implementation objects and relations. In general, a relation "inherits" these attributes from its implementation. The following are all parameters of defrelation:

Relation (testeffort rel-or-imp function)
Relation (relsize relation pattern size)
Relation (relupdater rel-or-imp function)
Relation (reladder rel-or-imp macro)
Relation (reldeleter rel-or-imp macro)
Relation (reltester rel-or-imp macro)
Relation (relgenerator rel-or-imp macro)

Triggers are not inherited from implementations. They are represented as follows:

Relation (trigger rel1 tv1 function rel2 tv2)
This implies that rel2 is derived from rel1, and that whenever a tuple of rel1 acquires the truth value tv1 (either T or NIL), the function will compute tuples of rel2 which therefore acquire truth value tv2.

Similarly the following are only defined for relations.

Relation (possibletoadd rel t-or-nil)
Relation (possibletodelete rel t-or-nil)
Relation (inlinerel relation)

Rules

Type (rule x)
is true if the object is a rule. More specifically, it is the union of the following subtypes of rule:

Type (consistencyrule x)
Type (consistencychecker x)
Type (automationrule x)
Type (automationgenerator x)
Type (maintainrule x)
(See below for a description of maintainrules.)

The components of consistency rules are accessible via:

Relation (ruletrigger rule trigger)
Relation (rulename rule name)
Relation (rulecode rule code)
Relation (constraint-enforcement-level rule enforcement)
Relation (inconsistency-reporter consistency-rule function)

Other rules also have names, triggers and code.

Particular types of consistency rules are also accessible from the database:

The following relations provide access to count constraints:

Relation (multiple-for relname pattern)
means that there is a rule with a trigger that looks like the kind that Restrict-Cardinality would produce if passed the relation name and the pattern with a countspec of :multiple or :unique.

Relation (optional-for relname pattern)
is the corresponding thing for countspecs of :optional or :unique.

Relation (none-for relname pattern)
is the corresponding thing for countspec :none.

Relation (unique-for relname pattern)
means bot optional-for and multiple-for.

Relation (any-for relname pattern)
means none of the above.

SubRel is actually defined as the transitive closure of:

Relation (isubrel subrel superrel)
This means that subrel is an "immediate" subrelation of superrel. ISubrel is actually a maintained relation that reflects the presence of a constraint which requires every tuple of the subrelation to be a tuple of the superrelation. Such constraints are recognized by the form of their triggers.

Relation (subreltrigger subrel superrel arity trigger)
means that a consistency rule with the given trigger would require that subrel be a subrelation of superrel (and that it is recognized as such by ISubRel).

Relation (typeconstraint relation position type)
means that there is some rule whose trigger looks like the kind created by DefTypeConstraints which requires the position'th element of any tuple in relation to be of type type. Positions start at 0.

It turns out that the form of a type constraint for a unary relation is identical to the form of a subtype constraint. In fact, the rule that is created either way will be recognized as an instance of both. The only difference is in the default enforcement and the action to be taken in order to maintain consistency. The relevant relation for disjointness constraints is:

Relation (disjointtypes type1 type2)
This is true if the two types have been declared disjoint or are subtypes of two such types.

Relation (idisjointtypes type1 type2)
This relates pairs of types that have a rule specifically declaring them disjoint.

Rules that Maintain Relations

Efficiency considerations sometimes dictate that a relation that could be defined ought to be cached. For example, the definition is expensive to access, and the relation is accessed much more often than it is updated. In AP5, the way to get the right thing to happen is to create a rule that causes the (stored) relation to track the definition. This could be (and in fact used to be) done by consistency rules. Unfortunately, such an implementation does not quite achieve the same effect as a defined relation. For example, an update that appears to cause a change in a defined relation might trigger a consistency rule which "undoes" that change. If the defined relation were replaced by a stored relation that was maintained by another consistency rule, the attempt to "undo" the change would cause an abort (whereas the defined relation is free to change back).

Instead of using consistency rules, AP5 has another type of rule, called a MaintainRule, which attempts to make a maintained relation behave just like a defined relation. This will enable users to start out by defining relations and later optimize their programs by changing them to maintained relations.

The only difference between defined and maintained relations (other than performance) is that defined relations can have update macros that say how the relations in their definitions should be altered to change the truth value of the definition. Of course a maintained relation has to be stored, so its update macros have to describe how its stored value is changed. If you want to maintain a defined relation with update macros, you'll have to divide it into two relations: R-Maintained, with the original definition, and R, which is defined as R-Maintained and has the update macro. Then you can use R as before, but with different performance.

Note:
Some versions of AP5 may assume (and not check) that maintained definitions are not circular.

AP5 has a consistency rule that causes relations to be maintained whenever they have an implementation other than DEFINED and a definition. ("Defined" relations, i.e., those that are always derived from definitions, also have definitions, but their implementation is DEFINED.) The relation

Relation (relation-in-defn rel rel-or-rule)
means that rel is a relation that appears in the wffdefn of rel-or-rule (if it's a relation) or the ruletrigger of rel-or-rule (if it's a rule). The term "appears in" includes macro expansion but not expansion of defined relations. Thus if we define R2 in terms of R1 and R3 in terms of R2, this relates R2 to R3 but not R1 to R3. For that you should use

Relation (relation-in-defn* rel rel-or-rule)
which is the transitive closure.

Note:
While this all reacts properly to changes in wffdefn's, it is assumed that macro definitions are stable in the sense that you don't change macros used in ruletriggers/wffdefns in such a way that they expand into a different set of relations. If you really must do this you should probably undeclare those rules/relations first and then redeclare them. I'm afraid the only way to find out what definitions use a macro is to iterate over all the definitions.

Maintained relations are stored in the manner specified by their implementations. For example, if you do

(DefRelation PersonPhone :implementation structprop
   :definition
   ((x y) s.t. (E (office) (and (PersonOffice x office)
				(OfficePhone office y)))))

you will get a structprop relation which tracks the definition. The relation

Relation (maintainedrel relation defn rule)
means that the rule maintains the equivalence between the relation and the definition. (In fact, MaintainedRel is a maintained relation!) The rule that maintains a relation is of type
Type (maintainrule x)

These rules always assume that, at the time of rule declaration, the relation matches the definition, i.e., that the relation need only be maintained incrementally. Usually this requires only that the definition be declared before any updates that make it true, e.g., in the PersonPhone example, before any PersonOffice or OfficePhone tuples are added. By default, AP5 will generate an error if you try to directly update a maintained relation, since this is supposed to be done only be the maintaining rule. However, if you discover that the maintained relation is wrong (for example if you failed to insert a tuple before asking that the relation be maintained), you can update it with Variable ap5::fix-maintained-relation
bound to a non-NIL value.

Miscellaneous

First come those things that are so miscellaneous that they don't even belong in one of the following sections:

The object of a high level compiler like AP5 is that you shouldn't have to think about the algorithms it chooses. In general, this is all right if the computations will be cheap. AP5 tries to warn you when this is not the case. In particular:
Variable ap5::geneffortwarning
holds a number (or NIL, meaning infinite). Whenever a generator is compiled with an estimated effort of this amount or more, a warning is printed.
Variable ap5::gensizewarning
is similar, but causes warnings when the estimated number of tuples is large.
Variable ap5::triggereffortwarning
is a value to which ap5::GenEffortWarning is rebound when compiling trigger code. Since triggering code may run on any atomic transition, the default value is somewhat lower than that of ap5::GenEffortWarning.
Variable ap5::triggersizewarning
is the analagous variable for ap5::GenSizeWarning.

Variable ap5::prototype-mode
If this is NIL (the default), the results of AP5's compilation (into lisp code) are further compiled by the lisp compiler. If you don't actually expect to run that code very much you can save compilation time by turning on this switch.

More esoteric wff syntax

By default DefRelation declares a macro for the name of the relation being declared (unless it already has a function or macro definition). We will refer to this as a relation macro. If we define a binary relation R, then we get the following macro expansions:

(R x y) => (?? R x y)
(R x) => (any y s.t. (R x y))
(R) => (any x y s.t. (R x y))

This macro also treats any symbol that looks like a variable named "$" as a new existential variable. Multiple uses of $ need not refer to the same object. Similarly, any symbol that looks like a variable named "?" (for compatibility with earlier versions of AP5, "~" is treated exactly the same as "?") is treated as a place holder for outputs to be generated. Any missing arguments are filled in by ?. Thus:

(R ? y) => (any x s.t. (R x y))
(R ? $) => (any x s.t. (E (y) (R x y)))
(R $ $) => (?? E (x y) (R x y))
(R $) => (any y s.t. (E (x) (R x y)))

In any syntactic context where a wff is expected, certain abbreviations are permitted. These are expressed by special syntax for arguments of primitive relations and are allowed regardless of whether the named relation has a relation macro. Arguments of primitive relations are interpreted as follows:

Names of bound variables refer to those variables.
$ introduces a new existential variable as in the case of relation macros.
Any list starting with a relation (name) followed by the right number of arguments or fewer, exactly one of which named "=" refers to any object which satisfies the relation when substituted for the =. Missing arguments are filled in with $'s.
Any other list starting with a wff macro (see below) is macroexpand-1'd and the result is interpreted as the argument - note this list may contain = and may also macroexpand into things containing =.
Anything else is treated as a lisp expression (or constant) to be evaluated in the context of bindings outside the wff. This case, of course, includes uses of relation macros as described above.

Note that ?, =, and $ are notational shorthands. Anything that can be expressed with them can also be expressed without them. They allow more concise (one might also say "cryptic") expression of certain things, by giving up control of where the new quantifiers are to be placed. The answer, in the case of $ and = is that the scope of the quantifier is just the primitive wff containing the $ or =. Thus

(?? not (P x (Q x =))) => (?? not (E (y) (and (P x y) (Q x y))))

as opposed to, say (?? E (y) (and (not (P x y)) (Q x y)))

(?? A (x) (implies (P x) (Q x (R = $))))] =>
 (?? A (x) (implies (P x) (E (y z) (and (R y z) (Q x y)))))

as opposed to (?? E (y z) (A (x) (implies (P x) (and (R y z) (Q x y)))))]

Although the rules above look complicated, the syntax is not really difficult to use. You can use expanddescription to find out for sure what an expression means. We present some examples for motivation:

((x) s.t. (and (spouse x $) (not (child x $))))
 ; married people without children
((x) s.t. (office x (office p =)))
 ; people who have an office in common with p
((x) s.t. (office x (office p ?)))
 ; people in an arbitrarily chosen office of p
 ; note the difference (unless p necessarily has a unique office)
((x) s.t. (phone (office x =) 226))
 ; people with an office that has 226 as a phone

Note that ((x) s.t. (office x)) is an error - the trailing $'s are supplied only for lists that appear as arguments and not for top level wffs.

Wff syntax allows variables to be introduced (either by quantifiers or descriptions) with types and equivalence relations by using a symbol containing the characters "#" and "@". In the most general form a variable is introduced as typename#varname@equivname.

If the equivalence is left out it is defaulted as described in the section on equivalence. The typename defaults to entity. If a typename is supplied then the variable name defaults to the type name. Thus person# is the same as person#person. After the variable is introduced it should be referred to by its name, e.g., (A (person#p) (Q p)), as opposed to (A (person#p) (Q person#p)). $, =, and ? can also appear in combination with @ and #. Also, commonlisp type specifiers can be used as AP5 type names.

Finally, there is a limited mechanism for extending wff syntax. Wherever AP5 expects a wff, if it sees a list whose CAR is a symbol but not a known relation name or piece of wff syntax, it will try a macroexpansion, treating the result as a wff.

Currently defined macros include:
Macro (all vars wff)
All is a synonym for A.
Macro (exists vars wff)
Exists is a synonym for E.
Macro (implies wff1 wff2)
This is true if wff1 implies wff2, i.e., (OR (NOT wff1) wff2)

Special symbols are also supported on some machines, namely those which support the use of the special characters in symbols.
Macro (equiv wff1 wff2)
This is true if wff1 has the same truth value as wff2, i.e., (AND (implies wff1 wff2) (implies wff2 wff1)).
Macro (~ wff)
This is a synonym for not.
Macro (xor wff1 wff2)
This is true if wff1 has the opposite truth value from wff2, i.e., (NOT (equiv wff1 wff2)).
Macro (if-then-else if then else)
The arguments are all wffs. If if is true then this is equivalent to then, otherwise it's equivalent to else, i.e., (OR (AND if then) (AND (NOT if) else)).
Macro (change vars wff)
(Change vars wff) is a temporal wff which is true if the truth value of wff has changed for any binding of vars. It translates into (E vars' (or (start w') (start (not w')))) where vars' and w' account for typed entries in vars.

Also see the section on partial orders, which are implemented as macros.

When you create such macros for use in wffs you must inform AP5 of that fact by adding them to the relation:
Relation (wffmacro symbol)
This says that the symbol should be macroexpanded in a wff to produce another wff. Such symbols should probably also be added to:
Relation (names-not-allowed-for-relations symbol)
This prevents the use of symbols as relation names. It's probably a bad idea to remove symbols from this set.

Under normal circumstances it's very odd to introduce variables and not use them at all. AP5 therefore generates errors or warnings for unused varibales. If you really need to do this (for instance you need to pass a binary relation to some function but you want to define it in terms of only one argument), you can use the syntax:
(ignore-vars var-name-list wff), in place of wff, as in
((x y) s.t. (ignore-vars (y) (R x 1)))
The ignore-vars should appear immediately after the introduction of the variables to be ignored, i.e.,
(E (x y) (ignore-vars (x) (and (P y) (Q y)))) rather than (E (x y) (and (P y) (ignore-vars (x) (Q y)))).

Functions useful for extending the environment

This section lists a number of internal functions that have been useful in previous attemots to extend the programming environment. We make no claims for completeness.

Function (ap5::parse-var symbol)
This takes a variable name, potentially of the form type#name@compare and returns three symbols (as multiple values), coreponding to the variable name, the name of its type and the name of its comparison. Type and comparison values are NIL if not supplied in the input symbol.

Function (ap5::get-comparison relation position {error})
Given a relation and a position this returns the comparison. It is similar to

(any x s.t. (compare-relation relation position x) 
    ifnone (dbo relation eql))

except for a few special cases, which generate errors if error is non-NIL:

If (anycomparison relation position) the value is T
If relation is an equivalence relation the value is (list relation)
If relation is defined in such a way that the specified argument is used only in equivalence relations, the value is a list of those equivalence relations.

Function (ap5::possibletoupdate relation truthvalue)
tells whether tuples can be added (if truthvalue)) or deleted (if not) to relation. It uses the relations PossibleToAdd, PossibleToDelete, relupdater, reladder and reldeleter.

Function (ap5::descriptionp x {allow-eval-args})
This is the same as (?? Description x) when allow-eval-args is NIL. Otherwise it accepts things that are almost descriptions but contain lisp expressions to be evaluated, e.g., ((x) s.t. (relationname x y)) would qualify.

Variable ap5::*potential-wff-macro-form*
When a macro is being expanded in order to make sense of a wff, this variable will be bound to the macro form. This enables you to "overload" macros to do one thing if they're used in lisp code and another if they're used in wffs.

Advising, debugging, optimizing Rules

Rule triggering is performed by a match network conceptually similar to that used by the OPS family of production systems interpreters. The match network consists of objects called MatchNode's, identified by the type
Type (ap5::matchnode x)

Every matchnode is responsible for detecting tuples matching a description corresponding to the relations:
Relation (ap5::matchvars matchnode variable-list)
Relation (ap5::matchwff matchnode wff+)

There are two ways in which the list (variable-list s.t. wff+>) differs from a normal description. First, the variables are internal structures rather than symbols. Second, Wff comes from a slightly extended set of wffs. In addition to the normal Start syntax, there is a new form, (Start* wff), which means that this matchnode detects tuples such that (Start wff) under the assumption that (Previously wff) is never true of any tuple. (It turns out that this can be detected in certain cases where (Start wff) cannot be detected, and this allows the enforcement of certain consistency rules - which after all are presumed not to have been violated in the previous state.)

Matchnodes are also related to each other via the relation:
Relation (ap5::matchsuccessor predecessor successor)
which means that the tuples recognized by predecessor are used as inputs to successor in order to help it recognize its own pattern.

The relations
Relation (ap5::matchconsistency matchnode ConsistencyRule)
Relation (ap5::matchautomation matchnode AutomationRule)
Relation (ap5::matchmaintain matchnode MaintainRule)
relate matchnodes to the rules (of corresponding type) that they trigger.

Advice for every transition

As a convenience for causing something to happen after every database transition, we advertise
Variable ap5::*advice-for-every-transition*
This is a list of functions of no arguments to which users may add their own entries. On every transition every such function will be called. The only real difference between such advice and an Automation Generator with the trigger (nil s.t. true) is that the advice will never be turned off by with-dormant-rules. (The uses we have seen for such advice are things like updating the display to reflect changes in the database.)

As a convenience to those who wish to write such code, the following function is provided:
Function (map-updates mapfn ...)
This calls mapfn for each update that matches a pattern specified by the keyword arguments. The parameters passed to the mapping function are the relation, the tuple, the truthvalue and possibly some other attributes (presently unspecified) of the matching update. (Thus you should use a function with an "&rest ignore" parameter.) An update matches if it satisfies the constraints imposed by all of the non-NIL keyword parameters:
rel is either a relation, a list of relations or one of a few keywords described below. If the value is a relation, a matching update must be an update of that relation. If the value is a list, a matching update must be an update of a relation in the list. The default value is :Stored&NotMaintained, which is equivalent to a list of all relations that are stored (not derived) and not maintained. Another legal value is :stored which is equivalent to a list of all stored relations including maintained relations. While it is legal to ask for updates to derived relations (such as transitive closures or defined relations), these may cause runtime errors (for instance the updates cannot be generated) and they are likely to be much more expensive to compute than updates to stored relations.
relpred is a predicate of relations. The relation of the update must satisfy that predicate
tuplespec is a list of objects and occurrences of the symbol :any. A matching update must be an update of a tuple of the same length as the list which is equivalent (with respect to the relation being updated) to the list in every position except those containing :any.
tuplepred is a predicate of two arguments: the new truthvalue and the tuple of objects acquiring that truth value. The truth value and update must satisfy that predicate.
tv is a list of truthvalues. The new truthvalue must be in the list. The default value is (T NIL). (This could be encoded in the tuplepred but in some cases it's more efficient to use this argument if its value is not the default.)

It is recommended that after using this function to identify interesting updates, further processing be done via normal AP5 mechanisms, so as to avoid having to worry about such things as equivalence relations.

Debugging Rules

We advertise for purposes of debugging the fact that all rules are invoked by
Function (ap5::doconsistencyrule rule inputs)
Function (ap5::doautomationrule rule inputs)
Function (ap5::domaintainrule rule inputs)
Breaking or advising these functions will provide access to every invocation. Also, during rule execution,
Variable ap5::*current-rule*
is bound to the rule that is being run.

Also, at the start of every consistency cycle,
Function (ap5::start-ccycle cycle# updates delta)
is called. Its arguments are a consistency cycle counter (starting at 1 for each atomic), the set of updates to be incorporated into delta and the delta at the start of the cycle. This is, of course, the same as the delta at the end of the previous cycle, except for the first cycle, where it's the delta at the end of the last reveal, or NIL if there was no reveal.

For purposes of performance monitoring, we advertise two internal functions used for matching rule triggers:
Function (ap5::do-matchcode matchnode inputs)
Function (ap5::do-matchcompare previous matchnode tuple)
The first of these computes tuples described by the matchnode (those related to the inputs), and the second checks whether a tuple has been found already. If certain atomic transitions take longer than you expect, monitoring these can tell you whether the time is going into trigger matching and, with the right monitoring tool, what the problem is. This in turn might influence you to change some relation representations, maintain some relations, change a rule trigger, declare some no-trigger's, etc. (We have a tool called record-calls that seems appropriate for this sort of monitoring.)

Optimizing Rules

It is not difficult to create special purpose matchnodes that interface to the match network even though they work very differently from matchnodes built by the rule compiler. However this degree of optimization seems not to be normally needed and so (until demand is demonstrated) we do not describe here how to do this. Rather we concentrate on mechanisms for giving the rule compiler advice that it can use to produce more efficient match networks.

For non primitive wffs, the relation
Relation (no-trigger description)
advises the rule compiler not to bother trying to detect instances of the description. Although descriptions are more general than the information conveyed by PossibleToAdd and PossibleToDelete, there's a corresponding disadvantage: the rule compiler will only recognize that very description, and not other equivalent forms. This relation is therefore really only useful after you've seen the matchnodes that your rules compile into. When adding tuples to this relation you should use the function
Function (canonicalize-trigger description)
as in (++ no-trigger (canonicalize-trigger '(vars s.t. wff))). This is necessary (at present) because the rule compiler uses ExpandDescription in order to canonicalize wffs, and the same wff may canonicalize differently in different worlds.

In the interest of efficiency, you may want to take a look at the triggers of the matchnodes that were built in order to enforce a rule, so you can declare some as No-Trigger's and deactivate them. Similarly, if you notice that certain updates take a long time, you might be interested in looking for costly and unnecessary matchnodes that are activated by the update.

Function (matchnodes-< rules)
returns a list of matchnodes that are used to trigger the rules. If rules is T then all rules are examined.

Function (matchnodes-> rels {changetypes})
returns a list of matchnodes that can be triggered by changes in the relations in the list rels. If rels is T then all relations are examined. Changetypes defaults to '(:add :delete) which means to find the matchnodes that can be activated by either additions or deletions of the relations. The arguments '(:add) and '(:delete) can be used to get only the matchnodes that can be activated by one or the other.

In order to look at the triggers of matchnodes, it will be convenient to use
Function (mnodetriggers MatchNode-List)
which returns a list (corresponding to the input matchnodes) of descriptions which, if added to the No-Trigger relation would prevent the matchnode from being built.

Function (rules-sensitive-to rels {changetypes})
returns a list of rules that could be triggered by changes in the relations in the list rels. If rels is T then all relations are examined. This is useful for browsing. Changetypes is interpreted as above. See also the relation Relation-in-Defn.

The following functions may be useful for debugging or for efficiency hacks, but of course, they might endanger the integrity of the database:

Function (deactivate-rule rule)
This effectively stops enforcing the rule.
Function (reactivate-rule rule)
This undoes the effect of a DeActivate-Rule.
Macro (with-dormant-rules rules . forms)
evaluates the forms as if the rules specified by rules did not exist. There are several ways of specifying the set of rules. The most extreme case is T, meaning all rules, even maintaining rules. This means not only that you are supplying a set of updates that satisfies all consistency conditions, but that you are also supplying the updates to maintained relations. Specifying T also binds ap5::Fix-Maintained-Relation to T. It doesn't make any sense to run consistency rules or automation rules without running maintaining rules, since the maintaining rules determine in part which of the other rules will trigger. The more usual cases are that you want to turn off all consistency rules or all automation rules. These are done by using :consistency and :automation, respectively. Otherwise rules should be a list of symbols. The symbols :automation and :consistency mean the same as above. Any other symbol is interpreted as the name of a rule.
Function (deactivate-matchnode matchnode {classes})
Deactivates the matchnode. Classes is a list of symbols from {:consistency :maintain :automation}. This allows you to retain the matchnode for use in the other classes of rules.

For example, one might do

(++ No-Trigger
     '((x y) s.t. (and (Start (P x)) (Previously (Q y)))))

if you knew (but AP5 didn't) that this was impossible. Similarly, you might do

(++ No-Trigger '((x y) s.t. (and (Start (P x)) (Q y))))

if you knew that this condition was not of any interest by itself but was only being used to trigger ((x y) s.t. (start (and (P x) (Q y)))), and if you knew that (and (Start (P x)) (Previously (Q y))) was impossible. The reasoning involves knowing that in order to notice (start (and (P x) (Q y))), the rule compiler combines noticers for (and (Start (P x)) (Q y)) and for (and (Start (Q y)) (P x)). Since (and (Start (P x)) (Previously (Q y))) is impossible, the only cases that would be noticed by (and (Start (P x)) (Q y)) are actually those where (and (Start (P x)) (Start (Q y))), which would also be noticed by (and (Start (Q y)) (P x)).

Of course, this declaration can also be used to tell the system not to worry about a condition that could in fact arise, if you're willing to take responsibility for either seeing that it doesn't or noticing and taking appropriate action when it does.

Transition History

An additional piece of data that may turn out to be useful (more generally than for debugging rules) is that AP5 records all the transitions. [We reserve the right to change the representation] but as of this writing:
Variable ap5::global-history
is a list of transitions in reverse chronological order.

For each transition:
Function (ap5::originalupdates transition)
is a list of update records, one for each ++ or -- that was performed in the atomic (not including defined updates or rules).
Function (ap5::consistency-cycles transition)
is a list of consistency cycle records, again in reverse chronological order. For each such cycle the following are defined:
Function (ap5::direct consistency-cycle)
Function (ap5::indirect consistency-cycle)
Function (ap5::maintain consistency-cycle)
Function (ap5::again consistency-cycle)
Function (ap5::delta consistency-cycle)

Direct records non-noop updates to stored relations (which may include inherits [This derives from a context mechanism that is no longer supported.]). DirectDelta is a version of Direct in which inherits that change the truth value of a tuple have been replaced by updates that explicitly describe the effect of the inherit (by pretending to be additions or deletions) and inherits that do not affect the truth value have been deleted. Indirect records "virtual updates" computed by triggers (described later). Maintain records updates to maintained relations. Again records updates that were noops or repetitions. Delta is the union of DirectDelta, Indirect and Maintain.

Function (delta)
displays atomic updates (the "originalupdates" component), starting with the most recent. After each one it asks whether you want to continue.

Function (showdelta {:record record} {:field field})
displays atomic transactions. Record may be either a number, meaning the record'th atomic transition from the last (default is 0, meaning the last transition), or a consistency cycle as described above. This is especially useful in debugging aborted transitions, since abortdata typically ends with an entry that was to have been added to global-history. Thus (car (consistency-cycles (car (last abortdata)))) is frequently a useful record argument to showdelta. Field is one of symbols (from the AP5 package):
Direct, DirectDelta, Indirect, Maintain, Again, Delta. The default is Delta.

Variable *max-history-length*
can be set to a positive integer to prevent global-history from growing indefinitely (and using up too much space). The value is the number of items to be kept in global-history. Setting this variable (or setting global-history itself) is generally acceptable, but doing so while in the scope of the TRANSACTION macro might prevent you from backing out of the transaction (if the history that has to be undone is lost).

Checkers

This section does not really describe any rules, but rather a hook that is occasionally useful.
Relation (checker rel-or-imp function)
can be used to insert code that will be run before updates are made to the (primitive) relation (or any relation of the implementation). Internally this has been used mainly to check for datatype constraints, e.g., a StructProp relation can only be added for a DBObject. (This is inconvenient to say in a consistency rule, because we're talking about any relation of this implementation. Rules cannot be triggered on wffs containing relations that are not identified until runtime.) Function accepts the same arguments as an update function: a relation, a list of tuples to be added and a list of tuples to be deleted. If it returns NIL then the update will abort. This could be used for debugging.

Multiple Processes

Since multiple processes are not part of commonlisp, they have received no attention in this manual. However, for machines that do support processes there are some difficult database issues. In particular, how can you write programs to run (and behave properly!) while the database is changing in unpredictable ways? Even what appears as a simple database query may return odd results. For example, suppose you do (listof x s.t. (P x)). It is conceivable that your query starts in a state where the answer ought to be NIL, that in the course of evaluating the query an atomic update occurs which yields a state in which the answer ought to be (1 2), and that the query actually returns (2)! The point is that queries take time to evaluate, and, even if they don't run while the database is being updated, they can run partly in one state and partly in another. The result might not actually be the "right" answer for any time during which the query was being evaluated! Often this is not important. However, if it is, you can use:
Macro (withwriteaccess . body)
which prevents any other process from doing a WithWriteAccess during the execution of the body. The Atomic construct puts all the forms to be evaluated along with the actual database updates in the same WithWriteAccess. This means that all database access code in the atomic will be done in the same state, and all updates from the atomic will be made to that same state. Of course, it may be undesirable to lock out other processes for a long time.

Large Wffs

Wffmacros and (inline) defined relations are a significant source of power for the user of AP5. Unfortunately, they cannot in general help AP5 to compile the code the user writes with them. (See the next section if you need convincing.) Thus one will occasionally see a concise looking piece of specification that takes a long time to compile, or cannot be compiled. One reason that it can take a long time to compile is that the short wff actually expands into a very long wff. In order to see the expanded form of a wff, you can use:
Function (expanddescription description {:allowevalargs allowevalargs} {:keepstarts keepstarts} ...)
For example:

(expanddescription '((x) s.t. (xor (relation x)(type x))))
((Var-X)
   S.T.
   (OR (AND (A (Var-F76109)
	       (NOT (RELATIONARITY Var-X Var-F76109)))
	    (RELATIONARITY Var-X 1))
       (AND (E (Var-F76108) (RELATIONARITY Var-X Var-F76108))
	    (NOT (RELATIONARITY Var-X 1)))))
NIL
NIL

The value of allowevalargs should be T if you intend to use lisp expressions as arguments. The additional values record the uses of such lisp expressions and the uses of expressions to be used as relations (in funcall and apply wffs). The value of keepstarts should be T if you do not want START's translated into PREVIOUSLY's.

On occasion, you will notice that the resulting wff could be simplified. Alas, the simplifier used by AP5 is imperfect, and this can force the compiler to do extra work, or occasionally even fail where you might be able to succeed. Although we can imagine ways of improving the simplifier, it will probably always miss some simplifications that you can see. In the example above, the first disjunct is actually impossible - if something has an arity of one, then it must have an arity! In such a case, you can always lend a helping hand and provide the simplified version of the wff to AP5.

Decaching Compiled Code (and other data)

AP5 does a lot of compilation in addition to the functions that you explicitly ask to be compiled. It internally caches most of this compiled code. Unfortunately, when you change some data that AP5 uses in order to compile this code, the cached data becomes obsolete. This section describes some of these caches and tells you how to decache it should that become necessary.
Note:
For implementation reasons we do not treat caches as relations in the sense that they are changed without doing atomic updates (for instance we don't want cached code to be lost just because it was computed during an atomic that failed). As a result, decaching may lead to problems when trying to undo history. In particular, if you undo past a point where you had to decache some code, you'll probably have to decache again in order to allow the new compiled code to be replaced by what was appropriate before.

Relation testing, adding, deleting code

The macros ??, ++ and -- each store code in the property lists of relations. If the tester, adder or deleter change, this code should be decached (there's an automation rule to do this). Similarly, if you change one of these operations without updating the database (by redefining a function that is used to macroexpand a database operation), you'll have to decache this code in order to have the correct code installed (the next time you attempt the operation). The value of the properties tester, adder and deleter are compiled functions with uncompiled definitions stored in their :previous-definition properties.

Code for Generating Relations

When code for generating a description (in this case the term "description" is used to include things with lisp expressions to be evaluated) is produced, a compiled version is cached. This code is associated with a canonical form of the description and used whenever matching descriptions are to be generated. This has two important consequences. First, subsequent generation of similar descriptions will be faster (, i.e., the first one is slower). Second, whenever the method for generating changes, the cache must be updated. There are rules to decache entries that may be affected by database activity, but if you redefine a function that is used in generating, you should decache whatever uses the old version.

The relation
Relation (generator-cache rel-or-symbol description compiled-generator)
provides database access to the generator cache. The description slot is the canonical form of the description which the last slot specifies how to generate. (The last slot can also specify that the description cannot be generated.) The rel-or-symbol is either the CAR of the wff in the description or a relation that appears in the description. All such relations and symbols are added and deleted together - if you delete one the others will follow. This relation is provided in order to allow users to examine the cache. Tuples should only be added by AP5 itself as a result of compilation. The proper way to remove a particular entry is:
Function (ap5::gendeleter description compiled-generator)
This is better than -- in that it removes entries for all appropriate rel-or-symbol's, and it saves the name of the generating function so that previously compiled code will continue to work (after recompiling the entry).

Macro (cache-generators . descs-and-rels)
computes and caches generators for the patterns described by the arguments. These can be either descriptions, corresponding to descriptions you expect to generate, or relation names. A relation name corresponds to every possible pattern of inputs and outputs for that relation. If this form appears in a file and the file is compiled, the cache entries will be compiled to the file and loaded when the file is loaded.

It is not necessary to explicitly cache generators for compound wffs that are used by functions defined by defun if you use:
Macro (ap5::defun . body)
This is just like lisp::defun except that any generators used by the function are cached. (Ap5::defun is not currently exported. You can either import it into your own package or include the ap5:: prefix only when you want this behavior.)

Since the first use of a generator is slower, some applications might want to preload the generator cache.

When you decache generators the default is not to recompile any of them until they are needed. Sometimes this is a bad tradeoff. If, for instance, you're about to save a world it would be better to recompile all the decached generatores first.
Macro (ap5::recompile-decached-generators {new-set})
recompiles all the decached generators. If the optional argument is NIL (default is T) then it tries to recompile the same set that it found on the previous attempt. Normally it searches the cache for entries that need to be recompiled.

Another use of this macro is that when you make a patch that decaches generators you can recompile them as part of the patch (so those who load the patch won't have to recompile them). This can be done by first recompiling the decached generators in your environment, then doing the decache that will be part of the patch, and finally putting (ap5::Recompile-Decached-Generators) in the patch file. Thus you evaluate something like this:

(ap5::Recompile-Decached-Generators)
 ; this recompiles everything in the environment so only the things
 ; decached below will be included in the patch file
(Decache-Rel ...)
 ; typically changes to DefRelation cause decaching too

and then compile a patch file that looks like this:

(Decache-Rel ...)  ; same as above
(ap5::Recompile-Decached-Generators)
 ; compiling this causes the generators you just decached
 ; to be recompiled and recached in your environment, and
 ; also causes them to be saved on the patch file)

Of course, you should make sure that the generators you save on the patch file only refer to relations that will be in the environments where the patch will be loaded.

It may also be useful to be able to change the data in the generator cache. The most likely thing is to want to change the function that actually does the generating. For a canonical pattern to be generated (slot 1 of a generator-cache tuple), the function
Function (ap5::lookup-gen pattern{no-error})
returns the name of a compiled function, something like AP-COMPILED::F1779 (the number will vary) which is used to generate the given pattern. Lookup-gen also compiles such a function if necessary (and possible). If the pattern cannot be generated, Lookup-Gen will either return :cannot-generate (if no-error is T) or generate an error. The original source code of the ap-compiled:: function is stored in its :previous-definition property, so you can examine and change it.

Code in the Match Network

The compilation of rules involves building "matchnodes", which contain compiled code of two kinds. One is code that generates objects that match some pattern. This code can stop working if the generators or testers become less capable. The other kind is code that recognizes whether tuples are the same. This can become obsolete if the equivalence relations used to compare the objects of the relations involved are changed (in which case other things may need to be fixed).

Code for comparing tuples of Relations

The comparison relations for a given relation are compiled into a piece of code that compares tuples for that relation. If you decide to change the comparisons for a relation after using that relation, you should decache that function. Note however, that this will not change the data stored for the relation, which might also be necessary if the comparison changes.

Expansions of Definitions

In an effort to optimize the processing of definitions for defined relations, AP5 caches an expanded version of the definition of a relation in the ap5::EXPANDED-DEFN property of the relation. Should this definition change, the value should be decached. (An automation rule does this.) (I don't know what ought to happen to other things that were computed from the old definition. Are they still what you intended?)

Decaching

Function (decache-rel rel . args)
If rel is either a relation or the name of a relation, some or all of its cached data is deleted. The remaining arguments determine which data is decached. For compatibility with older code, the arguments may either be keywords or keywords followed by NIL or T (as in lisp keyword arguments). (If a keyword appears in the argument list without being followed by NIL, it is treated as if it were followed by T. If a keyword appears more than once the first occurrence takes precedence.) The special keyword :all means to decache everything. This ought to be safe, but more expensive than special cases where you know which things need to be decached.

Decache-rel is driven by the relation
Relation (decache-rel-keyword-fn keyword function)
For each tuple in this relation, if the keyword (or :all) is passed to decache-rel then the function is called on the relation. This makes it easy for you to invent new caches for relations and get them decached when necessary. Initially defined keywords include:
:tester means decache the testing code for this relation
:adder means decache the adding code for this relation
:deleter means decache the deleting code for this relation
:comparetuple means remove the tuple comparison code
:expanded-defn means remove the expanded definition
:generator means to remove all code for generating this relation (including code for generating compound wffs that use it) from the generator cache
:fix-existing-generator means that all previously compiled code that generates this relation should be updated as well. (This has no effect if the generator argument is NIL.) If this cannot be done a warning is issued.
:recompile-rules means that any rules that use the relation should be recompiled. This is likely to be an expensive operation. Any rules that cannot be recompiled will cause warnings.

Dumping strange objects to compiled files

Compilation of some AP5 programs to files may apparently require that some nonstandard lisp object be dumped to the compiled file, e.g., a relation. For instance suppose the file contains the following:

(defun subtype-of-person (type)
  (?? subtype type #,(dbo type person)))

(defrelation subtype-of-person :definition
   ((x) s.t. (subtype x #,(dbo type person))))

The problem is that there is no commonlisp standard way to save the object #,(dbo type person) to a compiled file. In the first case we could get around this problem by changing the definition slightly:

(defun subtype-of-person (type)
  (?? subtype type (dbo type person)))

which incurs a runtime cost of computing (dbo type person) on every execution. However, this is not a solution to the second example, since defining wffs are not allowed to contain lisp expressions that have to be evaluated. They may only contain constants. Therefore a better solution would be something like:

(defconstant *type-person* (dbo type person))
(defrelation subtype-of-person :defintion
   ((x) s.t. (subtype x *type-person*)))

However, even this is not quite good enough, since macros may expand into forms that find they need to make up constants (and it may not be possible to stick defconstant forms inside the macro expansion). Our solution to this problem is
Macro (memo form)
The first time this is evaluated, the form is evaluated and its result is stored. The form is not executed on later evaluations. The stored result is always returned as the value. This form therefore satisfies the usual definition of a constant since the values on all executions are EQ. Note, however, that this may not be "constant enough" - if the value is smashed, programs that access its fields can get different results. Also remember that other forms that are EQUAL may evaluate to different things. In any case, wff syntax considers lists starting with Memo to be constants. Thus we can write:

(defrelation subtype-of-person :defintion
   ((x) s.t. (subtype x (memo (dbo type person)))))

Macro (constant form)
This is similar to memo, but it does not save the value of the first execution. Rather the form is executed every time, and AP5 trusts that the values are "constant enough".

Appendices

Additional declarations for relations and implementations
An example of an AP5 Program

Additional declarations for relations and implementations

Most of what you need to do in declaring a relation is done by the DefRelation macro. Additional information less commonly needed includes:

(++ subtype rel rel) ; for its sub/super types

(++ equivalence-relation rel) ; for equivalence relations
(++ canonicalizer rel eq/equal/eql function)

(++ reader rel function) ; for some types
(++ printer rel function)

(++ checker rel function) ; also for representations

An example of an AP5 Program

We present a development of a "topological sort", as described in CACM of June 1985 in the "programming pearls" column, p. 573. The problem is to find a linear ordering for a finite set of nodes, which conforms to a provided partial ordering. The algorithm is to repeatedly find a node which has no predecessor, put it in the list, and remove it from the graph. If this can be done until the graph is empty (has no more nodes) then the order of elements in the list is an answer. Otherwise there is no answer (the graph has cycles).

In AP5, I choose to represent the graph (which is, after all, a set of nodes and edges) by two relations, which in the prototype version will be declared as base relations:

(defrelation node :arity 1
 :documentation "the set (type) of nodes")
(defrelation edge :types (node node)
 :documentation "the set of edges")

I will actually use the type constraint to remove the edges connecting nodes which are deleted. Nodes will be deleted merely by making them cease to be nodes.

Now for some data:

(set-listof n s.t. (node n) '(a b c d e f g h i))
(set-listof n1 n2 s.t. (edge n1 n2)
      '((a b)(a f)(b f)(d c)(d h)(e b)
	(e d)(f h)(g e)(g c)(i e)(i c)))

And now for the program:

(loop while (?? E (n) (node n)) collect ; while there are nodes
      (forany node#n s.t.                ; find one with 
	      (not (E (x) (edge x n)))   ;   no predecessor
	      (-- node n)		 ; remove it from the graph
	      n                          ; and use it as next node
	      ifnone (error "cycles in graph")))
(I G E D C A B F H)                      ; this was the result

This program is, of course, not particularly efficient. If efficiency is a concern (the program is to be run often or on large problems) it can now be optimized. The most obvious problem is that the entire set of edges must be searched in order to find that a given node has no predecessors. This could be fixed by changing the implementation of the edge relation, e.g., making it a Two-Way-Structprop relation. This reduces the cost of determining whether a node has any predecessors to a constant.

Another problem is that the program may have to search the entire set of nodes on each iteration to find one without a predecessor. This could be fixed by maintaining the (unary) relation Node-With-No-Predecessor, defined as
((node#n) s.t. (not (E (x) (edge x n)))), and using that in the forany. This would reduce the cost of finding the next node to a constant. Another cost of course arises in maintaining the new relation. However, in this case the cost would be relatively small: whenever a new node is added, the maintaining rule adds it to this relation (which is easy). Whenever an edge is added it removes the target node from the relation (again easy). For each edge that is deleted (ultimately every edge, unless there are cycles), the rule checks to see whether that was the last edge to its target (this is cheap if we have a reasonable implementation for edge), and if so, puts the target back into the set of nodes with no predecessors. The rule also removes nodes from this set as they are deleted from the set of nodes.

At this point, we have reduced the cost of the program to near the lower bound. The cost of the I/O is proportional to the number of nodes plus the number of edges. The only remaining costs over and above this lower bound (as far as I can see) come from searching for a particular edge in the set of edges from or to a given node (to delete it), and similarly searching the set of nodes without predecessors. If these sets are expected to be large, the implementations can be further optimized, e.g., to use hashtables.