# Welcome to the Walla Walla University Computer Science Department!

## BAT: A Tableau Based Theorem Prover

Anthony A. Aaby
Department of Computer Science
Walla Walla College

August, 1988 This work is partially supported by a Pew foundation grant

#### abstract

This paper presents BAT, an implementation of a tableau-based theorem prover for first order logic \cite{Smullyan68}. BAT is is written in BIM Prolog and is able to solve Schubert's Steamroller problem in 0.7 seconds on a SUN 3/50. This paper reports on the design of BAT, experiences with the theorem prover and outline of future work. The complete Prolog code for BAT is given in an appendix.

#### Key words and phrases.

Theorem prover, first-order logic, analytic tableau, Prolog, LCF, Nuprl, Tactics, Tacticals.

### 1 Introduction

This paper describes BAT, the Bucknell analytic tableau based theorem prover for first-order predicate logic. BAT is an implementation of the method of analytic tableaux described by Smullyan \cite{Smullyan68}. It is intended to form the basis of several student projects for a course logic programming to be given at Bucknell University in the fall of 1988. BAT is written in Prolog and consists of less than 220 lines of code.

Preliminary testing shows that BAT compares quite favorably with other theorem provers. BAT is is written in BIM Prolog and is able to solve Schubert's Steamroller problem in 0.7 seconds on a SUN 3/50. This contrasts favorably with Walther's many-sorted resolution solution \cite{Walter85} which required a solution time of 7.11 seconds, Stickel's theorem prover solution time of 2 hours and 53 minutes \cite{Cohen85}, an ITP solution time of 660 seconds \cite{Cohen85,Walter85} and Oppacher and Suen's \cite{OppSuen88} solution time of 14 seconds on a Symbolics 3600. Manthey and Bry report that their Prolog technology theorem prover SATCHMO \cite{ManBry88} is able to solve the steamroller problem in 0.3 seconds. However, they require the problem to be stated in clausal form. Fitting \cite{Fitting88} has also developed a tableau based first order theorem prover but reports no performance figures.

BAT accepts formulas in the full first-order predicate logic, uses skolem functions to instantiate existentially quantified variables, and uses Prolog's unification mechanism to check for complementary literals. The HARP \cite{OppSuen88} theorem prover is of particular interest since it is also an implementation of the method of analytic tableaux. HARP consists of approximately 3000 lines of ZetaLisp.

In section 2, the method of analytic tableaux is reviewed. In section 3, the implementation of BAT is described. In section 4, a description of the performance of BAT on several of the 75 problems of Pelletier \cite{Pelletier86} is given. In section 5, a description of plans for future work is presented. In Appendix A, the Prolog code for BAT is given. In Appendix B, several of the 75 problems of Pelletier \cite{Pelletier86,Pelletier88} are given in the prefix form required by BAT.

## 2 First-Order Logic

### Syntax

The symbols of first order logic are the following:
• A denumerable list of symbols $x0, x1, x2,...$, called individual variables.
• A denumerable set $U$ of symbols $c0, c1, c2,...$, called individual constants.
• For each natural number n, a denumerable list of symbols $P0, P1, P2,...$, called n-ary predicates.
• Operators: $\neg , \bigvee , \bigwedge , \forall , \exists$.
If $P$ is an n-ary predicate and $m0, m1,..., mn$ are individual variables or individual constants, then $P(m0, m1,..., mn)$ is an atomic formula ($m0$, $m1$, ..., $mn$ are called the parameters of $P$). Formulas of first order logic are defined as follows.
1. If $A$ is an atomic formula then $A$ is a formula.
2. If $A$ and $B$ are formulas then so are $\neg A$ ( negation), $(A \bigwedge B)$ (conjunction), $(A \bigvee B)$ ( disunction), $(A \rightarrow B)$ (implication), and $(A \leftrightarrow B)$ (biconditional).
3. If $A$ is a formula and $x$ is an individual variable then $\forall x.A$ (universal quantification) and $\exists x.A$ (existential quantification) are formulas.
BAT requires formulas to be written with the logical operators in prefix form (a simple modification to the program would permit the standard infix notation). Variables are strings which must begin with an uppercase letter. Constants may be either a numbers or strings beginning with a lowercase letter. The following is a list of BAT's formation rules.

{\small \begin{verbatim} % FORMATION RULES % constants: numbers or strings beginning with a lowercase % letter. % variables: strings beginning with an uppercase letter. % terms: constants,variables, t0,t1,... % predicate symbols: strings beginning with a lowercase letter. % formulas: p(t0,...,tn), and if A,B are formulas then so are % not A % or(A,B) % A = B % and(A,B) % if(A,B) % iff(A,B) % qa(X,F) % qe(X,F) \end{verbatim}}

Let $A$ and $B$ be formulas, $x$ and $y$ individual variables, and $c$ an individual constant. Substitution is defined for formulas of first order logic as follows.

1. If $A$ is an atomic formula, then $A[x:c]$ is the result of substituting $c$ for each occurrence of $x$ in $A$.
2. $(\neg A)[x:c] = \neg A[x:c]$
3. $(A \bigwedge B)[x:c] = A[x:c] \bigwedge B[x:c]$.
4. $(A \bigvee B)[x:c] = A[x:c] \bigvee B[x:c]$.
5. $(A \rightarrow B)[x:c] = A[x:c] \rightarrow B[x:c]$.
6. $(A \leftrightarrow B)[x:c] = A[x:c] \leftrightarrow B[x:c]$.
7. $(\forall x.A)[x:c] = \forall x.A$
8. $(\exists x.A)[x:c] = \exists x.A$
9. $(\forall y.A)[x:c] = \forall y.(A[x:c])$ where $x$ and $y$ are different individual variables.
10. $(\exists y.A)[x:c] = \exists y.(A[x:c])$ where $x$ and $y$ are different individual variables.
BAT implements substitution via the predicate {\tt substitute(VAR, EXP, FORM, FORM')} where {\tt FORM'} is the result of replacing {\tt VAR} with {\tt EXP} in {\tt FORM}. BAT's rules are as follows: \small{

\begin{verbatim} % SUBSTITUTION substitute(X,C,not A, not Ap) :- substitute(X,C,A,Ap). substitute(X,C,and(A,B),and(Ap,Bp)) :- substitute(X,C,A,Ap), substitute(X,C,B,Bp). substitute(X,C,or(A,B), or(Ap,Bp)) :- substitute(X,C,A,Ap), substitute(X,C,B,Bp). substitute(X,C,if(A,B), if(Ap,Bp)) :- substitute(X,C,A,Ap), substitute(X,C,B,Bp). substitute(X,C,iff(A,B),iff(Ap,Bp)) :- substitute(X,C,A,Ap), substitute(X,C,B,Bp). substitute(X,C,qa(Y,A), qa(Y,Ap)) :- X \== Y, substitute(X,C,A,Ap). substitute(X,C,qa(Y,A), qa(Y, A)) :- X == Y. substitute(X,C,qe(Y,A), qe(Y,Ap)) :- X \== Y, substitute(X,C,A,Ap). substitute(X,C,qe(Y,A), qe(Y, A)) :- X == Y. substitute(X,C,Pred, Predp) :- Pred =..[P|List], not in(P,[not,and,or,if,iff,qe,qa]), list_subs(X,C,List,Listp), Predp =..[P|Listp]. list_subs(X,C,[], []). list_subs(X,C,[Y|R],[Y|Rp]) :- X \== Y, list_subs(X,C,R,Rp). list_subs(X,C,[Y|R],[C|Rp]) :- X == Y, list_subs(X,C,R,Rp). \end{verbatim}}

A formula $A$ is called a sentence iff $A[x:c]=A$ for all pairs of individual variables $x$ and individual constants $c$.

### Semantics

The method of analytic tableaux\cite{Smullyan68} provides a systematic method to test for consistency. Given a set of formulas, the method constructs a tree of formulas. Initially, the set of formulas are placed at the root of the tree. The tree is grown'' by adding formulas to the tree. Each formula added to the tree is a subformula of a formula already present on the tree. The method rests on the following facts.
1. $\neg A$ is true iff $A$ is false.
2. $\neg A$ is false iff $A$ is true.
3. $A \bigwedge B$ is true iff both $A$ and $B$ are true.
4. $A \bigwedge B$ is false iff either $A$ or $B$ is false.
5. $A \bigvee B$ is true iff either $A$ or $B$ is true.
6. $A \bigvee B$ is false iff both $A$ and $B$ are false.
7. $A \rightarrow B$ is true iff either $\neg A$ or $B$ is true.
8. $A \rightarrow B$ is false iff $A$ is true and $B$ is false.
9. $A \leftrightarrow B$ is true iff either $A \bigwedge B$ or $\neg A \bigwedge \neg B$ is true.
10. $A \leftrightarrow B$ is false iff either $A \bigwedge \neg B$ or $\neg A \bigwedge B$ is true.
11. $\forall x.A$ is true iff $A[x:c]$ is true for every $c \in U$.
12. $\forall x.A$ is false iff $A[x:c]$ is false for some $c \in U$.
13. $\exists x.A$ is true iff $A[x:c]$ is true for some $c \in U$.
14. $\exists x.A$ is false iff $A[x:c]$ is false for every $c \in U$.
Each of these facts suggests that the path on which a formula is found may be extended by adding a new formula to the path or that a branching point has been reached and subformulas extend the path along the new branches. For example, rule 3 suggests extending the path with the subformulas while rule 4 suggests the creation of a branch point and each subformula is placed separately on a new branches. The decomposition rules are treated as rewriting rules (it is not necessary for both a formula and its subformulas to remain on a branch). The decomposition rules have the form {\tt rule(Formula, Subformulas)}. They are as follows: {\small

\begin{verbatim} % DECOMPOSITION RULES rule(not, not not A, [A]). rule(alpha, and(A,B),[A, B]). rule(alpha, not or(A,B),[not A,not B]). rule(alpha, not if(A,B),[A,not B]). rule(beta, or(A,B),[A,B]). rule(beta, if(A,B),[not A,B]). rule(beta, not and(A,B),[not A,not B]). rule(beta, iff(A,B),[and(A,B), and(not A,not B)]). rule(beta, not iff(A,B),[and(A, not B),and(not A, B)]). % The gamma rule substitues a new variable for universally quantified % variables. rule(gamma, qa(X,F),[ Fp,qa(X, F)],Y) :- substitute(X,Y,F,Fp). rule(gamma, not qe(X,F),[not Fp,not qe(X,F)],Y) :- substitute(X,Y,F,Fp). % The delta rule substitues a new skolem function for an % existentially quantified variable. rule(delta, qe(X,F),[Fp],Vars) :- new_constant(C), SK =..[skf,C|Vars], substitute(X,SK,F,Fp). rule(delta, not qa(X,F),[not Fp],Vars) :- new_constant(C), SK =..[skf,C|Vars], substitute(X,SK,F,Fp). \end{verbatim}}

A literal is an atomic formula or the negation of an atomic formula. A pair of complementary literals is an atomic formula and its negation. A path is said to be inconsistent iff it contains a pair of complementary literals. If a path is not inconsistent then it is said to be consistent and the set of literals occurring on the path is said to be a model of the given set of formulas. If each path through the tree is inconsistent then the given set of formulas is said to be unsatisfiable otherwise the set of formulas is said to be satisfiable. A set of formulas $\{F1,..., Fi,..., Fn\}$ is said to be valid iff the set of formulas $\{F1,..., \neg Fi,..., Fn\}$ is unsatisfiable. If a set of formulas $\{F1,..., Fm\}$ is satisfiable, then the method will construct a model (possibly infinite) of the set of formulas. If a set of formulas $\{F1,..., Fm\}$ is unsatisfiable, then the method will terminate in a finite number of steps. Since the formula $\forall x.A$ states that any value may be substituted for $x$ in the formula $A$, $\forall x.A$ may be replaced with $A[x:x']$ where $x'$ is a variable new the path. The use of a variable new to the path insures that the semantics of the formula remain unchanged. The formula $\exists x.A$ states that there is some value which may be substituted for $x$ in the formula $A$ therefore, $x$ may be replaced by an arbitrary variable which represents the value. Thus, $\exists x.A$ may be replaced with $A[x:c]$ where $c$ is an arbitrary constant new to the branch. Since $c$ may depend on other variables, $x$ must be replaced with a function whose arguments are the appropriate variables. Such functions are called skolem functions \cite{bundy,robinson} and if the are no variables on which the choice rests, a constant may be used instead and such a constant is called a skolem constant. These two ideas enable the elimination of quantifiers from formulas. BAT generates the required skolem constants and functions for the existentially quantified variables. The generated constants are integers and the skolem functions are functions of universally quantified variables.

## Tableaux

Given a formula to prove, BAT negates the formula and places it at the root of the tree. A branch on the tree is represented by two lists. {\tt Formulas} is the list of undecomposed formulas, and {\tt Atoms} is a list of positive and negative literals. Initially, the set of formulas to be checked are placed in {\tt Formulas} and {\tt Atoms} is empty. The predicate {\tt prove1} is used to construct a branch. This sequence of activity is implemented by the following code.

\small{ \begin{verbatim} prove(F) :- new_queue(DL), enqueue(not F,DL,Formula), prove1(Formula,[],[]), !. % termination (open branch--no contradiction) prove1(Formulas,Atoms,Vars) :- is_empty(Formulas), print([satisfiable,Atoms]), nl. % termination (closed branch--contradiction) prove1(Formulas,Atoms,Vars) :- dequeue(F,Formulas,Form), contradiction(F,Atoms). % literal prove1(Formulas,Atoms,Vars) :- dequeue(F,Formulas,Form), literal(F), prove1(Form,[F|Atoms],Vars). % rule application (double negation) prove1(Formulae,Atoms,Vars) :- dequeue(F,Formulae,Form), rule(not,F,Fl), push(Fl,Form,FL), prove1(FL,Atoms,Vars). % rule application (and) prove1(Formulas,Atoms,Vars) :- dequeue(F,Formulas,Form), rule(alpha,F,Fl), push(Fl,Form,FL), prove1(FL,Atoms,Vars). % rule application (or) prove1(Formulas,Atoms,Vars) :- dequeue(F,Formulas,Form), rule(delta,F,I,Vars), push(I,Form,FL), prove1(FL,Atoms,Vars). % rule application (universal quantification) prove1(Formulas,Atoms,Vars) :- dequeue(F,Formulas,Form), rule(gamma,F,IQ,V), push(IQ,Form,FL), prove1(FL,Atoms,[V|Vars]). % rule application (existential quantification) prove1(Formulas,Atoms,Vars) :- dequeue(F,Formulas,Form), rule(beta,F,[A,B]), copy_queue(Form,Fp), push([A],Form,FL), prove1(FL,Atoms,Vars), push([B],Fp,NFL), prove1(NFL,Atoms,Vars). \end{verbatim}}

The predicate {\tt prove1} has eight cases. The first case defines termination when there are no more docomposable formulas ({\tt Formulas} is empty). In this case {\tt Atoms} contains the model of the original formula. The second case terminates the construction of a path due to the occurrence of a contradiction. The third case moves a literal from {\tt Formulas} to {\tt Atoms}. The fourth case simplifies doublely negated formulas. The fifth case case replaces conjunctions with both subformulas. The sixth case case handles disjunctions. The proof tree branches and the subformulas are in different branches. The seventh case handles universal quantification. The universal quantifier is removed, a new variable is substituted for the quantified variable, the resulting formula is added to the branch, the original formula is enqueued in the list of formulas and the original variable is added to the list of variables. The eighth case handles existential quantification. The existential quantifier is removed, a new skolem function is substituted for the quantified variable and the resulting formula is added to the branch. BAT works on a compound formula until literals are reached. It works on a branch until it closes or a model is generated.

## Performance

The implementation has been tested on several of the 75 problems of Pelletier \cite{Pelletier86}. The total time required to solve problems 1-27, 29, 34-36, 47 was 6.0 seconds. BAT required an average time of 0.2 seconds per problem. BAT was not tested on problems involving identity or functions.

## Future Work

There are several modifications of BAT that are planned.
• We would like to make provision to grow'' the proof tree in a depth first fashion. In terms of the present implementation, all branches must be present in the predicate {\tt prove1}.
• We would like to minimize branching and explore Smullyan's \cite{Smullyan68} suggestion that type alpha formulas should be decomposed first and then type $\beta$ formulas. In terms of the present implementation, rather than selecting the first formula present in {\tt Formulas}, the and'' formulas must be given precedence.
• We would like to incorporate reasoning about identity and functions. In terms of the present implementation, special actions must be taken each time a literal formula is added to {\tt Atoms}. The actions taken include the construction of equivalence classes of variables and constants, substitutions, and contradiction checking. $a=b$ is a contradiction iff there is some $c$ such that $\neg (a=c)$ and $b \in$ the same equivalence class with $c$. $\neg a=b$ is a contradiction iff $b \in$ the same equivalence class with $a$.
• We would like to incorporate the use of theorems. In terms of the present implementation, special actions must be taken each time {\tt Formulas} became empty. To facilitate the selection of appropriate theorems, the representation of each theorem will include a list of the literals it contains. Precedence will be given to theorems which contain literals complementary to literals which are present in {\tt Atoms}.
• Given a formula, BAT either announces that the formula is unsatisfiable, presents a finite model, or does not terminate (in the case of an infinite model). We would like to extend BAT with some heuristics for the recognition of infinite models \cite{OppSuen88}. In terms of the present implementation, the huristic selection would occur as an alternate to a decomposition rule selection.
• We would like to extend BAT with facilities for proof explanations and condensations. In terms of the present implementation, a trace of the deduction would be returned and an analysis of the trace will form the basis of the explanation.
• We would like to explore the use of tactics and tacticals such as are utilized in LCF \cite{Paulson87} and Nuprl \cite{ConsKnoBa85}. Natural deduction can be naturally implemented within BAT. Since the proof tree is constructed by the predicate {\tt prove1(Formulas,Atoms)}, {\tt Formulas} may be viewed as a list of the hypotheses and {\tt Atoms} as a list of the goals. The only modification to BAT required is that when a literal is moved from {\tt Formulas} to {\tt Atoms} the literal must be complemented before placing it into {\tt Atoms}.
• We would like to combine BAT with a theorem prover for temporal logic \cite{AabyNa88} thus, providing a theorem prover for first-order temporal logic.

\begin{thebibliography}{99} \bibitem{AabyNa88} Aaby, A. A. and Narayana K. T., Propositional Temporal Logic is PSPACE. 9 th International Conference on Automated Deduction LNCS No. 310, Springer Verlag, (1988) 218-237. \bibitem{bundy} Bundy, A., The Computer Modelling of Mathematical Reasoning. Academic Press, London (1983) \bibitem{Cohen85} Cohen, A. G., On the solution of Schubert's Steamroller in Many Sorted Logic. IJCAI (1985) 1169-1173. \bibitem{ConsKnoBa85} Constable, R. L., Knoblock, T. B., and Bates, J. L., Writing Programs that Construct Proofs, Journal of Automated Reasoning 1 (1985) 285-326. \bibitem{Fitting88} Fitting, M., First-Order Modal Tableaux, Journal of Automated Reasoning 4 (1988) 191-213. \bibitem{ManBry88} Manthey, R. and Bry, F., SATCHMO: a theorem prover implemented in Prolog. 9 th International Conference on Automated Deduction, LNCS No. 310, Springer Verlag, (1988), 415-434. \bibitem{OppSuen88} Oppacher, F. and Suen, E., HARP: A Tableau-Based Theorem Prover, Journal of Automated Reasoning 4 (1988) 69-100. \bibitem{Paulson87} Paulson, L. C., Logic and Computation: Interactive Proof with Cambridge LCF. Cambridge University Press, Cambridge (1987). \bibitem{Pelletier86} Pelletier, F. J., Seventy-Five Problems for Testing Automatic Theorem Provers, Journal of Automated Reasoning 2 (1986) 191-216. \bibitem{Pelletier88} Pelletier, F. J., Errata, Journal of Automated Reasoning 4 (1988) 235-236. \bibitem{robinson} Robinson, J. A., Logic: Form and Function. North-Holland, New York (1979) \bibitem{Smullyan68} Smullyan, R. M., First-Order Logic. Springer-Verlag, Berlin (1968). \bibitem{Walter85} Walter, C., A mechanical Solution of Schubert's Steamroller by Many-Sorted Resolution, Artificial Intelligence. (May 1985) 217-224 \end{thebibliography}