AARNEWS - April 2001

NEWSLETTER

No. 51, April 2001

Contents

From the AAR President
MathXpert As a Theorem Prover
Woody Bledsoe Student Travel Award
Herbrand Award: Amendments to the Selection Procedure
Conference Calls

FLoC'02:Call for Workshop Proposals

WFLP 2001

LPAR'2001

New Books

Automated Theorem Proving: Theory and Practice

Elimination Methods

Symbolic Computation and Automated Reasoning

From the AAR President, Larry Wos...

Several months ago I began an active correspondence with Michael Beeson about some of the work we have reported in AAR newsletters and some of the research I am doing in the area of missing proofs. Beeson mentioned his software program MathXpert, and we started a discussion about whether it is or is not a theorem prover. Beeson's article in this newsletter directly addresses that issue. We both welcome comments.

I also call attention to the fact that several new books have appeared recently on automated theorem proving and (the name I prefer) automated reasoning. I remind our readers that the Journal of Automated Reasoning welcomes book reviews; please submit them to the managing editor, Gail W. Pieper (pieper@mcs.anl.gov).

MathXpert As a Theorem Prover:
Introduction to MathXpert

Michael Beeson

MathXpert (the software formerly known as Mathpert) is a program designed to help people learn algebra, trigonometry, and calculus. It contains a computer algebra system, a logic module called the prover, a graphics module, and some complex code meant to provide a simple user interface. The program's primary purpose was education, and it is being marketed with that in mind (see the Web site). This article will ignore education, graphics, and user interfaces and will focus on the logical capabilities of MathXpert. We will attempt to answer the AAR president's question, "Is MathXpert a theorem prover, or not?"

A MathXpert session begins with the specification of a topic (for example, solving cubic equations, verifying trigonometric identities, or evaluating integrals--there are about 140 topics altogether). Then the user enters a problem appropriate for that topic. After the problem is accepted, the next task is the generation of a step-by-step solution of the problem. This can take place under user control, with the user choosing the steps and the program executing them. It can also take place in auto mode, in which MathXpert can generate the entire solution. These two modes are roughly analogous to what we mean when we say proof checker and theorem prover. Only auto mode solutions will be discussed in this article, since our topic is theorem-proving.

Here is a sample solution, the output of MathXpert when the input is the arithmetic-geometric mean inequality.

Is this a proof? I am certain it would be accepted as such in most mathematics classes.

One may object, however, that this is not a formal proof; I shall take up below the question whether there is a formal proof here to be extracted. At the moment, the only purpose of displaying this example is to fix the ideas as to what MathXpert actually produces.

The company marketing MathXpert is about to make available MathXpert Online, which neatly separates the auto mode capabilities of MathXpert from all others and makes the auto mode solutions (only) available over the Internet for free. When this service is available, it will be found at the Web site.

Logic and Correct Computation

MathXpert makes the connection between logic and computation by maintaining a list of assumptions. Each (visible) line of computation can be thought of as depending on a list of assumptions. If we put this list of assumptions to the left of , and the visible step A to right of , then we have a sequent , of which is called the antecedent and A the succedent. When computational steps (operations) are applied, they may have side conditions (hypotheses). These have to be verified before the operation can be legitimately applied. This mechanism prevents MathXpert from deriving false conclusions. More details of how side conditions are handled in this sequent-calculus framework can be found in [2].

Other computer algebra systems have nothing like this kind of logical underpinnings and hence can easily be led to deduce false conclusions. Witness this example from Mathematica 4.0.2:

In[3]= Integrate[1/Sin[x], {x, -1, 1}] 
 
Out[3]= 0

The integral in question is divergent, since it has a singularity at the origin. Mathematica 4.0.2. (unlike some earlier versions) correctly notes that fact when the integrand is 1/x, but this is an ad hoc fix, and making the integrand slightly more complicated, Mathematica fails to check the side condition and incorrectly produces an answer. The same integral cannot even be accepted as a problem in MathXpert because the prover checks the continuity of the function over the interval of integration before accepting the problem.

One may well say that correct computation is nice, but it does not make a theorem prover. Of course that is correct, but if we want a theorem prover that can make proofs including computational steps, such as the above example, then the computational steps must be logically correct. Correct computation is a prerequisite to a theorem prover that can take computational steps.

Kinds of Theorems MathXpert Can Prove

Most of the topics available in MathXpert begin with a mathematical expression (such as an integral, limit, derivative, or algebraic expression), and the solution takes steps that rewrite each line to an equal expression. The theorem that is proved is that the original problem equals the last line generated, provided that the assumptions on which the last line depends hold. (These can be seen in MathXpert by choosing View Assumptions.) There are three other types of problems to be considered: solving equations, verifying identities, and "solving" inequalities. When solving equations, MathXpert essentially is asked to find the values of x such that f(x) = 0, and if it succeeds, it gives an explicit disjunction that is equivalent to the existential statement. In the case of trigonometric equations, the solutions are parametrized by new integer parameters; for example, the solution of is .

In MathXpert, to "solve" an inequality for x means to reduce the inequality to a disjunction of inequalities or pairs of inequalities which express the solution set as a union of intervals. This includes as a special case the possibility that the answer might be true, as in the arithmetic-geometric mean example above. This means that the inequality has been proved. Unlike with equations, there is no clear distinction between verifying and solving an inequality.

How the Prover Works

The MathXpert prover can perhaps best be thought of as using rewrite-rule technology combined with evaluation and ad hoc control of the rewrites. It is not purely rewrite rules: the simplest example is the rule that permits rewriting x³ x⁵ as x⁸. Arithmetic is performed on the exponents, which involves evaluation. MathXpert can also evaluate much more complex expressions during application of rewrite rules. Note that some of the more complex rules are not exposed to the (student) user of MathXpert but are used only internally. For example, there are sophisticated algorithms for determining if a polynomial has a zero on a given interval. In general, the proof produced is supposed to be comprehensible to a freshman, so these algorithms are used only to check side conditions.

The Underlying Logic

Part of the prover has to deal with verifying side conditions inside the scope of a bound variable (such as in a limit term or definite integral). In the course of justifying the use of algorithms based on nonstandard analysis for verifying side conditions in a limit term, I set out a formal theory that I believe adequately captures the mathematical realm of MathXpert [1]. The variables can range over integers, real numbers, or complex numbers. The logic must deal correctly with terms that can be undefined such as x, which is accomplished by the use of the Logic of Partial Terms (LPT). It must also include term-formation rules for limits, derivatives, and integrals. It is also possible to use a function variable in MathXpert, although the vast majority of theorems do not use them. Formally speaking, it might be simplest to allow them, and to allow the formation of lambda-terms, because then limits, derivatives, and integrals can be defined by using lambda-terms. Of course, so can quantifiers then, but in practice MathXpert proofs are quantifier free.

The axioms of this theory are numerous: some 1600 mathematical operations are implemented in MathXpert. Of course, it is possible to derive these systematically from a small number of axioms. But the proofs MathXpert actually generates can appeal to any of these 1600 "axioms" at any step. In principle it would be possible to write a program that translated MathXpert proofs automatically into proofs in some simple logical theory whose axioms would amount, in essence, to the axioms of ordered fields complete with respect to zeroes of functions defined in the theory, and the definitions of the trigonometric and exponential functions, and closed under derivatives and integrals of functions defined by terms of the theory. Since there are also integer variables, a few axioms about the integers would be required, but MathXpert does not use induction.

Enabling the Prover to Handle Quantifiers

I combined MathXpert's logical engine with a very simple theorem prover that applies the rules of Gentzen's sequent calculus "backwards". This added the ability to handle quantifiers explicitly, so that, for example, epsilon-delta proofs could be reduced to the necessary inequalities. This combination program, known as Weierstrass, was able [3] to prove the epsilon-delta continuity of several specific functions such as x³ and and to prove the irrationality of e. These proofs depend at some points on some special inference rules for inequalities, but fundamentally they depend on MathXpert's logical engine for the rewriting steps.

Comparison with Otter

MathXpert does not search for a proof--it simply uses its rewrite-style engine. In that sense it is much weaker than Otter, which can generate and examine hundreds of thousands of intermediate deductions. It cannot work with arbitrary theories but only with the fixed mathematical framework used in high-school algebra and freshman calculus. It cannot, for example, deduce the ten Knuth-Bendix axioms that form a complete confluent set for group theory---one cannot even ask the question, since it assumes multiplication is commutative (except for matrices).

On the other hand, it is not quite trivial for most theorem provers (including Otter) to prove the arithmetic-geometric mean inequality, say from the axioms of ordered fields with a square-root function. I do not mean to imply that proving that inequality is a remarkable achievement--of course, the arithmetic-geometric mean inequality falls within the decidable realm of real-closed fields, and so can be proved by the well-known CAD decision algorithm. I propose it as an interesting exercise for AAR members to get their favorite theorem-prover to prove it by its usual logical means. For MathXpert it is just another problem.

Because MathXpert does not search for proofs and cannot deal with an arbitrary first-order theory, it is not a traditional theorem prover, but it does produce proofs. MathXpert's strengths, as a theorem-prover, are first, its ability to take steps involving computation, and second, its ability to handle inequalities.

Notes

The papers listed below and other papers about MathXpert can be downloaded from the Web. Other relevant links can be found on the author's
home page.

References

[1] Beeson, M., Using nonstandard analysis to verify the correctness of computations, International Journal of Foundations of Computer Science 6, no. 3 (1995) 299-338.

[2] Beeson, M., MathXpert: Computer support for learning algebra, trigonometry, and calculus, in: A. Voronkov (ed.), Logic Programming and Automated Reasoning, Lecture Notes in Artificial Intelligence 624, Springer-Verlag, Berlin (1992).

[3] Beeson, M., Automatic generation of epsilon-delta proofs of continuity, in: J. Calmet and J. Plaza (eds.), Artificial Intelligence and Symbolic Computation, Lecture Notes in Artificial Intelligence 1476, pp. 67-83, Springer-Verlag, Berlin (1998).

Woody Bledsoe Student Travel Award:
Call for Nominations

Maria Paola Bonacina
(On behalf of the CADE Inc. Board of Trustees)

The Woody Bledsoe Student Travel Award was created to honor the memory of Woody Bledsoe, for his contributions to mathematics, artificial intelligence, and automated theorem proving, and for his dedication to students. The award is intended to enable selected students to attend the International Conference on Automated Deduction (CADE) or the International Joint Conference on Automated Reasoning (IJCAR), whichever is scheduled for the year, by covering much of their expenses.

In 2001, IJCAR will take place June 18-23 in Siena, Italy (for further information see the Web site. The winners will be reimbursed (to a maximum of US $600) for their conference registration, transportation, and accomodation expenses. Preference will be given to students who do not have alternative funding. While preference will be given to students who will play an active role in the conference, including all attached workshops and events, also students who do not expect to give presentations, including students who have just begun their research in automated deduction or are considering the field, are encouraged to apply.

A nomination consists of a recommendation letter of up to 300 words from the student's supervisor. Nominations for IJCAR 2001 should be sent by e-mail to
Fabio Massacci, IJCAR 2001 Conference Chair (ijcar-cch@dii.unisi.it) and
Rajeev Gore, Alexander Leitsch, Tobias Nipkow, IJCAR 2001 Program Co-Chairs (ijcar-pch@dii.unisi.it),
with copies to

Ulrich Furbach, CADE Inc. President (uli@informatik.uni-koblenz.de) and
Neil V. Murray, CADE Inc. Treasurer (nvm@cs.albany.edu).

Nominations must arrive no later than May 7, 2001, and the winners will be notified by May 14, 2001. The awards will be presented at IJCAR; in case a winner does not attend, the chairs and trustees may transfer the award to another nominee or give no award.

Herbrand Award: Amendments to the Selection Procedure

Maria Paola Bonacina
(On behalf of the CADE Inc. Board of Trustees)

The description of the Herbrand Award from the CADE-11 proceedings has been amended by the CADE trustees. The modifications reflect discussions in the board of trustees in May-June 2000 and again in January-February 2001, terminated with a vote approving the new text in February-March 2001. As always, comments and suggestions from the AAR members are welcome.

In previous years the Herbrand Award was decided based on a vote where CADE trustees, members of the program committee, and past recipients of the award received the following ballot:

You are not asked to rank the N nominees, but simply to answer the question below for each nominee. Note that the resulting N questions are independent: the answer to one does not prejudice the answer to another. When the votes are in, the trustees will make the award to the nominee receiving the strongest support, provided that nominee receives "yes" votes from at least 2/3 of the votes cast. (Eligible to vote are trustees, program committee members, and the previous recipients of the award). In the event that no nominee satisfies this condition, no award will be made. In the event that K nominees tie with maximum support, K awards will be made if they have received at least 2/3 support.

If you think several candidates are each worthy of the award, even though others have been nominated, you may answer "yes" in several cases.

If you would prefer the award to go to certain candidates rather than others, you may answer "yes" for those you prefer and "no" to the others.

If you think that none are clearly worthy, you may answer "no" to all.

This was followed by the question

Do you consider that nominee(s) X is(are) worthy of the Herbrand Award for Distinguished Contributions to Automated Reasoning?
(YES / NO)

for all nominees X.

As this practice had raised a number of criticisms, the trustees sought to modify it according to the following principles:

The Herbrand award procedure has three stages:
* Nomination,
* Advisory voting by trustees, current PC and past (n) winners,
* Final decision by trustees only;

"Worthiness" is separated from "ranking";

Current trustees should not be eligible;

resulting in the new text below.

Herbrand Award for
Distinguished Contributions to Automated Reasoning

CADE Inc. has established since 1992 the Herbrand Award for Distinguished Contributions to Automated Reasoning, to honor an individual or (a group of) individuals for exceptional contributions to the field of Automated Deduction.

Nominations for this award can be made at any time to the president of CADE Inc. A nomination should include a letter (up to 2000 words) from a principal nominator describing the nominee's contributions, along with two other letters (up to 2000 words) of endorsement. Current members of the board of trustees of CADE Inc. are not eligible.

The CADE president will be responsible for soliciting opinions and evaluations, and carrying out a vote. Nominations will be kept confidential. The nomination of a group of individuals who are collaborators, will be considered as a single nomination.

The program committee, the award winners from the last 10 years, and the current board of trustees of CADE, Inc. will participate in the selection, with the final decision resting with the board of trustees. The award will be given at the CADE or IJCAR conference, whichever is scheduled for the year.

The selection procedure will work as follows. The members of the current program committee, winners from the past 10 years, and trustees are invited to express their opinions on the nominees in two ways:

They indicate which candidates they believe are worthy of the award.

They give a preferential ranking of all the candidates.

It is expected that only those receiving an endorsement by a 2/3 majority under vote (1) will be considered by the trustees for the award. A preferred majority candidate will be selected by the single transferrable vote principle using the preferential ranking (2).

After possible further discussion among the trustees, the president of CADE proposes a candidate which should be endorsed in a final vote by 2/3 of the trustees. It is expected that in most cases this should be the majority candidate from the advisory vote.