Logic and Proofs, Propositional Calculus

Propositional Calculus

Don't be misled by the word "calculus". Propositional calculus and predicate calculus are systems in logic; they have nothing to do with differential or integral calculus.

Propositional calculus, or sentential calculus, is a specific form of boolean logic, where the sentences are boolean expressions, and the inference rules combine sentences to create more sentences. This page presents a specific formulation of propositional calculus, for purposes of illustration - however, other schemas can and do exist. For instance, an alternate presentation, more complicated (in my opinion) than the one presented here, contains ten axioms and just one inference rule. Although it is interesting to imagine so many theorems coming from a single inference rule, I find that approach less intuitive. The following system has several rules, and no essential axioms.

Variables and Sentences

A variable is a letter or word, using upper case letters. Since this is boolean logic, variables represent true or false. Thus X might be true, and FROG might be false. Note that the set of all possible variables is countable, and a boolean "combination" assigns true or false to these variables, or to a subset of variables that we are interested in.

A sentence is a well formed boolean expression, using the aforementioned variables, parentheses, and boolean operators. I will use | for "or", & for "and", ~ for "not", → for "implies", and ⇔ for "if&only if". A straightforward context free grammar describes the syntax of a sentence, but that is not important for this discussion. I think you know a syntactically correct boolean expression when you see one.

Inference Rules

Our version of propositional calculus contains 10 inference rules. These rules allow us to derive true "theorems", given a set of axioms that are assumed to be true. I use small letters as variables, because they represent formulas, i.e. preexisting sentences, rather than boolean elements. Independent sentences are separated by a comma. Thus the second rule requires, as a precondition for its application, a sentence that is indicated by the variable a, and another sentence that has the exact syntax of a, possibly in parentheses, followed by an implication symbol, and then a second formula, possibly in parentheses, which I have designated as b. Given these two inputs, with this relationship, we may crank out b as a new sentence. We have "proven" b.

NameInputOutput
1Double Negative~~aa
2Modus Ponensa, a→bb
3Biconditional Introductiona→b, b→aa⇔b
4Biconditional Eliminationa⇔ba→b, b→a
5Conjunction Introductiona, ba&b
6Conjunction Eliminationa&ba, b
7Disjunction Introductionaa|b, b|a
8Disjunction Eliminationa|b, a→c, b→cc
9Proof by Assumptiona proves ba→b
10Proof by Contradictiona proves b and ~b~a

The last two rules are special, because they bring in an unproven hypothesis. These statements are pushed onto a stack of axioms. All inferences after that are temporary. They exist only to service rule 9 or 10, which pops the stack and places a sentence in the previous context.

Let's consider a proof strategy that you've seen before. Assume we already know, or we have demonstrated, that a implies b. Then if b fails, a must fail. This is an application of rule 10. Suppose a is true, then combine it with a→b and rule 2 to get b. Now we have b and ~b, and that allows us to invoke rule 10, and assert ~a, which is what we wanted to prove.

Given a→b, the above can be expressed concisely. Assume ~b, and obtained ~a. Apply rule 9 and get ((a→b)&~b)→~a. This is called the contrapositive. Let's write out the proof, properly indented, like C code.


1:  a→b { /* assume */
2:      ~b { /* assume */
3:          a { /* suppose */
4:              b /* rule 2(1,3) */
5:          } ~a /* rule 10[3,4,2] */
6:      } ~b→~a /* rule 9[2,5] */
7:  } (a→b) → (~b→~a) /* rule 9[1,6] */

/* That was the first formulation; let's prove the second. */

8:  (a→b) & ~b { /* assume */
9:      a→b /* rule 6[8] */
10:     ~b /* rule 6[8] */
11:     ~b→~a /* rule 2[7,9] */
12:     ~a /* rule 2[11,10] */
13: } ((a→b)&~b) → ~a /* rule 9[8,12] */

Mathematics and software do have something in common. The real power is in creating useful routines. You don't want to write out each proof, in this much detail, every time. Better to create, and agree upon, a "library" of certain theorems that will make other theorems easier to prove. (This is similar to invoking printf, strncpy, and recvmsg from C.) For example, one can prove that conjunction and disjunction are commutative and associative, and that they distribute past each other. Combine this with demorgan's laws and you can safely rearrange any boolean expression as you were taught to do in high school. Since propositional calculus is part of every system of logic, these tools are used again and again.

The contrapositive, shown above, is a common technique for denying a claim - deny something that it implies. Similarly, proving an implication, and then its converse, is the standard method for proving equivalence. Some people refer to the contrapositive as the converse, but that is not correct. If we are given a→b, the converse is b→a. If I am trying to prove two conditions are equivalent, I will first prove a→b, and then open a new paragraph with the word "conversely", as I prove b→a. Once these two implications are established, rule 3 asserts a⇔b.

Consistent

Let x be a set of axioms which are introduced into the system. A theorem t is true if all the boolean assignments that satisfy the axioms also satisfy t. In other words, t is implied, semantically, by x.

On the other hand, t could be derived, syntactically, from x, using the aforementioned rules of inference. We would like to know that t can be derived from x iff it is implied by x. The rules don't make any mistakes, and they are sufficient.

Restrict attention to the first 8 rules. (I'll deal with rules 9 and 10 later.) Perform an inductive argument on the length of any derivation. To start, each sentence in x is satisfied by the boolean assignments that satisfy all the sentences of x. Then, step through each of the 8 rules in turn and verify that any boolean assignment that satisfies the input also satisfies the output. At each step the new sentence is true (under all the axioms), so long as the previous sentences are true. We can't go wrong.

At the same time, and this is a subtle difference, you should verify, via truth tables, that the output, or at least one sentence of a multi-sentence output, is untrue only when at least one of the inputs is untrue. Thus, by induction, a derivation using the first 8 rules, that leads to an untrue statement, must have, for that combination of boolean variables, an untrue axiom.

What if there are no boolean combinations that satisfy x? There is no harm in inferring the sentence t, for any sentence. Every combination that satisfies x trivially satisfies t. Once again, rules 1 through 8 do not make any mistakes. Going the other direction, an inferred sentence t that is untrue for a combination c of boolean variables, comes from at least one untrue axiom (under c), since every combination c causes at least one axiom to fail. This is really a special case of the above, but it is worth noting, because it will come into play when we apply rule 10.

Now let's move on to rule 9. We have pushed a hypothesis h onto the set of axioms x, then inferred t using a shorter derivation, which preserves truth. That means t is true whenever x and h are satisfied. It is equivalent to say h implies t in the context of x, so it is accurate and consistent to derive h→t.

Suppose you have derived h→t, using rule 9, yet the implication is untrue. Thus h is true and t is false. The axioms x+h derive t, which is untrue. This is a shorter derivation, hence one of the axioms is untrue. It isn't h, so one of the axioms in x is untrue.

Finally consider rule 10. Let x be the prior set of axioms and assumptions, on the stack at this point in the derivation. If x cannot be satisfied, there is no harm in bringing in ~h. So assume c is a boolean combination that satisfies x. Bring in h, a supplemental hypothesis that may or may not be satisfied by c. Then derive t, ~t, or both. Since this derivation is shorter, it preserves truth. Yet t and ~t cannot both be true. An untrue statement, for the combination c, has to come from an untrue axiom. All the axioms of x are satisfied, so h(c) is false. This holds for every c satisfying x, so ~h is true, and it is consistent to derive ~h.

Suppose you have derived ~h, using rule 10, yet there is a combination c satisfying h. A shorter derivation, culminating in ~h, derives an untrue statement, hence one of the axioms fails for c. It isn't h, so one of the prior axioms fails.

Throughout this proof, I was careful to demonstrate that truth leads to truth, and failure always comes from failure. This is done so that the proof of rule 10 does not become a circular argument, drawing upon rule 10.

In summary, these rules preserve truth at every step, and by induction, they preserve truth for any theorem that can be proved in finitely many steps. If you have proven it, it is so.

As a corollary, these rules are consistent; they cannot prove t and ~t simultaneously. (This assumes your axioms admit at least one boolean solution.) Thus we say the rules are "consistent".

Outside the Box

Note that the proof relies heavily on induction, which is not part of propositional calculus. You have to step outside the box to prove these rules of inference are consistent and complete. (Completeness is shown below.) In other words, this proof exists in numeric logic, in order to make an assertion about boolean logic. We will see that numeric logic is powerful enough to prove some theorems about itself - but boolean logic is not.

Complete

We showed that the rules are consistent, i.e. every derived statement is true. Now we want to show the rules are complete, i.e. every true statement can be derived. We will prove the contrapositive: a statement that cannot be derived is not true.

Let t be a statement that cannot be deduced from x. Since statements are countable, arrange them in an ordered list. For each statement h, add h into the list of axioms iff the resulting set of axioms does not prove t. Let u be the union over all these statements. If u is an infinite set, and u proves t, it does so in finitely many steps, drawing upon finitely many axioms. Thus an earlier set of axioms would prove t. This is a contradiction, hence all of u fails to prove t. Also, u is maximal with this property. Bring in one more statement h, and if t still cannot be proved, we would have brought h in at the time of its consideration.

Our set u is consistent. If u contains h and ~h, assume ~t, derive h and ~h (trivially), and prove t.

If some of the axioms in u can be used to create a new statement h, not in u, using our rules of inference, fold h into u and find a larger set that cannot prove t. (If it could prove t, then the axioms of u are sufficient to prove t.) Therefore u is closed under the rules of inference.

Let s be an arbitrary sentence, and suppose u contains neither s nor ~s. Bringing in s proves t, hence, by rule 9, we have s→t. This is derived from u, and is part of u. Similarly ~s→t is part of u.

Let h be the hypothesis ~s&s. Bring in h and derive both ~s and s, a contradiction. Apply rule 10, giving ~h. By demorgan's laws, this becomes s|~s. Apply rule 8 and prove t. This is a contradiction, hence u contains either s or ~s for every sentence s.

Each boolean variable acts as its own sentence, thus each boolean variable, or its complement, stands alone in u. This assigns a value, true or false, to each variable. Call this configuration c. Thus c satisfies all the atomic sentences of u. Furthermore, c is the only configuration that might satisfy u. Let's see if it does.

Let a statement s consist of two smaller statements joined by a double arrow. This can be derived from axioms iff both implications can be derived. This is merely an application of rules 3 and 4. Thus double arrows can be "normalized" away. In the same way, a→b can be derived iff ~a|b can be derived. Thus implications can be removed. Finally, use demorgan's laws to push complement down to the variables. Thus s is derived iff its normalization, consisting of "and", "or", and (possibly negated) variables can be derived.

Let s be a statement that is true under c. Show that s can be derived from the individual atoms that establish c. Proceed by induction on the length of s. If s is the "and" of to shorter sentences then both those sentences are true. They can be derived, and by rule 5, s can be derived. If s is the "or" of shorter sentences then one of them is true, and can be derived, whence s can be derived. Finally, if s is a single atom it has to be one of the atoms that defines c. In other words, s is derived trivially. Therefore s can be derived, and is part of u.

In summary, u contains every sentence that is true under c, and since u is consistent, i.e. it cannot contain both s and ~s, u omits every statement that is fallse under c.

If u contains t then u proves t, hence u contains ~t. This means ~t(c) is true, and t(c) is false. The original axioms, which are part of u, are satisfied by c, and t is not. This means t is not true. Every theorem that cannot be derived is not true, and therefore, every theorem that is true can be derived. This system of logic is consistent and complete.

Constructionists

The constructionists, a fringe group of mathematicians, attempt to prove theorems without rule 10. They don't like indirect proofs. This seems like a silly intellectual exercise to me, making things harder than they need to be, but that's just my opinion.

You may also be missing out on some theorems that are true. Consider the completeness theorem above. It relies on rule 10. If you aren't allowed to use rule 10, you can't prove u contains s or ~s, and the completeness argument breaks down. So there may indeed be theorems, i.e. simple boolean formulas, that are true under your axioms, but can't be proven. This result extends up through higher systems of logic. Your favorite theorem from topology or number theory may become unprovable if you eschew indirect proofs.

More to come; this topic is unfinished.