Complexity Theory, Satisfiability

Satisfiability

Let the satisfiable language be the set of well-formed boolean expressions that are true for some combination of their inputs. for instance, x|y is in the language, since there are plenty of ways to make that expression true, but x&~x is not. This language is np complete. In other words, finding a solution to a boolean expression is np complete.

I've glossed over a subtle difference here. Satisfiability says a solution exists, without telling you what it is. As it turns out, knowing there is a solution, and finding one, are equivalent. Obviously the latter implies the former. Conversely, assume you have a polynomial time procedure that tells you whether a given boolean expression has a solution. Take the expression you are interested in and see if it has a solution. If it does, and it with x and try again. If a solution persists then you are free to set x = 1 throughout. If the solution disappears then x has to be 0. Do the same for y, and z, and so on. Repeat this process until all variables are established.

The string of length n has at most n variables, far fewer in practice, so we are multiplying the execution time by n, and it remains polynomial. Knowing there is a solution is the same as finding one.

There are many ways to encode a boolean expression using a finite alphabet. You wouldn't use letters like x and y, you'd use numbers, representing boolean variables x₁ x₂ x₃ x₄ etc, hence the number of variables is not constrained. Reserved symbols indicate | & ~ ( ). It's pretty easy to verify the symtax of the expression in linear time, so let's take that as a given.

An ntm sets each variable to true or false, evaluates the expression, and halts, indicating success if the expression is true. Of course the ntm sets the variables to true or false nondeterministically, trying all combinations in parallel, and if any of them satisfies the expression, the ntm returns success. Keep the stack in a separate scratch area, and you can evaluate the expression in linear time. Therefore satisfiability is an np problem.

Now for the converse, which is a bit tricky, but very beautiful. Let l₁ be an np language, as evidenced by the ntm m₁. In other words, m₁ recognizes w ∈ l₁, in p(|w|) time, for a fixed polynomial p(n).

Given a word w, which may or may not be accepted by m₁, let p be shorthand for p(|w|). In other words, m₁ accepts or rejects w in at most p steps.

Let the string #x₀#x₁#x₂#x₃#…#x_p# represent successive snapshots of the tape, as m₁ marches along. After i moves, x_i is on the tape. Clearly x₀ is the input word w.

Include blanks, so the length of each intermediate string is p. In p steps, the string cannot grow longer than p, hence the tape is effectively bounded, and the length of each snapshot can be set to p.

If m₁ accepts early, repeat the final snapshot again and again, out to x_p. There are now p+1 snapshots, each having length p. Bring in the # delimiters, and the entire string has length (p+1)²+1.

Assume m₁ has q states. Given any one of these states, and 0 1 or blank (input), assume m₁ has at most k choices, at most k possible state transitions. Since a transition writes a digit, moves left or right, and goes to a new state, k is at most 4q.

Create a new symbol for each valid state transition. The symbol encodes the input (0 1 or blank), the current state, and the path taken, e.g. an integer from 1 to k. There are at most 3qk such composite symbols. Our alphabet is now 0, 1, blank, and these composite symbols.

As mentioned earlier, x_i consists of zeros, ones, and blanks; however, one character in each x_i is a composite, determining the symbol on the tape, the current state, and the transition selected at this point.

For each position in the string, and for each character in our alphabet, create a boolean variable that will be true if that character appears in that position in the string. Then build constraints that will be anded together to create a large boolean expression. For starters, the boolean variables that assign characters to x₀ are set, according to the input word w. Remember that the first character in x₀ is an exception. It is not 0 1 or blank, hence those three boolean variables are false. Instead, the first character is a composite that encodes the first symbol in w, the start state, and the first transition of m₁. There can be k such characters, so or the corresponding boolean variables together. Thus m₁ begins in its start state, and has to follow one of its legal state transitions.

Across the entire string, exactly one character can be placed in each position. This rule is encoded by straightforward, albein large, boolean expressions.

One of the characters in x_p denotes a final state. This is an "or" of many boolean variables that place final states anywhere in x_p.

Transition derived boolean expressions tie the chain together. The i^th character is determined by the three characters at positions i-p-2, i-p-1, and i-p. If a composite symbol is involved, we could be applying a state transition, so set the i^th character accordingly. Otherwise make it the same, 0 1 or blank, as its counterpart in position i-p-1. These are all implications, e.g. this implies that, easily encoded as boolean expressions.

A consistent solution to this boolean expression defines an execution path that allows m₁ to accept w, and if m₁ excepts w, the snapshots set the boolean variables in a manner that is consistent with our boolean expression. In other words, m₁ accepts w iff our boolean expression can be satisfied. The language l₁, which is an arbitrary np language, is a reduction, i.e. a specific instance, of satisfiability.

We still neeed to show the conversion takes place in polynomial time. Given w, measure its length and compute p(|w|), which I called p earlier. The size of our alphabet, approximately 3qk, is a constant relative to p. Thus the number of boolean variables is proportional to p², which is still a polynomial. Each boolean variable produces a set of logical constraints, independent of the other variables. Again, we are multiplying by a constant. Put it all together and the conversion is proportional to p². The transformation is well behaved, and satisfiability is np complete. Solve the satisfiability problem, and all np problems fall.

Equivalent Formulas

If two expressions e₁ and e₂ are built using the same variables, the task of determining whether they have the same solutions is np complete.

By evaluating e₁&~e₂ | e₂&~e₁, we can search for a solution to one expression and not the other, hence the problem is np.

Now suppose equivalence can be determined in polynomial time, and let e₁ be an arbitrary boolean expression. If e₁ is built from the variables a through z, let e₂ = e₁&a. If e₁ and e₂ have the same solutions then forcing a to be true changes nothing. Otherwise there is a solution with a false. Substitute for a in e₁, reducing the number of variables. Continue this process until e₁ contains only one variable, whence satisfiability is easily determined.