Context Free, Push Down Machines

Push Down Machines

To provide more power, add a push down stack to a finite state machine, producing a push down machine (pdm).

A fixed set of letters acts as the stack alphabet. These are the symbols that can be pushed onto, or popped off of, the stack. This is usually different from the language alphabet. The words of the language might be written in the letters ABCDE, while the stack alphabet employs qrstu. By convention, the language uses digits or upper case letters, while lower case letters represent states and/or stack symbols.

Each transition pops a symbol off the stack. Based on this symbol, the current state, and the input character, the pdm enters a new state and pushes a list of symbols (possibly empty) onto the stack. The word is accepted if it leaves the pdm in a final state.

The pdm has a start state, and a start symbol preloaded on its stack.

If the stack is empty, there is no symbol to pop, and the machine cannot continue. Longer words are not accepted, and the current word, read thus far, is accepted iff the pdm is in a final state.

If every transition pops s off of the stack, and pushes s back onto the stack, the pdm is simply an fsm. In other words, every regular language is accepted by a pdm.

Nondeterministic

A nondeterministic push down machine (npdm) is, well, nondeterministic, like a nondeterministic fsm. Each stack symbol/state/input character presents many choices, and somehow, the npdm makes all of them in parallel. It also permits an E transition, which carries a state and a popped stack symbol to a new state, pushing a (possibly empty) string of symbols onto the stack, without reading, or even interrogating, an input character. Since E transitions could push the stack to infinity, they consume a clock cycle, just like a character driven state transition. (This is different from the fsm, where E transitions were folded into the next state transition.) At each step, the npdm makes all possible E transitions and all possible character transitions.

Here is another (equivalent) way to look at it. At each step, the npdm makes the correct transition, with or without the benefit of the next input character. Having made the correct choices, one after another, the npdm winds up in a final state, and the word is accepted.

Empty Stack

An empty stack is an equivalent condition for success. Given a traditional machine m1, build a new machine m2 that starts with b on the stack, for "bottom of stack". On an E transition, m2 pushes s, the start symbol, and moves to the start state of m1. Now m2 runs as m1, but if it reaches a final state, it uses E transitions to empty its stack, down to and including b. Of course m2 is nondeterministic, so it will do this only if the word is over, and this final state is used to accept the word, and it is time to empty the stack. An npdm always knows what to do.

If b pops up while m1 is in a nonfinal state, m1 has run into an empty stack, which is invalid in that context. In this case m2 puts b back on the stack, and stays in the same state, eating characters. Nothing is accepted beyond this point. Therefore m2 has an empty stack iff m1 ends up in a final state.

Conversely, assume m1 succeeds with empty stack. Again, build a new machine m2 that starts the stack with the symbol b, and uses an E transition to push s and move to the start state of m1. New E transitions carry b, and any state of m1, to a new final state in m2. Since b is removed, m2 stops in its final state. If m1 never empties its stack, m2 will not reach its final state.

The criteria are equivalent, and the empty stack criterion is often more convenient.

In the deterministic world the criteria are not equivalent. If m1 accepts a word via empty stack it cannot continue, and cannot accept a longer word. Even a language as simple as 0n is inaccessible. So we revert to the original "final state" definition. Now m1 can accept a word, and a longer word, and a longer word, and so on, as it moves in and out of its final states.

cfl iff npdm

A language is context free iff it is accepted by an npdm.

Given a context free grammar, convert to greibach normal form. The terminals are the letters of the alphabet, and the nonterminals are the stack symbols. There is but one state. The production x → Cyz corresponds to a transition on the stack symbol x, with input character C, that pushes zy onto the stack. Notice that the right side of the production is reversed, then pushed onto the stack. An input word leaves the stack empty, and is accepted, iff it is produced by the grammar.

Conversely, start with an npdm and build a context free grammar that generates the same language. A grammar doesn't have an ancillary state machine, so the states must somehow be folded into the nonterminals. create a nonterminal for each state / stack-symbol / state triple. Focus in on a state transition that pops x off the stack, and reads C from the input, and moves from state q3 to state q7, and pushes zyw on the stack. This single transition is going to create a lot of productions. A sample production looks like this.

q3xq7 → C q7wq2 q2yq9 q9zq15

You'll recognize the states q3 and q7 in this production; the others are arbitrary. If the machine has n states, there are n3 productions associated with this state transition.

If this were an E transition, there would be no terminal in the production. Simply remove C from the above example. All these transitions combine to build a context free grammar - but where do we start?

Introduce a start symbol s, and let unit productions take s to q0bqj, where q0 is the machine's start state, and b is the symbol at the bottom of the stack. By selecting one of these symbols, the grammar anticipates the first state transition; the machine will move from state q0 to state qj, having popped b off the stack, and having read the first character C from the input word. (C is not read if the first step is an E transition.) the first unit production, that trades s in for a more useful nonterminal, can be followed by a second production iff the selected nonterminal corresponds to the machine's first state transition. The process has begun.

Return to the x → Cwyz example given above. As the machine eats C and pushes zyw onto the stack, the grammar selects just the right production, so that the next time the machine pops w (right away), and y (sometime later), and z (later still), it will be in the states that appear on the left of those nonterminals. The grammar is nondeterministic, like the machine, so it has no trouble selecting the right production for the situation. The grammar generates the input word as the machine empties its stack. Conversely, every word in our context free language has a derivation, in canonical form, that indicates the state transitions of the machine. Once again the machine will empty its stack as the last nonterminal is consumed.

In summary, a language is context free, i.e. generated by a context free grammar, iff it is accepted by a nondeterministic pdm, using either the "final state" or the "empty stack" criterion.

Deterministic PDM

Return to the world of deterministic pdms, and build a machine that accepts 0n1n. Since the empty string is legal, the start state is a final state. With b on the stack, and 0 coming in, move from the start state to a new state, leaving b on the stack. As more 0's come in, push them onto the stack, then, when you read a 1, switch states and pop 0. Continue popping 0's as long as 1's come in, until you pop b, which indicates success. This machine reaches a final state and empties its stack when a word is recognized.

Use the pumping lemma to show 0n1n is not a regular language. Thanks to its stack, a pdm is more powerful than an fsm.

However, some context free languages are inaccessible to a deterministic pdm. Consider the language of binary palendromes, such as 010001100010. With each passing bit of an arbitrarily long half word, the pdm must store one bit of information. If it has n states, log2(n) bits can be stored in states, but the rest must be stored on the stack. As it compares this against the second half of the word, the stack must be popped, and the information lost. It might use states to validate the very end of the palendrome, but the stack must store and compare most of the word.

Beyond a certain length, a word that is accepted leaves the pdm in a final state, with a stack whose depth is bounded by a fixed constant corresponding to the number of states in the pdm. There are only so many final states, and so many ways to push a finite number of symbols onto the stack. Some of these patterns must repeat. There are different palendromes, x and w, that leave the pdm in the same success configuration. If we continue on from here, xw and ww will both succeed or both fail together, even though only one is a valid palendrome.

We have found a context free language that is not accepted by a deterministic pdm.

Perhaps the languages accepted by these machines should be declared type 2.5, somewhere between context free and regular. I don't know if anyone has explored these languages in detail.