Languages, Grammar Types

Grammar Types

Grammars are partitioned into 4 types, based on their complexity. A grammar is type 3, or regular, if its rules are as follows. Here x and y are arbitrary nonterminals, T is an arbitrary terminal, and E is the empty string.

x → E | T | y | yT

Alternatively, we coulde replace the format x → yT with x → Ty, spinning terminals off to the left as we go. The language is essentially the same, except the words are reversed. Use whichever form is convenient.

A grammar is type 2, or context free, if the left side of each rule is a single nonterminal.

A grammar is type 1, or context sensitive, if each string generates a string that is at least as long. Intermediate strings never get shorter as you proceed through the derivation. The production s → E is tolerated, if the language includes the empty word.

Type 0 grammars are not constrained in any way.

By definition, a type x language can be defined by a type x grammar, but not a type x+1 grammar. We will eventually show that type x languages exist, for x in 3 2 1 0. This hierarchy really does represent increasing levels of complexity.

Finite Languages are Regular

When a language consists of a finite set of words, a regular grammar defines that language.

In this construction, wordi,j is the jth letter of the ith word. The variables xi,j are nonterminals that support the grammar. (In both cases, we start numbering at 0.) For each wordi in the language, construct a rule:

s → xi,0

Thus the first stepin the derivation selects the word. Subsequent steps spell out the word, as we shall see. Add the following rules, for each letter in each word.

xi,j → wordi,j xi,j+1
xi,j → wordi,j (for the last letter in the ith word)

The rule s → E is added if the language includes the empty word.

This is a theoretical result with little practical value, since natural (English) and artificial (fortran) languages, although finite, are much too large for exhaustive enumeration. Besides, a simple list of "legal" strings denies the structure of the language, which is necessary for comprehension, whether you are a compiler or a human brain.