Grammar Structure

The format for a context free grammar is as follows:
  1. Any reference to "punctuation" when describing legal identifiers is restricted to certain pieces of punctuation. In no particular order, here are the punctuation characters that are legal: ~ ! @ # * ( ) _ + ' ; : / ?
  2. Terminals are strings which either start with a lowercase letter, a number, or a piece of punctuation, followed by any amount of letters, numbers and punctuation. i.e. a terminal matches the regular expression [{punctuation}a-z0-9][{punctuation}a-zA-Z0-9]*. Examples are id + begin end plus 9 8' 7+? '' ....
  3. Nonterminals start with uppercase letters, followed by any number of legal characters. i.e. a nonterminal matches the regular expression [A-Z][{punctuation}a-zA-Z0-9]*. Examples are S A EXPR TERM~3 Stop ....
  4. It is assumed that the start symbol is the first nonterminal whose productions are given.
  5. The productions associated with a nonterminal are indicated as
    head -> RHS1
          | RHS2
          | RHSn.
    where the alternative righthand sides of the productions are separated by a slash, "|", and terminated by a period, ".".
  6. The RHS of a production is a sequence of terminals or nonterminals separated by spaces. For example: Expr add Expr.
  7. C-style comments are allowed, e.g.
    /* this is a comment 
    possibly spanning many lines */

Empty grammars

The grammar X IS EMPTY. for any nonterminal X is a valid grammar. This represents a grammar with no productions. The nonterminal X is needed as the start symbol, as all context-free grammars must have start symbols.

Empty grammars form a subset of the grammars for the empty language, and there is a subtle difference between the two. For example, the non-empty grammar S -> T. T -> S. does not produce any strings — i.e., the grammar forms the empty language. This owes to the start symbol S being unrealizable. The grammar S IS EMPTY. is a grammar for the same language with the start symbol, and additionally the grammar itself is empty. Indeed transforming the former grammar via removing unrealizable productions will yield the latter grammar.