Select this to skip to main content [CSUSB] >> [CNS] >> [Comp Sci Dept] >> [R J Botting] >> [MATHS] >> intro_grammar
 [Index] || [Contents] || [Source] || [Definitions] || [Search] || [Notation] || [Copyright] || [Comment] Fri Mar 26 08:02:20 PST 2004

Contents


    Form of a grammar

    A grammar is a collection of rules or definitions that describe how to construct valid or correct strings in a language. The symbols that make up the strings in the language - characters, words, or lexemes, are called the terminal symbols. The set of terminal symbols is symbolized as T,
  1. T::Sets=given set of terminal symbols.

    The terms used to talk about the language - forming its meta-language - are separate set of symbols(N), that does not overlap with T,

  2. N::Sets=given set of nonterminal symbols
  3. |- (disjoint): T & N = {}.

    A grammer generates the strings in its language by replace non-terminal symbols by terminal ones. This is why we use the tem terminal vs non-terminal.

    The grammar has a vocabulary V made up of symbols in either T or N.

  4. V::= T | V. The vocabulary V is split into the terminals or elements (T) and the defined terms or nonterminals (N).

    The rules describe a collection of operations or functions that take a string of symbols from both N and T and changes it to another one. The rules only apply when the string has at least one non-terminal and typically replace it by a string of terminals and nonterminals.

    Context free grammars the rules are expressed as a definition of a nonterminal as a string of terminals and nonterminals. The mapping is to substitute the nonterminal by the its expresion as a string. For example the rule ' L:=a L b' describes a mapping/function/operation that can do things like the following:

    A MATHS grammar gives each term n in N, a formula that defines n in terms of the nonterminals m1,m2,... (which may include n itself) and some terminals(D[n](m1,m2,...)), using the following operations "(&|~#)".

    In the above example L:=a L b defines "L" in terms of "a", "b", and "L". So the L-rule is to replace a string x by "a" x "b". Thus:

  5. For all x, D["L"](x) = "a" x "b". or
  6. D["L"] = \lambda[x]( "a" x "b").

    Meaning of a grammar

    Each terminal and nonterminal represents a set of strings of symbols in an alphabet or vocabulary.

    There are several models of strings and some have been translated into MATHS [ intro_strings.html ] [ logic_6_Numbers..Strings.html ] [ math_61_String_Theories.html ] [ math_62_Strings.html ] [ math_66_SuperStrings.html ]

    I assume in the theory of grammars that strings are generated by starting with an empty string ("") and a putting symbols (in T) onto it using concatenation operation (!).

    The pre-defined operations of union('|'), intersection ('&'), complementation ('~'), concatenation, and iteration (#) on sets of strings: I treat a string as a set when the context requires it. A string s is cast into the singleton set {s} when necessary.

  7. For s:string, s::sets of strings = { s }.

    For A,B: sets of strings,

  8. A B ={c || c=a!b and a in A and b in B},
  9. A & B ={c || c in A and c in B},
  10. A | B ={c || c in A or c in B},
  11. A ~ B ={c || c in A and not( c in B) }.

    .Note { x || P} is short for the set of x such that P is true. [ intro_sets.html ] [ logic_30_Sets.html ] [ logic_32_Set_Theory.html ]

    The Kleene closure operator (#) can be read "zero or more of" and can be defined as the smallest set of strings that contains the null string and the results of concatentaing one of the elements of the set onto the closure:

  12. #A =least{ X || X=({""} | A X) }. '#' could be defined many other ways. In MATHS it is a standard operator.

    A set of definitions associates a meaning (M(n)) with each defined term (n) by the rule that each M(n) is the smallest set of strings such that all the definitions are true simultaneously:

  13. For all n:N, M(n) = D[n](M(m1),M(m2),M(m3),...),

    For example, the grammar

  14. Net{ L:="" || a L b } defines L to be the smallest set of strings such that
  15. M(L)={""}| {c || for some x in M(L) (c =a!x!b)}

    This definition does not tell us how to find the M's, neither does it prove that they exist nor whether they are unique. Finally the term 'smallest' has not been defined. It can be shown, for a wide class of grammars (see later) that we can get as complete a listing of the M's as we like by a simple process:


    1. Initially, for n:N do M(n)={}.

    2. Iteration the following steps as often as you want:
      Net
      1. For n:N do P(n) := D[n](M(n1),M(n2),M(n3),...).
      2. For n:N do M(n)=P(n)

      (End of Net)

    For example, with
  16. { L:=""| a L b } gives as successive approximations to M(L) as the following sets:
    Net
    1. {}
    2. {""}|a{}b = {""}
    3. {""}|a{""}b = {"","ab"}
    4. {""}|a{"","ab"}b = {"","ab","aabb"}
    5. {""}|a{"","ab","aabb"}b = {"","ab","aabb","aaabbb"}
    6. {""}|a{"","ab","aabb","aaabbb"}b =
    7. {"","ab","aabb","aaabbb","aaaabbb"}
    8. ...

    (End of Net)
    It seems intuitively obvious that these sets are `tending towards a limit`:

  17. M(L)={a^n!b^n||n:0..}.

    The reason that these seem to converge to M(L) is that we have to look at longer and longer examples of strings in L to find one that has not been generated by the iteration. The discrepancy between the iterates and M(L) are getting longer and longer - More formally, if we itterate long enough, then the iterates will match the limit up to any preselected length.


Formulae and Definitions in Alphabetical Order