Navigation
  • Home
  • Recent
  • Most Active
  • Popular
  • Blog
  • Credits
  • RSS
  •   Interaction
  • Register
  • Statistics
  •   Help
  • Suggestions
  • Contact Us
  • How to Edit
  • Help



  • [Edit]


    In linguistics and computer science, a context-free grammar (CFG) is a formal grammar in which every production rule is of the form:V → w
    where V is a nonterminal symbol and w is a string consisting of terminals and/or non-terminals. The term "context-free" comes from the fact that the non-terminal V can always be replaced by w, regardless of the context in which it occurs. A formal language is context-free if there is a context-free grammar that generates it.

    Context-free grammars are powerful enough to describe the syntax of most programming languages; in fact, the syntax of most programming languages are specified using context-free grammars. On the other hand, context-free grammars are simple enough to allow the construction of efficient parsing algorithms which, for a given string, determine whether and how it can be generated from the grammar. An Earley parser is an example of such an algorithm, while LR and LL parsers only deal with more restrictive subsets of context-free grammars.

    BNF (Backus-Naur Form) is the most common notation used to express context-free grammars.

    Not all formal languages are context-free — a well-known counterexample is , the set of strings containing some number of a's, followed by the same number of b's and the same number of c's.


        Context-free grammar
            Formal definition
                Example 1
                Example 2
                Example 3
                Example 4
                Other examples
            Derivations and syntax trees
            Normal forms
            Undecidable problems
            Properties of context-free languages
            See also

    top

    Formal definition
    Just as any formal grammar, a context-free grammar G can be defined as a 4-tuple:

    G = (V_t, V_n, P, S)
    where

      V_t is a finite set of terminals
      V_n is a finite set of non-terminals
      P is a finite set of production rules
      S is an element of V_n, the distinguished starting non-terminal.
      elements of P are of the form
    V_n longrightarrow (V_t cup V_n)^

    A language L is said to be a Context-Free-Language (CFL) if its grammar is Context-Free. More precisely, it is a language whose words, sentences and phrases are made of symbols and words from a Context-Free-Grammar. Usually, CFL is of the form L=L(G). Given below are examples for CFG but not for CFL.

    top

    Example 1
    A simple context-free grammar is given as.......
    S → aSb | ε

    where | is used to separate multiple options for the same non-terminal, and ε stands for the empty string. This grammar generates the language

    which is not regular.

    top

    Example 2
    Here is a context-free grammar for syntactically correct infix algebraic expressions in the variables x, y and z:
    S → x | y | z | S + S | S - S | S
      S | S/S | (S)

    This grammar can, for example, generate the string "( x + y )
      x - z
        y / ( x + x )".

    top

    Example 3
    A context-free grammar for the language consisting of all strings over which contain a different number of a's to b's is
    S → U | V

    U → TaU | TaT

    V → TbV | TbT

    T → aTbT | bTaT | ε

    Here, T can generate all strings with the same number of a's as b's, U generates all strings with more a's than b's and V generates all strings with fewer a's than b's.

    top

    Example 4
    Another example of a context-free language is . This is not a regular language, but it is context free as it can be generated by the following CFG (Context Free Grammar):
    S → bSbb | A

    A → aA | ε


    top

    Other examples
    Context-free grammars are not limited in application to mathematical ("formal") languages.
    For example, it has been suggested that a class of Tamil poetry called Venpa is governed by a context-free grammar.

    top

    Derivations and syntax trees
    There are two common ways to describe how a given string can be derived from the start symbol of a given grammar. The simplest way is to list the consecutive strings of symbols, beginning with the start symbol and ending with the string, and the rules that have been applied. If we introduce a strategy such as "always replace the left-most nonterminal first" then for context-free grammars the list of applied grammar rules is by itself sufficient. This is called the leftmost derivation of a string. For example, if we take the following grammar:
    (1) S → S + S

    (2) S → 1

    (3) S → a


    and the string "1 + 1 + a" then a left derivation of this string is the list (1), (1), (2), (2), (3) . Analogously the rightmost derivation is defined as the list that we get if we always replace the rightmost nonterminal first. In this case this could be the list (1), (3), (1), (2), (2).

    The distinction between leftmost derivation and rightmost derivation is important because in most parsers the transformation of the input is defined by giving a piece of code for every grammar rule that is executed whenever the rule is applied. Therefore it is important to know whether the parser determines a leftmost or a rightmost derivation because this determines the order in which the pieces of code will be executed. See for an example LL parsers and LR parsers.

    A derivation also imposes in some sense a hierarchical structure on the string that is derived. For example the structure of the string "1 + 1 + a" would, according to the leftmost derivation, be:

    S→S+S (1)

    S→S+S+S (1)

    S→1+S+S (2)

    S→1+1+S (2)

    S→1+1+a (3)


    S

    where S indicates a substring recognized as belonging to S. This hierarchy can also be seen as a tree:

    S
    /|
    / |
    / |
    S '+' S
    /| |
    / | |
    S '+' S 'a'
    | |
    '1' '1'

    This tree is called a concrete syntax tree (see also abstract syntax tree) of the string. In this case the presented leftmost and the rightmost derivation define the same syntax tree, however there is another (leftmost) derivation of the same string possible

    S→ S + S (1)

    S→ 1 + S (2)

    S→ 1 + S + S (1)

    S→ 1 + 1 + S (2)

    S→ 1 + 1 + a (3)


    and this defines the following syntax tree:

    S
    /|
    / |
    / |
    S '+' S
    | /|
    | / |
    '1' S '+' S
    | |
    '1' 'a'

    If for certain strings in the language of the grammar there is more than one parsing tree then the grammar is said to be an ambiguous grammar. Such grammars are usually hard to parse because the parser cannot always decide which grammar rule it has to apply.

    top

    Normal forms
    Every context-free grammar that does not generate the empty string can be transformed into an equivalent one in Chomsky normal form or Greibach normal form. "Equivalent" here means that the two grammars generate the same language.

    Because of the especially simple form of production rules in Chomsky Normal Form grammars, this normal form has both theoretical and practical implications. For instance, given a context-free grammar, one can use the Chomsky Normal Form to construct a polynomial-time algorithm which decides whether a given string is in the language represented by that grammar or not (the CYK algorithm).

    top

    Undecidable problems
    Although some operations on context-free grammars are decidable due to their limited power, CFGs do have interesting undecidable problems. One of the simplest and most cited is the problem of deciding whether a CFG accepts the language of all strings. A reduction can be demonstrated to this problem from the well-known undecidable problem of determining whether a Turing machine accepts a particular input. The reduction uses the concept of a computation history, a string describing an entire computation of a Turing machine. We can construct a CFG that generates all strings that are not accepting computation histories for a particular Turing machine on a particular input, and thus it will accept all strings only if the machine does not accept that input.

    As a consequence of this, it is also undecidable whether two CFGs describe the same language, since we can't even decide whether a CFG is equivalent to the trivial CFG deciding the language of all strings.

    On the other hand, the problem of determining whether a CFG accepts at least one string is decidable.

    top

    Properties of context-free languages
      An alternative and equivalent definition of context-free languages employs non-deterministic push-down automata: a language is context-free if and only if it can be accepted by such an automaton.
      A language can also be modeled as a set of all sequences of terminals which are accepted by the grammar. This model is helpful in understanding set operations on languages.
      The union and concatenation of two context-free languages is context-free, but the intersection need not be.
      The reverse of a context-free language is context-free, but the complement need not be.
      The intersection of a context-free language and a regular language is always context-free.
      To prove that a given language is not context-free, one may employ the pumping lemma for context-free languages.
      Another point worth mentioning is that the problem of determining if a context-sensitive grammar describes a context-free language is undecidable.

    top

    See also
     
    Search more:
     

       
    Source Privacy License Download Contact Us Atlas
    Scientus.org Dictionary (Yet Another Wiki) RC : 1.39
    MIT OpenCourseWare
    This article is licensed under the GNU Free Documentation License [copyleft]. It uses material from the Wikipedia article "Context-free grammar". link