Computer Science 456/656
Automata and Formal Languages

Every context-free language (CFL) is in the class P-Time. This is proved by giving a polynomial time algorithm for deciding whether a given string is in a given CFL. The best known such algorithm is called the CYK algorithm, and uses dynamic programming to build a collection of derivations of substrings from grammar symbols.
Let G be a CFG. Let L be the language generated by G.
For the sake of simplicity, I shall assume that G is in Chomsky normal form. In actual fact, the CYK algorithm can be adapted to an arbitrary context-free grammar, but the details are much more complex. This complexity makes explaining it far more difficult, but adds nothing to your understanding, because the CNF case gives you the essence of the algorithm.
Let m be the number of grammar symbols of G. We name this grammar symbols x1, x2, ... xm. We can assume that the start symbol is x1. We can also assume that x1 ... xr are variables, and that xr+1 ... xm are terminals.
Let w[p,d] be the substring of w of length d starting from the pth symbol of w. Let P[i,p,d] be the boolean array where P[i,p,d] means that xi derives w[p,d]. Note that the size of the array is O(n2), because the size of the grammar is treated as a constant.
The steps of the CYK algorithm:
  1. Initialize all P[i,p,d] to false.
  2. For all i from r+1 to m, and for all p from 1 to n, if xi = w[p,1], then assign P[i,p,1] to be true.
  3. For each production of the form xj -> xi, where j is between 1 and p (i.e., xj is a variable) and i is between r+1 and m (i.e., xi is a terminal), and for each p from 1 to n, if P[i,p,1], then assign P[j,p,1] to be true.
  4. Execute the following for all d, starting with d=2 and ending with d=n.
    1. Execute the following for every production of the form xj -> xjxk
        for all p from 1 to n-d+1,
        for q from p+1 to p+d-1,
        if P[j,p,q-p] and P[k,q,d+p-q] then assign P[i,p,d] to true.
  5. If P[1,1,n] then return Yes else return No.